Time-Series Analysis: Wearable Devices using DTW and kNN

Recently, there has been great success in time series analyses by applying dynamic time warping (DTW). Indeed, when combined with simple algorithms such as k-Nearest Neighbours, DTW has reproduced accuracies of  state-of-the-art algorithms, such as deep neural nets etc.

We will discuss here a supervised classification example of wearable devices and a possible use-case for DTW.

k-Nearest Neighbours is one of the simplest of machine learning algorithms. At its heart, it is a density based classifier that classifies new data based on its k nearest known classes.

For example, say we have two classes of elements (blue squares and red triangles), and we are trying to place a new element (shown as a green circle) into one of these classes, we look at the “nearest” neighbours and vote on what the newest element is. In Figure 1, we what happens if we take k=1, the new example is classified as Class 1, for k=3, the new example would be put in Class 2.

Figure 1: Shows k-Nearest Neighbour classification. A new example is classified by a vote of its neighbours. 

Figure 1: Shows k-Nearest Neighbour classification. A new example is classified by a vote of its neighbours. 

Notice that this is a greedy algorithm, with all the typical caveats of local minima and it also requires spherically (in some metric) symmetric clusters.

K-Nearest Neighbours and Dynamic Time Warping

The typical kNN algorithm uses a Euclidean distance to define nearest neighbours. Instead, we will use DTW as the similarity/distance-like score and for simplicity we take k=1.  

Figure 2: Combining kNN with DTW.

Figure 2: Combining kNN with DTW.

Wearable Devices

Fitbits, Apple watches, Google Glass etc all use time-series measurements to track a user’s movements. The general idea is to classify known movements into categories, say sitting, standing, walking, lying down etc. Using template waves for each of these activities we can compare using kNN and DTW a new sequence.

Comparing new movements with the template waves is very simple using DTW with a simple norm:

Plugging this into a test dataset we find the following confusion matrix:

Which corresponds to an F1-score of approximately 85% across the various categories.

Final Thoughts

Just with this minimal setup, we have already achieved very high accuracy without the need for feature engineering.  These methods can be applied in conjunction with any distance-based algorithms and problems. They have shown great success in time-series analyses in healthcare, sports, finance and more. Further, we can also attempt to ensemble these techniques with more-standard feature-generation methods to achieve even better results.

DTW as a distance metric can be used with more complicated algorithms, and indeed there is much ongoing research to further improve upon the results shown in this analysis.