DeepSense: A unified deep learning framework for time-series mobile sensing data processing

移动端时序感知数据处理的标准深度学习框架–DeepSense

http://www.kdnuggets.com/2017/08/deepsense-unified-deep-learning-framework-time-series-mobile.html?utm_content=bufferd0bc1&utm_medium=social&utm_source=twitter.com&utm_campaign=buffer

DeepSense is a deep learning framework that runs on mobile devices, and can be used for regression and classification tasks based on data coming from mobile sensors (e.g., motion sensors). An example of a classification task is heterogeneous human activity recognition (HHAR) – detecting which activity someone might be engaged in (walking, biking, standing, and so on) based on motion sensor measurements. Another example is biometric motion analysis where a user must be identified from their gait. An example of a regression task is tracking the location of a car using acceleration measurements to infer position.
DeepSense是一种在移动设备上运行的深入学习框架,可以根据来自移动传感器(例如,运动传感器)的数据开展回归和分类任务。 分类任务的一个例子是异构人体活动识别(HHAR) — 基于运动传感器测量数据来检测某人可能在进行哪些活动(步行,骑车,站立等等)。 另一个例子是生物特征运动分析–从步态识别出用户。 回归任务的一个例子是利用加速度测量数据来跟踪汽车的位置来推断汽车以后的位置。

Compared to the state-of-art, DeepSense provides an estimator with far smaller tracking error on the car tracking problem, and outperforms state-of-the-art algorithms on the HHAR and biometric user identification tasks by a large margin.
与最先进的框架相比,DeepSense提供了一个估计器,该估计器的跟踪误差远小于汽车跟踪问题的误差,并且明显优于HHAR和生物识别用户识别任务方面的最先进的算法。

Despite a general shift towards remote cloud processing for a range of mobile applications, we argue that it is intrinsically desirable that heavy sensing tasks be carried out locally on-device, due to the usually tight latency requirements, and the prohibitively large data transmission requirement as dictated by the high sensor sampling frequency (e.g., accelerometer, gyroscope). Therefore we also demonstrate the feasibility of implementing and deploying DeepSense on mobile devices by showing its moderate energy consumption and low overhead for all three tasks on two different types of smart device.

尽管我们一般将多数移动应用程序转移到远程云处理,但是转移过程对时间延迟要求很高,而且高传感器采样频率(如加速器,陀螺仪)导致数据传输难以支撑,因此我们打心底希望在本地设备上执行大型测试任务。 因此,我们通过展示在两种不同类型的智能设备上针对所有三项任务的适度能耗和低开销,表现出在移动设备上实现和部署DeepSense的可行性。

I’d add that on-device processing is also an important component of privacy for many potential applications.
In working through this paper, I ended up with quite a few sketches in my notebook before I reached a proper understanding of how DeepSense works. In this write-up I’m going to focus on taking you through the core network design, and if that piques your interest, the rest of the evaluation details etcetera should then be easy to pick up from the paper itself.
补充说明,对于许多潜在的应用程序来说,本地处理也是处理隐私的一个重要组成部分。
在撰写本文的过程中,在我深入了解DeepSense的工作原理之前,我在笔记本中已经画了不少草图。 在行文中,如果您感兴趣的话,我将重点带你了解核心网络设计,其余的评估细节等等应该很容易从论文中找到。

Processing the data from a single sensor

处理单传感器数据

Let’s start off by considering a single sensor (ultimately we want to build applications that combine data from multiple sensors). The sensor may provide multi-dimensional measurements. For example, a motion sensor that report motion along x, y, and z axes. We collect sensor readings in each of these d dimensions at regular intervals (i.e., a time series), which we can represent in matrix form as follows:
我们先考虑单个传感器(最终我们希望构建应用程序,将多个传感器的数据组合起来)。传感器可以提供多维测量。例如,一个运动传感器,它沿着x、y和z轴报告运动。我们每隔一段时间(即时间序列)收集这些D维中的传感器读数(即时间序列),我们可以用如下形式表示:

We’re going to process the data in non-overlapping windows of width τ. Dividing the number of data points in the time series sample by τ gives us the total number of windows, T. For example, if we have 5 seconds of motion sensor data and we divide it into windows lasting 0.25s each, we’ll have 20 windows.
我们要在宽度为τ的非重叠窗口中处理数据。 将时间序列样本中的数据点数除以τ可以得到总的窗口T数。例如,如果我们有5秒的运动传感器数据,并打算将它们划分为持续0.25秒的窗口,那么我们将得到 20 个窗口。

Finding patterns in the time series data works better in the frequency dimension than in the time dimension, so the next step is to take each of the T windows, and pass them through a Fourier transform resulting in f frequency components, each with a magnitude and phase. This gives us a d x 2f matrix for each window.
在频率维度上对时间序列数据中寻找模式比在时间维度上更好,所以下一步是将T窗口中的每一个小窗口通过傅立叶变换传递给f频率分量,每个频率分量具有大小和相位。每个窗口得到一个d x 2f的矩阵。

We’ve got T of these, and we can pack all of that data into a d x 2f x T tensor.
我们有了这些T,就可以将所有数据包装到一个d x 2f x T张量。


[Image: https://quip.com/-/blob/SWJAAAGkR93/CSYHMTEBsTmr4w4FcoV_AQ] It’s handy for the implementation to have everything nicely wrapped up in a single tensor at this point, but actually we’re going to process slice by slice in the T dimension (one window at a time). Each d x 2fwindow slice is passed through a convolution neural network component comprising three stages as illustrated below:
实现所有的信息都很好地封装在单个张量中这一点是很方便的,但实际上我们将在t维中逐层处理(一次一个窗口)。每个d x 2f窗口切片通过卷积神经网络组件包括三个阶段,如下图所示:
First we use 2D convolutional filters to capture interactions among dimensions, and in the local frequency domain. The output is then passed through 1D convolutional filter layers to capture high-level relationships. The output of the last filter layer is flatten to yield sensor feature vector.
首先,我们使用二维卷积滤波器捕捉各维数之间的交互,本地频率域进行同样的操作。然后输出通过一维卷积滤波器层来捕获高级关系。最后一个滤波器层的输出被压平以产生传感器特征向量。

Combining data from multiple sensors

整合多传感器数据

Follow the above process for each of the K sensors that are used by the application. We now have Ksensor feature vectors, that we can pack into a matrix with K rows.
按照上述应用程序使用每K个传感器的流程。我们现在有了K个传感器特征向量,我们可以把它打包成一个具有K行的矩阵。
The sensor feature matrix is then fed through a second convolutional neural network component with the same structure as the one we just looked at. That is, a 2D convolutional filter layer followed by two 1D layers. Again, we take the output of the last filter layer and flatten it into a combined sensors feature vector. The window width τ is tacked onto the end of this vector.
然后传感器特征矩阵通过与我们刚刚看到结构相同的第二个卷积神经网络组件传送。 也就是说,二维卷积滤波层后面是两个一维向量层。 最后,我们将最后一个滤波器的输出压平为整合后的传感器特征向量。 窗口宽度τ附加在在该矢量的末端。

For each convolutional layer, DeepSenses learns 64 filters, and uses ReLU as the activation function. In addition, batch normalization is applied at each layer to reduce internal covariate shift.

对于每个卷积层, DeepSenses学习64个滤波器,并使用ReLU作为激活功能。 另外,在每层施加批量归一化以减少内部协变量。

Now we have a combined sensors feature vector for one time window. Repeat the above process for all T windows.
现在我们有一个时间窗口的组合传感器特征向量。 对所有T窗口重复上述过程。

So now we have T combined sensor feature vectors, each learning intra-window interactions. But of course it’s also important to learn inter-window relationships across time windows. To do this the feature vectors are fed into an RNN.
At this point I think we’re ready for the big picture.
所以现在我们有组合的传感器特征向量的T窗口,每个T窗口都学习窗口内的相互作用。 当然,跨时间窗口学习窗口之间的关系也很重要。 为了做到这一点,T特征向量被馈送到RNN中。关于这一点,我想我们已经准备好了一张大图来说明问题。

Instead of using LSTMs, the authors choose to use Gated Recurrent Units (GRUs) for the RNN layer.
作者在RNN层中采用了门循环单元(GRUs)而不是LSTMs。

… GRUs show similar performance to LSTMs on various tasks, while having a more concise expression, which reduces network complexity for mobile applications.

GRU在各种任务中表现出与LSTM相似的性能,同时具有更简单的结构,这降低了移动应用程序的网络复杂性。

DeepSense uses a stacked GRU structure with two layers. These can run incrementally when there is a new time window, resulting in faster processing of stream data.
DeepSense使用两层的堆叠式GRU结构。 当有新的时间窗口时,此结构可以逐步运行,从而更快地处理流数据。

Top it all with an output layer

将其全部数据输出到输出层

The output of the recurrent layer is a series of T vectors, , one for each time window.
For regression-based tasks (e.g., predicting car location), the output layer is a fully connected layer on top of each of those vectors, sharing weights and bias term to learn .
循环层的输出是一系列T向量(每个时间窗口一个T向量)。对于基于回归的任务(例如,预测汽车位置),输出层是所有向量之上的全连接层,所有向量共享用于学习的权重和偏置项。
For classification tasks, the individual vector are composed into a single fixed-length vector for further processing. You could use something fancy like a weighted average over time learned by an attention network, but in this paper excellent results are obtained simply by averaging over time (adding up the vectors and dividing by T). This final feature vector is fed into a softmax layer to generate the final category prediction.
对于分类任务,将单个矢量组成一个固定长度的单矢量以便进一步处理。 您可以使用类似注意网络学习的加权平均值一样的技巧,但是在本文中,通过取时间的平均值(加上向量和除以T)可以获得优异的结果。 最终特征向量被传送到softmax层以生成最终类别预测分数。

Customise for the application in hand

自定义手中的应用程序

To tailor DeepSense for a particular mobile sensing and computing task, the following steps are taken:
遵从以下步骤将DeepSense打磨成适合特定的移动测绘和计算任务的系统

  • Identify the number of sensor inputs, K, and pre-process the inputs into a set of d x 2f x T tensors.
  • 确定传感器的输入个数k,将预处理输入到一组d x 2f x T张量中。
  • Identify the type of the task and select the appropriate output layer
  • 确定任务类型并选择合适的输出层
  • Optionally customise the cost function. The default cost function for regression oriented tasks is mean squared error, and for classification it is cross-entropy error.
  • 可选择自定义代价函数。 用于面向回归的任务的默认代价函数是均方误差,对于分类任务,选择交叉熵误差。

For the activity recognition (HHAR) and user identification tasks in the evaluation the default cost function is used. For car location tracking a negative log likelihood function is used (see section 4.2 for details).
对于评估中的活动识别(HHAR)和用户识别任务,使用默认代价函数。 对于汽车位置跟踪任务,使用负对数似然函数(详见4.2节)。

Key results

主要成果

Here’s the accuracy that DeepSense achieves on the car tracking task, versus the sensor-fusion and eNav algorithms. The map-aided accuracy column shows the accuracy achieved when the location is mapped to the nearest road segment on a map.
以下是DeepSense相对于传感器融合和eNav算法在汽车跟踪任务上的准确性。 地图辅助精度栏显示了位置被映射到地图上最近的道路段时得到的准确度。

the HHAR task DeepSense outperforms other methods by 10%.
DeepSense 在异构人体活动识别(HHAR)任务上优于其它方法10%。
And on the user identification task by 20%:
在用户识别任务上优于其它方法20%。

We evaluated DeepSense via three representative mobile sensing tasks, where DeepSense outperformed state of the art baselines by significant margins while still claiming its mobile-feasibility through moderate energy consumption and low latency on both mobile and embedded platforms.

我们通过三个具有代表性的移动端测量任务评估了DeepSense,DeepSense的高利润高于现有技术基准线,同时仍然通过在移动和嵌入式平台上适度的能耗和低延迟的表现证明其移动端部署可行性。

The evaluation tasks focused mostly on motion sensors, but the approach can be applied to many other sensor types including microphone, Wi-Fi signal, Barometer, and light-sensors.
评估任务主要集中在运动传感器上,但该方法还可以应用于许多其他传感器类型,包括麦克风,路由器,气压计和光传感器。

Share this to:

发表评论

电子邮件地址不会被公开。 必填项已用*标注