Effective 2D Stroke-based Gesture Augmentation for RNNs

Image credit [ACM DL]

Abstract

Recurrent neural networks (RNN) require large training datasets from which they learn new class models. This limitation prohibits their use in custom gesture applications where only one or two end user samples are given per gesture class. One common way to enhance sparse datasets is to use data augmentation to synthesize new samples. Although there are numerous known techniques, they are often treated as standalone approaches when in reality they are often complementary. We show that by intelligently chaining augmentation techniques together that simulate different gesture production variability types, such as those affecting the temporal and spatial qualities of a gesture, we can significantly increase RNN accuracy without sacrificing training time. Through experimentation on four public stroke-based 2D gesture datasets, we show that RNNs trained with our data augmentation chaining technique achieves state-of-the-art recognition accuracy in both writer-dependent and writer-independent test scenarios.

Publication
In CHI ‘23 Conference on Human Factors in Computing Systems

Short Summary

RNNs require a lot of original traning samples and training time to achieve high accuracy. We evaluate the effectiveness of gesture augmentation techniques for RNNs. Chaining these techniques in a specific way improves accuracy even more.

Mykola Maslych
Mykola Maslych
Computer Science PhD Candidate

My research interests include machine learning applied to 3D User interfaces and HCI in general.