APS Logo

Towards Interpretable Imputation of Missing Observations with Time Slice Synthetic Minority Oversampling Technique

ORAL

Abstract

Dealing with sparse and irregular time series data presents numerous problems for comparing the trajectories of time dependent systems in a controlled manner. Particularly, data points which are sampled at different times do not allowed for such a controlled comparison since the behavior of two independent signals will behave differently at different times. Further, the irregularity of this type of data makes it difficult to use state-of-the-art machine learning techniques such as recurrent neural networks due to the rigidity of their architectures. To deal with these issues, we developed a simple yet novel non-parametric time series imputation technique with the goal of constructing an irregular time series that is uniform across every sample in a data set. Specifically, we fix a grid defined by the midpoints of non-overlapping bins (dubbed "slices") of observation times and ensure that each sample has values for all of the features at that given time. This allows one to both impute fully missing observations to allow uniform time series classification across the entire data and, in special cases, to impute individually missing features. We illustrate the technique in a number of examples and proof-of-concept directions for future research in biology and medicine.

Presenters

  • Andrew Baumgartner

    Institute for Systems Biology

Authors

  • Andrew Baumgartner

    Institute for Systems Biology