Feature Engineering for Data Driven Applications in Physical Sciences
ORAL
Abstract
In data driven applications, features are attributes of the system used to represent the underlying problem to the algorithm. Feature engineering involves the transformation of raw data into descriptive and discriminative elements. In addition to improved performance from predictive models, this can be used to improve the interpretability of the model. Owing to its importance, approaches have been developed for feature construction, selection and transformation. However, the nature of data produced in physical science problems makes some of these approaches sub-optimal, while others may be rendered misleading.
In this talk, we focus on the pertinence of different approaches to feature engineering in physical science applications. In particular we investigate the issue of multi-collinearity amongst groups of features of arbitrary sizes using illustrative datasets. Finally, we compare the performance of different approaches on a corpus of data from fluid flow simulations.
–
Presenters
-
Aashwin Mishra
Stanford University, Stanford Univ
Authors
-
Aashwin Mishra
Stanford University, Stanford Univ
-
Gianluca Iaccarino
Stanford University, Stanford Univ