Predicting Microbial Community Compositions and Oil Contamination in Water Samples Using Neural Networks and Generative Models

Tong Gao; Isaac Bigcraft; Stephen Techtmann; Issei Nakamura

Predicting Microbial Community Compositions and Oil Contamination in Water Samples Using Neural Networks and Generative Models

ORAL

Abstract

Understanding microbial communities and their response to oil contamination is vital for effective environmental monitoring and biodegradation. However, modeling high-dimensional biological datasets is often challenging due to limited experimental data. To overcome this issue, we developed a prediction model for microbial compositions and oil contamination in water samples using artificial neural network algorithms. Our approach integrates dimensionality reduction, a noise injection algorithm, and a variational autoencoder (VAE) to handle high-dimensional, non-linear, and sparse data. We demonstrate that dimensionality reduction based on feature importance from decision trees enhances model training performance. Additionally, we employ a noise injection method to generate synthetic data, which improves VAE training by learning the underlying data distribution. This straightforward combination of standard neural networks significantly enhances training performance and predictive power, achieving an R² of up to 0.99.

March 18, 2025, 12:24 PM – March 18, 2025, 12:36 PM

Presenters

Tong Gao

Department of Physics, Michigan Technological University, Michigan Technological University

Authors

Tong Gao

Department of Physics, Michigan Technological University, Michigan Technological University
Isaac Bigcraft

Department of Biological Sciences, Michigan Technological University
Stephen Techtmann

Department of Biological Sciences, Michigan Technological University
Issei Nakamura

Michigan Technological University, Department of Physics, Michigan Technological University