VASPsol Parameterization and Optimization using Machine Learning Surrogate Models and Active Learning
ORAL
Abstract
This work parameterizes VASPsol, a DFT solvation package, with an ML surrogate model, experimental data from the Minnesota Solvation Dataset, and an active learning (AL) loop for 18 non-aqueous solvents. We train a convolutional neural network (CNN) on sigma profiles and VASPsol parameters to learn the error of VASPsol's solvation energy prediction. We first train the CNN on grid searches of the electronic density cutoff, cavity diffusivity, and effective surface tension for each solvent in VASPsol. AL efficiently retrains our CNN to optimize VASPsol errors while reducing the number of needed DFT evaluations. We predict the optimal parameters for each solvent in VASPsol with particle swarm optimization coupled with our CNN. New solvation energy errors are computed with each solvent's new parameters and added to our training dataset. The AL loop is repeated 3 times, resulting in parameterization for 18 non-aqueous solvents with a mean absolute error (MAE) of 1.3 kcal/mol, down from 4.95 kcal/mol. The MAE of aqueous systems is improved from 1.2 to 1.11 kcal/mol. Our efforts (1) enhance the generalizability of VASPsol to 18 non-aqueous solvents through data-driven parameterization and (2) provide a framework for the continued improvement of VASPsol.
–
Presenters
-
Eric C Fonseca
University of Florida
Authors
-
Eric C Fonseca
University of Florida
-
Sean Florez
University Of Florida
-
Richard G Hennig
University of Florida, Department of Materials Science and Engineering, University of Florida