APS Logo

VASPsol Parameterization and Optimization using Machine Learning Surrogate Models and Active Learning

ORAL

Abstract

This work parameterizes VASPsol, a DFT solvation package, with an ML surrogate model, experimental data from the Minnesota Solvation Dataset, and an active learning (AL) loop for 18 non-aqueous solvents. We train a convolutional neural network (CNN) on sigma profiles and VASPsol parameters to learn the error of VASPsol's solvation energy prediction. We first train the CNN on grid searches of the electronic density cutoff, cavity diffusivity, and effective surface tension for each solvent in VASPsol. AL efficiently retrains our CNN to optimize VASPsol errors while reducing the number of needed DFT evaluations. We predict the optimal parameters for each solvent in VASPsol with particle swarm optimization coupled with our CNN. New solvation energy errors are computed with each solvent's new parameters and added to our training dataset. The AL loop is repeated 3 times, resulting in parameterization for 18 non-aqueous solvents with a mean absolute error (MAE) of 1.3 kcal/mol, down from 4.95 kcal/mol. The MAE of aqueous systems is improved from 1.2 to 1.11 kcal/mol. Our efforts (1) enhance the generalizability of VASPsol to 18 non-aqueous solvents through data-driven parameterization and (2) provide a framework for the continued improvement of VASPsol.

Presenters

  • Eric C Fonseca

    University of Florida

Authors

  • Eric C Fonseca

    University of Florida

  • Sean Florez

    University Of Florida

  • Richard G Hennig

    University of Florida, Department of Materials Science and Engineering, University of Florida