Expanding the Molecular Alphabet of DNA Data Storage Systems with Single-molecule Nanopore Readout
ORAL
Abstract
DNA is a promising next-generation data storage medium, however, the recording latency and synthesis cost of DNA oligos using the four natural nucleotides remain high. In this talk, we describe a new DNA storage system that uses an extended 11-letter molecular alphabet combining natural and chemically modified nucleotides. Experimental results involving a library of 77 oligo sequences show that one can readily discriminate different combinations of monomers using single-molecule detection with MspA nanopores. We further demonstrate full nanopore sequencing of hybrid synthetic DNA oligos using commercial Oxford Nanopores by developing a custom neural network architecture to classify raw current signals, yielding an average accuracy exceeding 60%, which is 39 times higher than random guessing. Molecular dynamics simulations show that most chemically modified nucleotides do not induce dramatic disruption of the DNA double helix, which suggests that the extended alphabet is compatible with PCR-based random access data retrieval. Broadly, these methodologies provide a forward path for new implementations of molecular recorders.
–
Presenters
-
Kasra Tabatabaei
University of Illinois at Urbana Champaign
Authors
-
Kasra Tabatabaei
University of Illinois at Urbana Champaign
-
Charles M Schroeder
University of Illinois at Urbana-Champaign
-
Olgica Milenkovic
University of Illinois at Urbana Champaign
-
Chao Pan
University of Illinois at Urbana Champaign
-
Aleksei Aksimentiev
University of Illinois at Urbana-Champaign
-
Jingqian Liu
University of Illinois at Urbana Champaign
-
Bach Pham
University of Massachusetts at Amherst
-
Min Chen
University of Massachusetts at Amherst
-
Shubham Chandak
Stanford University
-
Alvaro G Hernandez
University of Illinois at Urbana Champaign
-
Spencer A Shorkey
University of Massachusetts at Amherst