DNA knowledge storage: AI technique quickens knowledge retrieval by 3,200 instances

March 21, 2025

The GIST Editors' notes

This text has been reviewed in accordance with Science X's editorial course of and insurance policies. Editors have highlighted the next attributes whereas guaranteeing the content material's credibility:

fact-checked

peer-reviewed publication

trusted supply

proofread

DNA knowledge storage: AI technique quickens knowledge retrieval by 3,200 instances

DNA data storage: AI method speeds up data retrieval by 3,200 times
Knowledge used for DNA experiments. Credit score: Nature Machine Intelligence (2025). DOI: 10.1038/s42256-025-01003-z

Researchers from the Henry and Marilyn Taub School of Laptop Science have developed an AI-based technique that accelerates DNA-based knowledge retrieval by three orders of magnitude whereas considerably bettering accuracy. The analysis crew included Ph.D. pupil Omer Sabary, Dr. Daniella Bar-Lev, Dr. Itai Orr, Prof. Eitan Yaakobi, and Prof. Tuvi Etzion.

The analysis is printed within the journal Nature Machine Intelligence.

DNA knowledge storage is an rising discipline that leverages DNA as a platform for storing info. DNA affords important benefits as a storage medium, together with:

  • Lengthy-term preservation: In 2013, researchers in Denmark efficiently extracted DNA from a horse bone courting again 700,000 years. In 2021, a world crew recovered DNA from mammoths that lived over one million years in the past. In contrast, magnetic disks utilized in knowledge facilities have lifespans measured in years or, at greatest, a couple of a long time. This highlights DNA's potential for long-term storage.
  • Vitality and value effectivity: The "cloud" that powers most of immediately's computing companies depends on knowledge facilities that devour roughly 3% of worldwide electrical energy and emit round 2% of complete carbon emissions. With the exponential development of knowledge, the environmental affect of present applied sciences is predicted to extend considerably.
  • Unmatched knowledge density: DNA storage affords knowledge density as much as 100 million instances better than conventional digital storage. Because of this a quantity presently holding one megabyte might theoretically retailer as much as 100 terabytes utilizing DNA.

DNA is a molecule composed of a sequence of natural compounds referred to as nucleotides. These nucleotides are categorized into 4 sorts, represented by the letters A, C, G, and T. Not like conventional computing, the place knowledge is encoded utilizing solely two digits (0 and 1), DNA storage relies on sequences of 4 letters, dramatically rising the variety of attainable mixtures.

To jot down (retailer) knowledge on this expertise, DNA synthesis is required—creating DNA molecules primarily based on the sequences encoding the knowledge. To learn the saved knowledge, DNA sequencing is important.

DNAformer: where nature meets AI
Check tubes containing DNA encoding the knowledge. Credit score: Rami Shlush

Challenges in DNA knowledge storage

Growing DNA-based storage expertise presents a number of technological challenges:

  • Each synthesis and sequencing are prolonged and error-prone processes, introducing deletion, insertion, and substitution errors
  • Because of the limitations of the synthesis course of, a number of copies of every DNA molecule encoding the info are produced. These copies are saved collectively, unordered, in a storage container
  • Throughout sequencing, many faulty copies of those molecules are retrieved—most containing errors, whereas some disappear fully

DNAformer: AI-powered knowledge retrieval

The present analysis presents a complete computational resolution for retrieving and correcting errors in advanced DNA-based storage techniques. Utilizing superior algorithms and encoding methods, the researchers have demonstrated that their resolution reduces knowledge retrieval and studying time from a number of days to simply 10 minutes.

The Technion-developed technique, DNAformer, relies on a transformer mannequin educated on simulated knowledge (generated utilizing a simulator, which was additionally developed at Technion) to reconstruct correct DNA sequences from faulty copies. The tactic additionally features a customized error-correction code tailor-made for DNA, guaranteeing sturdy knowledge integrity.

Moreover, an additional security margin mechanism detects notably noisy DNA sequences (undesirable indicators or errors that happen in the course of the sequencing course of, which might intervene with the correct interpretation of the info) and applies highly effective algorithmic instruments to deal with them effectively. On the finish of the method, the info is transformed again into digital info.

The brand new technique permits the studying of 100 megabytes of knowledge at a pace 3,200 instances quicker than probably the most correct present technique—with none lack of accuracy. In comparison with beforehand identified quick strategies, DNAformer additionally improves accuracy by as much as 40% whereas considerably lowering processing time. This was demonstrated on a 3.1-megabyte dataset, which included:

  • A coloration nonetheless picture
  • A 24-second audio clip of astronaut Neil Armstrong's phrases on the moon
  • A written textual content discussing DNA's benefits as a promising knowledge storage technique
  • Random knowledge as an instance the applicability to encrypted or compressed knowledge

The researchers plan to develop personalized variations of DNAformer tailor-made to completely different wants. They emphasize that their expertise is scalable and adaptable, which means it may be optimized for large-scale knowledge storage functions, assembly market calls for and future DNA synthesis and sequencing developments.

Extra info: Daniella Bar-Lev et al, Scalable and sturdy DNA-based storage by way of coding idea and deep studying, Nature Machine Intelligence (2025). DOI: 10.1038/s42256-025-01003-z

Journal info: Nature Machine Intelligence Offered by Technion – Israel Institute of Expertise Quotation: DNA knowledge storage: AI technique quickens knowledge retrieval by 3,200 instances (2025, March 21) retrieved 21 March 2025 from https://techxplore.com/information/2025-03-dna-storage-ai-method.html This doc is topic to copyright. Aside from any honest dealing for the aim of personal research or analysis, no half could also be reproduced with out the written permission. The content material is supplied for info functions solely.

Discover additional

Unleashing the facility of soft-decision decoding in DNA digital storage 12 shares

Feedback to editors