CIS Distinguished Speaker Series

Erez Lieberman Aiden

Baylor College of Medicine & Rice University

October 17, 2017

Time: 10:15-11:15 am
Location: Center for the Arts, Gore Recital Hall

Parallel Processing of the Genomes, by the Genomes and for the Genomes

Abstract: The human genome is a sequence of 3 billion chemical letters inscribed in a molecule called DNA. Famously, short stretches (~10 letters, or a-base pairs) of DNA fold into a double helix. But what about longer pieces? How does a 2 meter long macromolecule, the genome, fold up inside a 6 micrometer wide nucleus? And, once packed, how does the information contained in this ultra-dense structure remain accessible to the cell? This talk will discuss how the human genome folds in three dimensions, a configuration that enables the cell to access and process massive quantities of information in parallel. To probe how genomes fold, we developed Hi-C, a method that can determine not only the genome's 1D sequence, but its 3D fold Hi-C maps collisions between pairs of DNA sequences as they fluctuate inside the nucleus (Lieberman Aiden et al., Science, 2009; Rao & Huntley et al., Cell, 2014). To reconstruct the underlying folds from the billions of collisions we record, we, too must engage in massively parallel computation. Working together with IBM, NVIDIA, Mellanox, and Edico Genome, we built a specialized hardware platform integrating graphical processing units and field programmable gate arrays to accelerate our research. In one recent application, we determined the genome sequence of Aedes aegypti, the mosquito that carries Zika virus, using a new methodology that exploits folding patterns to assemble genome sequences more easily (Dudchenko et al., Science, 2017). Assembing this genome had been an urgent biomedical challenge that had been highlighted, only a few months before, on the front page of the New York Times.

Bio: Erez Lieberman Aiden received his PhD from Harvard and MIT in 2010. After several years at Harvard's Society of Fellows and at Google as Visiting Faculty, he became Assistant Professor of Genetics at Baylor College of Medicine and of Computer Science and Applied Mathematics at Rice University. Dr. Aiden's inventions include the Hi-C method for three-dimensional DNA sequencing, which enables scientists to examine how the two-meter long human genome folds up inside the tiny space of the cell nucleus. In 2014, his laboratory reported the first comprehensive map of loops across the human genome, mapping their anchors with single-base-pair resolution. In 2015, his lab showed that these loops form by extrusion, and that it is possible to add and remove loops and domains in a predictable fashion using targeted mutations as short as a single base pair. Together with Jean-Baptiste Michel, Dr. Aiden also developed the Google Ngram Viewer, a tool for probing cultural change by exploring the frequency of words and phrases in books over the centuries. The Ngram Viewer is used every day, by millions of users worldwide. Dr. Aiden's research has won numerous awards, including recognition for one of the top 20 "Biotech Breakthroughs that will Change Medicine", by Popular Mechanics, membership in Technology Review's 2009 TR35, recognizing the top 35 innovators under 35; and in Cell's 2014 40 Under 40. His work has been featured on the front page of the New York Times, the Boston Globe, the Wall Street Journal, and the Houston Chronicle. One of his talks has been viewed over 1 million times at Three of his research papers have appeared on the cover of Nature and Science. In 2012, he received the Presidents Early Career Award in Science and Engineering, the highest government honor for young scientists, from Barack Obama. In 2014, Fast Company called him "America's brightest young academic." In 2015, his laboratory was recognized on the floor of the US House of Representatives for its discoveries about the structure of DNA.