Documentation

UW Connect

Brian Teague: Distributions and DNA: A Hidden Markov Model for Optical Mapping Data

Room: 
Biotechnology Center Auditorium, 425 Henry Mall
Speaker Name: 
Brian Teague
Speaker Institution: 
Laboratory for Molecular and Computational Genomics, University of Wisconsin-Madison
Cookies: 
No
Cookies Location: 
Refreshments in lobby after seminar.

Abstract:

Optical Mapping is a unique platform for analyzing genomes: it uses measurements of single molecules of DNA to infer genome structure, which complements genome sequence to yield biological insight. Information on a genome's structure is useful in a wide variety of contexts, including aiding in sequence assembly; understanding normal human genetic variation; and probing cancer genomes in search of new therapeutic targets. Oncoming advances in DNA enzymology, molecule presentation and image analysis herald dramatic improvements in Optical Mapping's speed and resolution, promising deeper understanding of the biology of genomes. Alongside improvements in sample preparation and data collection, advances in algorithms and analyses allow us to probe these single molecule data sets in new ways, asking questions that were previously inaccessible.

This seminar will describe the development of a hidden Markov model (HMM) for Optical Mapping data. Developed at Bell Labs in the 1970s for voice recognition, HMMs have found use in a variety of bioinformatics endeavours including gene finding, copy number analysis, secondary structure prediction and multiple sequence alignment. Their success is founded on their ability to perform inference on systems whose internal state is *hidden* by noisy or incomplete data, resulting in algorithms that are fast, accurate and well-grounded in theory. They work particularly well when paired with large data sets whose error processes are well-characterized.

Happily, the data sets produced by the Optical Mapping platform fit this description: we want to infer properties of the genome based on a large ensemble of single-molecule observations whose generation is subject to a number of well-characterized error processes. The best-studied problems on hidden Markov models (evaluation, decoding, and learning) translate directly to common tasks in analyzing Optical Mapping data, and provide a jumping-off point for solving more difficult (but more interesting!) problems including map refinement and haplotype discernment.

Event Date:
Tuesday, May 1, 2012 - 4:00pm - 5:00pm (ended)