Penn Arts & Sciences Logo

Friday, November 3, 2006 - 2:00pm

Junhyong Kim

Penn Center for Bioinformatics & Penn Genomics Institute

Location

University of Pennsylvania

Berger Auditorium

Dan T Gillespie's talk will be rescheduled for the Spring

Phylogenies are tree-graphs depicting genealogical relationships of biological objects. Phylogenetic methods provide the temporal history of biological diversity and have been used in many applications. For example: to track the history of infectious diseases; to reconstruct ancestral molecules; to reveal functional patterns in comparative genomics; and even in criminal cases, to infer the relatedness of biological criminal evidence. While ideas about genealogical reconstruction have been around since Darwin, quantitative algorithmic approaches to the problem have been developed only in the last 50 years. The basic structure of the problem involves considering all possible tree graph structures compatible with an organismal genealogy and measuring their fits to observed data by various objective functions. There are now many algorithms based on various inferential principles including maximum information, maximum likelihood, Bayesian posterior, etc. Many flavors of the phylogeny reconstruction problem have been shown to be NP-hard and there is a considerable body of literature on associated computational and mathematical problems. As the problem becomes more complex in terms of the size of the input data and the complexity of assumed models, interesting questions arise in terms whether the tree graph is indeed estimable under these assumptions. Several counter examples are known that show conditions under which trees are unknown. Some of these problems can be approached by application of computational algebraic geometry. In this talk, I first provide basic overview of the phylogenetic estimation problem, and then discuss these problems including techniques to connect models of different biological assumptions.