The quantitative analysis of biological sequence data is based on methods from statistics coupled with efficient algorithms from computer science. Algebra provides a framework for unifying many of the seemingly disparate techniques used by computational biologists. Algebraic Statistics for Computational Biology offers an introduction to this mathematical framework and describes tools from computational algebra for designing new algorithms for exact, accurate results. These algorithms can be applied to biological problems such as aligning genomes, finding genes and constructing phylogenies. The first part of Algebraic Statistics for Computational Biology consists of four chapters on the themes of Statistics, Computation, Algebra and Biology, offering speedy, self-contained introductions to the emerging field of algebraic statistics and its applications to genomics. In the second part, the four themes are combined and developed to tackle real problems in computational genomics. As the first book in the exciting and dynamic area, it will be welcomed as a text for self-study or for advanced undergraduate and beginning graduate courses.
Preface
Part I. Introduction to the Four Themes
1. Statistics L. Pachter and B. Sturmfels
2. Computation L. Pachter and B. Sturmfels
3. Algebra L. Pachter and B. Sturmfels
4. Biology L. Pachter and B. Sturmfels
Part II. Studies on the Four Themes
5. Parametric inference R. Mihaescu
6. Polytope propagation on graphs M. Joswig
7. Parametric sequence alignment C. Dewey and K. Woods
8. Bounds for optimal sequence alignment S. Elizalde
9. Inference functions S. Elizalde
10. Geometry of Markov chains E. Kuo
11. Equations defining hidden Markov models N. Bray and J. Morton
12. The EM algorithm for hidden Markov models I. B. Hallgrimsdottir, A. Milowski and J. Yu
13. Homology mapping with Markov random fields A. Caspi
14. Mutagenetic tree models N. Beerenwinkel and M. Drton
15. Catalog of small trees M. Casanellas, L. Garcia and S. Sullivant
16. The strand symmetric model M. Casanellas and S. Sullivant
17. Extending statistical models from trees to splits graphs D. Bryant
18. Small trees and generalized neighbor-joining M. Contois and D. Levy
19. Tree construction using Singular Value Decomposition N. Eriksson
20. Applications of interval methods to phylogenetics R. Sainudiin and R. Yoshida
21. Analysis of point mutations in vertebrate genomes J. Al-Aidroos and S. Snir
22. Ultra-conserved elements in vertebrate genomes M. Drton, N. Eriksson and G. Leung
Index
Lior Pachter is Associate Professor of Mathematics at the University of California, Berkeley. He received his Ph.D. in mathematics from the Massachusetts Institute of Technology in 1999. He then moved to the mathematics department at UC Berkeley where he was a postdoctoral researcher for two years, before being hired as an assistant professor. He has been awarded an NSF Career award, and has received the Sloan Fellowship for his work on molecular biology and evolution. Equally at home amongst both mathematicians and biologists, he has published over 40 research articles in areas ranging from combinatorics to gene finding, and has participated in several large genome projects.
Bernd Sturmfels is Professor of Mathematics and Computer Science at the University of California, Berkeley. His honors include a National Young Investigator Fellowship, a Sloan Fellowship, and a David and Lucile Packard Fellowship. Sturmfels served as von Neumann Professor at TU Munich in Summer 2002, as the Hewlett-Packard Research Professor at MSRI Berkeley in 2003/04, and he was a Clay Senior Scholar in 2004.
"As the first book in this exciting and dynamic area, it will be welcomed as a text for self-study or for advanced undergraduate and beginning graduate courses."
- L'enseignement mathematique
"[...] substantial, enthusiastically presented, and confidently written [...]"
- Publication of the International Statistical Institute
"This book is of great interest to research workers, teachers and students in applied statistics, biology, medicine and genetics."
- Zentralblatt MATH