About this book
Concise, practical and useful - computing for Biologists.....
Extracts from the book: "Bioinformatics is the science of using information to understand biology; it's the tool we can use to help us answer these questions and many others like them. Unfortunately, with all the hype about mapping the human genome, bioinformatics has achieved buzzword status; the term is being used in a number of ways, depending on who is using it. Strictly speaking, bioinformatics is a subset of the larger field of computational biology , the application of quantitative analytical techniques in modeling biological systems. In this book, we stray from bioinformatics into computational biology and back again. The distinctions between the two aren't important for our purpose here, which is to cover a range of tools and techniques we believe are critical for molecular biologists who want to understand and apply the basic computational tools that are available today.
The field of bioinformatics relies heavily on work by experts in statistical methods and pattern recognition. Researchers come to bioinformatics from many fields, including mathematics, computer science, and linguistics. Unfortunately, biology is a science of the specific as well as the general. Bioinformatics is full of pitfalls for those who look for patterns and make predictions without a complete understanding of where biological data comes from and what it means. By providing algorithms, databases, user interfaces, and statistical tools, bioinformatics makes it possible to do exciting things such as compare DNA sequences and generate results that are potentially significant. "Potentially significant" is perhaps the most important phrase. These new tools also give you the opportunity to overinterpret data and assign meaning where none really exists. We can't overstate the importance of understanding the limitations of these tools. But once you gain that understanding and become an intelligent consumer of bioinformatics methods, the speed at which your research progresses can be truly amazing."
Developing Bioinformatics Computer Skills offers an in-depth yet clear insight into the techniques, IT resources and skills of computational biology with special focus on its application to gene sequencing. Unashamedly majoring on the open Linux/Unix platform, the fundamentals of data analysis, database management and straightforward system administration are all covered - O'Reilly have provided a wonderful reference for Computing in the Natural Sciences.
The book will help biologists, researchers, and students develop a structured approach to biological data and the computer tools they'll need to analyze it. The book covers the Unix file system, building tools and databases for bioinformatics, computational approaches to biological problems, an introduction to Perl for bioinformatics, data mining, data visualization, and tips for tailoring data analysis software to individual research needs.
I. Introduction 1. Biology in the Computer Age
How Is Computing Changing Biology?
Isn't Bioinformatics Just About Building Databases?
What Does Informatics Mean to Biologists?
What Challenges Does Biology Offer Computer Scientists?
What Skills Should a Bioinformatician Have?
Why Should Biologists Use Computers?
How Can I Configure a PC to Do Bioinformatics Research?
What Information and Software Are Available?
Can I Learn a Programming Language Without Classes?
How Can I Use Web Information?
How Do I Understand Sequence Alignment Data?
How Do I Write a Program to Align Two Biological Sequences?
How Do I Predict Protein Structure from Sequence?
What Questions Can Bioinformatics Answer? 2. Computational Approaches to Biological Questions
Molecular Biology's Central Dogma
What Biologists Model
Why Biologists Model
Computational Methods Covered in This Book
A Computational Biology Experiment
II. The Bioinformatics Workstation 3. Setting Up Your Workstation
Working on a Unix system
Setting Up a Linux Workstation
How to Get Software Working
What Software Is Needed? 4. Files and Directories in Unix
Commands for Working with Directories and Files
Working in a Multiuser Environment 5. Working on a Unix System
The Unix Shell
Issuing Commands on a Unix System
Viewing and Editing Files
Transformations and Filters
File Statistics and Comparisons
The Language of Regular Expressions
Unix Shell Scriptscripts
Communicating with Other Computers
Playing Nicely with Others in a Shared Environment
III. Tools for Bioinformatics 6. Biological Research on the Web
Using Search Engines
Finding Scientific Articles
The Public Biological Databases
Searching Biological Databases
Depositing Data into the Public Databases
Judging the Quality of Information 7.Sequence Analysis, Pairwise Alignment, and Database Searching
Chemical Composition of Biomolecules
Composition of DNA and RNA
Watson and Crick Solve the Structure of DNA
Development of DNA Sequencing Methods
Genefinders and Feature Detection in DNA
Pairwise Sequence Comparison
Sequence Queries Against Biological Databases
Multifunctional Tools for Sequence Analysis 8. Multiple Sequence Alignments, Trees, and Profiles
The Morphological to the Molecular
Multiple Sequence Alignment
Profiles and Motifs 9. Visualizing Protein Structures and Computing Structural Properties
A Word About Protein Structure Data
The Chemistry of Proteins
Web-Based Protein Structure Tools
Solvent Accessibility and Interactions
Computing Physicochemical Properties
Protein Resource Databases
Putting It All Together 10. Predicting Protein Structure and Function from Sequence
Determining the Structures of Proteins
Predicting the Structures of Proteins
From 3D to 1D
Feature Detection in Protein Sequences
Secondary Structure Prediction
Predicting 3D Structure
Putting It All Together: A Protein Modeling Project
Summary 11. Tools for Genomics and Proteomics
From Sequencing Genes to Sequencing Genomes
Accessing Genome Informationon the Web
Annotating and Analyzing Whole Genome Sequences
Functional Genomics: New Data Analysis Challenges
Biochemical Pathway Databases
Modeling Kinetics and Physiology
IV. Databases and Visualization 12. Automating Data Analysis with Perl
Pattern Matching and Regular Expressions
Parsing BLAST Output Using Perl
Applying Perl to Bioinformatics 13. Building Biological Databases
Types of Databases
Introduction to SQL
Installing the MySQL DBMS
Developing Web-Based Software That Interacts with Databases 14. Visualization and Data Mining
Preparing Your Data
Sequence Data Visualization
Networks and Pathway Visualization
Working with Numerical Data
Data Mining and Biological Information Bibliography
Cynthia Gibas is an assistant professor of biology at Virginia Tech, in Blacksburg, VA. Her research interest is in physicochemical properties of proteins and protein structure/function relationships. While at Virginia Tech, she has built a 32node AMD Athlonbased Linux cluster from parts, and helped her colleagues design curriculum options in bioinformatics. She teaches introductory courses in bioinformatics and biological sequence analysis. She has a Ph.D. in biophysics and computational biology from the University of Illinois. Per Jambeck is a Ph.D. student in the bioengineering department at the University of California, San Diego. He has worked on computational biology since 1994, concentrating on machine learning applications in understanding multidimensional biological data. Per smiles wistfully at the mention of free time, but he manages to host shows at community and studentrun radio stations anyway.