Bioinformatics encompasses a broad and ever-changing range of activities involved with the management and analysis of data from molecular biology experiments. Despite the diversity of activities and applications, the basic methodology and core tools needed to tackle bioinformatics problems is common to many projects. This unique book provides an invaluable introduction to three of the main tools used in the development of bioinformatics software – Perl, R and MySQL – and explains how these can be used together to tackle the complex data-driven challenges that typify modern biology. These industry standard open source tools form the core of many bioinformatics projects, both in academia and industry. The methodologies introduced are platform independent, and all the examples that feature have been tested on Windows, Linux and Mac OS.
Building Bioinformatics Solutions is suitable for graduate students and researchers in the life sciences who wish to automate analyses or create their own databases and web-based tools. No prior knowledge of software development is assumed. Having worked through Building Bioinformatics Solutions, the reader should have the necessary core skills to develop computational solutions for their specific research programmes. Building Bioinformatics Solutions will also help the reader overcome the inertia associated with penetrating this field, and provide them with the confidence and understanding required to go on to develop more advanced bioinformatics skills.
New to the second edition:
- Now contains more bioinformatics-specific content
- Includes a new chapter on good software engineering practices to help people working in teams
1: Introduction
2: Building Biological Databases with SQL
3: Beginning Programming in Perl
4: Numerical data analysis using R
5: Developing Web Resources
6: Software Engineering for Bioinformatics
Appendix A: Using Command Line Interfaces
Appendix B: Getting started with Apache HTTP Server
Appendix C: Setting up a Linux Virtual Machine in Windows
Conrad Bessant is Professor of Bioinformatics at Queen Mary, University of London. He is active in both teaching and research, and has been involved in a number of software development projects in the areas of proteomics and metabolomics.
Darren Oakley is a Software Developer at Nature Publishing Group (NPG). During his time at NPG he has been involved in numerous projects for improving the meta-data related to NPG articles. His current role is lead developer on NPG's next-generation online publishing platform.
Ian Shadforth is Director of Integrated Health and Bioinformatics at Alere Inc - a global medical devices and health management company. In this role Ian leads new concept development across technology, analytics and web to better enable individuals to achieve their health goals.
"[...] A book like this must represent a massive challenge for the writers. Bioinformatics is a huge field [...] Moreover, the sources of data and the types of questions to which they are applied vary massively, and the variety of tools available to achieve those ends is ever increasing. This makes it almost impossible to produce a text that will introduce any budding bioinformatician to all relevant aspects of the particular problems of interest to them. Nonetheless, Building Bioinformatics Solutions seems an excellent start point for any biologist seeking to improve their bioinformatics capabilities. [...] I strongly recommend this book for postgraduates (and beyond) interested in bioinformatics."
- Phil Stephens, The Bulletin of the British Ecological Society 46(1), March 2015