This volume details several important databases and data mining tools. Data Mining Techniques for the Life Sciences, Second Edition guides readers through archives of macromolecular three-dimensional structures, databases of protein-protein interactions, thermodynamics information on protein and mutant stability, "Kbdock" protein domain structure database, PDB_REDO databank, erroneous sequences, substitution matrices, tools to align RNA sequences, interesting procedures for kinase family/subfamily classifications, new tools to predict protein crystallizability, metabolomics data, drug-target interaction predictions, and a recipe for protein-sequence-based function prediction and its implementation in the latest version of the ANNOTATOR software suite. Written in the highly successful Methods in Molecular Biology series format, chapters include introductions to their respective topics, lists of the necessary materials and reagents, step-by-step, readily reproducible laboratory protocols, and tips on troubleshooting and avoiding known pitfalls. Authoritative and cutting-edge, Data Mining Techniques for the Life Sciences, Second Edition aims to ensure successful results in the further study of this vital field.
Part I: Data Bases
1. Update on Genomic Databases and Resources at the National Center for Biotechnology Information Tatiana Tatusova
2. Protein Structure Databases Roman A. Laskowski
3. The MIntAct Project and Molecular Interaction Databases Luana Licata and Sandra Orchard
4. Applications of Protein Thermodynamic Database for Understanding Protein Mutant Stability and Designing Stable Mutants M. Michael Gromiha, P. Anoosha, and Liang-Tsung Huang
5. Classification and Exploration of 3D Protein Domain Interactions using Kbdock Anisah W. Ghoorah, Marie-Dominique Devignes, Malika Smail-Tabbone, David W. Ritchie
6. Data Mining of Macromolecular Structures Bart van Beusekom, Anastassis Perrakis, and Robbie P. Joosten
7. Criteria to Extract High Quality Protein Data Bank Subsets for Structure Users Oliviero Carugo and Kristina Djinovic-Carugo
8. Homology-based Annotation of Large Protein Datasets Marco Punta and Jaina Mistry
PART II: Computational Techniques
9. Identification and Correction Of Erroneous Protein Sequences in Public Databases Laszlo Patthy
10. Improving the Accuracy of Fitted Atomic Models in Cryo-EM Density Maps Of Protein Assemblies Using Evolutionary Information From Aligned Homologous Proteins Ramachandran Rakesh and Narayanaswamy Srinivasan
11. Systematic Exploration of an Efficient Amino Acid Substitution Matrix, MIQS Kentaro Tomii and Kazunori Yamada
12. Promises and Pitfalls of High Throughput Biological Assays Greg Finak and Raphael Gottardo
13. Optimizing RNA-seq Mapping with STAR Alexander Dobin and Thomas R. Gingeras
PART III: Prediction Methods
14. Predicting Conformational Disorder Philippe Lieutaud, Francois Ferron, and Sonia Longhi
15. Classification of Protein Kinases Influenced By Conservation of Substrate Binding Residues Chintalapati Janaki, Narayanaswamy Srinivasan, Malini Manoharan
16. Spectral-Statistical Approach for Revealing Latent Regular Structures in DNA Sequence Maria Chaley and Vladimir Kutyrkin
17. Protein Crystallizability Pawel Smialowski and Philip Wong
18. Analysis and Visualization of ChIP-Seq and RNA-Seq Sequence Alignments using ngs.plot Yong-Hwee Eddie Loh, and Li Shen
19. Dataming with ontologies Robert Hoehndorft, Georgios V. Gkoutos, and Paul N. Schofield
20. Functional Analysis of Metabolomics Data Monica Chagoyen, Javier Lopez-Ibanez, and Florencio Pazos <
21. Bacterial Genomics Data Analysis in the Next-Generation Sequencing Era Massimiliano Orsini, Gianmauro Cuccuru, Paolo Uva, and Giorgio Fotia
22. A Broad Overview of Computational Methods for Predicting the Pathophysiological Effects of Non-Synonymous Variants Stefano Castellana, Caterina Fusilli, and Tommaso Mazza
23. Recommendation Techniques for Drug-Target Interaction Prediction and Drug-Repositioning Salvatore Alaimo, Rosalba Giugno, and Alfredo Pulvirenti
24. Protein Residue Contacts and Prediction Methods Badri Adhikari and Jianlin Cheng
25. The Recipe for Protein Sequence-Based Function Prediction and its Implementation in the Annotator Software Environment Birgit Eisenhaber, Durga Kuchibhatla, Westley Sherman, Fernanda L. Sirota, Igor N. Berezovsky, Wing-Cheong Wong, and Frank Eisenhaber
Part IV: Big Data
26. Big Data, Evolution, and Metagenomes: Predicting Disease from Gut Microbiota Codon Usage Profiles Maja Fabijanic and Kristian Vlahovicek
27. Big Data in Plant Science: Resources and Data Mining Tools for Plant Genomics and Proteomics George V. Popescu, Christos Noutsos, and Sorina C. Popescu