Click to have a closer look
About this book
Contents
Customer reviews
Biography
Related titles
About this book
Pulls together all of the vital information about the most commonly used databases, analytical tools, and tables used in sequence analysis. Contains details and examples of the common database formats (GenBank, EMBL, SWISS-PROT) and the GenBank/EMBL/DDBJ Feature Table Definitions. Also provides the command line syntax for popular analysis applications such as Readseq and MEME/MAST, BLAST, ClustalW, and the EMBOSS suite, as well as tables of nucleotide, genetic, and amino acid codes. Written in O'Reilly's popular, straightforward "Nutshell" format.
Contents
Preface I. Data Formats 1. FASTA Format NCBI's Sequence Identifier Syntax NCBI's Non-Redundant Database Syntax References 2. GenBank/EMBL/DDBJ Example Flat Files GenBank Example Flat File DDBJ Example Flat File GenBank/DDBJ Field Definitions EMBL Example Flat File EMBL Field Definitions DDBJ/EMBL/GenBank Feature Table References 3. SWISS-PROT SWISS-PROT Example Flat File SWISS-PROT Field Definitions SWISS-PROT Feature Table References 4. Pfam Pfam Example Flat File Pfam Field Definitions References 5. PROSITE PROSITE Example Flat File PROSITE Field Definitions References II. Tools 6. Readseq Supported Formats Command-Line Options References 7. BLAST formatdb blastall megablast blastpgp PSI-BLAST PHI-BLAST bl2seq References 8. BLAT Command-Line Options References 9. ClustalW Command-Line Options References 10. HMMER hmmalign hmmbuild hmmcalibrate hmmconvert hmmemit hmmfetch hmmindex hmmpfam hmmsearch References 11. MEME/MAST MEME MAST References 12. EMBOSS Common Themes List of All EMBOSS Programs Details of EMBOSS Programs References III. Appendixes A. Nucleotide and Amino Acid Tables B. Genetic Codes C. Resources D. Future Plans Index
Customer Reviews
Biography
Scott Markel is a Principal Software Architect at LION bioscience Inc., where he is responsible for providing architectural direction in the development of software for the life sciences, including the use and development of standards. He is a co-chair of the Life Sciences Research Domain Task Force of the Object Management Group, and also chairs the LSR's Architecture and Roadmap Working Group. Prior to working at LION, Scott worked at NetGenics, Johnson & Johnson Pharmaceutical Research & Development, and Sarnoff Corporation. He has a Ph.D. in mathematics from the University of Wisconsin-Madison. When Scott's not working or writing he enjoys spending time with his wife and kids, reading European history books, and just enjoying life in sunny San Diego. Darryl Leon is a Principal Scientific Architect at LION bioscience Inc., where he is responsible for providing scientific direction in the development of software for the life sciences. Prior to working at LION, Darryl worked at NetGenics, DoubleTwist, and Genset. He has taught at California Polytechnic State University, San Luis Obispo, and currently teaches a bioinformatics class at U.C. Santa Cruz Extension and U.C. San Diego Extension. He is also a member of the Bioinformatics Advisory Committee at U.C. San Diego Extension. Darryl has a Ph. D. in biochemistry from the University of California, San Diego and did his postdoctoral research at the University of California, Santa Cruz.