Friday 7 September 2012

A Practical Approach to Microarray Data Analysis

A Practical Approach to Microarray Data Analysis

A Practical Approach to Microarray Data Analysis is for all life scientists, statisticians, computer experts, technology developers, managers, and other professionals tasked with developing, deploying, and using microarray technology including the necessary computational infrastructure and analytical tools. The book addresses the requirement of scientists and researchers to gain a basic understanding of microarray analysis methodologies and tools. It is intended for students, teachers, researchers, and research managers who want to understand the state of the art and of the presented methodologies and the areas in which gaps in our knowledge demand further research and development. The book is designed to be used by the practicing professional tasked with the design and analysis of microarray experiments or as a text for a senior undergraduate- or graduate level course in analytical genetics, biology, bioinformatics, computational biology, statistics and data mining, or applied computer science.
Key topics covered include:
-Format of result from data analysis, analytical modeling/experimentation;
-Validation of analytical results;
-Data analysis/Modeling task;
-Analysis/modeling tools;
-Scientific questions, goals, and tasks;
-Application;
-Data analysis methods;
-Criteria for assessing analysis methodologies, models, and tools

Download

Structure and Interpretation of Computer Programs (SICP)


Structure and Interpretation of Computer Programs (SICP) is a textbook published in 1984 about general computer programming concepts from MIT Press written by Massachusetts Institute of Technology (MIT) professors Harold Abelson and Gerald Jay Sussman, with Julie Sussman. It was formerly used as the textbook of MIT introductory programming class and at other schools.
Using a dialect of the Lisp programming language known as Scheme, the book explains core computer science concepts, including abstraction, recursion, interpreters and metalinguistic abstraction, and teaches modular programming.
The program also introduces a practical implementation of the register machine concept, defining and developing an assembler for such a construct, which is used as a virtual machine for the implementation of interpreters and compilers in the book, and as a testbed for illustrating the implementation and effect of modifications to the evaluation mechanism. Working Scheme systems based on the design described in this book are quite common student projects.

Download 

Computational Biology: Genomes, Networks, Evolution


Lecture Slides

Lecture 1
Course Overview and Outline - Intro to Biology - Why Computational Biology - Regulatory Motif discovery

Lecture 2 - Sequence Alignment + Dynamic Programming
Fibonacci - Paths & Alignments - Bounded DP - Linear Space Alignment

Lecture 3 - Sequence Alignment II
Globa/Local/Semi-global alignment + Affine gaps + Alignment statistics

Lecture 4 - Exact string matching
Semi-numerical methods, prelude to hashing

Lecture 5 - Hashing + Blast
Database search - Hashing - Blast - Extensions - Combs - Suffix Trees

Lecture 6 - Modeling Biological Sequences with HMMs
Dishonest Casino, CpG islands, Markov Chains, HMMs, Viterbi

Lecture 7 - HMM decoding evaluation training
Viterbi+Decoding, Forward+Evaluation, Backward+Posterior Decoding, BaumWelch+Training


Lecture 9 - Clustering and Dimensionality Reduction
Running time analysis, feature selection, SVD, PCA

Lecture 10 - Regulatory Motif Discovery
Combinatorial/probabilistic formulation, weight matrices, gibbs sampling, EM

Lecture 11 - Graph algorithms
Connected components, spectral partitioning
Evolution, trees, distance-based methods, parsimony 

Lecture 13 - Phylogenetics
Jukes-Cantor, Kimura, ultrametric, additive, UPGMA, Neighbor-Joining, Dynamic programming parsimony

Lecture 15 - RNA folding
RNA folding - Nussinov's algorithm - Zucker's algorithm - context-free grammars - parsing

Lecture 16 - Stochastic Context-Free Grammars
CYK algorithm - Inside/Outside - HMM similarity - Posterior decoding

Lecture 17 - Genome Rearrangements
Evolution by rearrangements - Sorting by reversals - greedy algorithms - approximation algorithms - breakpoint graphs

Lecture 18 - Genome Duplication
Orthologs - Paralogs - Phylogenetic Tree Reconciliation - Genome Duplication - Duplicate gene divergence - Accelerated Evolution

Lecture 19: Genome assembly
Sequencing, assembly, whole genome shotgun, hierarchical approach

Lecture 22 - Biological Networks
Guest lecture by Laszlo Barabasi - Scale-free networks - Network growth - Robustness - Modularity - Hierarchical - Flux

Lecture 23 - Advanced Multiple Alignment and Assembly
Traditional assembly - String-graph assembly - Global and glocal alignment - Alignmnet with polymorphism

Lecture 24 - Whole-Genome Analysis
HMMs for Gene Finding - Classification based gene finding - Human Motif Finding - MicroRNA regulation

Recitation Notes







Problem Sets

Problem Set 1     

Problem Set 2     



Wednesday 5 September 2012

Listen about Bioinformatics....






One More is here

An Introduction to Genetic Algorithms- Melanie Mitchell

Science arises from the very human desire to understand and control the world. Over the course of history, we humans have gradually built up a grand edifice of knowledge that enables us to predict, to varying extents, the weather, the motions of the planets, solar and lunar eclipses, the courses of diseases, the rise and fall of economic growth, the stages of language development in children, and a vast panorama of other natural, social, and cultural phenomena. More recently we have even come to understand some fundamental limits to our abilities to predict. Over the eons we have developed increasingly complex means to control many aspects of our lives and our interactions with nature, and we have learned, often the hard way, the extent to which other aspects are uncontrollable.

The advent of electronic computers has arguably been the most revolutionary development in the history of science and technology. This ongoing revolution is profoundly increasing our ability to predict and control nature in ways that were barely conceived of even half a century ago. For many, the crowning achievements of this revolution will be the creation—in the form of computer programs—of new species of intelligent beings, and even of new forms of life.

The goals of creating artificial intelligence and artificial life can be traced back to the very beginnings of the computer age. The earliest computer scientists—Alan Turing, John von Neumann, Norbert Wiener, and others—were motivated in large part by visions of imbuing computer programs with intelligence, with the life−like ability to self−replicate, and with the adaptive capability to learn and to control their environments. These early pioneers of computer science were as much interested in biology and psychology as in electronics, and they looked to natural systems as guiding metaphors for how to achieve their visions. It  should be no surprise, then, that from the earliest days computers were applied not only to calculating missile trajectories and deciphering military codes but also to modeling the brain, mimicking human learning, and simulating biological evolution. These biologically motivated computing activities have waxed and waned over the years, but since the early 1980s they have all undergone a resurgence in the computation research community. The first has grown into the field of neural networks, the second into machine learning, and the third into what is now called "evolutionary computation," of which genetic algorithms are the most prominent example

Read

DNA Molecular Structure and Dynamics

Author: I.C. Baianu, editor with several contributors


Description

A concise overview with color image galleries of important DNA molecular dynamics applications to computing and quantum computations of DNA structure and dynamics.

Includes several image galleries with instrumentation, techniques and contributed brilliant images. 113-page textbook PDF of 24 Mb, May 25th, 2009.

     Read

The Forbidden Combinations of Amino acids & Genetic Codes (codons)

The Forbidden Combinations of Amino acids & Genetic Codes (codons)



The proteogenic amino acids tryptophan, cysteine, and methionine have only a single codon for each in the table of universal genetic code. The relative frequency of each of these codons is 1.5625%. Strikingly, the relative distribution of these amino acids in enzymes is also invariably less than 3.0% irrespective of the class and  type of the reaction catalyzed.  The amino acids other than tryptophan, cysteine and methionine show variable distributions.  One would also find that the following genetic code combinations are  very rare  in nature. There are some hypothetical, predicted, or cloned sequences and proteins in the databases like NCBI. But, none of them are natural.The list of forbidden genetic code combinations:

1.  TGGTGTATG   corresponding to the amino acid combination WCM
2.  TGGATGTGT   corresponding to the amino acid combination WMC
3.  TGTATGTGG   corresponding to the amino acid combination CMW
4.  TGTTGGATG   corresponding to the amino acid combination CWM
5.  ATGTGTTGG  corresponding to the amino acid combination MCW
6.  ATGTGGTGT  corresponding to the amino acid combination MCW

Based on these observations, I conclude that nature does not allow all the genetic code combinations to occur with equal probability. If the combinations occur equally likely, then one should observe these combinations with the same relative frequency as those of other code combinations. Why nature forbids such combinations is yet to be answered. Is it biophysically restricted or is it a genetic restriction? These are unanswered questions. 
One could also make proteins, if possible, with these restricted combinations (either by site directed
mutagenesis or by solid state synthesis) and study their biophysical properties. The above observation is purely based on the data available from the NCBI and RCSB. 

To verify this claim:
1. Run blastp at http://blast.ncbi.nlm.nih.gov/Blast.cgi   for  wcmwmccmwcwmmcwmwc and check the output, check for the proteins, find whether they are hypothetical or biochemically characterized.
2. Run blastn at http://blast.ncbi.nlm.nih.gov/Blast.cgi   for  TGGTGTATGAAAAAAAAAAA, 
TGGATGTGTAAAAAAAAAAA, TGTATGTGGAAAAAAAAAAA, TGTTGGATGAAAAAAAAAAA,
ATGTGTTGGAAAAAAAAAAA, ATGTGGTGTAAAAAAAAAAA, and check each output. One would find similar sequences only, no exact match (except some clones).


Sivashanmugam. P., Lecturer, Biophysical Chemistry, 
Department of Bioinformatics, Jamal Mohamed College, Tiruchirappalli – 620020 – India
e-mail: soundaryanayaki@aol.com