Projects and Software

 

 

 

Population Genetics

 
HAPAA: a tool for ancestral haploblock reconstruction. Specifically, given the genotype  (for instance, as derived by an Illumina genotyping array) of an individual of admixed ancestry, find the source population for each segment of the individual's genome.

 

Protein Interaction Networks

 
A tool for aligning multiple global protein interaction networks; Graemlin also supports search for homology between a query module of proteins and a database of interaction networks.

 

Machine Learning

 
CONTRA: Conditionally trained models for sequence analysis. See CONTRAlign, a protein sequence aligner with very high accuracy, especially in twilight alignments. See CONTRAfold, an RNA secondary structure prediction tool. Stay tuned for more...

RNA Structure Prediction

CONTRAfold: Prediction of RNA secondary structure with a Conditional Log-Linear model that relies on automatically trained parameters, rather than on a physics-based energy model of RNA folding.

Protein Alignment

CONTRAlign: A protein sequence aligner that users can optionally train on feature sets such as secondary structure and solvent accessibility; see the CONTRA project above.
A protein multiple sequence aligner that exhibits high accuracy on popular benchmarks.
A protein multiple aligner that automatically finds domain structures of sequences with shuffled and repeated domain architectures.

Motif Finding

MotifCut: a non-parametric graph-based motif finding algorithm.
 
MotifScan: a non-parametric method for representing motifs and scanning DNA sequences for known motifs.
 CompareProspector: motif finding with Gibbs sampling & alignment.

Genomic Alignment

Stanford ENCODE: Multiple Alignments of 1% of the Human genome.
Typhon: BLAST-like sequence search to a multiple alignments database.
LAGAN: tools for genomic alignment. These include the MLAGAN multiple alignment tool, and Shuffle-LAGAN for alignment with rearrangements.

Microarray Analysis

Application of Independent Component Analysis (ICA) to microarrays.