Below is a listing of software and bioinformatics tools developed by DCMB faculty and researchers.
Genomics, regulatory genomics and epigenomics
- Bamnostic:
A pure Python multi-version tolerant, runtime and OS-agnostic BAM file parser and random access tool. - Broad-Enrich:
test for enriched biological pathways, Gene Ontology terms, or other gene sets (Sartor)
- Canny:
assessing copy number variation genotypes (Mills)
- Dinumt:
identification and genotyping of nuclear insertions of mitochondrial origin (Mills) - fastCN:
A pipeline for the fast estimation of copy-number based on read depth using the multi-mapper mrsFAST (Kidd) - F-seq:
Individual sequence data summarization and display (Boyle) - F-seq2:
A generic peak caller using kernel density estimation (Boyle) - GREGOR:
evaluation of global enrichment of trait-associated variants (Willer) - insertion-genotype:
Genotyping Mobile Element Insertions based on remapping reads (Kidd) - Islet eQTL variants:
exploration of variants in islet expression quantitative trait loci (Parker)
- LocusZoom:
plotting regional association results (Willer)
- LRpath:
gene set enrichment testing using logistic regression (Sartor)
- MethylSig:
analyzing bisulfite sequencing data (Sartor)
- Nephroseq:
analysis of publicly available renal gene expression data (Kretzler) - Palmer
Pre-mAsking Long reads for Mobile Element inseRtion
- PePr:
Peak Prioritization Pipeline, an analysis pipeline for ChIP-Seq experiments with biological replicates (Sartor) - QuicK-mer2:
k-mer based analysis for paralog specific copy number estimation (Kidd)
- RegulomeDB:
a database that annotates SNPs (Boyle) - SAIGE:
Efficiently controlling for case-control imbalance and sample relatedness in single-variant assoc tests (SAIGE) and controlling for sample relatedness in region-based assoc tests in large cohorts and biobanks [SAIGE-GENE] (Willer) - Self Organizing Maps:
exploration of the combinatorial space of transcription factor binding (Boyle)
- Svelter:
identification of rearrangements from paired-end sequencing data (Mills)
Protein structure, proteomics, and alternative splicing
- 3DRobot:
protein decoy structure generator (Zhang)
- ABACUS:
extraction of label-free quantitative information from MS/MS data sets (Nesvizhskii)
- ANGLOR:
protein backbone torsion angle prediction (Zhang) - BatMass:
mass spectrometry data visualization (Nesvizhskii) - BSpred:
sequence-base protein-protein binding site prediction (Zhang) - COACH:
protein ligand binding site prediction (Zhang) - COFACTOR:
structure based protein function prediction (Zhang)
- CRAPome:
Contaminant Repository for Affinity Purification (Nesvizhskii) - Crystal-C:
A computational tool for refinement of open search results (Nesvizhskii) - DEMO:
protein domain structure assembly (Zhang)
- DIA-UMPIRE:
analysis of data independent acquisition (DIA) mass spectrometry-based proteomics data (Nesvizhskii)
- Disorder Atlas:
interpretation of intrinsic disorder predictions using proteome-based descriptive statistics (Schnell)
- EDTsurf:
construction of macromolecular surface (Zhang) - FragPipe:
A complete proteomics pipeline with MSFragger search engine at heart - HAAD:
hydrogen atom addition for protein structures (Zhang) - I-TASSER:
protein structure prediction and structure-base function annotation (Zhang) - I-TASSER-MR:
determining phase of X-ray crystallography using structure prediction (Zhang) - IonCom:
Protein Ion ligand binding site prediction (Zhang) - LOMETS:
meta-server for protein threading and fold-recognition (Zhang)
- Luciphor:
localization of post-translational modifications on peptide sequences (Nesvizhskii) - Luciphor2:
an expansion of Luciphor in JAVA (Nesvizhskii) - MAP-DIA:
Model-based Analysis of Quantitative Proteomics from Data Independent Acquisition Mass Spectrometry (Nesvizhskii) - MM-align:
protein-protein complex structural alignment (Zhang) - ModRefiner:
high resolution protein structure refinement program (Zhang) - MSFragger:
Ultrafast and comprehensive peptide identification in mass spectrometry–based proteomics (Nesvizhskii) - MUSTER:
protein threading program for identifying global structure template for target sequence (Zhang)
- NESTEDCLUSTER:
construction of protein complexes (Nesvizhskii) - PD-Nodes:
The implementation of MSFragger and Philosopher (PeptideProphet) as Proteome Discoverer nodes (Nesvizhskii) - Philosopher:
A complete toolkit for shotgun proteomics data analysis (Nesvizhskii) - PROHITS:
a Laboratory Management System (LIMS) for interaction proteomics (Nesvizhskii)
- PSSpred:
protein secondary structure prediction (Zhang)
- QPROT:
analysis of differential protein expression (Nesvizhskii) - QSPEC:
analysis of differential protein expression with label-free spectral count data (Nesvizhskii)
- QUARK:
ab initio protein structure prediction (Zhang) - REMO:
reconstructing full-atom protein structure model from C-alpha trace (Zhang) - ResQ:
estimating B-factor and residue-level quality of protein structure (Zhang) - RW and RWplus:
atomic-level potential for protein structure recognition (Zhang) - SAINT:
Significance Analysis of INTeractome (Nesvizhskii) - SEGMER:
protein threading program for identifying local conserved structure motifs (Zhang)
- Spectre:
identification of regions of active translation from ribosome profiling sequence data (Mills)
- SPICKER:
protein decoy selection through structure clustering (Zhang) - STRUM:
prediction of protein stability change upon single point mutation (Zhang) - SVMSEQ:
protein residue-residue contact prediction by Support Vector Machine (Zhang) - ThreaDom:
threading-based protein domain boundary prediction (Zhang) - TM-align:
protein structural alignment (Zhang) - TM-score:
quantitative assessment of protein structure similarity (Zhang) - TMT-Integrator:
A tool that integrates channel abundances from multiple TMT samples and exports a general report for downstream analysis. (Nesvizhskii) - Trans-Proteomic Pipeline:
primary processing of mass spectrometry-based proteomic data (Nesvizhskii)
Systems biology and networks analysis
- MetDisease Plugin for Cytoscape:
annotation of a metabolic network with MeSH disease terms (Karnovsky) - MetScape3:
the visualization and interpretation of metabolomic and expression profiling data (Karnovsky) - Network WorkBench:
a large-scale network analysis, modeling and visualization toolkit (Schnell)
Biomedical data science, translational bioinformatics, and pharmacogenomics
- ConceptMetab:
mapping and exploring the relationships among metabolite sets (Sartor) - MetDisease Plugin for Cytoscape:
annotation of a metabolic network with MeSH disease terms (Karnovsky) - MetScape3:
the visualization and interpretation of metabolomic and expression profiling data (Karnovsky)
Methodological development in computational biology
- Compressive Big Data Analytics (CBDA): (Dinov)
- Data Science: Time Complexity, Inferential Uncertainty, and Spacekime Analytics: (Dinov)
- DataSifter: Statistical Obfuscation of sensitive health data:
(Dinov) - DPC:
Dual Projection onto Convex Sets for screening (Ye) - LONI Pipeline:
graphical Workflow Environment for Imaging, Informatics and Genomics Computing (Dinov) - Probability Distributome Project:
- for diverse probability distributions (Dinov)
- R Predictive Big Data Analytics (PBDA):
Big Data Discovery Science, and Big Data To Knowledge Software (Dinov) - SOCR:
Statistics Online Computational Resource (Dinov) - WAIR:
Wavelet Analysis of Image Registration (Dinov)