DCMB Software and Bioinformatics Tools

 Below is a listing of software and bioinformatics tools developed by DCMB faculty and researchers.

Genomics, regulatory genomics and epigenomics

  • Bamnostic:
    A pure Python multi-version tolerant, runtime and OS-agnostic BAM file parser and random access tool. 
  • Broad-Enrich:
    test for enriched biological pathways, Gene Ontology terms, or other gene sets (Sartor)
  • Canny:
    assessing copy number variation genotypes (Mills)
  • Dinumt:
    identification and genotyping of nuclear insertions of mitochondrial origin (Mills)
  • fastCN:
    A pipeline for the fast estimation of copy-number based on read depth using the multi-mapper mrsFAST (Kidd)
  • F-seq:
    Individual sequence data summarization and display (Boyle)
  • F-seq2:
    A generic peak caller using kernel density estimation (Boyle)
  • GREGOR:
    evaluation of global enrichment of trait-associated variants (Willer)
  • insertion-genotype:
    Genotyping Mobile Element Insertions based on remapping reads (Kidd)
  • Islet eQTL variants:
    exploration of variants in islet expression quantitative trait loci (Parker)
  • LocusZoom:
    plotting regional association results (Willer)
  • LRpath:
    gene set enrichment testing using logistic regression (Sartor)
  • MethylSig:
    analyzing bisulfite sequencing data (Sartor)
  • Nephroseq:
    analysis of publicly available renal gene expression data (Kretzler)
  • Palmer
    Pre-mAsking Long reads for Mobile Element inseRtion
  • PePr:
    Peak Prioritization Pipeline, an analysis pipeline for ChIP-Seq experiments with biological replicates (Sartor)
  • QuicK-mer2:
    k-mer based analysis for paralog specific copy number estimation (Kidd)
  • RegulomeDB:
    a database that annotates SNPs (Boyle)
  • SAIGE:
    Efficiently controlling for case-control imbalance and sample relatedness in single-variant assoc tests (SAIGE) and controlling for sample relatedness in region-based assoc tests in large cohorts and biobanks [SAIGE-GENE] (Willer)
  • Self Organizing Maps:
    exploration of the combinatorial space of transcription factor binding (Boyle)
  • Svelter:
    identification of rearrangements from paired-end sequencing data (Mills)

Protein structure, proteomics, and alternative splicing

  • 3DRobot:
    protein decoy structure generator (Zhang)
  • ABACUS:
    extraction of label-free quantitative information from MS/MS data sets (Nesvizhskii)
  • ANGLOR:
    protein backbone torsion angle prediction (Zhang)
  • BatMass:
    mass spectrometry data visualization (Nesvizhskii)
  • BSpred:
    sequence-base protein-protein binding site prediction (Zhang)
  • COACH:
    protein ligand binding site prediction (Zhang)
  • COFACTOR:
    structure based protein function prediction (Zhang)
  • CRAPome:
    Contaminant Repository for Affinity Purification (Nesvizhskii)
  • Crystal-C:
    A computational tool for refinement of open search results (Nesvizhskii)
  • DEMO:
    protein domain structure assembly (Zhang)
  • DIA-UMPIRE:
    analysis of data independent acquisition (DIA) mass spectrometry-based proteomics data (Nesvizhskii)
  • Disorder Atlas:
    interpretation of intrinsic disorder predictions using proteome-based descriptive statistics (Schnell)
  • EDTsurf:
    construction of macromolecular surface (Zhang)
  • FragPipe:
    A complete proteomics pipeline with MSFragger search engine at heart
  • HAAD:
    hydrogen atom addition for protein structures (Zhang)
  • I-TASSER:
    protein structure prediction and structure-base function annotation (Zhang)
  • I-TASSER-MR:
    determining phase of X-ray crystallography using structure prediction (Zhang)
  • IonCom:
    Protein Ion ligand binding site prediction (Zhang)
  • LOMETS:
    meta-server for protein threading and fold-recognition (Zhang)
  • Luciphor:
    localization of post-translational modifications on peptide sequences (Nesvizhskii)
  • Luciphor2:
    an expansion of Luciphor in JAVA (Nesvizhskii)
  • MAP-DIA:
    Model-based Analysis of Quantitative Proteomics from Data Independent Acquisition Mass Spectrometry (Nesvizhskii)
  • MM-align:
    protein-protein complex structural alignment (Zhang)
  • ModRefiner:
    high resolution protein structure refinement program (Zhang)
  • MSFragger:
    Ultrafast and comprehensive peptide identification in mass spectrometry–based proteomics (Nesvizhskii)
  • MUSTER:
    protein threading program for identifying global structure template for target sequence (Zhang)
  • NESTEDCLUSTER:
    construction of protein complexes (Nesvizhskii)
  • PD-Nodes
    The implementation of MSFragger and Philosopher (PeptideProphet) as Proteome Discoverer nodes (Nesvizhskii)
  • Philosopher:
    A complete toolkit for shotgun proteomics data analysis (Nesvizhskii)
  • PROHITS:
    a Laboratory Management System (LIMS) for interaction proteomics (Nesvizhskii)
  • PSSpred:
    protein secondary structure prediction (Zhang)
  • QPROT:
    analysis of differential protein expression (Nesvizhskii)
  • QSPEC:
    analysis of differential protein expression with label-free spectral count data (Nesvizhskii)
  • QUARK:
    ab initio protein structure prediction (Zhang)
  • REMO:
    reconstructing full-atom protein structure model from C-alpha trace (Zhang)
  • ResQ:
    estimating B-factor and residue-level quality of protein structure (Zhang)
  • RW and RWplus:
    atomic-level potential for protein structure recognition (Zhang)
  • SAINT:
    Significance Analysis of INTeractome (Nesvizhskii)
  • SEGMER:
    protein threading program for identifying local conserved structure motifs (Zhang)
  • Spectre:
    identification of regions of active translation from ribosome profiling sequence data (Mills)
  • SPICKER:
    protein decoy selection through structure clustering (Zhang)
  • STRUM:
    prediction of protein stability change upon single point mutation (Zhang)
  • SVMSEQ:
    protein residue-residue contact prediction by Support Vector Machine (Zhang)
  • ThreaDom:
    threading-based protein domain boundary prediction (Zhang)
  • TM-align:
    protein structural alignment (Zhang)
  • TM-score:
    quantitative assessment of protein structure similarity (Zhang)
  • TMT-Integrator:
    A tool that integrates channel abundances from multiple TMT samples and exports a general report for downstream analysis. (Nesvizhskii)
  • Trans-Proteomic Pipeline:
    primary processing of mass spectrometry-based proteomic data (Nesvizhskii)

Systems biology and networks analysis

  • MetDisease Plugin for Cytoscape:
    annotation of a metabolic network with MeSH disease terms (Karnovsky)
  • MetScape3:
    the visualization and interpretation of metabolomic and expression profiling data (Karnovsky)
  • Network WorkBench:
    a large-scale network analysis, modeling and visualization toolkit (Schnell)

Biomedical data science, translational bioinformatics, and pharmacogenomics

  • ConceptMetab:  
    mapping and exploring the relationships among metabolite sets (Sartor)
  • MetDisease Plugin for Cytoscape:
    annotation of a metabolic network with MeSH disease terms (Karnovsky)
  • MetScape3:
    the visualization and interpretation of metabolomic and expression profiling data (Karnovsky)

Methodological development in computational biology