Topic: "genome"
google/deepvariant
DeepVariant is an analysis pipeline that uses a deep neural network to call genetic variants from next-generation DNA sequencing data.
Language: Python - Size: 867 MB - Last synced at: 4 days ago - Pushed at: 4 months ago - Stars: 3,493 - Forks: 758

ivanseidel/IAMDinosaur
🦄 An Artificial Inteligence to teach Google's Dinosaur to jump cactus
Language: JavaScript - Size: 348 KB - Last synced at: 3 months ago - Pushed at: over 7 years ago - Stars: 2,812 - Forks: 536

broadinstitute/gatk
Official code repository for GATK versions 4 and up
Language: Java - Size: 463 MB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 1,847 - Forks: 614

primaryobjects/AI-Programmer
Using artificial intelligence and genetic algorithms to automatically write programs. Tutorial: http://www.primaryobjects.com/cms/article149
Language: C# - Size: 6.48 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 1,095 - Forks: 265

brentp/mosdepth
fast BAM/CRAM depth calculation for WGS, exome, or targeted sequencing
Language: Nim - Size: 1.17 MB - Last synced at: 6 months ago - Pushed at: 8 months ago - Stars: 725 - Forks: 100

jerryji1993/DNABERT
DNABERT: pre-trained Bidirectional Encoder Representations from Transformers model for DNA-language in genome
Language: Python - Size: 11.3 MB - Last synced at: about 2 months ago - Pushed at: 2 months ago - Stars: 680 - Forks: 172

MAGICS-LAB/DNABERT_2
[ICLR 2024] DNABERT-2: Efficient Foundation Model and Benchmark for Multi-Species Genome
Language: Shell - Size: 854 KB - Last synced at: 23 days ago - Pushed at: 23 days ago - Stars: 404 - Forks: 86

vanheeringen-lab/genomepy
genes and genomes at your fingertips
Language: Python - Size: 11.8 MB - Last synced at: 3 days ago - Pushed at: 4 months ago - Stars: 399 - Forks: 39

bernatgel/karyoploteR
karyoploteR - An R/Bioconductor package to plot arbitrary data along the genome
Language: R - Size: 3.08 MB - Last synced at: 11 days ago - Pushed at: 3 months ago - Stars: 337 - Forks: 44

HICAI-ZJU/Scientific-LLM-Survey
Scientific Large Language Models: A Survey on Biological & Chemical Domains
Size: 578 KB - Last synced at: 27 minutes ago - Pushed at: about 2 hours ago - Stars: 328 - Forks: 31

bcgsc/abyss
:microscope: Assemble large genomes using short reads
Language: C++ - Size: 60.9 MB - Last synced at: 6 days ago - Pushed at: 5 months ago - Stars: 323 - Forks: 110

tariqdaouda/pyGeno
Personalized Genomics and Proteomics. Main diet: Ensembl, side dishes: SNPs
Language: Python - Size: 10.6 MB - Last synced at: 4 months ago - Pushed at: almost 2 years ago - Stars: 321 - Forks: 49

genometools/genometools
GenomeTools genome analysis system.
Language: C - Size: 53.1 MB - Last synced at: 2 months ago - Pushed at: over 1 year ago - Stars: 316 - Forks: 64

Gaius-Augustus/Augustus
Genome annotation with AUGUSTUS
Language: C++ - Size: 545 MB - Last synced at: 30 days ago - Pushed at: 30 days ago - Stars: 314 - Forks: 115

lmdu/pyfastx
a python package for fast random access to sequences from plain and gzipped FASTA/Q files
Language: C - Size: 9.41 MB - Last synced at: 10 days ago - Pushed at: 8 months ago - Stars: 286 - Forks: 23

MariaNattestad/Ribbon
A genome browser designed for complex structural variants and long reads.
Language: JavaScript - Size: 33.3 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 283 - Forks: 30

bcgsc/NanoSim
Nanopore sequence read simulator
Language: Python - Size: 1010 MB - Last synced at: 3 months ago - Pushed at: 6 months ago - Stars: 274 - Forks: 61

alekseyzimin/masurca
Language: M4 - Size: 3.9 GB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 254 - Forks: 33

aquaskyline/SOAPdenovo2
Next generation sequencing reads de novo assembler.
Language: C - Size: 2.1 MB - Last synced at: 3 months ago - Pushed at: over 1 year ago - Stars: 232 - Forks: 77

zengxiaofei/HapHiC
HapHiC: a fast, reference-independent, allele-aware scaffolding tool based on Hi-C data
Language: Python - Size: 42.1 MB - Last synced at: 12 days ago - Pushed at: 12 days ago - Stars: 228 - Forks: 11

ropensci/biomartr
Genomic Data Retrieval with R
Language: R - Size: 6.03 MB - Last synced at: 2 days ago - Pushed at: about 2 months ago - Stars: 223 - Forks: 29

Nextomics/NextPolish
Fast and accurately polish the genome generated by long reads.
Language: C - Size: 14.2 MB - Last synced at: 4 months ago - Pushed at: 8 months ago - Stars: 223 - Forks: 28

broadinstitute/viral-ngs
Viral genomics analysis pipelines
Language: Python - Size: 64.5 MB - Last synced at: 6 months ago - Pushed at: 12 months ago - Stars: 192 - Forks: 68

genome-spy/genome-spy
A visualization grammar and GPU-accelerated toolkit for genomic data
Language: JavaScript - Size: 14.4 MB - Last synced at: 2 days ago - Pushed at: 2 months ago - Stars: 184 - Forks: 11

google/deepsomatic
DeepSomatic is an analysis pipeline that uses a deep neural network to call somatic variants from tumor-normal and tumor-only sequencing data.
Size: 66.4 KB - Last synced at: 4 days ago - Pushed at: 4 months ago - Stars: 180 - Forks: 22

nf-core/eager
A fully reproducible and state-of-the-art ancient DNA analysis pipeline
Language: Nextflow - Size: 64.7 MB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 179 - Forks: 83

althonos/pyrodigal
Cython bindings and Python interface to Prodigal, an ORF finder for genomes and metagenomes. Now with SIMD!
Language: Cython - Size: 7.03 MB - Last synced at: 3 days ago - Pushed at: 19 days ago - Stars: 169 - Forks: 9

pirovc/genome_updater
Bash script to download/update snapshots of files from NCBI genomes repository (refseq/genbank) with track of changes and without redundancy
Language: Shell - Size: 1.29 MB - Last synced at: 2 days ago - Pushed at: 9 months ago - Stars: 160 - Forks: 15

XMU-Kuangnan-Fang-Team/GENetLib
A Python library for Gene–environment interaction analysis via deep learning
Language: Python - Size: 4.37 MB - Last synced at: 18 days ago - Pushed at: 18 days ago - Stars: 157 - Forks: 19

aehrc/VariantSpark
machine learning for genomic variants
Language: JavaScript - Size: 75.1 MB - Last synced at: 13 days ago - Pushed at: 13 days ago - Stars: 146 - Forks: 45

baoxingsong/AnchorWave
Sensitive alignment of genomes with high sequence diversity, extensive structural polymorphism and whole-genome duplication variation
Language: C++ - Size: 26.9 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 145 - Forks: 19

tolkit/telomeric-identifier
Identify and find telomeres, or telomeric repeats in a genome.
Language: Rust - Size: 4.28 MB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 130 - Forks: 14

cslarsen/arv
A fast 23andMe DNA parser and inferrer for Python
Language: C++ - Size: 325 KB - Last synced at: 18 days ago - Pushed at: almost 6 years ago - Stars: 121 - Forks: 7

neherlab/pangraph
A bioinformatic toolkit to align genome assemblies into pangenome graphs
Language: C - Size: 75.5 MB - Last synced at: 19 days ago - Pushed at: 20 days ago - Stars: 112 - Forks: 7

DECIPHER-genomics/Genoverse
HTML5 scrollable genome browser
Language: JavaScript - Size: 15.3 MB - Last synced at: 8 days ago - Pushed at: almost 2 years ago - Stars: 111 - Forks: 45

COMBINE-lab/pufferfish
An efficient index for the colored, compacted, de Bruijn graph
Language: C - Size: 6.88 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 107 - Forks: 19

mentatpsi/OSGenome
An Open Source Web Application for Genetic Data (SNPs) using 23AndMe and Data Crawling Technologies
Language: Python - Size: 5.75 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 107 - Forks: 17

genotoul-bioinfo/dgenies
Dotplot large Genomes in an Interactive, Efficient and Simple way
Language: JavaScript - Size: 16.7 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 104 - Forks: 12

mobinasri/flagger
Evaluating genome assemblies
Language: C - Size: 36.3 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 97 - Forks: 10

bcgsc/arcs
🌈Scaffold genome sequence assemblies using linked or long read sequencing data
Language: C++ - Size: 106 MB - Last synced at: 6 days ago - Pushed at: 10 months ago - Stars: 96 - Forks: 16

rnabioco/valr
Genome Interval Arithmetic in R
Language: R - Size: 69.3 MB - Last synced at: 11 days ago - Pushed at: 11 days ago - Stars: 93 - Forks: 25

broadinstitute/catch
A package for designing compact and comprehensive capture probe sets.
Language: Python - Size: 5.68 MB - Last synced at: 2 days ago - Pushed at: over 1 year ago - Stars: 89 - Forks: 15

flowhub-team/WholeGenomeSequencing
Whole Genome Sequencing analysis, WGS analysis
Size: 889 KB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 78 - Forks: 10

kennethreitz/context
Raw dump of Kenneth Reitz's DNA Sequence, Ancestry, Genealogy, &c.
Language: HTML - Size: 129 MB - Last synced at: about 8 hours ago - Pushed at: about 11 hours ago - Stars: 77 - Forks: 8

ychuest/Awesome-LLMs-meet-genomes
Explore a comprehensive collection of basic theories, applications, papers, and best practices about Large Language Models (LLMs) in genomes.
Size: 24.1 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 75 - Forks: 4

SAMtoBAM/MUMandCo
MUM&Co is a simple bash script that uses Whole Genome Alignment information provided by MUMmer (only v4) to detect Structural Variation
Language: Shell - Size: 4.67 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 71 - Forks: 15

robert-koch-institut/SARS-CoV-2-Sequenzdaten_aus_Deutschland
Ein zentraler Bestandteil einer erfolgreichen Erregersurveillance ist das Verständnis der Verbreitung eines Erregers sowie seiner pathogenen Eigenschaften. Hierbei stellt das Wissen über das Erregergenom eine wichtige Informationsquelle dar. So erlaubt der Nachweis von Mutationen im Genom eines Erregers, Verwandtschaftsbeziehungen zu rekonstruie...
Size: 11.5 GB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 68 - Forks: 7

evotools/nf-LO
A Nextflow workflow to generate lift over files for any pair of genomes
Language: Nextflow - Size: 19.4 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 66 - Forks: 10

muellan/metacache
memory efficient, fast & precise taxnomomic classification system for metagenomic read mapping
Language: C++ - Size: 149 MB - Last synced at: 4 months ago - Pushed at: 5 months ago - Stars: 60 - Forks: 13

tubanlee/MD
Matrix dissimilarity from the differences of Moments and sparsity
Language: R - Size: 2.42 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 56 - Forks: 3

icebert/pblat
parallelized blat with multi-threads support
Language: C - Size: 2.45 MB - Last synced at: 7 months ago - Pushed at: 8 months ago - Stars: 52 - Forks: 15

drostlab/LTRpred
De novo annotation of young retrotransposons
Language: R - Size: 8.26 MB - Last synced at: 6 days ago - Pushed at: over 3 years ago - Stars: 48 - Forks: 9

quxiaojian/PGA
Plastid Genome Annotator
Language: Perl - Size: 3.84 MB - Last synced at: 8 months ago - Pushed at: almost 5 years ago - Stars: 48 - Forks: 18

zhaotao1987/SynNet-Pipeline
Workflow for Building Microsynteny Networks
Language: Shell - Size: 3.78 MB - Last synced at: over 1 year ago - Pushed at: about 3 years ago - Stars: 47 - Forks: 27

jasperlinthorst/reveal
Graph based multi genome aligner
Language: Python - Size: 6.48 MB - Last synced at: about 2 months ago - Pushed at: almost 4 years ago - Stars: 47 - Forks: 3

YaoLab-Bioinfo/shinyChromosome
an R/Shiny application for interactive creation of non-circular plots of whole genomes
Language: R - Size: 106 MB - Last synced at: about 1 month ago - Pushed at: 4 months ago - Stars: 45 - Forks: 9

ay-lab/dcHiC
dcHiC: Differential compartment analysis for Hi-C datasets
Language: R - Size: 155 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 44 - Forks: 9

deepomicslab/GCNFrame
This is a python package for genomics study with a GCN framework.
Language: Python - Size: 2.46 MB - Last synced at: 12 days ago - Pushed at: 9 months ago - Stars: 42 - Forks: 8

mkpython3/Mutation-Simulator
A tool for simulating random mutations in any genome
Language: Python - Size: 4.47 MB - Last synced at: 10 days ago - Pushed at: over 1 year ago - Stars: 42 - Forks: 6

ginkgobioworks/edge
Efficiently keep track of changes to genomes
Language: Python - Size: 37.6 MB - Last synced at: about 15 hours ago - Pushed at: about 1 year ago - Stars: 38 - Forks: 4

estebanpw/chromeister
A dotplot generator for large chromosomes
Language: C - Size: 927 KB - Last synced at: over 1 year ago - Pushed at: almost 2 years ago - Stars: 38 - Forks: 4

Plant-Food-Research-Open/assemblyqc
A Nextflow pipeline for evaluating assembly quality
Language: Nextflow - Size: 65.3 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 37 - Forks: 8

wangyibin/CPhasing
C-Phasing/CPhasing: Phasing and scaffolding polyploid genomes based on Pore-C, HiFi-C/CiFi or Hi-C.
Language: Python - Size: 70.5 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 37 - Forks: 4

guyleonard/get_jgi_genomes
A quick and easy way to download the genomes/predicted proteins of taxa available in JGI's Genome Portal.
Language: Perl - Size: 97.7 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 37 - Forks: 6

BioinformaticsLabAtMUN/Promotech
Machine-learning-based general bacterial promoter prediction tool.
Language: C - Size: 89.3 MB - Last synced at: over 1 year ago - Pushed at: about 2 years ago - Stars: 37 - Forks: 10

cggh/panoptes
Eyes on your (genomic) data
Language: JavaScript - Size: 51.4 MB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 37 - Forks: 6

bpucker/MGSE
Mapping-based Genome Size Estimation (MGSE) performs an estimation of a genome size based on a read mapping to an existing genome sequence assembly.
Language: Python - Size: 16.2 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 34 - Forks: 3

sjteresi/TE_Density
Python script calculating transposable element density for all genes in a genome. Publication: https://mobilednajournal.biomedcentral.com/articles/10.1186/s13100-022-00264-4
Language: Python - Size: 7.5 MB - Last synced at: 7 days ago - Pushed at: 11 months ago - Stars: 34 - Forks: 5

SirBob01/NEAT-Python
Genetic learning algorithm implementation for simulations, games, or general machine learning problems
Language: Python - Size: 321 KB - Last synced at: over 2 years ago - Pushed at: over 3 years ago - Stars: 34 - Forks: 10

RKMlab/perf
PERF is an Exhaustive Repeat Finder
Language: HTML - Size: 16 MB - Last synced at: 12 days ago - Pushed at: over 4 years ago - Stars: 34 - Forks: 11

junjunlab/BioSeqUtils
Extract Sequence from Genome According to Annotation File
Language: R - Size: 1.03 MB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 32 - Forks: 3

3DGenomes/TADkit 📦
3D Genome Browser
Language: JavaScript - Size: 99.8 MB - Last synced at: over 1 year ago - Pushed at: over 3 years ago - Stars: 31 - Forks: 10

jlab-code/MethylStar
A fast and robust pre-processing pipeline for bulk or single-cell whole-genome bisulfite sequencing (WGBS) data.
Language: Python - Size: 2.96 MB - Last synced at: almost 2 years ago - Pushed at: almost 4 years ago - Stars: 29 - Forks: 6

csoderlund/SyMAP
Synteny Mapping and Analysis Program
Language: Java - Size: 178 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 27 - Forks: 7

aquaskyline/16GT
Simultaneous detection of SNPs and Indels using a 16-genotype probabilistic model
Language: Perl - Size: 1.26 MB - Last synced at: 3 months ago - Pushed at: over 1 year ago - Stars: 27 - Forks: 8

bcgsc/btllib
📚Bioinformatics Technology Lab common code library
Language: C++ - Size: 12.9 MB - Last synced at: 1 day ago - Pushed at: 4 months ago - Stars: 25 - Forks: 6

nageshsinghc4/DNA-Sequence-Machine-learning
Understand DNA structure and how machine learning can be used to work with DNA sequence data.
Language: Python - Size: 2.83 MB - Last synced at: over 2 years ago - Pushed at: over 4 years ago - Stars: 25 - Forks: 23

GMOD/jbrowse-jupyter
A python package for showing JBrowse views
Language: Python - Size: 6.45 MB - Last synced at: 5 months ago - Pushed at: over 1 year ago - Stars: 24 - Forks: 4

lpryszcz/pyScaf
Genome assembly scaffolding using information from paired-end/mate-pair libraries, long reads, and synteny to closely related species.
Language: Python - Size: 3.07 MB - Last synced at: 8 days ago - Pushed at: almost 7 years ago - Stars: 24 - Forks: 11

SouradiptoC/CodonU
A python project for analysis of codon usage for gene or genome analysis
Language: Python - Size: 64.4 MB - Last synced at: 2 days ago - Pushed at: 10 months ago - Stars: 23 - Forks: 1

Rinoahu/SwiftOrtho
A high performance tool to identify orthologs and paralogs across genomes.
Language: Python - Size: 20.1 MB - Last synced at: almost 2 years ago - Pushed at: over 2 years ago - Stars: 22 - Forks: 12

resendislab/corda
An implementation of genome-scale model reconstruction using Cost Optimization Reaction Dependency Assessment by Schultz et. al
Language: Python - Size: 1.6 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 21 - Forks: 8

institut-de-genomique/HAPO-G
Hapo-G is a tool that aims to improve the quality of genome assemblies by polishing the consensus with accurate reads.
Language: C - Size: 177 KB - Last synced at: about 1 year ago - Pushed at: over 1 year ago - Stars: 21 - Forks: 2

GFA-spec/assembler-components
Components of genome sequence assembly tools
Language: Shell - Size: 41 KB - Last synced at: over 1 year ago - Pushed at: over 3 years ago - Stars: 21 - Forks: 3

sanger-pathogens/companion 📦
This repository has been archived, currently maintained version is at https://github.com/iii-companion/companion
Language: Lua - Size: 23.1 MB - Last synced at: 3 months ago - Pushed at: over 4 years ago - Stars: 21 - Forks: 18

Phillip-a-richmond/GenomeAnalysisModule
Welcome to the website and github repository for the Genome Analysis Module. This website will guide the learning experience for trainees in the UBC MSc Genetic Counselling Training Program, as they embark on a journey to learn about analyzing genomes.
Language: HTML - Size: 10.8 MB - Last synced at: almost 2 years ago - Pushed at: almost 3 years ago - Stars: 20 - Forks: 6

wtsi-hpag/Scaff10X
Pipeline for scaffolding and breaking a genome assembly using 10x genomics linked-reads
Language: C - Size: 3.94 MB - Last synced at: about 1 year ago - Pushed at: about 3 years ago - Stars: 20 - Forks: 4

nf-core/references
nf-core/references is a bioinformatics pipeline that build references, for multiple use cases
Language: Nextflow - Size: 1.34 MB - Last synced at: 2 days ago - Pushed at: 9 days ago - Stars: 18 - Forks: 5

guigolab/tmerge
Merge transcriptome read-to-genome alignments into non-redundant transcript models
Language: Perl - Size: 1.16 MB - Last synced at: 5 days ago - Pushed at: over 1 year ago - Stars: 18 - Forks: 3

teepean/BAM-Analysis-Kit
Language: Forth - Size: 2.18 GB - Last synced at: about 2 years ago - Pushed at: almost 3 years ago - Stars: 18 - Forks: 6

devinus/genome
My genomic data
Language: Standard ML - Size: 5.61 MB - Last synced at: 6 days ago - Pushed at: over 5 years ago - Stars: 18 - Forks: 2

superphy/semantic
SuperPhy for the semantic web
Language: Web Ontology Language - Size: 40.1 MB - Last synced at: over 1 year ago - Pushed at: over 8 years ago - Stars: 18 - Forks: 3

iferres/MLSTar
An easy way of MLSTyping your genomes in R.
Language: R - Size: 4.37 MB - Last synced at: almost 2 years ago - Pushed at: over 4 years ago - Stars: 17 - Forks: 10

wiedenhoeft/HaMMLET
Fast Bayesian Hidden Markov Model with Wavelet Compression
Language: C++ - Size: 2.52 MB - Last synced at: over 1 year ago - Pushed at: over 5 years ago - Stars: 17 - Forks: 4

zyxue/biogrinder
Grinder is a versatile open-source bioinformatic tool to create simulated omic shotgun and amplicon sequence libraries for all main sequencing platforms.
Language: Perl - Size: 246 KB - Last synced at: 4 months ago - Pushed at: about 7 years ago - Stars: 17 - Forks: 1

pyani-plus/pyani-plus
Development repo for pyani-plus (the next iteration of pyani)
Language: Python - Size: 12.8 MB - Last synced at: 2 days ago - Pushed at: 5 days ago - Stars: 16 - Forks: 2

swvanderlaan/MetaGWASToolKit
A ToolKit to perform a Meta-analysis of Genome-Wide Association Studies
Language: Shell - Size: 694 MB - Last synced at: 7 days ago - Pushed at: about 1 month ago - Stars: 16 - Forks: 2

PASSIONLab/ELBA
Parallel String Graph Construction, Transitive Reduction, and Contig Generation for De Novo Genome Assembly
Language: C++ - Size: 106 MB - Last synced at: 7 days ago - Pushed at: about 1 year ago - Stars: 16 - Forks: 10

tamerh/biobtree
A bioinformatics tool to search, map and retrieve identifiers, keywords and attributes
Language: Go - Size: 7.11 MB - Last synced at: about 1 year ago - Pushed at: over 2 years ago - Stars: 16 - Forks: 3

satoshikawato/gbdraw
A genome diagram generator for microbes and organelles
Language: Python - Size: 566 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 15 - Forks: 4
