An open API service providing repository metadata for many open source software ecosystems.

Topic: "genome"

google/deepvariant

DeepVariant is an analysis pipeline that uses a deep neural network to call genetic variants from next-generation DNA sequencing data.

Language: Python - Size: 867 MB - Last synced at: 4 days ago - Pushed at: 4 months ago - Stars: 3,493 - Forks: 758

ivanseidel/IAMDinosaur

🦄 An Artificial Inteligence to teach Google's Dinosaur to jump cactus

Language: JavaScript - Size: 348 KB - Last synced at: 3 months ago - Pushed at: over 7 years ago - Stars: 2,812 - Forks: 536

broadinstitute/gatk

Official code repository for GATK versions 4 and up

Language: Java - Size: 463 MB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 1,847 - Forks: 614

primaryobjects/AI-Programmer

Using artificial intelligence and genetic algorithms to automatically write programs. Tutorial: http://www.primaryobjects.com/cms/article149

Language: C# - Size: 6.48 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 1,095 - Forks: 265

brentp/mosdepth

fast BAM/CRAM depth calculation for WGS, exome, or targeted sequencing

Language: Nim - Size: 1.17 MB - Last synced at: 6 months ago - Pushed at: 8 months ago - Stars: 725 - Forks: 100

jerryji1993/DNABERT

DNABERT: pre-trained Bidirectional Encoder Representations from Transformers model for DNA-language in genome

Language: Python - Size: 11.3 MB - Last synced at: about 2 months ago - Pushed at: 2 months ago - Stars: 680 - Forks: 172

MAGICS-LAB/DNABERT_2

[ICLR 2024] DNABERT-2: Efficient Foundation Model and Benchmark for Multi-Species Genome

Language: Shell - Size: 854 KB - Last synced at: 23 days ago - Pushed at: 23 days ago - Stars: 404 - Forks: 86

vanheeringen-lab/genomepy

genes and genomes at your fingertips

Language: Python - Size: 11.8 MB - Last synced at: 3 days ago - Pushed at: 4 months ago - Stars: 399 - Forks: 39

bernatgel/karyoploteR

karyoploteR - An R/Bioconductor package to plot arbitrary data along the genome

Language: R - Size: 3.08 MB - Last synced at: 11 days ago - Pushed at: 3 months ago - Stars: 337 - Forks: 44

HICAI-ZJU/Scientific-LLM-Survey

Scientific Large Language Models: A Survey on Biological & Chemical Domains

Size: 578 KB - Last synced at: 27 minutes ago - Pushed at: about 2 hours ago - Stars: 328 - Forks: 31

bcgsc/abyss

:microscope: Assemble large genomes using short reads

Language: C++ - Size: 60.9 MB - Last synced at: 6 days ago - Pushed at: 5 months ago - Stars: 323 - Forks: 110

tariqdaouda/pyGeno

Personalized Genomics and Proteomics. Main diet: Ensembl, side dishes: SNPs

Language: Python - Size: 10.6 MB - Last synced at: 4 months ago - Pushed at: almost 2 years ago - Stars: 321 - Forks: 49

genometools/genometools

GenomeTools genome analysis system.

Language: C - Size: 53.1 MB - Last synced at: 2 months ago - Pushed at: over 1 year ago - Stars: 316 - Forks: 64

Gaius-Augustus/Augustus

Genome annotation with AUGUSTUS

Language: C++ - Size: 545 MB - Last synced at: 30 days ago - Pushed at: 30 days ago - Stars: 314 - Forks: 115

lmdu/pyfastx

a python package for fast random access to sequences from plain and gzipped FASTA/Q files

Language: C - Size: 9.41 MB - Last synced at: 10 days ago - Pushed at: 8 months ago - Stars: 286 - Forks: 23

MariaNattestad/Ribbon

A genome browser designed for complex structural variants and long reads.

Language: JavaScript - Size: 33.3 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 283 - Forks: 30

bcgsc/NanoSim

Nanopore sequence read simulator

Language: Python - Size: 1010 MB - Last synced at: 3 months ago - Pushed at: 6 months ago - Stars: 274 - Forks: 61

alekseyzimin/masurca

Language: M4 - Size: 3.9 GB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 254 - Forks: 33

aquaskyline/SOAPdenovo2

Next generation sequencing reads de novo assembler.

Language: C - Size: 2.1 MB - Last synced at: 3 months ago - Pushed at: over 1 year ago - Stars: 232 - Forks: 77

zengxiaofei/HapHiC

HapHiC: a fast, reference-independent, allele-aware scaffolding tool based on Hi-C data

Language: Python - Size: 42.1 MB - Last synced at: 12 days ago - Pushed at: 12 days ago - Stars: 228 - Forks: 11

ropensci/biomartr

Genomic Data Retrieval with R

Language: R - Size: 6.03 MB - Last synced at: 2 days ago - Pushed at: about 2 months ago - Stars: 223 - Forks: 29

Nextomics/NextPolish

Fast and accurately polish the genome generated by long reads.

Language: C - Size: 14.2 MB - Last synced at: 4 months ago - Pushed at: 8 months ago - Stars: 223 - Forks: 28

broadinstitute/viral-ngs

Viral genomics analysis pipelines

Language: Python - Size: 64.5 MB - Last synced at: 6 months ago - Pushed at: 12 months ago - Stars: 192 - Forks: 68

genome-spy/genome-spy

A visualization grammar and GPU-accelerated toolkit for genomic data

Language: JavaScript - Size: 14.4 MB - Last synced at: 2 days ago - Pushed at: 2 months ago - Stars: 184 - Forks: 11

google/deepsomatic

DeepSomatic is an analysis pipeline that uses a deep neural network to call somatic variants from tumor-normal and tumor-only sequencing data.

Size: 66.4 KB - Last synced at: 4 days ago - Pushed at: 4 months ago - Stars: 180 - Forks: 22

nf-core/eager

A fully reproducible and state-of-the-art ancient DNA analysis pipeline

Language: Nextflow - Size: 64.7 MB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 179 - Forks: 83

althonos/pyrodigal

Cython bindings and Python interface to Prodigal, an ORF finder for genomes and metagenomes. Now with SIMD!

Language: Cython - Size: 7.03 MB - Last synced at: 3 days ago - Pushed at: 19 days ago - Stars: 169 - Forks: 9

pirovc/genome_updater

Bash script to download/update snapshots of files from NCBI genomes repository (refseq/genbank) with track of changes and without redundancy

Language: Shell - Size: 1.29 MB - Last synced at: 2 days ago - Pushed at: 9 months ago - Stars: 160 - Forks: 15

XMU-Kuangnan-Fang-Team/GENetLib

A Python library for Gene–environment interaction analysis via deep learning

Language: Python - Size: 4.37 MB - Last synced at: 18 days ago - Pushed at: 18 days ago - Stars: 157 - Forks: 19

aehrc/VariantSpark

machine learning for genomic variants

Language: JavaScript - Size: 75.1 MB - Last synced at: 13 days ago - Pushed at: 13 days ago - Stars: 146 - Forks: 45

baoxingsong/AnchorWave

Sensitive alignment of genomes with high sequence diversity, extensive structural polymorphism and whole-genome duplication variation

Language: C++ - Size: 26.9 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 145 - Forks: 19

tolkit/telomeric-identifier

Identify and find telomeres, or telomeric repeats in a genome.

Language: Rust - Size: 4.28 MB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 130 - Forks: 14

cslarsen/arv

A fast 23andMe DNA parser and inferrer for Python

Language: C++ - Size: 325 KB - Last synced at: 18 days ago - Pushed at: almost 6 years ago - Stars: 121 - Forks: 7

neherlab/pangraph

A bioinformatic toolkit to align genome assemblies into pangenome graphs

Language: C - Size: 75.5 MB - Last synced at: 19 days ago - Pushed at: 20 days ago - Stars: 112 - Forks: 7

DECIPHER-genomics/Genoverse

HTML5 scrollable genome browser

Language: JavaScript - Size: 15.3 MB - Last synced at: 8 days ago - Pushed at: almost 2 years ago - Stars: 111 - Forks: 45

COMBINE-lab/pufferfish

An efficient index for the colored, compacted, de Bruijn graph

Language: C - Size: 6.88 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 107 - Forks: 19

mentatpsi/OSGenome

An Open Source Web Application for Genetic Data (SNPs) using 23AndMe and Data Crawling Technologies

Language: Python - Size: 5.75 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 107 - Forks: 17

genotoul-bioinfo/dgenies

Dotplot large Genomes in an Interactive, Efficient and Simple way

Language: JavaScript - Size: 16.7 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 104 - Forks: 12

mobinasri/flagger

Evaluating genome assemblies

Language: C - Size: 36.3 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 97 - Forks: 10

bcgsc/arcs

🌈Scaffold genome sequence assemblies using linked or long read sequencing data

Language: C++ - Size: 106 MB - Last synced at: 6 days ago - Pushed at: 10 months ago - Stars: 96 - Forks: 16

rnabioco/valr

Genome Interval Arithmetic in R

Language: R - Size: 69.3 MB - Last synced at: 11 days ago - Pushed at: 11 days ago - Stars: 93 - Forks: 25

broadinstitute/catch

A package for designing compact and comprehensive capture probe sets.

Language: Python - Size: 5.68 MB - Last synced at: 2 days ago - Pushed at: over 1 year ago - Stars: 89 - Forks: 15

flowhub-team/WholeGenomeSequencing

Whole Genome Sequencing analysis, WGS analysis

Size: 889 KB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 78 - Forks: 10

kennethreitz/context

Raw dump of Kenneth Reitz's DNA Sequence, Ancestry, Genealogy, &c.

Language: HTML - Size: 129 MB - Last synced at: about 8 hours ago - Pushed at: about 11 hours ago - Stars: 77 - Forks: 8

ychuest/Awesome-LLMs-meet-genomes

Explore a comprehensive collection of basic theories, applications, papers, and best practices about Large Language Models (LLMs) in genomes.

Size: 24.1 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 75 - Forks: 4

SAMtoBAM/MUMandCo

MUM&Co is a simple bash script that uses Whole Genome Alignment information provided by MUMmer (only v4) to detect Structural Variation

Language: Shell - Size: 4.67 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 71 - Forks: 15

robert-koch-institut/SARS-CoV-2-Sequenzdaten_aus_Deutschland

Ein zentraler Bestandteil einer erfolgreichen Erregersurveillance ist das Verständnis der Verbreitung eines Erregers sowie seiner pathogenen Eigenschaften. Hierbei stellt das Wissen über das Erregergenom eine wichtige Informationsquelle dar. So erlaubt der Nachweis von Mutationen im Genom eines Erregers, Verwandtschaftsbeziehungen zu rekonstruie...

Size: 11.5 GB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 68 - Forks: 7

evotools/nf-LO

A Nextflow workflow to generate lift over files for any pair of genomes

Language: Nextflow - Size: 19.4 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 66 - Forks: 10

muellan/metacache

memory efficient, fast & precise taxnomomic classification system for metagenomic read mapping

Language: C++ - Size: 149 MB - Last synced at: 4 months ago - Pushed at: 5 months ago - Stars: 60 - Forks: 13

tubanlee/MD

Matrix dissimilarity from the differences of Moments and sparsity

Language: R - Size: 2.42 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 56 - Forks: 3

icebert/pblat

parallelized blat with multi-threads support

Language: C - Size: 2.45 MB - Last synced at: 7 months ago - Pushed at: 8 months ago - Stars: 52 - Forks: 15

drostlab/LTRpred

De novo annotation of young retrotransposons

Language: R - Size: 8.26 MB - Last synced at: 6 days ago - Pushed at: over 3 years ago - Stars: 48 - Forks: 9

quxiaojian/PGA

Plastid Genome Annotator

Language: Perl - Size: 3.84 MB - Last synced at: 8 months ago - Pushed at: almost 5 years ago - Stars: 48 - Forks: 18

zhaotao1987/SynNet-Pipeline

Workflow for Building Microsynteny Networks

Language: Shell - Size: 3.78 MB - Last synced at: over 1 year ago - Pushed at: about 3 years ago - Stars: 47 - Forks: 27

jasperlinthorst/reveal

Graph based multi genome aligner

Language: Python - Size: 6.48 MB - Last synced at: about 2 months ago - Pushed at: almost 4 years ago - Stars: 47 - Forks: 3

YaoLab-Bioinfo/shinyChromosome

an R/Shiny application for interactive creation of non-circular plots of whole genomes

Language: R - Size: 106 MB - Last synced at: about 1 month ago - Pushed at: 4 months ago - Stars: 45 - Forks: 9

ay-lab/dcHiC

dcHiC: Differential compartment analysis for Hi-C datasets

Language: R - Size: 155 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 44 - Forks: 9

deepomicslab/GCNFrame

This is a python package for genomics study with a GCN framework.

Language: Python - Size: 2.46 MB - Last synced at: 12 days ago - Pushed at: 9 months ago - Stars: 42 - Forks: 8

mkpython3/Mutation-Simulator

A tool for simulating random mutations in any genome

Language: Python - Size: 4.47 MB - Last synced at: 10 days ago - Pushed at: over 1 year ago - Stars: 42 - Forks: 6

ginkgobioworks/edge

Efficiently keep track of changes to genomes

Language: Python - Size: 37.6 MB - Last synced at: about 15 hours ago - Pushed at: about 1 year ago - Stars: 38 - Forks: 4

estebanpw/chromeister

A dotplot generator for large chromosomes

Language: C - Size: 927 KB - Last synced at: over 1 year ago - Pushed at: almost 2 years ago - Stars: 38 - Forks: 4

Plant-Food-Research-Open/assemblyqc

A Nextflow pipeline for evaluating assembly quality

Language: Nextflow - Size: 65.3 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 37 - Forks: 8

wangyibin/CPhasing

C-Phasing/CPhasing: Phasing and scaffolding polyploid genomes based on Pore-C, HiFi-C/CiFi or Hi-C.

Language: Python - Size: 70.5 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 37 - Forks: 4

guyleonard/get_jgi_genomes

A quick and easy way to download the genomes/predicted proteins of taxa available in JGI's Genome Portal.

Language: Perl - Size: 97.7 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 37 - Forks: 6

BioinformaticsLabAtMUN/Promotech

Machine-learning-based general bacterial promoter prediction tool.

Language: C - Size: 89.3 MB - Last synced at: over 1 year ago - Pushed at: about 2 years ago - Stars: 37 - Forks: 10

cggh/panoptes

Eyes on your (genomic) data

Language: JavaScript - Size: 51.4 MB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 37 - Forks: 6

bpucker/MGSE

Mapping-based Genome Size Estimation (MGSE) performs an estimation of a genome size based on a read mapping to an existing genome sequence assembly.

Language: Python - Size: 16.2 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 34 - Forks: 3

sjteresi/TE_Density

Python script calculating transposable element density for all genes in a genome. Publication: https://mobilednajournal.biomedcentral.com/articles/10.1186/s13100-022-00264-4

Language: Python - Size: 7.5 MB - Last synced at: 7 days ago - Pushed at: 11 months ago - Stars: 34 - Forks: 5

SirBob01/NEAT-Python

Genetic learning algorithm implementation for simulations, games, or general machine learning problems

Language: Python - Size: 321 KB - Last synced at: over 2 years ago - Pushed at: over 3 years ago - Stars: 34 - Forks: 10

RKMlab/perf

PERF is an Exhaustive Repeat Finder

Language: HTML - Size: 16 MB - Last synced at: 12 days ago - Pushed at: over 4 years ago - Stars: 34 - Forks: 11

junjunlab/BioSeqUtils

Extract Sequence from Genome According to Annotation File

Language: R - Size: 1.03 MB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 32 - Forks: 3

3DGenomes/TADkit 📦

3D Genome Browser

Language: JavaScript - Size: 99.8 MB - Last synced at: over 1 year ago - Pushed at: over 3 years ago - Stars: 31 - Forks: 10

jlab-code/MethylStar

A fast and robust pre-processing pipeline for bulk or single-cell whole-genome bisulfite sequencing (WGBS) data.

Language: Python - Size: 2.96 MB - Last synced at: almost 2 years ago - Pushed at: almost 4 years ago - Stars: 29 - Forks: 6

csoderlund/SyMAP

Synteny Mapping and Analysis Program

Language: Java - Size: 178 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 27 - Forks: 7

aquaskyline/16GT

Simultaneous detection of SNPs and Indels using a 16-genotype probabilistic model

Language: Perl - Size: 1.26 MB - Last synced at: 3 months ago - Pushed at: over 1 year ago - Stars: 27 - Forks: 8

bcgsc/btllib

📚Bioinformatics Technology Lab common code library

Language: C++ - Size: 12.9 MB - Last synced at: 1 day ago - Pushed at: 4 months ago - Stars: 25 - Forks: 6

nageshsinghc4/DNA-Sequence-Machine-learning

Understand DNA structure and how machine learning can be used to work with DNA sequence data.

Language: Python - Size: 2.83 MB - Last synced at: over 2 years ago - Pushed at: over 4 years ago - Stars: 25 - Forks: 23

GMOD/jbrowse-jupyter

A python package for showing JBrowse views

Language: Python - Size: 6.45 MB - Last synced at: 5 months ago - Pushed at: over 1 year ago - Stars: 24 - Forks: 4

lpryszcz/pyScaf

Genome assembly scaffolding using information from paired-end/mate-pair libraries, long reads, and synteny to closely related species.

Language: Python - Size: 3.07 MB - Last synced at: 8 days ago - Pushed at: almost 7 years ago - Stars: 24 - Forks: 11

SouradiptoC/CodonU

A python project for analysis of codon usage for gene or genome analysis

Language: Python - Size: 64.4 MB - Last synced at: 2 days ago - Pushed at: 10 months ago - Stars: 23 - Forks: 1

Rinoahu/SwiftOrtho

A high performance tool to identify orthologs and paralogs across genomes.

Language: Python - Size: 20.1 MB - Last synced at: almost 2 years ago - Pushed at: over 2 years ago - Stars: 22 - Forks: 12

resendislab/corda

An implementation of genome-scale model reconstruction using Cost Optimization Reaction Dependency Assessment by Schultz et. al

Language: Python - Size: 1.6 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 21 - Forks: 8

institut-de-genomique/HAPO-G

Hapo-G is a tool that aims to improve the quality of genome assemblies by polishing the consensus with accurate reads.

Language: C - Size: 177 KB - Last synced at: about 1 year ago - Pushed at: over 1 year ago - Stars: 21 - Forks: 2

GFA-spec/assembler-components

Components of genome sequence assembly tools

Language: Shell - Size: 41 KB - Last synced at: over 1 year ago - Pushed at: over 3 years ago - Stars: 21 - Forks: 3

sanger-pathogens/companion 📦

This repository has been archived, currently maintained version is at https://github.com/iii-companion/companion

Language: Lua - Size: 23.1 MB - Last synced at: 3 months ago - Pushed at: over 4 years ago - Stars: 21 - Forks: 18

Phillip-a-richmond/GenomeAnalysisModule

Welcome to the website and github repository for the Genome Analysis Module. This website will guide the learning experience for trainees in the UBC MSc Genetic Counselling Training Program, as they embark on a journey to learn about analyzing genomes.

Language: HTML - Size: 10.8 MB - Last synced at: almost 2 years ago - Pushed at: almost 3 years ago - Stars: 20 - Forks: 6

wtsi-hpag/Scaff10X

Pipeline for scaffolding and breaking a genome assembly using 10x genomics linked-reads

Language: C - Size: 3.94 MB - Last synced at: about 1 year ago - Pushed at: about 3 years ago - Stars: 20 - Forks: 4

nf-core/references

nf-core/references is a bioinformatics pipeline that build references, for multiple use cases

Language: Nextflow - Size: 1.34 MB - Last synced at: 2 days ago - Pushed at: 9 days ago - Stars: 18 - Forks: 5

guigolab/tmerge

Merge transcriptome read-to-genome alignments into non-redundant transcript models

Language: Perl - Size: 1.16 MB - Last synced at: 5 days ago - Pushed at: over 1 year ago - Stars: 18 - Forks: 3

teepean/BAM-Analysis-Kit

Language: Forth - Size: 2.18 GB - Last synced at: about 2 years ago - Pushed at: almost 3 years ago - Stars: 18 - Forks: 6

devinus/genome

My genomic data

Language: Standard ML - Size: 5.61 MB - Last synced at: 6 days ago - Pushed at: over 5 years ago - Stars: 18 - Forks: 2

superphy/semantic

SuperPhy for the semantic web

Language: Web Ontology Language - Size: 40.1 MB - Last synced at: over 1 year ago - Pushed at: over 8 years ago - Stars: 18 - Forks: 3

iferres/MLSTar

An easy way of MLSTyping your genomes in R.

Language: R - Size: 4.37 MB - Last synced at: almost 2 years ago - Pushed at: over 4 years ago - Stars: 17 - Forks: 10

wiedenhoeft/HaMMLET

Fast Bayesian Hidden Markov Model with Wavelet Compression

Language: C++ - Size: 2.52 MB - Last synced at: over 1 year ago - Pushed at: over 5 years ago - Stars: 17 - Forks: 4

zyxue/biogrinder

Grinder is a versatile open-source bioinformatic tool to create simulated omic shotgun and amplicon sequence libraries for all main sequencing platforms.

Language: Perl - Size: 246 KB - Last synced at: 4 months ago - Pushed at: about 7 years ago - Stars: 17 - Forks: 1

pyani-plus/pyani-plus

Development repo for pyani-plus (the next iteration of pyani)

Language: Python - Size: 12.8 MB - Last synced at: 2 days ago - Pushed at: 5 days ago - Stars: 16 - Forks: 2

swvanderlaan/MetaGWASToolKit

A ToolKit to perform a Meta-analysis of Genome-Wide Association Studies

Language: Shell - Size: 694 MB - Last synced at: 7 days ago - Pushed at: about 1 month ago - Stars: 16 - Forks: 2

PASSIONLab/ELBA

Parallel String Graph Construction, Transitive Reduction, and Contig Generation for De Novo Genome Assembly

Language: C++ - Size: 106 MB - Last synced at: 7 days ago - Pushed at: about 1 year ago - Stars: 16 - Forks: 10

tamerh/biobtree

A bioinformatics tool to search, map and retrieve identifiers, keywords and attributes

Language: Go - Size: 7.11 MB - Last synced at: about 1 year ago - Pushed at: over 2 years ago - Stars: 16 - Forks: 3

satoshikawato/gbdraw

A genome diagram generator for microbes and organelles

Language: Python - Size: 566 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 15 - Forks: 4