An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: genomics-data

ncbi/datasets

NCBI Datasets is a new resource that lets you easily gather data from across NCBI databases.

Language: Jupyter Notebook - Size: 20.3 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 430 - Forks: 51

AmpliconSuite/AmpliconRepository

Website to host AmpliconSuite outputs, including AA outputs and resulting focal amplification classifications, such as ecDNA.

Language: HTML - Size: 26.6 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 4 - Forks: 5

lehner-lab/ABA_receptor Fork of MaximilianStammnitz/ABA_receptor

Companion scripts for DMS data processing, dose-response curve fitting and figure reproduction ("The genetic architecture of an allosteric hormone receptor", Stammnitz & Lehner, biorXiv 2025)

Language: R - Size: 30.5 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 0 - Forks: 0

pdimens/mimick

Linked-read sequence simulator

Language: Python - Size: 13.2 MB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 1 - Forks: 0

ML-Bioinfo-CEITEC/genomic_benchmarks

Benchmarks for classification of genomic sequences

Language: Jupyter Notebook - Size: 24.2 MB - Last synced at: 8 days ago - Pushed at: 3 months ago - Stars: 145 - Forks: 20

MaximilianStammnitz/ABA_receptor

Companion scripts for DMS data processing, dose-response curve fitting and figure reproduction (Stammnitz & Lehner, biorXiv 2025)

Language: R - Size: 30.5 MB - Last synced at: 8 days ago - Pushed at: 9 days ago - Stars: 1 - Forks: 1

remytuyeras/HaploDynamics

A python library to develop genomic data simulators

Language: Python - Size: 1.01 MB - Last synced at: 8 days ago - Pushed at: almost 2 years ago - Stars: 4 - Forks: 0

HallLab/pandas-genomics

Pandas ExtensionDtypes for dealing with genomics data

Language: Python - Size: 9.18 MB - Last synced at: 12 days ago - Pushed at: 12 days ago - Stars: 47 - Forks: 8

AstraBert/simON-reads

simON-reads ("Simulate Oxford Nanopore Reads") is a simple yet powerful tool to generate fastq files containing MiniON-like long reads

Language: Python - Size: 95.7 KB - Last synced at: 2 days ago - Pushed at: about 1 year ago - Stars: 3 - Forks: 0

yaacoo/multiPGS_py

multiPGS_py is a fast, simple and low-memory python method to calculate polygenic scores (PGS/PRS)

Language: Python - Size: 30.3 KB - Last synced at: 21 days ago - Pushed at: 21 days ago - Stars: 2 - Forks: 0

gmboowa/AMRSurveillanceDashboard

AMR Surveillance Dashboard is an interactive tool for visualizing AMR trends regionally. It provides dynamic insights into resistance patterns, genotypic markers, & lineage distribution across countries using genomic data. Designed for public health researchers & policy analysts, it supports real-time exploration of AMR surveillance data.

Language: Python - Size: 0 Bytes - Last synced at: 23 days ago - Pushed at: 23 days ago - Stars: 0 - Forks: 0

ForomePlatform/AStorage-Java

AStorage - a specialized data server for Genomics data

Language: Java - Size: 2.45 MB - Last synced at: 25 days ago - Pushed at: 25 days ago - Stars: 0 - Forks: 0

kyegomez/Prometheus

Welcome to Prometheus, the revolutionary AI model that allows you to generate DNA sequences for any creature you can imagine. Whether it’s a pink panda, an elephant-sized turtle, or a completely new lifeform from your wildest dreams, Prometheus decodes the mystery of biology and synthesizes genetic blueprints with precision.

Language: Python - Size: 2.17 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 8 - Forks: 0

mixOmicsTeam/mixOmics

Development repository for the Bioconductor package 'mixOmics '

Language: R - Size: 16.9 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 192 - Forks: 61

pritampanda15/ML-Genomics

Machine learning in Genomics

Language: Jupyter Notebook - Size: 6.25 MB - Last synced at: 8 days ago - Pushed at: 7 months ago - Stars: 1 - Forks: 0

Replicon-genetics/rg_exploder_shared

Python code for generating synthetic sequence data: DNASEQ and RNASEQ reads for use as standards in genomics data analysis pipelines

Language: Python - Size: 209 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0

IPK-BIT/divbrowse

A web application for interactive visualization and exploratory data analysis of variant call matrices

Language: Svelte - Size: 2.6 MB - Last synced at: 26 days ago - Pushed at: almost 2 years ago - Stars: 19 - Forks: 3

AstraBert/drosmel-in-asia

Drosophila melanogaster IndSeq and PoolSeq genomics data in Asia and Europe

Language: HTML - Size: 6.51 MB - Last synced at: 2 days ago - Pushed at: 3 months ago - Stars: 1 - Forks: 0

GenomicMedLab/wags-tails

Data acquisition tools for Wagnerds

Language: Python - Size: 303 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

Sentieon/sentieon-dnascope-ml

Sentieon DNAscope + Machine Learning Model

Language: Shell - Size: 111 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 12 - Forks: 4

outbreak-info/python-outbreak-info

Python package to access the genomics and epidemiology data and Research Library metadata compiled and standardized on outbreak.info

Language: Jupyter Notebook - Size: 31.4 MB - Last synced at: 25 days ago - Pushed at: 4 months ago - Stars: 6 - Forks: 2

tseemann/kounta

🧮 🔢 Generate multi-sample k-mer count matrix from WGS

Language: Perl - Size: 93.8 KB - Last synced at: about 2 months ago - Pushed at: over 5 years ago - Stars: 12 - Forks: 3

ArjunBasandrai/tcga-paad-survival-analysis

Survival Analysis on the TCGA-PAAD (The Cancer Genome Atlas-Pancreatic Adenocarcinoma) dataset

Language: Jupyter Notebook - Size: 20.3 MB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 0 - Forks: 0

FredHutch/cbioportal-data-formatting

A repository with easy-to-follow instructions on how to prepare you study data files for upload into cBioportal

Language: Python - Size: 101 MB - Last synced at: about 2 months ago - Pushed at: 8 months ago - Stars: 4 - Forks: 1

ReddyLab/ggr-cwl-ipynb-gen

Jupyter notebook generator to download and execute the processing files for GGR related datasets

Language: Python - Size: 283 KB - Last synced at: 22 days ago - Pushed at: 5 months ago - Stars: 0 - Forks: 6

RicoLeiser/EZNCBIdownloader

Python-based tool to easily bulk download bacteria genomes from NCBI

Language: Python - Size: 54.7 KB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 0 - Forks: 0

kpatel427/R-scripts

Miscellaneous R script to wrangle data and generate visualizations

Language: R - Size: 146 KB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 15 - Forks: 10

EBISPOT/DUO

Ontology for consent codes and data use requirements

Language: Makefile - Size: 6.43 MB - Last synced at: 6 months ago - Pushed at: over 2 years ago - Stars: 64 - Forks: 15

bio-ontology-research-group/STARVar

STARVar:Symptom based Tool for Automatic Ranking of Variants using evidence from literature and genomes

Language: Python - Size: 152 KB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 5 - Forks: 1

ToledoEM/msigdf Fork of stephenturner/msigdf

Molecular Signatures Database (MSigDB) in a data frame

Language: HTML - Size: 239 MB - Last synced at: 10 months ago - Pushed at: about 1 year ago - Stars: 23 - Forks: 6

theislab/cellrank_reproducibility

CellRank's reproducibility repository.

Language: Jupyter Notebook - Size: 135 MB - Last synced at: 5 months ago - Pushed at: over 3 years ago - Stars: 14 - Forks: 8

MosaeSat/Livestreams

An open-source project to create a centralized hub for live streaming content.

Language: HTML - Size: 62.5 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

vangiangtran/BWGS

2024 BreedWheat Genomic Selection pipeline

Language: R - Size: 20 MB - Last synced at: 12 months ago - Pushed at: over 1 year ago - Stars: 11 - Forks: 5

gitikabhardwaj/Genomic-Data-Exploration-Visualizing-Genetic-Variants-and-Allelic-Imbalance

Advanced bioinformatics analysis of RNA sequencing data and genomic databases using R. Explore allelic imbalances, SNP variants, and phylogenetic trees to uncover genetic insights and visualize complex data interactions.

Language: R - Size: 80.1 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

histolab/gdc-api-wrapper

Genomic Data Commons API wrapper

Language: Python - Size: 32.2 KB - Last synced at: about 2 months ago - Pushed at: over 4 years ago - Stars: 7 - Forks: 0

lucianhu/DNA-Methylation-Assessment

Applying R and Bioconductor to assess DNA Methylation

Size: 26.4 KB - Last synced at: about 1 year ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

kagningemma/sapovirus-evolution

These are scripts relevant to the study of sapovirus intrahost and interhost genome evolution. The scripts are relevant to (1) the mining of inter-genotype evolution, (2) within and between-host sapovirus genome evolutionary analyses, (3) sapovirus genetic divergence and diversity.

Language: Jupyter Notebook - Size: 17.9 MB - Last synced at: about 1 year ago - Pushed at: over 3 years ago - Stars: 1 - Forks: 1

s1monj/nft-peppers

Generative Art NFTs from Genomic Data

Language: JavaScript - Size: 118 MB - Last synced at: 7 days ago - Pushed at: over 3 years ago - Stars: 20 - Forks: 8

BioJulia/ReadDatastores.jl

Datastores for reads, not your papa's FASTQ files.

Language: Julia - Size: 618 KB - Last synced at: 1 day ago - Pushed at: almost 2 years ago - Stars: 11 - Forks: 4

BioInterchange/BioInterchangeC

Genomic linked data converter. Interchange GFF, GVF, VCF files via JSON.

Language: C - Size: 14.5 MB - Last synced at: about 1 year ago - Pushed at: about 4 years ago - Stars: 2 - Forks: 1

ncherric/Iliad

ILIAD: A suite of automated Snakemake workflows for processing genomic data for downstream applications

Language: Python - Size: 54.7 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 23 - Forks: 2

danymukesha/ensembl-variant-lookup

Python tool built for a user-friendly interface for searching individual variants, batch queries, and exploring gene regions from the Ensembl database.

Language: Python - Size: 26.7 MB - Last synced at: 8 days ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

lakhanp1/fungal_resources

R AnnotationDB resources like org.db, tx.db for various fungal species

Language: R - Size: 196 MB - Last synced at: about 1 year ago - Pushed at: over 3 years ago - Stars: 5 - Forks: 2

LotharukpongJS/phylomapr

Get precomputed gene age maps (phylomaps) in R

Language: R - Size: 12.1 MB - Last synced at: 2 months ago - Pushed at: over 1 year ago - Stars: 3 - Forks: 2

mansikath/whole-genome-sequencing-pipeline

A comprehensive workflow for de novo assembly of whole-genome shotgun sequencing data using Velvet, followed by BLAST searches to analyze assembled contigs.

Language: HTML - Size: 1.96 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

GenomicsDB/GenomicsDB-R

Experimental R bindings to the native GenomicsDB library

Language: R - Size: 5.45 MB - Last synced at: about 1 year ago - Pushed at: over 1 year ago - Stars: 4 - Forks: 2

Sentieon/sentieon-dnaseq 📦

Sentieon DNAseq

Language: Shell - Size: 60.5 KB - Last synced at: over 1 year ago - Pushed at: over 4 years ago - Stars: 19 - Forks: 12

Arvindiyer/scRNA_seq_tutorial

A case study notebook for doing scRNA seq analysis

Language: Jupyter Notebook - Size: 5.03 MB - Last synced at: 5 months ago - Pushed at: about 5 years ago - Stars: 0 - Forks: 1

kevin-wamae/plasmo-seq-grabber

A bash script for downloading sequence datasets from PlasmoDb, a Plasmodium informatics resource

Language: Shell - Size: 26.4 KB - Last synced at: over 1 year ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

furkanmtorun/gnomad_python_api

🧬 gnomAD Python API is used to obtain data from gnomAD (genome aggregation database).

Language: Python - Size: 280 KB - Last synced at: over 1 year ago - Pushed at: over 3 years ago - Stars: 17 - Forks: 9

alejandrorgijon/a_genomic_perspective_2022

Sup. Table 1 from Rodríguez-Gijón and Nuy et al., 2022 (Front. in Microbiology)

Size: 6.66 MB - Last synced at: almost 2 years ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

ACDBio/AutoMR

An R script to perform two sample Mendelian randomization screening (with TwoSampleMR) for a custom summary statistic against a set of summary statistics from the IEU GWAS database.

Language: R - Size: 6.52 MB - Last synced at: almost 2 years ago - Pushed at: over 2 years ago - Stars: 19 - Forks: 1

jkissing/jcklab-public

Kissinger Research Group Shared Code

Size: 1000 Bytes - Last synced at: over 1 year ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

ijazali678/2--Python-for-Genomic-Data-Science

The GitHub repository for genomic data science is a collection of resources, tools, and code snippets for analyzing and interpreting genomic data.

Language: Jupyter Notebook - Size: 21.6 MB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

outbreak-info/R-outbreak-info

R package to access the genomics and epidemiology data and Research Library metadata compiled and standardized on outbreak.info.

Language: R - Size: 91.1 MB - Last synced at: almost 2 years ago - Pushed at: over 2 years ago - Stars: 17 - Forks: 9

elijahedmondson/HS.ATmutated

A collection of R scripts for genomics

Language: R - Size: 56.6 KB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

victorskl/genomic-bigdata-spark

Genomic BigData Warehousing with Apache Spark and LakeHouse Architecture

Language: Jupyter Notebook - Size: 172 KB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 6 - Forks: 0

thierrygosselin/gotedna

Guidance on optimal eDNA sampling periods to develop, optimize and interpret monitoring programs

Language: R - Size: 860 KB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 1 - Forks: 0

aWormGuy/Mansonella-Genomes-Sinha-et-al.-2023

The bioinformatic analysis pipelines and scripts used for de novo genome assembly and analysis of filarial parasites Mansonella perstans and Mansonella ozzardi

Language: Shell - Size: 170 MB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

Sarah-Hesham-2022/BioStatistics-R-Coding-Using-Dataset-AntiProfilesData

Bio Statistics R Coding Using Dataset AntiProfilesData Genomics Data Structure Processing.

Language: R - Size: 259 KB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 2 - Forks: 0

GenGrim76/Pareto-Optimal-GDC

:chart_with_downwards_trend: Disk Storage of Compressed k-mer Dictionaries, with or without Random Access in Main Memory.

Language: Java - Size: 66.4 MB - Last synced at: over 2 years ago - Pushed at: over 2 years ago - Stars: 3 - Forks: 0

jajokine/Genomics-and-High-Dimensional-Data

MITx - MicroMasters Program on Statistics and Data Science - Data Analysis: Statistical Modeling and Computation in Applications - Second Project

Language: Jupyter Notebook - Size: 1.46 MB - Last synced at: over 2 years ago - Pushed at: about 3 years ago - Stars: 2 - Forks: 1

legacy-repo/level4-datamgt

Level 4 Data Management for Genomics Data

Language: JavaScript - Size: 10.1 MB - Last synced at: over 2 years ago - Pushed at: almost 6 years ago - Stars: 0 - Forks: 0

omics-lab/VirusTaxo_Hierarchical

Implementation of https://doi.org/10.1016/j.ygeno.2022.110414

Language: Roff - Size: 58 MB - Last synced at: over 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

a-r-j/TB-Resources

List of online databases and resources for Mycobacterium Tuberculosis

Size: 10.7 KB - Last synced at: 2 months ago - Pushed at: about 6 years ago - Stars: 1 - Forks: 1

luisagmazuca/Post-Genomics

This repository contains the file codes used to complete the final project for the course Post Genomic Analysis.

Language: Python - Size: 18.6 KB - Last synced at: 8 months ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

thondeboer/genomicsdatascience

Data science in genomics and healthcare

Language: Jupyter Notebook - Size: 2.25 MB - Last synced at: about 1 year ago - Pushed at: about 3 years ago - Stars: 0 - Forks: 0

petermchale/denoising_coverage_profiles

Using Convolutional Neural Networks to model an association between a genomic sequence and the number of sequenced reads that align to it

Language: Jupyter Notebook - Size: 44.5 MB - Last synced at: about 2 years ago - Pushed at: almost 6 years ago - Stars: 2 - Forks: 1

biodata-fun/nf_genomics_pipelines

This repo contains pipeline implemented using the nextflow workflow management system for the analysis of genomics data

Language: Nextflow - Size: 72.3 KB - Last synced at: over 2 years ago - Pushed at: over 3 years ago - Stars: 2 - Forks: 1

agenorrneto/tracking-the-enemy

An analysis of SARS-CoV-2 genomic sequencing scenario in Brazil

Size: 372 KB - Last synced at: about 2 years ago - Pushed at: almost 4 years ago - Stars: 1 - Forks: 0

jpwhalley/combat_sda_processing

Python code for the processing of the input data for SDA tensor and matrix decomposition and the resulting output data.

Language: Python - Size: 11.7 KB - Last synced at: over 2 years ago - Pushed at: almost 4 years ago - Stars: 0 - Forks: 0

akshayparopkari/shell-genomics Fork of datacarpentry/shell-genomics

Covers the following bash commands cat, chmod, cd, cp, curl, head, history, ls, less, man, mkdir, mv, nano, pwd, rm, tail, wc, wget

Language: Python - Size: 50.5 MB - Last synced at: over 2 years ago - Pushed at: over 4 years ago - Stars: 0 - Forks: 0

sturkarslan/evolution-of-syntrophy

Code and data related to analysis of genomics data

Language: R - Size: 39.4 MB - Last synced at: 12 months ago - Pushed at: 12 months ago - Stars: 0 - Forks: 0

ArnaudFickinger/LotteryTicket-PyTorch

Manifestation of the Lottery Ticket Phenomena in supervised, reinforcement and unsupervised learning.

Language: Python - Size: 178 KB - Last synced at: over 2 years ago - Pushed at: almost 6 years ago - Stars: 0 - Forks: 0

NathanSiemers/FauxFlow

FauxFlow - Analyze Single Cell Genomics Data in a Flow Cytometry Paradigm

Language: R - Size: 90.1 MB - Last synced at: 5 months ago - Pushed at: over 5 years ago - Stars: 2 - Forks: 0