GitHub topics: genomics-data
ncbi/datasets
NCBI Datasets is a new resource that lets you easily gather data from across NCBI databases.
Language: Jupyter Notebook - Size: 20.3 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 430 - Forks: 51

AmpliconSuite/AmpliconRepository
Website to host AmpliconSuite outputs, including AA outputs and resulting focal amplification classifications, such as ecDNA.
Language: HTML - Size: 26.6 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 4 - Forks: 5

lehner-lab/ABA_receptor Fork of MaximilianStammnitz/ABA_receptor
Companion scripts for DMS data processing, dose-response curve fitting and figure reproduction ("The genetic architecture of an allosteric hormone receptor", Stammnitz & Lehner, biorXiv 2025)
Language: R - Size: 30.5 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 0 - Forks: 0

pdimens/mimick
Linked-read sequence simulator
Language: Python - Size: 13.2 MB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 1 - Forks: 0

ML-Bioinfo-CEITEC/genomic_benchmarks
Benchmarks for classification of genomic sequences
Language: Jupyter Notebook - Size: 24.2 MB - Last synced at: 8 days ago - Pushed at: 3 months ago - Stars: 145 - Forks: 20

MaximilianStammnitz/ABA_receptor
Companion scripts for DMS data processing, dose-response curve fitting and figure reproduction (Stammnitz & Lehner, biorXiv 2025)
Language: R - Size: 30.5 MB - Last synced at: 8 days ago - Pushed at: 9 days ago - Stars: 1 - Forks: 1

remytuyeras/HaploDynamics
A python library to develop genomic data simulators
Language: Python - Size: 1.01 MB - Last synced at: 8 days ago - Pushed at: almost 2 years ago - Stars: 4 - Forks: 0

HallLab/pandas-genomics
Pandas ExtensionDtypes for dealing with genomics data
Language: Python - Size: 9.18 MB - Last synced at: 12 days ago - Pushed at: 12 days ago - Stars: 47 - Forks: 8

AstraBert/simON-reads
simON-reads ("Simulate Oxford Nanopore Reads") is a simple yet powerful tool to generate fastq files containing MiniON-like long reads
Language: Python - Size: 95.7 KB - Last synced at: 2 days ago - Pushed at: about 1 year ago - Stars: 3 - Forks: 0

yaacoo/multiPGS_py
multiPGS_py is a fast, simple and low-memory python method to calculate polygenic scores (PGS/PRS)
Language: Python - Size: 30.3 KB - Last synced at: 21 days ago - Pushed at: 21 days ago - Stars: 2 - Forks: 0

gmboowa/AMRSurveillanceDashboard
AMR Surveillance Dashboard is an interactive tool for visualizing AMR trends regionally. It provides dynamic insights into resistance patterns, genotypic markers, & lineage distribution across countries using genomic data. Designed for public health researchers & policy analysts, it supports real-time exploration of AMR surveillance data.
Language: Python - Size: 0 Bytes - Last synced at: 23 days ago - Pushed at: 23 days ago - Stars: 0 - Forks: 0

ForomePlatform/AStorage-Java
AStorage - a specialized data server for Genomics data
Language: Java - Size: 2.45 MB - Last synced at: 25 days ago - Pushed at: 25 days ago - Stars: 0 - Forks: 0

kyegomez/Prometheus
Welcome to Prometheus, the revolutionary AI model that allows you to generate DNA sequences for any creature you can imagine. Whether it’s a pink panda, an elephant-sized turtle, or a completely new lifeform from your wildest dreams, Prometheus decodes the mystery of biology and synthesizes genetic blueprints with precision.
Language: Python - Size: 2.17 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 8 - Forks: 0

mixOmicsTeam/mixOmics
Development repository for the Bioconductor package 'mixOmics '
Language: R - Size: 16.9 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 192 - Forks: 61

pritampanda15/ML-Genomics
Machine learning in Genomics
Language: Jupyter Notebook - Size: 6.25 MB - Last synced at: 8 days ago - Pushed at: 7 months ago - Stars: 1 - Forks: 0

Replicon-genetics/rg_exploder_shared
Python code for generating synthetic sequence data: DNASEQ and RNASEQ reads for use as standards in genomics data analysis pipelines
Language: Python - Size: 209 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0

IPK-BIT/divbrowse
A web application for interactive visualization and exploratory data analysis of variant call matrices
Language: Svelte - Size: 2.6 MB - Last synced at: 26 days ago - Pushed at: almost 2 years ago - Stars: 19 - Forks: 3

AstraBert/drosmel-in-asia
Drosophila melanogaster IndSeq and PoolSeq genomics data in Asia and Europe
Language: HTML - Size: 6.51 MB - Last synced at: 2 days ago - Pushed at: 3 months ago - Stars: 1 - Forks: 0

GenomicMedLab/wags-tails
Data acquisition tools for Wagnerds
Language: Python - Size: 303 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

Sentieon/sentieon-dnascope-ml
Sentieon DNAscope + Machine Learning Model
Language: Shell - Size: 111 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 12 - Forks: 4

outbreak-info/python-outbreak-info
Python package to access the genomics and epidemiology data and Research Library metadata compiled and standardized on outbreak.info
Language: Jupyter Notebook - Size: 31.4 MB - Last synced at: 25 days ago - Pushed at: 4 months ago - Stars: 6 - Forks: 2

tseemann/kounta
🧮 🔢 Generate multi-sample k-mer count matrix from WGS
Language: Perl - Size: 93.8 KB - Last synced at: about 2 months ago - Pushed at: over 5 years ago - Stars: 12 - Forks: 3

ArjunBasandrai/tcga-paad-survival-analysis
Survival Analysis on the TCGA-PAAD (The Cancer Genome Atlas-Pancreatic Adenocarcinoma) dataset
Language: Jupyter Notebook - Size: 20.3 MB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 0 - Forks: 0

FredHutch/cbioportal-data-formatting
A repository with easy-to-follow instructions on how to prepare you study data files for upload into cBioportal
Language: Python - Size: 101 MB - Last synced at: about 2 months ago - Pushed at: 8 months ago - Stars: 4 - Forks: 1

ReddyLab/ggr-cwl-ipynb-gen
Jupyter notebook generator to download and execute the processing files for GGR related datasets
Language: Python - Size: 283 KB - Last synced at: 22 days ago - Pushed at: 5 months ago - Stars: 0 - Forks: 6

RicoLeiser/EZNCBIdownloader
Python-based tool to easily bulk download bacteria genomes from NCBI
Language: Python - Size: 54.7 KB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 0 - Forks: 0

kpatel427/R-scripts
Miscellaneous R script to wrangle data and generate visualizations
Language: R - Size: 146 KB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 15 - Forks: 10

EBISPOT/DUO
Ontology for consent codes and data use requirements
Language: Makefile - Size: 6.43 MB - Last synced at: 6 months ago - Pushed at: over 2 years ago - Stars: 64 - Forks: 15

bio-ontology-research-group/STARVar
STARVar:Symptom based Tool for Automatic Ranking of Variants using evidence from literature and genomes
Language: Python - Size: 152 KB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 5 - Forks: 1

ToledoEM/msigdf Fork of stephenturner/msigdf
Molecular Signatures Database (MSigDB) in a data frame
Language: HTML - Size: 239 MB - Last synced at: 10 months ago - Pushed at: about 1 year ago - Stars: 23 - Forks: 6

theislab/cellrank_reproducibility
CellRank's reproducibility repository.
Language: Jupyter Notebook - Size: 135 MB - Last synced at: 5 months ago - Pushed at: over 3 years ago - Stars: 14 - Forks: 8

MosaeSat/Livestreams
An open-source project to create a centralized hub for live streaming content.
Language: HTML - Size: 62.5 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

vangiangtran/BWGS
2024 BreedWheat Genomic Selection pipeline
Language: R - Size: 20 MB - Last synced at: 12 months ago - Pushed at: over 1 year ago - Stars: 11 - Forks: 5

gitikabhardwaj/Genomic-Data-Exploration-Visualizing-Genetic-Variants-and-Allelic-Imbalance
Advanced bioinformatics analysis of RNA sequencing data and genomic databases using R. Explore allelic imbalances, SNP variants, and phylogenetic trees to uncover genetic insights and visualize complex data interactions.
Language: R - Size: 80.1 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

histolab/gdc-api-wrapper
Genomic Data Commons API wrapper
Language: Python - Size: 32.2 KB - Last synced at: about 2 months ago - Pushed at: over 4 years ago - Stars: 7 - Forks: 0

lucianhu/DNA-Methylation-Assessment
Applying R and Bioconductor to assess DNA Methylation
Size: 26.4 KB - Last synced at: about 1 year ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

kagningemma/sapovirus-evolution
These are scripts relevant to the study of sapovirus intrahost and interhost genome evolution. The scripts are relevant to (1) the mining of inter-genotype evolution, (2) within and between-host sapovirus genome evolutionary analyses, (3) sapovirus genetic divergence and diversity.
Language: Jupyter Notebook - Size: 17.9 MB - Last synced at: about 1 year ago - Pushed at: over 3 years ago - Stars: 1 - Forks: 1

s1monj/nft-peppers
Generative Art NFTs from Genomic Data
Language: JavaScript - Size: 118 MB - Last synced at: 7 days ago - Pushed at: over 3 years ago - Stars: 20 - Forks: 8

BioJulia/ReadDatastores.jl
Datastores for reads, not your papa's FASTQ files.
Language: Julia - Size: 618 KB - Last synced at: 1 day ago - Pushed at: almost 2 years ago - Stars: 11 - Forks: 4

BioInterchange/BioInterchangeC
Genomic linked data converter. Interchange GFF, GVF, VCF files via JSON.
Language: C - Size: 14.5 MB - Last synced at: about 1 year ago - Pushed at: about 4 years ago - Stars: 2 - Forks: 1

ncherric/Iliad
ILIAD: A suite of automated Snakemake workflows for processing genomic data for downstream applications
Language: Python - Size: 54.7 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 23 - Forks: 2

danymukesha/ensembl-variant-lookup
Python tool built for a user-friendly interface for searching individual variants, batch queries, and exploring gene regions from the Ensembl database.
Language: Python - Size: 26.7 MB - Last synced at: 8 days ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

lakhanp1/fungal_resources
R AnnotationDB resources like org.db, tx.db for various fungal species
Language: R - Size: 196 MB - Last synced at: about 1 year ago - Pushed at: over 3 years ago - Stars: 5 - Forks: 2

LotharukpongJS/phylomapr
Get precomputed gene age maps (phylomaps) in R
Language: R - Size: 12.1 MB - Last synced at: 2 months ago - Pushed at: over 1 year ago - Stars: 3 - Forks: 2

mansikath/whole-genome-sequencing-pipeline
A comprehensive workflow for de novo assembly of whole-genome shotgun sequencing data using Velvet, followed by BLAST searches to analyze assembled contigs.
Language: HTML - Size: 1.96 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

GenomicsDB/GenomicsDB-R
Experimental R bindings to the native GenomicsDB library
Language: R - Size: 5.45 MB - Last synced at: about 1 year ago - Pushed at: over 1 year ago - Stars: 4 - Forks: 2

Sentieon/sentieon-dnaseq 📦
Sentieon DNAseq
Language: Shell - Size: 60.5 KB - Last synced at: over 1 year ago - Pushed at: over 4 years ago - Stars: 19 - Forks: 12

Arvindiyer/scRNA_seq_tutorial
A case study notebook for doing scRNA seq analysis
Language: Jupyter Notebook - Size: 5.03 MB - Last synced at: 5 months ago - Pushed at: about 5 years ago - Stars: 0 - Forks: 1

kevin-wamae/plasmo-seq-grabber
A bash script for downloading sequence datasets from PlasmoDb, a Plasmodium informatics resource
Language: Shell - Size: 26.4 KB - Last synced at: over 1 year ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

furkanmtorun/gnomad_python_api
🧬 gnomAD Python API is used to obtain data from gnomAD (genome aggregation database).
Language: Python - Size: 280 KB - Last synced at: over 1 year ago - Pushed at: over 3 years ago - Stars: 17 - Forks: 9

alejandrorgijon/a_genomic_perspective_2022
Sup. Table 1 from Rodríguez-Gijón and Nuy et al., 2022 (Front. in Microbiology)
Size: 6.66 MB - Last synced at: almost 2 years ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

ACDBio/AutoMR
An R script to perform two sample Mendelian randomization screening (with TwoSampleMR) for a custom summary statistic against a set of summary statistics from the IEU GWAS database.
Language: R - Size: 6.52 MB - Last synced at: almost 2 years ago - Pushed at: over 2 years ago - Stars: 19 - Forks: 1

jkissing/jcklab-public
Kissinger Research Group Shared Code
Size: 1000 Bytes - Last synced at: over 1 year ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

ijazali678/2--Python-for-Genomic-Data-Science
The GitHub repository for genomic data science is a collection of resources, tools, and code snippets for analyzing and interpreting genomic data.
Language: Jupyter Notebook - Size: 21.6 MB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

outbreak-info/R-outbreak-info
R package to access the genomics and epidemiology data and Research Library metadata compiled and standardized on outbreak.info.
Language: R - Size: 91.1 MB - Last synced at: almost 2 years ago - Pushed at: over 2 years ago - Stars: 17 - Forks: 9

elijahedmondson/HS.ATmutated
A collection of R scripts for genomics
Language: R - Size: 56.6 KB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

victorskl/genomic-bigdata-spark
Genomic BigData Warehousing with Apache Spark and LakeHouse Architecture
Language: Jupyter Notebook - Size: 172 KB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 6 - Forks: 0

thierrygosselin/gotedna
Guidance on optimal eDNA sampling periods to develop, optimize and interpret monitoring programs
Language: R - Size: 860 KB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 1 - Forks: 0

aWormGuy/Mansonella-Genomes-Sinha-et-al.-2023
The bioinformatic analysis pipelines and scripts used for de novo genome assembly and analysis of filarial parasites Mansonella perstans and Mansonella ozzardi
Language: Shell - Size: 170 MB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

Sarah-Hesham-2022/BioStatistics-R-Coding-Using-Dataset-AntiProfilesData
Bio Statistics R Coding Using Dataset AntiProfilesData Genomics Data Structure Processing.
Language: R - Size: 259 KB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 2 - Forks: 0

GenGrim76/Pareto-Optimal-GDC
:chart_with_downwards_trend: Disk Storage of Compressed k-mer Dictionaries, with or without Random Access in Main Memory.
Language: Java - Size: 66.4 MB - Last synced at: over 2 years ago - Pushed at: over 2 years ago - Stars: 3 - Forks: 0

jajokine/Genomics-and-High-Dimensional-Data
MITx - MicroMasters Program on Statistics and Data Science - Data Analysis: Statistical Modeling and Computation in Applications - Second Project
Language: Jupyter Notebook - Size: 1.46 MB - Last synced at: over 2 years ago - Pushed at: about 3 years ago - Stars: 2 - Forks: 1

legacy-repo/level4-datamgt
Level 4 Data Management for Genomics Data
Language: JavaScript - Size: 10.1 MB - Last synced at: over 2 years ago - Pushed at: almost 6 years ago - Stars: 0 - Forks: 0

omics-lab/VirusTaxo_Hierarchical
Implementation of https://doi.org/10.1016/j.ygeno.2022.110414
Language: Roff - Size: 58 MB - Last synced at: over 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

a-r-j/TB-Resources
List of online databases and resources for Mycobacterium Tuberculosis
Size: 10.7 KB - Last synced at: 2 months ago - Pushed at: about 6 years ago - Stars: 1 - Forks: 1

luisagmazuca/Post-Genomics
This repository contains the file codes used to complete the final project for the course Post Genomic Analysis.
Language: Python - Size: 18.6 KB - Last synced at: 8 months ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

thondeboer/genomicsdatascience
Data science in genomics and healthcare
Language: Jupyter Notebook - Size: 2.25 MB - Last synced at: about 1 year ago - Pushed at: about 3 years ago - Stars: 0 - Forks: 0

petermchale/denoising_coverage_profiles
Using Convolutional Neural Networks to model an association between a genomic sequence and the number of sequenced reads that align to it
Language: Jupyter Notebook - Size: 44.5 MB - Last synced at: about 2 years ago - Pushed at: almost 6 years ago - Stars: 2 - Forks: 1

biodata-fun/nf_genomics_pipelines
This repo contains pipeline implemented using the nextflow workflow management system for the analysis of genomics data
Language: Nextflow - Size: 72.3 KB - Last synced at: over 2 years ago - Pushed at: over 3 years ago - Stars: 2 - Forks: 1

agenorrneto/tracking-the-enemy
An analysis of SARS-CoV-2 genomic sequencing scenario in Brazil
Size: 372 KB - Last synced at: about 2 years ago - Pushed at: almost 4 years ago - Stars: 1 - Forks: 0

jpwhalley/combat_sda_processing
Python code for the processing of the input data for SDA tensor and matrix decomposition and the resulting output data.
Language: Python - Size: 11.7 KB - Last synced at: over 2 years ago - Pushed at: almost 4 years ago - Stars: 0 - Forks: 0

akshayparopkari/shell-genomics Fork of datacarpentry/shell-genomics
Covers the following bash commands cat, chmod, cd, cp, curl, head, history, ls, less, man, mkdir, mv, nano, pwd, rm, tail, wc, wget
Language: Python - Size: 50.5 MB - Last synced at: over 2 years ago - Pushed at: over 4 years ago - Stars: 0 - Forks: 0

sturkarslan/evolution-of-syntrophy
Code and data related to analysis of genomics data
Language: R - Size: 39.4 MB - Last synced at: 12 months ago - Pushed at: 12 months ago - Stars: 0 - Forks: 0

ArnaudFickinger/LotteryTicket-PyTorch
Manifestation of the Lottery Ticket Phenomena in supervised, reinforcement and unsupervised learning.
Language: Python - Size: 178 KB - Last synced at: over 2 years ago - Pushed at: almost 6 years ago - Stars: 0 - Forks: 0

NathanSiemers/FauxFlow
FauxFlow - Analyze Single Cell Genomics Data in a Flow Cytometry Paradigm
Language: R - Size: 90.1 MB - Last synced at: 5 months ago - Pushed at: over 5 years ago - Stars: 2 - Forks: 0
