An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: scalable-data-analysis

parashardhapola/scarf

Toolkit for highly memory efficient analysis of single-cell RNA-Seq, scATAC-Seq and CITE-Seq data. Analyze atlas scale datasets with millions of cells on laptop.

Language: Python - Size: 32.4 MB - Last synced at: 4 days ago - Pushed at: 5 days ago - Stars: 105 - Forks: 15

COM6012/ScalableML

COM6012 Scalable Machine Learning - University of Sheffield. Enjoy our resources? ⭐ Star this repository to show your support and help others discover it!

Language: HTML - Size: 268 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 88 - Forks: 85

kaydotdev/stochastic-quantization

Robust and Scalable Stochastic Quasi-Gradient Clustering

Language: Jupyter Notebook - Size: 17.7 MB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 2 - Forks: 1

efeag/aga-MSDA

This repository contain projects completed during my graduate study in Data Science & Analytics at the J. Mack Robinson College of Business, Georgia State University. I worked as part of a team of 4 or 6 members and we equally contributed in completing tasks and preparing final documentations (code file, report & PowerPoint presentation).

Language: Jupyter Notebook - Size: 6.87 MB - Last synced at: 3 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

emmalanguage/emma

A quotation-based Scala DSL for scalable data analysis.

Language: Scala - Size: 9.16 MB - Last synced at: 3 months ago - Pushed at: over 3 years ago - Stars: 63 - Forks: 19

terilios/automated_data_scientist

Automated Data Scientist: An intelligent, adaptive data analysis tool that leverages AI-driven automation to dynamically plan, execute, and refine data science workflows. Automatically handles data preparation, analysis planning, code generation, and result interpretation using advanced language models.

Language: Python - Size: 207 KB - Last synced at: 5 months ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 1

YogiOnBioinformatics/Computational-Drug-Discovery-Internship-at-Merck

Description of work done at Merck pharmaceutical company in the summer of 2018 as a Computational Drug Discovery Intern at West Point, PA. Information excludes all proprietary information belonging to Merck & Co.

Language: Python - Size: 293 KB - Last synced at: over 1 year ago - Pushed at: over 3 years ago - Stars: 3 - Forks: 2

Caleydo/lineupjs Fork of lineupjs/lineupjs

Fork and custom implementation of LineUp Library for Visual Analysis of Multi-Attribute

Language: TypeScript - Size: 11.3 MB - Last synced at: 3 months ago - Pushed at: over 6 years ago - Stars: 70 - Forks: 4

Caleydo/taggle 📦

deprecated use lineup.js develop branch instead

Language: TypeScript - Size: 416 KB - Last synced at: 11 months ago - Pushed at: almost 8 years ago - Stars: 0 - Forks: 0

mmaguero/cloud-based-tool-SA

A cloud-based tool for sentiment analysis in reviews about restaurants on TripAdvisor

Language: Python - Size: 12.9 MB - Last synced at: over 2 years ago - Pushed at: about 4 years ago - Stars: 0 - Forks: 0

lapets/course-data-mechanics

Lecture notes and other materials for a one-semester course on data mechanics.

Language: HTML - Size: 73.2 KB - Last synced at: 3 months ago - Pushed at: about 5 years ago - Stars: 0 - Forks: 1

JayLohokare/sparkGIS

Spark GIS (Docker + Flask Webserver + SparkGIS)

Language: Java - Size: 11.4 MB - Last synced at: over 2 years ago - Pushed at: almost 7 years ago - Stars: 0 - Forks: 0

manuparra/knowledgegraphs

Knowledge data processing

Language: HTML - Size: 92.8 KB - Last synced at: 6 months ago - Pushed at: about 7 years ago - Stars: 1 - Forks: 1

emmalanguage/emma-lib

Language: Scala - Size: 146 KB - Last synced at: 3 months ago - Pushed at: about 8 years ago - Stars: 2 - Forks: 4

Related Keywords
scalable-data-analysis 14 python 3 data-visualization 2 data-science 2 visualization 2 research 2 table 2 numpy 2 dsl 2 emma 2 machine-learning 2 scala 2 clustering 2 openai-gpt 1 computational-proteomics 1 containerization 1 cloud-native 1 computer-science 1 vis 1 typescript 1 drug-development 1 ranking 1 drug-discovery 1 multi-attribute 1 drug-targets 1 lineup 1 high-performance 1 labkey 1 javascript-library 1 d3 1 lc-ms 1 pharmaceuticals 1 xml 1 tkinter-python 1 swissprot 1 publication 1 proteomics 1 pharmacokinetics 1 pharmacodynamics 1 library 1 knowledge-representation 1 knowledge-graph 1 d3js 1 spatial-analysis 1 inmemory-db 1 geospatial-database 1 distributed-systems 1 apache-spark 1 urban-data-science 1 optimization-algorithms 1 optimization 1 nosql 1 model-checking 1 introduction-to-statistics 1 graph-algorithms 1 geojson 1 data-mechanics 1 vagrant-box 1 vader-sentiment-analysis 1 sentiment-analysis 1 scrapy 1 scalable-machine-learning 1 rules-based 1 nginx-proxy 1 mongodb 1 lexicon-based 1 end-to-end-pipeline 1 docker-image 1 docker-compose 1 django 1 stochastic-optimization 1 sgd 1 scikit-learn 1 quantization 1 pytorch 1 paper 1 k-means-clustering 1 k-means 1 jupyter-notebook 1 academic-paper 1 open-course 1 umap 1 tsne 1 single-cell-rna-seq 1 single-cell-genomics 1 single-cell-atac-seq 1 scrna-seq 1 scatac-seq 1 pseudotime 1 multiomics 1 memory-efficient 1 graph-analytics 1 genomics 1 gene-modules 1 dimension-reduction 1 differential-expression 1 cite-seq 1 bioinformatics 1 big-data 1 ml-ops 1