Topic: "data"
TanStack/query
🤖 Powerful asynchronous state management, server-state utilities and data fetching for the web. TS/JS, React Query, Solid Query, Svelte Query and Vue Query.
Language: TypeScript - Size: 92.3 MB - Last synced at: 2 days ago - Pushed at: 4 days ago - Stars: 47,881 - Forks: 3,620
run-llama/llama_index
LlamaIndex is the leading framework for building LLM-powered agents over your data.
Language: Python - Size: 362 MB - Last synced at: 4 days ago - Pushed at: 6 days ago - Stars: 45,922 - Forks: 6,651
metabase/metabase
The easy-to-use open source Business Intelligence and Embedded Analytics tool that lets everyone work with data :bar_chart:
Language: Clojure - Size: 1.38 GB - Last synced at: 18 days ago - Pushed at: 19 days ago - Stars: 44,855 - Forks: 6,065
DataExpert-io/data-engineer-handbook
This is a repo with links to everything you'd ever want to learn about data engineering
Language: Jupyter Notebook - Size: 59.5 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 38,528 - Forks: 7,412
SheetJS/sheetjs
📗 SheetJS Spreadsheet Data Toolkit -- New home https://git.sheetjs.com/SheetJS/sheetjs
Size: 101 MB - Last synced at: 6 days ago - Pushed at: over 1 year ago - Stars: 36,087 - Forks: 7,990
vercel/swr
React Hooks for Data Fetching
Language: TypeScript - Size: 7.95 MB - Last synced at: 10 days ago - Pushed at: 12 days ago - Stars: 32,211 - Forks: 1,304
sinaptik-ai/pandas-ai
Chat with your database or your datalake (SQL, CSV, parquet). PandasAI makes data analysis conversational using LLMs and RAG.
Language: Python - Size: 54.8 MB - Last synced at: 11 days ago - Pushed at: about 2 months ago - Stars: 22,808 - Forks: 2,234
PrefectHQ/prefect
Prefect is a workflow orchestration framework for building resilient data pipelines in Python.
Language: Python - Size: 187 MB - Last synced at: 3 days ago - Pushed at: 4 days ago - Stars: 21,142 - Forks: 2,039
airbytehq/airbyte
The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
Language: Python - Size: 756 MB - Last synced at: 3 days ago - Pushed at: 5 days ago - Stars: 20,303 - Forks: 4,973
fivethirtyeight/data
Data and code behind the articles and graphics at FiveThirtyEight
Language: Jupyter Notebook - Size: 155 MB - Last synced at: 7 months ago - Pushed at: 10 months ago - Stars: 17,060 - Forks: 11,127
prestodb/presto
The official home of the Presto distributed SQL query engine for big data
Language: Java - Size: 250 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 16,594 - Forks: 5,509
akfamily/akshare
AKShare is an elegant and simple financial data interface library for Python, built for human beings! 开源财经数据接口库
Language: Python - Size: 4.85 MB - Last synced at: 3 days ago - Pushed at: 5 days ago - Stars: 14,899 - Forks: 2,659
faker-js/faker
Generate massive amounts of fake data in the browser and node.js
Language: TypeScript - Size: 30.3 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 14,747 - Forks: 1,037
oxnr/awesome-bigdata
A curated list of awesome big data frameworks, ressources and other awesomeness.
Size: 845 KB - Last synced at: 5 days ago - Pushed at: 28 days ago - Stars: 14,102 - Forks: 2,589
pwxcoo/chinese-xinhua
:orange_book: 中华新华字典数据库。包括歇后语,成语,词语,汉字。
Language: Python - Size: 34.6 MB - Last synced at: 7 months ago - Pushed at: almost 2 years ago - Stars: 11,204 - Forks: 2,621
apple/pkl
A configuration as code language with rich validation and tooling.
Language: Java - Size: 7.12 MB - Last synced at: 6 days ago - Pushed at: 9 days ago - Stars: 10,981 - Forks: 348
PRQL/prql
PRQL is a modern language for transforming data — a simple, powerful, pipelined SQL replacement
Language: Rust - Size: 22.9 MB - Last synced at: 5 days ago - Pushed at: 7 days ago - Stars: 10,572 - Forks: 247
bchavez/Bogus
:card_index: A simple fake data generator for C#, F#, and VB.NET. Based on and ported from the famed faker.js.
Language: C# - Size: 6.12 MB - Last synced at: 5 days ago - Pushed at: about 1 month ago - Stars: 9,540 - Forks: 536
rawgraphs/rawgraphs-app
A web interface to create custom vector-based visualizations on top of RAWGraphs core
Language: JavaScript - Size: 51 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 8,893 - Forks: 1,857
mage-ai/mage-ai
🧙 Build, run, and manage data pipelines for integrating and transforming data.
Language: Python - Size: 233 MB - Last synced at: 13 days ago - Pushed at: 14 days ago - Stars: 8,583 - Forks: 893
D4Vinci/Scrapling
🕷️ An undetectable, powerful, flexible, high-performance Python library to make Web Scraping Easy and Effortless as it should be!
Language: Python - Size: 4.02 MB - Last synced at: 6 days ago - Pushed at: 7 days ago - Stars: 8,318 - Forks: 475
mrdbourke/machine-learning-roadmap
A roadmap connecting many of the most important concepts in machine learning, how to learn them and what tools to use to perform them.
Size: 24.8 MB - Last synced at: 3 months ago - Pushed at: about 3 years ago - Stars: 7,740 - Forks: 1,168
olifolkerd/tabulator
Interactive Tables and Data Grids for JavaScript
Language: JavaScript - Size: 86 MB - Last synced at: about 1 month ago - Pushed at: 4 months ago - Stars: 7,403 - Forks: 871
snowplow/snowplow
The leader in Next-Generation Customer Data Infrastructure
Language: Scala - Size: 25.5 MB - Last synced at: 7 months ago - Pushed at: 9 months ago - Stars: 6,924 - Forks: 1,189
flyteorg/flyte
Scalable and flexible workflow orchestration platform that seamlessly unifies data, ML and analytics stacks.
Language: Go - Size: 331 MB - Last synced at: about 22 hours ago - Pushed at: 1 day ago - Stars: 6,643 - Forks: 767
cloudquery/cloudquery
Data pipelines for cloud config and security data. Build cloud asset inventory, CSPM, FinOps, and vulnerability management solutions. Extract from AWS, Azure, GCP, and 70+ cloud and SaaS sources.
Language: Go - Size: 179 MB - Last synced at: 6 days ago - Pushed at: 8 days ago - Stars: 6,277 - Forks: 544
dformoso/machine-learning-mindmap
A mindmap summarising Machine Learning concepts, from Data Analysis to Deep Learning.
Size: 14.8 MB - Last synced at: 7 months ago - Pushed at: over 5 years ago - Stars: 6,193 - Forks: 1,007
axa-group/Parsr
Transforms PDF, Documents and Images into Enriched Structured Data
Language: JavaScript - Size: 52.6 MB - Last synced at: 4 months ago - Pushed at: about 2 years ago - Stars: 6,001 - Forks: 318
cue-lang/cue
The home of the CUE language! Validate and define text-based and dynamic configuration
Language: Go - Size: 56.7 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 5,848 - Forks: 346
Countly/countly-server
Countly is a product analytics platform that helps teams track, analyze and act-on their user actions and behaviour on mobile, web and desktop applications.
Language: JavaScript - Size: 665 MB - Last synced at: 4 days ago - Pushed at: 5 days ago - Stars: 5,804 - Forks: 979
datajuicer/data-juicer
Data processing for and with foundation models! 🍎 🍋 🌽 ➡️ ➡️🍸 🍹 🍷
Language: Python - Size: 723 MB - Last synced at: 3 days ago - Pushed at: 6 days ago - Stars: 5,643 - Forks: 304
airbnb/knowledge-repo
A next-generation curated knowledge sharing platform for data scientists and other technical professions.
Language: Python - Size: 74 MB - Last synced at: 2 months ago - Pushed at: over 1 year ago - Stars: 5,532 - Forks: 685
mdn/browser-compat-data
Browser compatibility data for Web technologies as displayed on MDN
Language: JSON - Size: 113 MB - Last synced at: 8 days ago - Pushed at: 9 days ago - Stars: 5,522 - Forks: 2,435
brianvoe/gofakeit
Random fake data generator written in go
Language: Go - Size: 7.8 MB - Last synced at: 6 days ago - Pushed at: 21 days ago - Stars: 5,264 - Forks: 293
superduper-io/superduper
Superduper: End-to-end framework for building custom AI applications and agents.
Language: Python - Size: 73.8 MB - Last synced at: 9 days ago - Pushed at: 4 months ago - Stars: 5,233 - Forks: 532
cocoindex-io/cocoindex
Data transformation framework for AI. Ultra performant, with incremental processing. 🌟 Star if you like it!
Language: Rust - Size: 97.4 MB - Last synced at: 1 day ago - Pushed at: 2 days ago - Stars: 4,960 - Forks: 371
ckan/ckan
CKAN is an open-source DMS (data management system) for powering data hubs and data portals. CKAN makes it easy to publish, share and use data. It powers catalog.data.gov, open.canada.ca/data, data.humdata.org among many other sites.
Language: Python - Size: 215 MB - Last synced at: 4 days ago - Pushed at: 6 days ago - Stars: 4,912 - Forks: 2,071
tinyplex/tinybase
A reactive data store & sync engine.
Language: TypeScript - Size: 363 MB - Last synced at: 3 days ago - Pushed at: 5 days ago - Stars: 4,833 - Forks: 117
glideapps/glide-data-grid
🚀 Glide Data Grid is a no compromise, outrageously fast react data grid with rich rendering, first class accessibility, and full TypeScript support.
Language: TypeScript - Size: 95.4 MB - Last synced at: about 1 month ago - Pushed at: about 2 months ago - Stars: 4,714 - Forks: 368
dlt-hub/dlt
data load tool (dlt) is an open source Python library that makes data loading easy 🛠️
Language: Python - Size: 101 MB - Last synced at: 3 days ago - Pushed at: 5 days ago - Stars: 4,707 - Forks: 412
ArroyoSystems/arroyo
Distributed stream processing engine in Rust
Language: Rust - Size: 15.6 MB - Last synced at: 12 days ago - Pushed at: 13 days ago - Stars: 4,705 - Forks: 325
lk-geimfari/mimesis
Mimesis is a fast Python library for generating fake data in multiple languages.
Language: Python - Size: 33.9 MB - Last synced at: 10 days ago - Pushed at: about 1 month ago - Stars: 4,655 - Forks: 346
tensorflow/datasets
TFDS is a collection of datasets ready to use with TensorFlow, Jax, ...
Language: Python - Size: 952 MB - Last synced at: 18 days ago - Pushed at: 20 days ago - Stars: 4,507 - Forks: 1,592
StructuredLabs/preswald
Preswald is a WASM packager for Python-based interactive data apps: bundle full complex data workflows, particularly visualizations, into single files, runnable completely in-browser, using Pyodide, DuckDB, Pandas, and Plotly, Matplotlib, etc. Build dashboards, reports, and notebooks that run offline, load fast, and share like a document.
Language: Python - Size: 97.2 MB - Last synced at: 3 months ago - Pushed at: 5 months ago - Stars: 4,309 - Forks: 666
jonschlinkert/gray-matter
Smarter YAML front matter parser, used by metalsmith, Gatsby, Netlify, Assemble, mapbox-gl, phenomic, vuejs vitepress, TinaCMS, Shopify Polaris, Ant Design, Astro, hashicorp, garden, slidev, saber, sourcegraph, and many others. Simple to use, and battle tested. Parses YAML by default but can also parse JSON Front Matter, Coffee Front Matter, TOML Front Matter, and has support for custom parsers. Please follow gray-matter's author: https://github.com/jonschlinkert
Language: JavaScript - Size: 342 KB - Last synced at: about 1 month ago - Pushed at: 6 months ago - Stars: 4,293 - Forks: 151
truefoundry/cognita
RAG (Retrieval Augmented Generation) Framework for building modular, open source applications for production by TrueFoundry
Language: Python - Size: 50.3 MB - Last synced at: 28 days ago - Pushed at: 29 days ago - Stars: 4,289 - Forks: 361
speedyapply/2026-AI-College-Jobs
2026 AI/ML internship & new graduate job list updated daily
Size: 4.28 MB - Last synced at: 2 days ago - Pushed at: 3 days ago - Stars: 4,249 - Forks: 172
Quartz/bad-data-guide
An exhaustive reference to problems seen in real-world data along with suggestions on how to resolve them.
Size: 125 KB - Last synced at: 6 months ago - Pushed at: over 4 years ago - Stars: 4,071 - Forks: 403
quadratichq/quadratic
Spreadsheet with AI, Code, Connections
Language: TypeScript - Size: 1.33 GB - Last synced at: 3 days ago - Pushed at: 4 days ago - Stars: 3,898 - Forks: 254
mlabonne/llm-datasets
Curated list of datasets and tools for post-training.
Size: 103 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 3,850 - Forks: 318
LazyAGI/LazyLLM
Easiest and laziest way for building multi-agent LLMs applications.
Language: Python - Size: 14.6 MB - Last synced at: about 15 hours ago - Pushed at: about 21 hours ago - Stars: 3,637 - Forks: 352
Belval/TextRecognitionDataGenerator
A synthetic data generator for text recognition
Language: Python - Size: 149 MB - Last synced at: 19 days ago - Pushed at: over 1 year ago - Stars: 3,612 - Forks: 1,017
dtinit/data-transfer-project
The Data Transfer Project makes it easy for platforms to build interoperable user data portability features. We are establishing a common framework, including data models and protocols, to enable direct transfer of data both into and out of participating online service providers.
Language: Java - Size: 10.7 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 3,594 - Forks: 487
jdorfman/awesome-json-datasets
A curated list of awesome JSON datasets that don't require authentication.
Language: JavaScript - Size: 238 KB - Last synced at: 5 days ago - Pushed at: about 1 year ago - Stars: 3,526 - Forks: 386
Docta-ai/docta
A Doctor for your data
Language: Python - Size: 27.8 MB - Last synced at: 3 months ago - Pushed at: 12 months ago - Stars: 3,478 - Forks: 258
heroku/react-refetch
A simple, declarative, and composable way to fetch data for React components
Language: JavaScript - Size: 1.04 MB - Last synced at: 3 months ago - Pushed at: over 1 year ago - Stars: 3,422 - Forks: 141
superstreamlabs/memphis
Memphis.dev is a highly scalable and effortless data streaming platform
Language: Go - Size: 468 MB - Last synced at: 25 days ago - Pushed at: over 1 year ago - Stars: 3,417 - Forks: 229
ngneat/falso
All the Fake Data for All Your Real Needs 🙂
Language: TypeScript - Size: 11.4 MB - Last synced at: about 2 months ago - Pushed at: 6 months ago - Stars: 3,322 - Forks: 121
ucbepic/docetl
A system for agentic LLM-powered data processing and ETL
Language: Python - Size: 62.3 MB - Last synced at: 3 days ago - Pushed at: 6 days ago - Stars: 3,290 - Forks: 352
ruc-datalab/DeepAnalyze
DeepAnalyze is the first agentic LLM for autonomous data science. 🎈你的AI数据分析师,自动分析大量数据,一键生成专业分析报告!
Language: Python - Size: 22.5 MB - Last synced at: 7 days ago - Pushed at: 10 days ago - Stars: 3,191 - Forks: 472
pydata/pandas-datareader
Extract data from a wide range of Internet sources into a pandas DataFrame.
Language: Python - Size: 12.3 MB - Last synced at: 19 days ago - Pushed at: 9 months ago - Stars: 3,127 - Forks: 684
uber/aresdb
A GPU-powered real-time analytics storage and query engine.
Language: Go - Size: 12.4 MB - Last synced at: 4 months ago - Pushed at: over 1 year ago - Stars: 3,065 - Forks: 235
Kanaries/graphic-walker
An open source alternative to Tableau. Embeddable visual analytic
Language: TypeScript - Size: 3.73 MB - Last synced at: 25 days ago - Pushed at: 28 days ago - Stars: 3,012 - Forks: 163
weld-project/weld
High-performance runtime for data analytics applications
Language: Rust - Size: 2.88 MB - Last synced at: about 1 month ago - Pushed at: over 3 years ago - Stars: 3,001 - Forks: 256
montanaflynn/stats
A well tested and comprehensive Golang statistics library package with no dependencies.
Language: Go - Size: 333 KB - Last synced at: 4 months ago - Pushed at: 9 months ago - Stars: 2,992 - Forks: 170
datafold/data-diff 📦
Compare tables within or across databases
Language: Python - Size: 3.98 MB - Last synced at: about 2 months ago - Pushed at: over 1 year ago - Stars: 2,987 - Forks: 295
apache/incubator-devlake
Apache DevLake is an open-source dev data platform to ingest, analyze, and visualize the fragmented data from DevOps tools, extracting insights for engineering excellence, developer experience, and community growth.
Language: Go - Size: 38.8 MB - Last synced at: 7 days ago - Pushed at: 10 days ago - Stars: 2,884 - Forks: 659
kayak/pypika
PyPika is a python SQL query builder that exposes the full richness of the SQL language using a syntax that reflects the resulting query. PyPika excels at all sorts of SQL queries but is especially useful for data analysis.
Language: Python - Size: 1.27 MB - Last synced at: about 1 month ago - Pushed at: about 2 months ago - Stars: 2,753 - Forks: 319
spiceai/spiceai
A portable accelerated SQL query, search, and LLM-inference engine, written in Rust, for data-grounded AI apps and agents.
Language: Rust - Size: 66.9 MB - Last synced at: 14 days ago - Pushed at: 14 days ago - Stars: 2,645 - Forks: 150
spotify/scio
A Scala API for Apache Beam and Google Cloud Dataflow.
Language: Scala - Size: 89.5 MB - Last synced at: 7 days ago - Pushed at: 8 days ago - Stars: 2,614 - Forks: 526
mito-ds/mito
Jupyter extensions that help you write code faster: Context aware AI Chat, Autocomplete, and Spreadsheet
Language: Jupyter Notebook - Size: 278 MB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 2,601 - Forks: 205
unsplash/datasets
🎁 6,500,000+ Unsplash images made available for research and machine learning
Language: Jupyter Notebook - Size: 70.3 KB - Last synced at: 3 months ago - Pushed at: 8 months ago - Stars: 2,597 - Forks: 131
justinzm/gopup
数据接口:百度、谷歌、头条、微博指数,宏观数据,利率数据,货币汇率,千里马、独角兽公司,新闻联播文字稿,影视票房数据,高校名单,疫情数据…
Language: Python - Size: 689 KB - Last synced at: 4 months ago - Pushed at: over 2 years ago - Stars: 2,561 - Forks: 387
EntilZha/PyFunctional
Python library for creating data pipelines with chain functional programming
Language: Python - Size: 893 KB - Last synced at: about 3 hours ago - Pushed at: 10 months ago - Stars: 2,487 - Forks: 133
colour-science/colour
Colour Science for Python
Language: Python - Size: 124 MB - Last synced at: 3 days ago - Pushed at: 6 days ago - Stars: 2,469 - Forks: 279
deepnote/deepnote
Deepnote is a drop-in replacement for Jupyter with an AI-first design, sleek UI, new blocks, and native data integrations. Use Python, R, and SQL locally in your favorite IDE, then scale to Deepnote cloud for real-time collaboration, Deepnote agent, and deployable data apps. https://deepnote.com/
Language: TypeScript - Size: 20.7 MB - Last synced at: 4 days ago - Pushed at: 5 days ago - Stars: 2,453 - Forks: 157
rilldata/rill
Rill is a tool for effortlessly transforming data sets into powerful, opinionated dashboards using SQL. BI-as-code.
Language: Go - Size: 570 MB - Last synced at: 3 days ago - Pushed at: 4 days ago - Stars: 2,426 - Forks: 159
any4ai/AnyCrawl
AnyCrawl 🚀: A Node.js/TypeScript crawler that turns websites into LLM-ready data and extracts structured SERP results from Google/Bing/Baidu/etc. Native multi-threading for bulk processing.
Language: TypeScript - Size: 1.68 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 2,423 - Forks: 247
github/CodeSearchNet 📦
Datasets, tools, and benchmarks for representation learning of code.
Language: Jupyter Notebook - Size: 28.6 MB - Last synced at: 2 months ago - Pushed at: almost 4 years ago - Stars: 2,378 - Forks: 408
lukes/ISO-3166-Countries-with-Regional-Codes
ISO 3166-1 country lists merged with their UN Geoscheme regional codes in ready-to-use JSON, XML, CSV data sets
Language: Ruby - Size: 188 KB - Last synced at: 2 months ago - Pushed at: over 1 year ago - Stars: 2,375 - Forks: 3,322
Visualize-ML/Book6_First-Course-in-Data-Science
Book_6_《数据有道》 | 鸢尾花书:从加减乘除到机器学习;欢迎大家批评指正!纠错多的同学会得到赠书感谢!
Language: Jupyter Notebook - Size: 169 MB - Last synced at: 7 months ago - Pushed at: over 1 year ago - Stars: 2,347 - Forks: 432
emirozer/fake2db
create custom test databases that are populated with fake data
Language: Python - Size: 1020 KB - Last synced at: 2 months ago - Pushed at: about 6 years ago - Stars: 2,341 - Forks: 124
malloydata/malloy
Malloy is a modern open source language for describing data relationships and transformations.
Language: TypeScript - Size: 339 MB - Last synced at: 10 days ago - Pushed at: 12 days ago - Stars: 2,315 - Forks: 113
meltano/meltano
Meltano: the declarative code-first data integration engine that powers your wildest data and ML-powered product ideas. Say goodbye to writing, maintaining, and scaling your own API integrations.
Language: Python - Size: 145 MB - Last synced at: 4 days ago - Pushed at: 5 days ago - Stars: 2,293 - Forks: 191
approximatelabs/sketch
AI code-writing assistant that understands data content
Language: Python - Size: 8.98 MB - Last synced at: about 2 months ago - Pushed at: almost 2 years ago - Stars: 2,287 - Forks: 119
benkeen/generatedata
A powerful, feature-rich, random test data generator.
Language: TypeScript - Size: 82.5 MB - Last synced at: 6 days ago - Pushed at: 7 days ago - Stars: 2,272 - Forks: 620
TigerResearch/TigerBot
TigerBot: A multi-language multi-task LLM
Language: Python - Size: 74.2 MB - Last synced at: 11 days ago - Pushed at: 12 months ago - Stars: 2,261 - Forks: 190
apache/gobblin
A distributed data integration framework that simplifies common aspects of big data integration such as data ingestion, replication, organization and lifecycle management for both streaming and batch data ecosystems.
Language: Java - Size: 128 MB - Last synced at: 5 days ago - Pushed at: 7 days ago - Stars: 2,257 - Forks: 749
MarcSkovMadsen/awesome-streamlit
The purpose of this project is to share knowledge on how awesome Streamlit is and can be
Language: HTML - Size: 115 MB - Last synced at: 9 days ago - Pushed at: over 2 years ago - Stars: 2,232 - Forks: 367
DeepInsight-AI/DeepBI
LLM based data scientist, AI native data application. AI-driven infinite thinking redefines BI.
Language: Python - Size: 134 MB - Last synced at: 5 months ago - Pushed at: 9 months ago - Stars: 2,227 - Forks: 357
GSA/data
Assorted data from the General Services Administration.
Language: HTML - Size: 10.9 MB - Last synced at: 6 months ago - Pushed at: over 1 year ago - Stars: 2,168 - Forks: 276
pretzelai/pretzelai
The modern replacement for Jupyter Notebooks
Language: TypeScript - Size: 264 MB - Last synced at: 4 months ago - Pushed at: about 1 year ago - Stars: 2,155 - Forks: 155
man-group/ArcticDB
ArcticDB is a high performance, serverless DataFrame database built for the Python Data Science ecosystem.
Language: C++ - Size: 203 MB - Last synced at: 4 days ago - Pushed at: 5 days ago - Stars: 2,117 - Forks: 155
mara/mara-pipelines
A lightweight opinionated ETL framework, halfway between plain scripts and Apache Airflow
Language: Python - Size: 3.29 MB - Last synced at: 7 days ago - Pushed at: about 2 years ago - Stars: 2,086 - Forks: 99
mahmoud/glom
☄️ Python's nested data operator (and CLI), for all your declarative restructuring needs. Got data? Glom it! ☄️
Language: Python - Size: 1.27 MB - Last synced at: 2 months ago - Pushed at: 12 months ago - Stars: 2,080 - Forks: 68
onyx-platform/onyx 📦
Distributed, masterless, high performance, fault tolerant data processing
Language: Clojure - Size: 16.2 MB - Last synced at: 5 days ago - Pushed at: over 6 years ago - Stars: 2,044 - Forks: 202
keajs/kea
Batteries Included State Management for React
Language: JavaScript - Size: 7.33 MB - Last synced at: 7 days ago - Pushed at: 4 months ago - Stars: 1,986 - Forks: 51
illacceptanything/illacceptanything
The project where literally anything* goes.
Language: Ruby - Size: 1.47 GB - Last synced at: 23 days ago - Pushed at: 25 days ago - Stars: 1,961 - Forks: 591
brimdata/zui
Zui is a powerful desktop application for exploring and working with data. The official front-end to the Zed lake.
Language: TypeScript - Size: 222 MB - Last synced at: 5 days ago - Pushed at: 7 days ago - Stars: 1,912 - Forks: 136
baidu/tera
An Internet-Scale Database.
Language: C++ - Size: 15.7 MB - Last synced at: 6 months ago - Pushed at: over 1 year ago - Stars: 1,904 - Forks: 436