Topic: "datafusion"
apache/datafusion
Apache DataFusion SQL Query Engine
Language: Rust - Size: 138 MB - Last synced at: 2 days ago - Pushed at: 3 days ago - Stars: 7,069 - Forks: 1,456

ibis-project/ibis
the portable Python dataframe library
Language: Python - Size: 173 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 5,703 - Forks: 633

roapi/roapi
Create full-fledged APIs for slowly moving datasets without writing a single line of code.
Language: Rust - Size: 1.21 MB - Last synced at: about 2 hours ago - Pushed at: 2 days ago - Stars: 3,292 - Forks: 191

lakesoul-io/LakeSoul
LakeSoul is an end-to-end, realtime and cloud native Lakehouse framework with fast data ingestion, concurrent update and incremental data analytics on cloud storages for both BI and AI applications.
Language: Java - Size: 35 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 2,710 - Forks: 403

kwai/blaze
Blazing-fast query execution engine speaks Apache Spark language and has Arrow-DataFusion at its core.
Language: Rust - Size: 9.47 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 1,446 - Forks: 151

apache/datafusion-comet
Apache DataFusion Comet Spark Accelerator
Language: Rust - Size: 16.4 MB - Last synced at: about 2 hours ago - Pushed at: about 12 hours ago - Stars: 934 - Forks: 200

lakehq/sail
LakeSail's computation framework with a mission to unify batch processing, stream processing, and compute-intensive (AI) workloads.
Language: Rust - Size: 3.39 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 726 - Forks: 25

paradedb/pg_analytics 📦
DuckDB-powered data lake analytics from Postgres
Language: Rust - Size: 814 KB - Last synced at: 25 days ago - Pushed at: about 1 month ago - Stars: 522 - Forks: 21

splitgraph/seafowl
Analytical database for data-driven Web applications 🪶
Language: Rust - Size: 4.47 MB - Last synced at: 7 days ago - Pushed at: about 2 months ago - Stars: 482 - Forks: 13

arkflow-rs/arkflow
High-performance Rust stream processing engine, providing powerful data stream processing capabilities, supporting multiple input/output sources and processors.
Language: Rust - Size: 998 KB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 459 - Forks: 14

kamu-data/kamu-cli
Next-generation decentralized data lakehouse and a multi-party stream processing network
Language: Rust - Size: 37.3 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 318 - Forks: 15

JanKaul/iceberg-rust
Rust implementation of Apache Iceberg with integration for Datafusion
Language: Rust - Size: 4.65 MB - Last synced at: about 9 hours ago - Pushed at: 4 days ago - Stars: 166 - Forks: 24

datafusion-contrib/datafusion-dft
Batteries included CLI, TUI, and server implementations for DataFusion.
Language: Rust - Size: 15.8 MB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 147 - Forks: 13

PRQL/prql-query 📦
Query and transform data with PRQL
Language: Rust - Size: 1.32 MB - Last synced at: 18 days ago - Pushed at: over 1 year ago - Stars: 130 - Forks: 7

XiangpengHao/liquid-cache
10x lower latency for cloud-native DataFusion
Language: Rust - Size: 3.9 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 125 - Forks: 8

datafusion-contrib/datafusion-java
Java binding to Apache DataFusion
Language: Java - Size: 479 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 74 - Forks: 13

duo-rs/duo
A lightweight Logging and Tracing observability solution for Rust, built with Apache Arrow, Apache Parquet and Apache DataFusion.
Language: Rust - Size: 2.51 MB - Last synced at: 28 days ago - Pushed at: 7 months ago - Stars: 73 - Forks: 7

hw2499/etl-engine
etl engine 轻量级 跨平台 流批一体ETL引擎 数据抽取-转换-装载 ETL engine lightweight cross platform batch flow integration ETL engine data extraction transformation loading
Language: Go - Size: 1.6 MB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 68 - Forks: 13

datafusion-contrib/datafusion-objectstore-s3
S3 as an ObjectStore for DataFusion
Language: Rust - Size: 73.2 KB - Last synced at: 7 days ago - Pushed at: about 2 years ago - Stars: 61 - Forks: 13

jorgecarleitao/datafusion-python 📦
A Python library to run analytics workloads with the performance of Rust, the flexibility of Python and O(1) cost in moving data between the two. Uses Apache Arrow in-memory format and respective query engine DataFusion.
Language: Rust - Size: 125 KB - Last synced at: 7 days ago - Pushed at: almost 4 years ago - Stars: 61 - Forks: 4

wheretrue/exon
Exon is an OLAP query engine specifically for biology and life science applications.
Language: Rust - Size: 59.3 MB - Last synced at: 11 days ago - Pushed at: about 1 month ago - Stars: 60 - Forks: 5

datafusion-contrib/datafusion-python 📦
Python binding for DataFusion
Language: Python - Size: 234 KB - Last synced at: about 2 years ago - Pushed at: almost 3 years ago - Stars: 59 - Forks: 13

shauryashaurya/learn-data-munging
Notes on Data Engineering with Pandas, PySpark, Dask, Ray, Arrow DataFusion, Polars etc.
Language: Jupyter Notebook - Size: 627 MB - Last synced at: 9 days ago - Pushed at: 15 days ago - Stars: 48 - Forks: 21

biodatageeks/polars-bio
Blazing-Fast Bioinformatic Operations on Python DataFrames
Language: Python - Size: 8.04 MB - Last synced at: 23 days ago - Pushed at: about 1 month ago - Stars: 46 - Forks: 1

baggiponte/awesome-pandas-alternatives
Awesome list of alternative dataframe libraries in Python.
Size: 21.5 KB - Last synced at: about 19 hours ago - Pushed at: over 2 years ago - Stars: 46 - Forks: 3

metrico/influxdb3-community
Community InfluxDB 3.0 "IOx" static builds + containers + Examples for Developers & Integrators. Experiment with low-cost storage, unlimited cardinality and FlightSQL APIs
Language: Shell - Size: 223 KB - Last synced at: 13 days ago - Pushed at: about 1 month ago - Stars: 45 - Forks: 2

splitgraph/seafowl-gcsfuse
Scale to zero Seafowl hosting with Cloud Run
Language: Dockerfile - Size: 13.7 KB - Last synced at: 2 days ago - Pushed at: almost 2 years ago - Stars: 38 - Forks: 0

datafusion-contrib/datafusion-materialized-views
Incremental view maintenance & query rewriting for materialized views in DataFusion
Language: Rust - Size: 87.9 KB - Last synced at: 7 days ago - Pushed at: 12 days ago - Stars: 29 - Forks: 2

treebee/elixir-arrow
Experimental Elixir bindings for Apache Arrow including Parquet and DataFusion
Language: Rust - Size: 130 KB - Last synced at: about 2 years ago - Pushed at: about 4 years ago - Stars: 27 - Forks: 3

grouzen/zio-apache-arrow
Scala ZIO-powered Apache Arrow library
Language: Scala - Size: 482 KB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 21 - Forks: 1

apache/datafusion-benchmarks
Apache DataFusion Benchmarks
Language: Python - Size: 123 KB - Last synced at: about 2 hours ago - Pushed at: 21 days ago - Stars: 18 - Forks: 8

madesroches/micromegas
Scalable Observability
Language: Rust - Size: 2.7 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 16 - Forks: 4

sal-openlab/datafusion-server
Rust DataFusion Server
Language: Rust - Size: 1.51 MB - Last synced at: 11 days ago - Pushed at: about 1 month ago - Stars: 16 - Forks: 3

ModelarData/ModelarDB-RS
ModelarDB: Model-Based Time Series Management from Edge to Client
Language: Rust - Size: 1.56 MB - Last synced at: 18 days ago - Pushed at: 18 days ago - Stars: 13 - Forks: 5

fmenat/MultiviewCropClassification
Public repository of our IGARSS 2023 submission
Language: Python - Size: 132 MB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 13 - Forks: 1

datafusion-contrib/datafusion-c
C language bindings for DataFusion
Language: C - Size: 5.75 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 13 - Forks: 3

blaze-init/spark-blaze-extension 📦
Blazing-fast query execution engine speaks Apache Spark language and has Arrow-DataFusion at its core.
Language: Shell - Size: 288 MB - Last synced at: 26 days ago - Pushed at: about 3 years ago - Stars: 11 - Forks: 4

datafusion-contrib/datafusion-functions-extra
Various additional function packages for Apache DataFusion (unofficial)
Language: Rust - Size: 52.7 KB - Last synced at: 13 days ago - Pushed at: 6 months ago - Stars: 9 - Forks: 5

hengfeiyang/how-query-engines-work-zh-CN
How Query Engines Work 中文版
Size: 1.77 MB - Last synced at: about 2 months ago - Pushed at: 10 months ago - Stars: 8 - Forks: 2

milenkovicm/wasaffi 📦
Datafusion WASM User Defined Functions
Language: Rust - Size: 1.22 MB - Last synced at: about 1 month ago - Pushed at: 11 months ago - Stars: 8 - Forks: 0

roeap/flight-fusion
Language: Rust - Size: 4.3 MB - Last synced at: 13 days ago - Pushed at: over 1 year ago - Stars: 7 - Forks: 2

systemxlabs/datafusion-remote-table
A DataFusion table provider for executing SQL queries on remote databases.
Language: Rust - Size: 637 KB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 5 - Forks: 1

irtimmer/rust-kql
Kusto Query Language parser and planner for DataFusion
Language: Rust - Size: 112 KB - Last synced at: about 9 hours ago - Pushed at: about 1 month ago - Stars: 5 - Forks: 0

milenkovicm/adhesive
Apache Datafusion JVM User Defined Functions (UDF), integration nobody asked for 😀
Language: Rust - Size: 39.1 KB - Last synced at: 15 days ago - Pushed at: 25 days ago - Stars: 4 - Forks: 1

metrico/kompactor
Parquet + Metadata Compactor for InfluxDB 3 Core
Language: TypeScript - Size: 150 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 4 - Forks: 0

splitgraph/experimental-datafusion-webassembly
proof-of-concept: compile datafusion to `wasm32-wasi` (run in `wasmedge`) and `wasm32-unknown-unknown` (run in browser)
Size: 104 KB - Last synced at: 2 days ago - Pushed at: over 2 years ago - Stars: 4 - Forks: 1

milenkovicm/torchfusion
Torchfusion is a very opinionated torch inference on datafusion.
Language: Rust - Size: 92.8 KB - Last synced at: 16 days ago - Pushed at: about 2 months ago - Stars: 3 - Forks: 0

Caoxuheng/HyMS
Hyperspectral Image Super-resolution via Multi-stage Scheme without Employing Spatial Degradation
Language: Python - Size: 91.8 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 3 - Forks: 3

milenkovicm/ballista_delta
Datafusion Ballista support for Delta Table (showcase project)
Language: Rust - Size: 309 KB - Last synced at: 18 days ago - Pushed at: 18 days ago - Stars: 2 - Forks: 0

lostmygithubaccount/ibis-bench
A composable data system benchmark in a Python package.
Language: Python - Size: 965 KB - Last synced at: 15 days ago - Pushed at: 8 months ago - Stars: 2 - Forks: 1

jychen7/datafusion-bigtable 📦
Bigtable data source for Apache Arrow Datafusion
Language: Rust - Size: 34.2 KB - Last synced at: about 1 year ago - Pushed at: about 3 years ago - Stars: 2 - Forks: 0

matsadler/bishop
Query MongoDB via Apache Arrow and DataFusion
Language: Rust - Size: 37.1 KB - Last synced at: about 2 months ago - Pushed at: about 4 years ago - Stars: 2 - Forks: 0

MaciekLesiczka/bazof
Lakehouse with time travel
Language: Rust - Size: 47.2 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 1 - Forks: 0

milenkovicm/ballista_python
Ballista cluster pyarrow udf support
Language: Rust - Size: 152 KB - Last synced at: 13 days ago - Pushed at: 13 days ago - Stars: 1 - Forks: 0

apache/datafusion-testing
Apache DataFusion SQL Query Engine Testing
Size: 191 MB - Last synced at: about 2 hours ago - Pushed at: 17 days ago - Stars: 1 - Forks: 5

milenkovicm/lightfusion
LightGBM Inference on Datafusion
Language: Rust - Size: 9.83 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 1 - Forks: 0

hienduyph/fusionj
An Incomplete DataFusion Query Engine implemeted in Java
Language: Java - Size: 281 KB - Last synced at: about 2 months ago - Pushed at: 2 months ago - Stars: 1 - Forks: 0

duhanmin/arrow-sql-yarn
通过jni将sql执行到datafusion/polars引擎
Language: Java - Size: 26.9 MB - Last synced at: 17 days ago - Pushed at: 5 months ago - Stars: 1 - Forks: 0

DFKI-Earth-And-Space-Applications/MVCC_IGARSS
Public repository of our IGARSS 2023 submission
Language: Python - Size: 132 MB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 1 - Forks: 1

f-aguzzi/ChemFuseKit
Chemometrics library for data fusion, model training and prediction of data from multiple sensor sources.
Language: Jupyter Notebook - Size: 22.2 MB - Last synced at: 10 days ago - Pushed at: 10 months ago - Stars: 1 - Forks: 1

jychen7/BigQL
SQL Query Layer for Google Cloud Bigtable
Language: Python - Size: 110 KB - Last synced at: 5 days ago - Pushed at: about 3 years ago - Stars: 1 - Forks: 0

myryfe/dataframely
A declarative, 🐻❄️-native data frame validation library.
Language: Python - Size: 292 KB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 0 - Forks: 0

svenslaggare/gitrends
Gitrends - Web-based behavior code analysis tool.
Language: TypeScript - Size: 1.75 MB - Last synced at: 11 days ago - Pushed at: 11 days ago - Stars: 0 - Forks: 0

duyet/ballista
Example of Ballista Rust
Language: Rust - Size: 35.7 MB - Last synced at: 12 days ago - Pushed at: 12 days ago - Stars: 0 - Forks: 0

surfingreg/rust-in-memory-db-with-chart
Implement a fast, in-memory, time-series database. Query data using SQL and visualize in ChartJS over websocket.
Language: Rust - Size: 358 KB - Last synced at: 18 days ago - Pushed at: 18 days ago - Stars: 0 - Forks: 0

QizhiPei/MathFusion
MathFusion: Enhancing Mathematic Problem-solving of LLM through Instruction Fusion
Language: Python - Size: 9.33 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

andriidemus/exo
Toy Data REPL
Language: Rust - Size: 116 KB - Last synced at: 7 days ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

dadepo/df_extras
A collection of user defined functions, from your favourite databases, in Apache Datafusion
Language: Rust - Size: 161 KB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 0 - Forks: 0

PyramidGithub/data_fusion
Apache Spark Comet Wsl2
Language: Rust - Size: 3.21 MB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 0 - Forks: 0

rurumimic/apache-datafusion
Language: Rust - Size: 4.88 KB - Last synced at: 2 months ago - Pushed at: 11 months ago - Stars: 0 - Forks: 0

stormasm/dply-rs Fork of vincev/dply-rs
A dataframe manipulation tool inspired by dplyr.
Language: Rust - Size: 508 KB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 0 - Forks: 0

Mboubaker/Lidar_Evidential_occupancy_grid_mapping-
This reposity present an approach to build 2D evidential occupancy grid maps with Lidar data
Language: Jupyter Notebook - Size: 7.45 MB - Last synced at: 12 months ago - Pushed at: 12 months ago - Stars: 0 - Forks: 1

dsaad68/azurefunction-deltatable-pipeline-with-rust
A Delta Table pipeline in Rust, triggered by Azure Functions responding to blob storage events in a specific container subfolder. The pipeline processes CSV files, updating or creating Delta Tables as needed, using merges for row changes.
Language: Rust - Size: 1.05 MB - Last synced at: 7 days ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

jfrazier-eth/file-fusion
A file explorer for data warehouses
Language: TypeScript - Size: 1.07 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

zhxiaogg/dfq
A CLI for running SQLs over various data sources.
Language: Rust - Size: 26.4 KB - Last synced at: about 2 months ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

datafusion-contrib/datafusion-objectstore-azure 📦
Azure Storage as an ObjectStore for DataFusion
Language: Rust - Size: 23.4 KB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 1

selvakrishnan/DataFusion_Airflow_Trigger
A simple dag for triggering the Cloud Data Fusion Pipeline using Apache Airflow.
Language: Python - Size: 2.93 KB - Last synced at: 10 months ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

selvakrishnan/DataFusion_CDAP_Wrangler_Directives
Google Cloud Data Fusion - Data Transformation Logics using CDAP Wrangler Directives.
Size: 83 KB - Last synced at: 10 months ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

gongouveia/DataFusion-2021-22
Time series analysis, state estimation, stratification, classification and data mining
Language: Jupyter Notebook - Size: 8.03 MB - Last synced at: about 2 years ago - Pushed at: almost 3 years ago - Stars: 0 - Forks: 0

sscosta/datafusion-on-demand
Set of Airflow DAGs to create and destroy a Cloud Data Fusion instance
Language: Python - Size: 4.88 KB - Last synced at: about 2 years ago - Pushed at: about 3 years ago - Stars: 0 - Forks: 0

IFF-0303/lvh-fusion Fork of AshleyLab/lvh-fusion
Size: 65.4 KB - Last synced at: almost 2 years ago - Pushed at: about 3 years ago - Stars: 0 - Forks: 0

Georgsiedel/data_fusion_dipl
Code featured in the diploma thesis, which is in german language. Full text available with link in the Readme below.
Language: R - Size: 124 KB - Last synced at: almost 2 years ago - Pushed at: about 3 years ago - Stars: 0 - Forks: 0

ivangonzalezacuna/datafusion_collect_transform_data
Functions for the main process to collect and store the data received via MQTT and transform all the entries of each sensor in one
Language: Go - Size: 1.58 MB - Last synced at: about 2 years ago - Pushed at: almost 5 years ago - Stars: 0 - Forks: 0
