An open API service providing repository metadata for many open source software ecosystems.

Topic: "datafusion"

apache/datafusion

Apache DataFusion SQL Query Engine

Language: Rust - Size: 138 MB - Last synced at: 2 days ago - Pushed at: 3 days ago - Stars: 7,069 - Forks: 1,456

ibis-project/ibis

the portable Python dataframe library

Language: Python - Size: 173 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 5,703 - Forks: 633

roapi/roapi

Create full-fledged APIs for slowly moving datasets without writing a single line of code.

Language: Rust - Size: 1.21 MB - Last synced at: about 2 hours ago - Pushed at: 2 days ago - Stars: 3,292 - Forks: 191

lakesoul-io/LakeSoul

LakeSoul is an end-to-end, realtime and cloud native Lakehouse framework with fast data ingestion, concurrent update and incremental data analytics on cloud storages for both BI and AI applications.

Language: Java - Size: 35 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 2,710 - Forks: 403

kwai/blaze

Blazing-fast query execution engine speaks Apache Spark language and has Arrow-DataFusion at its core.

Language: Rust - Size: 9.47 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 1,446 - Forks: 151

apache/datafusion-comet

Apache DataFusion Comet Spark Accelerator

Language: Rust - Size: 16.4 MB - Last synced at: about 2 hours ago - Pushed at: about 12 hours ago - Stars: 934 - Forks: 200

lakehq/sail

LakeSail's computation framework with a mission to unify batch processing, stream processing, and compute-intensive (AI) workloads.

Language: Rust - Size: 3.39 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 726 - Forks: 25

paradedb/pg_analytics 📦

DuckDB-powered data lake analytics from Postgres

Language: Rust - Size: 814 KB - Last synced at: 25 days ago - Pushed at: about 1 month ago - Stars: 522 - Forks: 21

splitgraph/seafowl

Analytical database for data-driven Web applications 🪶

Language: Rust - Size: 4.47 MB - Last synced at: 7 days ago - Pushed at: about 2 months ago - Stars: 482 - Forks: 13

arkflow-rs/arkflow

High-performance Rust stream processing engine, providing powerful data stream processing capabilities, supporting multiple input/output sources and processors.

Language: Rust - Size: 998 KB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 459 - Forks: 14

kamu-data/kamu-cli

Next-generation decentralized data lakehouse and a multi-party stream processing network

Language: Rust - Size: 37.3 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 318 - Forks: 15

JanKaul/iceberg-rust

Rust implementation of Apache Iceberg with integration for Datafusion

Language: Rust - Size: 4.65 MB - Last synced at: about 9 hours ago - Pushed at: 4 days ago - Stars: 166 - Forks: 24

datafusion-contrib/datafusion-dft

Batteries included CLI, TUI, and server implementations for DataFusion.

Language: Rust - Size: 15.8 MB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 147 - Forks: 13

PRQL/prql-query 📦

Query and transform data with PRQL

Language: Rust - Size: 1.32 MB - Last synced at: 18 days ago - Pushed at: over 1 year ago - Stars: 130 - Forks: 7

XiangpengHao/liquid-cache

10x lower latency for cloud-native DataFusion

Language: Rust - Size: 3.9 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 125 - Forks: 8

datafusion-contrib/datafusion-java

Java binding to Apache DataFusion

Language: Java - Size: 479 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 74 - Forks: 13

duo-rs/duo

A lightweight Logging and Tracing observability solution for Rust, built with Apache Arrow, Apache Parquet and Apache DataFusion.

Language: Rust - Size: 2.51 MB - Last synced at: 28 days ago - Pushed at: 7 months ago - Stars: 73 - Forks: 7

hw2499/etl-engine

etl engine 轻量级 跨平台 流批一体ETL引擎 数据抽取-转换-装载 ETL engine lightweight cross platform batch flow integration ETL engine data extraction transformation loading

Language: Go - Size: 1.6 MB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 68 - Forks: 13

datafusion-contrib/datafusion-objectstore-s3

S3 as an ObjectStore for DataFusion

Language: Rust - Size: 73.2 KB - Last synced at: 7 days ago - Pushed at: about 2 years ago - Stars: 61 - Forks: 13

jorgecarleitao/datafusion-python 📦

A Python library to run analytics workloads with the performance of Rust, the flexibility of Python and O(1) cost in moving data between the two. Uses Apache Arrow in-memory format and respective query engine DataFusion.

Language: Rust - Size: 125 KB - Last synced at: 7 days ago - Pushed at: almost 4 years ago - Stars: 61 - Forks: 4

wheretrue/exon

Exon is an OLAP query engine specifically for biology and life science applications.

Language: Rust - Size: 59.3 MB - Last synced at: 11 days ago - Pushed at: about 1 month ago - Stars: 60 - Forks: 5

datafusion-contrib/datafusion-python 📦

Python binding for DataFusion

Language: Python - Size: 234 KB - Last synced at: about 2 years ago - Pushed at: almost 3 years ago - Stars: 59 - Forks: 13

shauryashaurya/learn-data-munging

Notes on Data Engineering with Pandas, PySpark, Dask, Ray, Arrow DataFusion, Polars etc.

Language: Jupyter Notebook - Size: 627 MB - Last synced at: 9 days ago - Pushed at: 15 days ago - Stars: 48 - Forks: 21

biodatageeks/polars-bio

Blazing-Fast Bioinformatic Operations on Python DataFrames

Language: Python - Size: 8.04 MB - Last synced at: 23 days ago - Pushed at: about 1 month ago - Stars: 46 - Forks: 1

baggiponte/awesome-pandas-alternatives

Awesome list of alternative dataframe libraries in Python.

Size: 21.5 KB - Last synced at: about 19 hours ago - Pushed at: over 2 years ago - Stars: 46 - Forks: 3

metrico/influxdb3-community

Community InfluxDB 3.0 "IOx" static builds + containers + Examples for Developers & Integrators. Experiment with low-cost storage, unlimited cardinality and FlightSQL APIs

Language: Shell - Size: 223 KB - Last synced at: 13 days ago - Pushed at: about 1 month ago - Stars: 45 - Forks: 2

splitgraph/seafowl-gcsfuse

Scale to zero Seafowl hosting with Cloud Run

Language: Dockerfile - Size: 13.7 KB - Last synced at: 2 days ago - Pushed at: almost 2 years ago - Stars: 38 - Forks: 0

datafusion-contrib/datafusion-materialized-views

Incremental view maintenance & query rewriting for materialized views in DataFusion

Language: Rust - Size: 87.9 KB - Last synced at: 7 days ago - Pushed at: 12 days ago - Stars: 29 - Forks: 2

treebee/elixir-arrow

Experimental Elixir bindings for Apache Arrow including Parquet and DataFusion

Language: Rust - Size: 130 KB - Last synced at: about 2 years ago - Pushed at: about 4 years ago - Stars: 27 - Forks: 3

grouzen/zio-apache-arrow

Scala ZIO-powered Apache Arrow library

Language: Scala - Size: 482 KB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 21 - Forks: 1

apache/datafusion-benchmarks

Apache DataFusion Benchmarks

Language: Python - Size: 123 KB - Last synced at: about 2 hours ago - Pushed at: 21 days ago - Stars: 18 - Forks: 8

madesroches/micromegas

Scalable Observability

Language: Rust - Size: 2.7 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 16 - Forks: 4

sal-openlab/datafusion-server

Rust DataFusion Server

Language: Rust - Size: 1.51 MB - Last synced at: 11 days ago - Pushed at: about 1 month ago - Stars: 16 - Forks: 3

ModelarData/ModelarDB-RS

ModelarDB: Model-Based Time Series Management from Edge to Client

Language: Rust - Size: 1.56 MB - Last synced at: 18 days ago - Pushed at: 18 days ago - Stars: 13 - Forks: 5

fmenat/MultiviewCropClassification

Public repository of our IGARSS 2023 submission

Language: Python - Size: 132 MB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 13 - Forks: 1

datafusion-contrib/datafusion-c

C language bindings for DataFusion

Language: C - Size: 5.75 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 13 - Forks: 3

blaze-init/spark-blaze-extension 📦

Blazing-fast query execution engine speaks Apache Spark language and has Arrow-DataFusion at its core.

Language: Shell - Size: 288 MB - Last synced at: 26 days ago - Pushed at: about 3 years ago - Stars: 11 - Forks: 4

datafusion-contrib/datafusion-functions-extra

Various additional function packages for Apache DataFusion (unofficial)

Language: Rust - Size: 52.7 KB - Last synced at: 13 days ago - Pushed at: 6 months ago - Stars: 9 - Forks: 5

hengfeiyang/how-query-engines-work-zh-CN

How Query Engines Work 中文版

Size: 1.77 MB - Last synced at: about 2 months ago - Pushed at: 10 months ago - Stars: 8 - Forks: 2

milenkovicm/wasaffi 📦

Datafusion WASM User Defined Functions

Language: Rust - Size: 1.22 MB - Last synced at: about 1 month ago - Pushed at: 11 months ago - Stars: 8 - Forks: 0

roeap/flight-fusion

Language: Rust - Size: 4.3 MB - Last synced at: 13 days ago - Pushed at: over 1 year ago - Stars: 7 - Forks: 2

systemxlabs/datafusion-remote-table

A DataFusion table provider for executing SQL queries on remote databases.

Language: Rust - Size: 637 KB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 5 - Forks: 1

irtimmer/rust-kql

Kusto Query Language parser and planner for DataFusion

Language: Rust - Size: 112 KB - Last synced at: about 9 hours ago - Pushed at: about 1 month ago - Stars: 5 - Forks: 0

milenkovicm/adhesive

Apache Datafusion JVM User Defined Functions (UDF), integration nobody asked for 😀

Language: Rust - Size: 39.1 KB - Last synced at: 15 days ago - Pushed at: 25 days ago - Stars: 4 - Forks: 1

metrico/kompactor

Parquet + Metadata Compactor for InfluxDB 3 Core

Language: TypeScript - Size: 150 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 4 - Forks: 0

splitgraph/experimental-datafusion-webassembly

proof-of-concept: compile datafusion to `wasm32-wasi` (run in `wasmedge`) and `wasm32-unknown-unknown` (run in browser)

Size: 104 KB - Last synced at: 2 days ago - Pushed at: over 2 years ago - Stars: 4 - Forks: 1

milenkovicm/torchfusion

Torchfusion is a very opinionated torch inference on datafusion.

Language: Rust - Size: 92.8 KB - Last synced at: 16 days ago - Pushed at: about 2 months ago - Stars: 3 - Forks: 0

Caoxuheng/HyMS

Hyperspectral Image Super-resolution via Multi-stage Scheme without Employing Spatial Degradation

Language: Python - Size: 91.8 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 3 - Forks: 3

milenkovicm/ballista_delta

Datafusion Ballista support for Delta Table (showcase project)

Language: Rust - Size: 309 KB - Last synced at: 18 days ago - Pushed at: 18 days ago - Stars: 2 - Forks: 0

lostmygithubaccount/ibis-bench

A composable data system benchmark in a Python package.

Language: Python - Size: 965 KB - Last synced at: 15 days ago - Pushed at: 8 months ago - Stars: 2 - Forks: 1

jychen7/datafusion-bigtable 📦

Bigtable data source for Apache Arrow Datafusion

Language: Rust - Size: 34.2 KB - Last synced at: about 1 year ago - Pushed at: about 3 years ago - Stars: 2 - Forks: 0

matsadler/bishop

Query MongoDB via Apache Arrow and DataFusion

Language: Rust - Size: 37.1 KB - Last synced at: about 2 months ago - Pushed at: about 4 years ago - Stars: 2 - Forks: 0

MaciekLesiczka/bazof

Lakehouse with time travel

Language: Rust - Size: 47.2 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 1 - Forks: 0

milenkovicm/ballista_python

Ballista cluster pyarrow udf support

Language: Rust - Size: 152 KB - Last synced at: 13 days ago - Pushed at: 13 days ago - Stars: 1 - Forks: 0

apache/datafusion-testing

Apache DataFusion SQL Query Engine Testing

Size: 191 MB - Last synced at: about 2 hours ago - Pushed at: 17 days ago - Stars: 1 - Forks: 5

milenkovicm/lightfusion

LightGBM Inference on Datafusion

Language: Rust - Size: 9.83 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 1 - Forks: 0

hienduyph/fusionj

An Incomplete DataFusion Query Engine implemeted in Java

Language: Java - Size: 281 KB - Last synced at: about 2 months ago - Pushed at: 2 months ago - Stars: 1 - Forks: 0

duhanmin/arrow-sql-yarn

通过jni将sql执行到datafusion/polars引擎

Language: Java - Size: 26.9 MB - Last synced at: 17 days ago - Pushed at: 5 months ago - Stars: 1 - Forks: 0

DFKI-Earth-And-Space-Applications/MVCC_IGARSS

Public repository of our IGARSS 2023 submission

Language: Python - Size: 132 MB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 1 - Forks: 1

f-aguzzi/ChemFuseKit

Chemometrics library for data fusion, model training and prediction of data from multiple sensor sources.

Language: Jupyter Notebook - Size: 22.2 MB - Last synced at: 10 days ago - Pushed at: 10 months ago - Stars: 1 - Forks: 1

jychen7/BigQL

SQL Query Layer for Google Cloud Bigtable

Language: Python - Size: 110 KB - Last synced at: 5 days ago - Pushed at: about 3 years ago - Stars: 1 - Forks: 0

myryfe/dataframely

A declarative, 🐻❄️-native data frame validation library.

Language: Python - Size: 292 KB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 0 - Forks: 0

svenslaggare/gitrends

Gitrends - Web-based behavior code analysis tool.

Language: TypeScript - Size: 1.75 MB - Last synced at: 11 days ago - Pushed at: 11 days ago - Stars: 0 - Forks: 0

duyet/ballista

Example of Ballista Rust

Language: Rust - Size: 35.7 MB - Last synced at: 12 days ago - Pushed at: 12 days ago - Stars: 0 - Forks: 0

surfingreg/rust-in-memory-db-with-chart

Implement a fast, in-memory, time-series database. Query data using SQL and visualize in ChartJS over websocket.

Language: Rust - Size: 358 KB - Last synced at: 18 days ago - Pushed at: 18 days ago - Stars: 0 - Forks: 0

QizhiPei/MathFusion

MathFusion: Enhancing Mathematic Problem-solving of LLM through Instruction Fusion

Language: Python - Size: 9.33 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

andriidemus/exo

Toy Data REPL

Language: Rust - Size: 116 KB - Last synced at: 7 days ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

dadepo/df_extras

A collection of user defined functions, from your favourite databases, in Apache Datafusion

Language: Rust - Size: 161 KB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 0 - Forks: 0

PyramidGithub/data_fusion

Apache Spark Comet Wsl2

Language: Rust - Size: 3.21 MB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 0 - Forks: 0

rurumimic/apache-datafusion

Language: Rust - Size: 4.88 KB - Last synced at: 2 months ago - Pushed at: 11 months ago - Stars: 0 - Forks: 0

stormasm/dply-rs Fork of vincev/dply-rs

A dataframe manipulation tool inspired by dplyr.

Language: Rust - Size: 508 KB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 0 - Forks: 0

Mboubaker/Lidar_Evidential_occupancy_grid_mapping-

This reposity present an approach to build 2D evidential occupancy grid maps with Lidar data

Language: Jupyter Notebook - Size: 7.45 MB - Last synced at: 12 months ago - Pushed at: 12 months ago - Stars: 0 - Forks: 1

dsaad68/azurefunction-deltatable-pipeline-with-rust

A Delta Table pipeline in Rust, triggered by Azure Functions responding to blob storage events in a specific container subfolder. The pipeline processes CSV files, updating or creating Delta Tables as needed, using merges for row changes.

Language: Rust - Size: 1.05 MB - Last synced at: 7 days ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

jfrazier-eth/file-fusion

A file explorer for data warehouses

Language: TypeScript - Size: 1.07 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

zhxiaogg/dfq

A CLI for running SQLs over various data sources.

Language: Rust - Size: 26.4 KB - Last synced at: about 2 months ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

datafusion-contrib/datafusion-objectstore-azure 📦

Azure Storage as an ObjectStore for DataFusion

Language: Rust - Size: 23.4 KB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 1

selvakrishnan/DataFusion_Airflow_Trigger

A simple dag for triggering the Cloud Data Fusion Pipeline using Apache Airflow.

Language: Python - Size: 2.93 KB - Last synced at: 10 months ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

selvakrishnan/DataFusion_CDAP_Wrangler_Directives

Google Cloud Data Fusion - Data Transformation Logics using CDAP Wrangler Directives.

Size: 83 KB - Last synced at: 10 months ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

gongouveia/DataFusion-2021-22

Time series analysis, state estimation, stratification, classification and data mining

Language: Jupyter Notebook - Size: 8.03 MB - Last synced at: about 2 years ago - Pushed at: almost 3 years ago - Stars: 0 - Forks: 0

sscosta/datafusion-on-demand

Set of Airflow DAGs to create and destroy a Cloud Data Fusion instance

Language: Python - Size: 4.88 KB - Last synced at: about 2 years ago - Pushed at: about 3 years ago - Stars: 0 - Forks: 0

IFF-0303/lvh-fusion Fork of AshleyLab/lvh-fusion

Size: 65.4 KB - Last synced at: almost 2 years ago - Pushed at: about 3 years ago - Stars: 0 - Forks: 0

Georgsiedel/data_fusion_dipl

Code featured in the diploma thesis, which is in german language. Full text available with link in the Readme below.

Language: R - Size: 124 KB - Last synced at: almost 2 years ago - Pushed at: about 3 years ago - Stars: 0 - Forks: 0

ivangonzalezacuna/datafusion_collect_transform_data

Functions for the main process to collect and store the data received via MQTT and transform all the entries of each sensor in one

Language: Go - Size: 1.58 MB - Last synced at: about 2 years ago - Pushed at: almost 5 years ago - Stars: 0 - Forks: 0