An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: apache-arrow

pixie-io/pixie

Instant Kubernetes-Native Application Observability

Language: C++ - Size: 115 MB - Last synced at: about 5 hours ago - Pushed at: about 19 hours ago - Stars: 6,045 - Forks: 469

tansu-io/tansu

Apache Kafka® compatible broker with S3, PostgreSQL, Apache Iceberg and Delta Lake

Language: Rust - Size: 2.36 MB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 388 - Forks: 11

graphext/lector

A fast reader for messy CSV files with optional type inference.

Language: Python - Size: 245 KB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 17 - Forks: 0

lancedb/lance

Modern columnar data format for ML and LLMs implemented in Rust. Convert from parquet in 2 lines of code for 100x faster random access, vector index, and data versioning. Compatible with Pandas, DuckDB, Polars, Pyarrow, and PyTorch with more integrations coming..

Language: Rust - Size: 22.6 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 4,727 - Forks: 311

aws/aws-sdk-pandas

pandas on AWS - Easy integration with Athena, Glue, Redshift, Timestream, Neptune, OpenSearch, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretManager, PostgreSQL, MySQL, SQLServer and S3 (Parquet, CSV, JSON and EXCEL).

Language: Python - Size: 17 MB - Last synced at: 2 days ago - Pushed at: 5 days ago - Stars: 4,024 - Forks: 707

tradewelltech/beavers

Python stream processing for analytics

Language: Python - Size: 646 KB - Last synced at: 3 days ago - Pushed at: 4 days ago - Stars: 39 - Forks: 2

scikit-hep/awkward

Manipulate JSON-like data with NumPy-like idioms.

Language: Python - Size: 26.1 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 887 - Forks: 93

madesroches/micromegas

Scalable Observability

Language: Rust - Size: 2.77 MB - Last synced at: 4 days ago - Pushed at: 5 days ago - Stars: 17 - Forks: 5

polarsignals/frostdb

❄️ Coolest database around 🧊 Embeddable column database written in Go.

Language: Go - Size: 14.2 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 1,421 - Forks: 68

abdenlab/oxbow

Oxbow makes genomic data accessible for high-performance analytics.

Language: Rust - Size: 16.1 MB - Last synced at: 4 days ago - Pushed at: 5 days ago - Stars: 79 - Forks: 8

geopolars/geopolars

Geospatial extensions for Polars

Language: Rust - Size: 5.93 MB - Last synced at: 5 days ago - Pushed at: 10 months ago - Stars: 712 - Forks: 24

apache/arrow-dotnet

Official .NET implementation of Apache Arrow

Size: 0 Bytes - Last synced at: 6 days ago - Pushed at: 2 months ago - Stars: 3 - Forks: 0

visgl/loaders.gl

Loaders for big data visualization. Website:

Language: TypeScript - Size: 293 MB - Last synced at: 3 days ago - Pushed at: 2 months ago - Stars: 753 - Forks: 206

apache/arrow-js

Official JavaScript implementation of Apache Arrow

Language: TypeScript - Size: 7.66 MB - Last synced at: 6 days ago - Pushed at: 10 days ago - Stars: 16 - Forks: 3

kylebarron/parquet-wasm

Rust-based WebAssembly bindings to read and write Apache Parquet data

Language: Rust - Size: 2.67 MB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 588 - Forks: 20

mluttikh/xml2arrow-python

Convert XML data to Apache Arrow tables

Language: Python - Size: 66.4 KB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 0 - Forks: 0

mluttikh/xml2arrow

Efficiently convert XML data to Apache Arrow format for high-performance data processing

Language: Rust - Size: 249 KB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 5 - Forks: 0

elixir-explorer/adbc

Apache Arrow ADBC bindings for Elixir

Language: C++ - Size: 4.83 MB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 66 - Forks: 17

apache/arrow-swift

Official Swift implementation of Apache Arrow

Language: Swift - Size: 332 KB - Last synced at: 6 days ago - Pushed at: 9 days ago - Stars: 0 - Forks: 1

apache/arrow-go

Official Go implementation of Apache Arrow

Language: Assembly - Size: 19.2 MB - Last synced at: 8 days ago - Pushed at: 10 days ago - Stars: 186 - Forks: 35

mattf96s/QuackDB

Open-source in-browser DuckDB SQL editor

Language: TypeScript - Size: 3.6 MB - Last synced at: 3 days ago - Pushed at: about 2 months ago - Stars: 160 - Forks: 8

kylebarron/arrow-wasm

Building block library for using Apache Arrow in Rust WebAssembly modules.

Language: Rust - Size: 290 KB - Last synced at: 5 days ago - Pushed at: 12 days ago - Stars: 24 - Forks: 6

JosiahParry/arrow-extendr

Integration between arrow-rs and extendr

Language: Rust - Size: 77.1 KB - Last synced at: about 23 hours ago - Pushed at: 10 days ago - Stars: 22 - Forks: 2

kylebarron/arrow-js-ffi

Zero-copy reading of Arrow data from WebAssembly

Language: TypeScript - Size: 360 KB - Last synced at: 5 days ago - Pushed at: 12 months ago - Stars: 116 - Forks: 9

geoarrow/geoarrow-rs

GeoArrow in Rust, Python, and JavaScript (WebAssembly) with vectorized geometry operations

Language: Rust - Size: 14.2 MB - Last synced at: 9 days ago - Pushed at: 10 days ago - Stars: 330 - Forks: 27

man-group/sparrow

C++20 idiomatic APIs for the Apache Arrow Columnar Format

Language: C++ - Size: 1.54 MB - Last synced at: 10 days ago - Pushed at: 10 days ago - Stars: 86 - Forks: 19

geoarrow/geoarrow

Specification for storing geospatial data in Apache Arrow

Size: 120 KB - Last synced at: 10 days ago - Pushed at: 10 days ago - Stars: 467 - Forks: 27

unum-cloud/ustore

Multi-Modal Database replacing MongoDB, Neo4J, and Elastic with 1 faster ACID solution, with NetworkX and Pandas interfaces, and bindings for C 99, C++ 17, Python 3, Java, GoLang 🗄️

Language: C++ - Size: 6.56 MB - Last synced at: 8 days ago - Pushed at: almost 2 years ago - Stars: 600 - Forks: 34

apache/arrow-java

Official Java implementation of Apache Arrow

Language: Java - Size: 25.1 MB - Last synced at: 6 days ago - Pushed at: 10 days ago - Stars: 49 - Forks: 54

kylebarron/arro3

A minimal Python library for Apache Arrow, connecting to the Rust arrow crate

Language: Rust - Size: 3.5 MB - Last synced at: 11 days ago - Pushed at: 11 days ago - Stars: 147 - Forks: 12

mongodb-labs/mongo-arrow

MongoDB integrations for Apache Arrow. Export MongoDB documents to numpy array, parquet files, and pandas dataframes in one line of code.

Language: Python - Size: 556 KB - Last synced at: 3 days ago - Pushed at: 5 days ago - Stars: 106 - Forks: 16

rpy2/rpy2-arrow

Share Apache Arrow datasets between Python and R.

Language: Python - Size: 671 KB - Last synced at: 13 days ago - Pushed at: 13 days ago - Stars: 18 - Forks: 3

tansu-io/example-spark

Tansu schema-backed topics, instantly accessible as Apache Iceberg tables in Apache Spark

Language: Just - Size: 13.7 KB - Last synced at: 13 days ago - Pushed at: 13 days ago - Stars: 0 - Forks: 0

tansu-io/example-pyiceberg

Tansu schema-backed topics, instantly accessible as Apache Iceberg tables with pyiceberg

Language: Just - Size: 38.1 KB - Last synced at: 13 days ago - Pushed at: 13 days ago - Stars: 0 - Forks: 0

G-Research/ParquetSharp

ParquetSharp is a .NET library for reading and writing Apache Parquet files.

Language: C# - Size: 1.72 MB - Last synced at: 10 days ago - Pushed at: 17 days ago - Stars: 206 - Forks: 55

geoarrow/deck.gl-layers

deck.gl layers for rendering GeoArrow data

Language: TypeScript - Size: 2.59 MB - Last synced at: 5 days ago - Pushed at: 4 months ago - Stars: 121 - Forks: 8

nanoporetech/pod5-file-format

Pod5: a high performance file format for nanopore reads.

Language: C++ - Size: 29 MB - Last synced at: 16 days ago - Pushed at: 18 days ago - Stars: 149 - Forks: 20

lykmapipo/Python-Spark-Log-Analysis

Python scripts to process, and analyze log files using PySpark.

Language: Python - Size: 131 KB - Last synced at: 6 days ago - Pushed at: 11 months ago - Stars: 6 - Forks: 0

lykmapipo/NYC-TLC-Trip-Data

Python scripts to download, process, and analyze the New York City Taxi and Limousine Commission (TLC) Trip Record Data dataset

Language: Jupyter Notebook - Size: 100 MB - Last synced at: 6 days ago - Pushed at: 10 months ago - Stars: 5 - Forks: 1

developmentseed/lonboard

A Python library for fast, interactive geospatial vector data visualization in Jupyter.

Language: Python - Size: 122 MB - Last synced at: 17 days ago - Pushed at: 26 days ago - Stars: 754 - Forks: 39

dabevlohn/vispar

Build graphs on data from Parquet-files

Language: JavaScript - Size: 1000 Bytes - Last synced at: 21 days ago - Pushed at: 21 days ago - Stars: 1 - Forks: 0

red-data-tools/red_amber

A dataframe library for Rubyists.

Language: Ruby - Size: 5.34 MB - Last synced at: 14 days ago - Pushed at: 23 days ago - Stars: 71 - Forks: 13

1duo/awesome-ai-infrastructures

Infrastructures™ for Machine Learning Training/Inference in Production.

Size: 11.8 MB - Last synced at: 19 days ago - Pushed at: about 6 years ago - Stars: 416 - Forks: 74

abs-tudelft/fletcher

Fletcher: A framework to integrate FPGA accelerators with Apache Arrow

Language: VHDL - Size: 8.05 MB - Last synced at: 2 days ago - Pushed at: 2 months ago - Stars: 226 - Forks: 31

mbrobbel/narrow

An experimental (work-in-progress) statically typed implementation of Apache Arrow

Language: Rust - Size: 1.32 MB - Last synced at: 19 days ago - Pushed at: 19 days ago - Stars: 19 - Forks: 5

google/space 📦

Unified storage framework for the entire machine learning lifecycle

Language: Python - Size: 825 KB - Last synced at: 27 days ago - Pushed at: over 1 year ago - Stars: 155 - Forks: 8

kszucs/firebolt

Arrow implementation in Mojo

Language: Mojo - Size: 47.9 KB - Last synced at: about 10 hours ago - Pushed at: about 1 month ago - Stars: 21 - Forks: 1

apache/arrow-julia

Official Julia implementation of Apache Arrow

Language: Julia - Size: 1.99 MB - Last synced at: 6 days ago - Pushed at: 28 days ago - Stars: 291 - Forks: 64

ModelarData/ModelarDB-RS

ModelarDB: Model-Based Time Series Management from Edge to Client

Language: Rust - Size: 1.84 MB - Last synced at: 6 days ago - Pushed at: 7 days ago - Stars: 14 - Forks: 5

nevi-me/rust-dataframe 📦

A Rust DataFrame implementation, built on Apache Arrow

Language: Rust - Size: 253 KB - Last synced at: 4 days ago - Pushed at: over 4 years ago - Stars: 280 - Forks: 20

alekLukanen/ChapterhouseQE

A simple distributed SQL query engine written in Rust

Language: Rust - Size: 4.85 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 2 - Forks: 1

grouzen/zio-apache-arrow

Scala ZIO-powered Apache Arrow library

Language: Scala - Size: 485 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 21 - Forks: 1

wilhelmagren/falkorflight

Apache Arrow Flight server for OpenCypher queries to FalkorDB.

Language: Python - Size: 12.7 KB - Last synced at: 21 days ago - Pushed at: about 1 month ago - Stars: 2 - Forks: 0

baggiponte/awesome-pandas-alternatives

Awesome list of alternative dataframe libraries in Python.

Size: 21.5 KB - Last synced at: about 1 month ago - Pushed at: over 2 years ago - Stars: 48 - Forks: 2

amoeba/arrow-python-js-ipc-example

Example showing how to send Arrow RecordBatches from a Python backend to a web browser.

Language: JavaScript - Size: 26.4 KB - Last synced at: 3 days ago - Pushed at: over 2 years ago - Stars: 10 - Forks: 2

influxdata/flightsql-dbapi

DB API 2 interface for Flight SQL with SQLAlchemy extras.

Language: Python - Size: 188 KB - Last synced at: 13 days ago - Pushed at: 2 months ago - Stars: 39 - Forks: 5

cldellow/sqlite-parquet-vtable

A SQLite vtable extension to read Parquet files

Language: C++ - Size: 404 KB - Last synced at: 5 days ago - Pushed at: about 4 years ago - Stars: 271 - Forks: 32

tradewelltech/protarrow

Convert from protobuf to arrow and back

Language: Python - Size: 9.06 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 25 - Forks: 3

ippras/metadata

Metadata for Apache Arrow IPC format

Language: Rust - Size: 41 KB - Last synced at: 19 days ago - Pushed at: 19 days ago - Stars: 0 - Forks: 0

UWHustle/hustle

In-memory, columnar, arrow-based database.

Language: C++ - Size: 13.8 MB - Last synced at: about 2 months ago - Pushed at: almost 3 years ago - Stars: 46 - Forks: 7

amoeba/QLArrow

WIP QuickLook plugin for Apache Arrow and Parquet

Language: C - Size: 23 MB - Last synced at: 3 days ago - Pushed at: 6 months ago - Stars: 16 - Forks: 1

cmudig/falcon-vis Fork of vega/falcon

Cross-filter millions (or even billions) of data entries with no interaction delay

Language: Jupyter Notebook - Size: 131 MB - Last synced at: 9 days ago - Pushed at: over 1 year ago - Stars: 100 - Forks: 2

gr-oss-devops/ParquetSharp Fork of G-Research/ParquetSharp

ParquetSharp is a .NET library for reading and writing Apache Parquet files.

Language: C# - Size: 1.72 MB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 0 - Forks: 0

spaghettifunk/norman

Realtime distributed OLAP datastore, designed to answer OLAP queries with low latency written in Go. In Active development

Language: Go - Size: 370 KB - Last synced at: about 2 months ago - Pushed at: 2 months ago - Stars: 5 - Forks: 0

scikit-hep/awkward-0.x 📦

Manipulate arrays of complex data structures as easily as Numpy.

Language: Python - Size: 6.42 MB - Last synced at: 26 days ago - Pushed at: over 4 years ago - Stars: 214 - Forks: 40

duo-rs/duo

A lightweight Logging and Tracing observability solution for Rust, built with Apache Arrow, Apache Parquet and Apache DataFusion.

Language: Rust - Size: 2.51 MB - Last synced at: 2 months ago - Pushed at: 9 months ago - Stars: 73 - Forks: 7

svraka/asmisc

🧰 Miscellaneous R utility functions

Language: R - Size: 164 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

geoarrow/geoarrow-js

TypeScript implementation of GeoArrow

Language: TypeScript - Size: 308 KB - Last synced at: 26 days ago - Pushed at: 4 months ago - Stars: 28 - Forks: 6

cpg314/polarhouse

Interoperability between Polars and Clickhouse

Language: Rust - Size: 87.9 KB - Last synced at: about 2 months ago - Pushed at: 3 months ago - Stars: 9 - Forks: 1

igor-suhorukov/openstreetmap_h3

OSM planet dump high performance data loader. Transform OpenStreetMap World/Region PBF dump into partitioned by H3 regions PostGIS pgsnapshot (lossless) OSM schema representation and/or into ArrowIPC/Parquet dumps

Language: Java - Size: 6.06 MB - Last synced at: 19 days ago - Pushed at: 4 months ago - Stars: 92 - Forks: 8

cldellow/csv2parquet

Convert a CSV to a parquet file.

Language: Python - Size: 97.7 KB - Last synced at: 5 days ago - Pushed at: over 2 years ago - Stars: 64 - Forks: 14

Benjamin-Philip/serde_arrow

Serialization and deserialization to Apache Arrow for Erlang

Language: Erlang - Size: 158 KB - Last synced at: 6 days ago - Pushed at: 6 months ago - Stars: 11 - Forks: 1

iljavaleev/arrow_examples

apache arrow cpp examples

Language: Jupyter Notebook - Size: 157 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

makcymal/arrow-view

CLI preview of Apache Arrow files

Language: C++ - Size: 214 KB - Last synced at: 2 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

amoeba/arrow-pybind11-example

Minimal example of passing Arrow objects from Python to a C++ extension

Language: C++ - Size: 9.77 KB - Last synced at: 3 days ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

amoeba/arrow-cmake-fetchcontent

Minimal example of including Arrow in a C++ project using CMake and FetchContent

Language: C++ - Size: 18.6 KB - Last synced at: 6 days ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

cupiddb/cupiddb

In-memory Columnar Database

Language: Rust - Size: 49.8 KB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 4 - Forks: 0

sonhmai/how-sqlite-works

A Book about how SQLite works. Rewriting SQLite in Rust for Learning and Fun and writing a book I wished I had when started.

Language: Rust - Size: 16 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 21 - Forks: 1

amoeba/arrow-cpp-csv-examples

Short demonstration of Apache Arrow's CSV readers

Language: C++ - Size: 171 KB - Last synced at: 6 days ago - Pushed at: 6 months ago - Stars: 0 - Forks: 0

amoeba/arrow-gcs-test

Short example showing how to use GCS with Arrow C++

Language: C++ - Size: 6.84 KB - Last synced at: 6 days ago - Pushed at: 6 months ago - Stars: 0 - Forks: 0

amoeba/arrow-declaration-to-examples

Language: C++ - Size: 7.81 KB - Last synced at: 6 days ago - Pushed at: 6 months ago - Stars: 0 - Forks: 0

amoeba/arrow-opentelemetry-example

Example of using OpenTelemetry and Apache Arrow

Language: Python - Size: 115 KB - Last synced at: 3 days ago - Pushed at: 6 months ago - Stars: 2 - Forks: 0

amoeba/arrow-cpp-conan-example

Example using conan to package and use libarrow

Language: CMake - Size: 7.81 KB - Last synced at: 5 days ago - Pushed at: 6 months ago - Stars: 1 - Forks: 0

BauplanLabs/playlist-recomendations-with-bauplan-and-mongodb

Reference implementation of embedding-based, sequential recommendations, using Bauplan (with Apache Iceberg + Apache Arrow) for data preparation and training, and MongoDB for serving real-time suggestions.

Size: 20.5 KB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 0 - Forks: 0

webysther/aws-glue-docker 📦

🐋 Docker image for AWS Glue Spark/Python

Language: Dockerfile - Size: 56.6 KB - Last synced at: about 1 month ago - Pushed at: almost 2 years ago - Stars: 23 - Forks: 8

glimmerphoenix/dataeng_book

Libro Fundamentos de Ingeniería de Datos

Language: TeX - Size: 634 KB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 0 - Forks: 0

poopoothegorilla/fastframe

DataFrame project that utilizes Apache Arrow

Language: Go - Size: 218 KB - Last synced at: about 1 month ago - Pushed at: almost 5 years ago - Stars: 8 - Forks: 0

firelink-data/evolution

Efficiently evolve your old fixed-length data files into modern file formats.

Language: Rust - Size: 657 KB - Last synced at: 1 day ago - Pushed at: 5 months ago - Stars: 6 - Forks: 0

tiwater/rerun-query

Query and extract entity data from Rerun data files.

Language: Rust - Size: 7.94 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 2 - Forks: 0

dadepo/df_extras

A collection of user defined functions, from your favourite databases, in Apache Datafusion

Language: Rust - Size: 161 KB - Last synced at: 7 months ago - Pushed at: 7 months ago - Stars: 0 - Forks: 0

pachadotdev/tradestatistics-plumber-api

tradestatistics.io API, reads from PostgreSQL and provides tidy CSV and Apache Arrow data

Language: R - Size: 166 KB - Last synced at: 2 months ago - Pushed at: 10 months ago - Stars: 3 - Forks: 2

rupurt/zodbc

A blazing fast ODBC Zig client

Language: Zig - Size: 125 KB - Last synced at: about 1 month ago - Pushed at: about 1 year ago - Stars: 6 - Forks: 2

Desdaemon/polars_dart

Dart bindings for the polars library

Language: Dart - Size: 968 KB - Last synced at: about 2 months ago - Pushed at: about 1 year ago - Stars: 11 - Forks: 1

alekLukanen/arrow-ops

Golang implementation of common Apache Arrow operations

Language: Go - Size: 62.5 KB - Last synced at: 7 months ago - Pushed at: 9 months ago - Stars: 0 - Forks: 0

amoeba/arrow-cpp-examples

Various Arrow C++ examples

Language: C++ - Size: 132 KB - Last synced at: 6 days ago - Pushed at: 9 months ago - Stars: 0 - Forks: 0

amoeba/arrow-cpp-wasm

Playing around with Arrow C++ and WASM, see Website for demo

Language: HTML - Size: 4.48 MB - Last synced at: 6 days ago - Pushed at: 9 months ago - Stars: 0 - Forks: 0

tradestatistics/database-postgresql Fork of pachadotdev/tradestatistics-database-postgresql

Tidy trade data from UN COMTRADE and also countries, commodities, units, and reporting system tables. Writes to PostgreSQL.

Language: R - Size: 51.4 MB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 1 - Forks: 0

perspective-community/arrow-wasm-cpp 📦

Standalone Apache Arrow compiled to WebAssembly, extracted from https://github.com/finos/perspective

Language: CMake - Size: 88.9 KB - Last synced at: 8 days ago - Pushed at: over 1 year ago - Stars: 8 - Forks: 0

tradestatistics/plumber-api Fork of pachadotdev/tradestatistics-plumber-api

tradestatistics.io API, reads from PostgreSQL and provides tidy CSV and Apache Arrow data

Language: R - Size: 166 KB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 2 - Forks: 0

roeap/flight-sql-client-node

A Flight SQL client for Node.js

Language: Rust - Size: 1.11 MB - Last synced at: 17 days ago - Pushed at: over 1 year ago - Stars: 4 - Forks: 2