GitHub topics: apache-arrow
pixie-io/pixie
Instant Kubernetes-Native Application Observability
Language: C++ - Size: 115 MB - Last synced at: about 5 hours ago - Pushed at: about 19 hours ago - Stars: 6,045 - Forks: 469

tansu-io/tansu
Apache Kafka® compatible broker with S3, PostgreSQL, Apache Iceberg and Delta Lake
Language: Rust - Size: 2.36 MB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 388 - Forks: 11

graphext/lector
A fast reader for messy CSV files with optional type inference.
Language: Python - Size: 245 KB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 17 - Forks: 0

lancedb/lance
Modern columnar data format for ML and LLMs implemented in Rust. Convert from parquet in 2 lines of code for 100x faster random access, vector index, and data versioning. Compatible with Pandas, DuckDB, Polars, Pyarrow, and PyTorch with more integrations coming..
Language: Rust - Size: 22.6 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 4,727 - Forks: 311

aws/aws-sdk-pandas
pandas on AWS - Easy integration with Athena, Glue, Redshift, Timestream, Neptune, OpenSearch, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretManager, PostgreSQL, MySQL, SQLServer and S3 (Parquet, CSV, JSON and EXCEL).
Language: Python - Size: 17 MB - Last synced at: 2 days ago - Pushed at: 5 days ago - Stars: 4,024 - Forks: 707

tradewelltech/beavers
Python stream processing for analytics
Language: Python - Size: 646 KB - Last synced at: 3 days ago - Pushed at: 4 days ago - Stars: 39 - Forks: 2

scikit-hep/awkward
Manipulate JSON-like data with NumPy-like idioms.
Language: Python - Size: 26.1 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 887 - Forks: 93

madesroches/micromegas
Scalable Observability
Language: Rust - Size: 2.77 MB - Last synced at: 4 days ago - Pushed at: 5 days ago - Stars: 17 - Forks: 5

polarsignals/frostdb
❄️ Coolest database around 🧊 Embeddable column database written in Go.
Language: Go - Size: 14.2 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 1,421 - Forks: 68

abdenlab/oxbow
Oxbow makes genomic data accessible for high-performance analytics.
Language: Rust - Size: 16.1 MB - Last synced at: 4 days ago - Pushed at: 5 days ago - Stars: 79 - Forks: 8

geopolars/geopolars
Geospatial extensions for Polars
Language: Rust - Size: 5.93 MB - Last synced at: 5 days ago - Pushed at: 10 months ago - Stars: 712 - Forks: 24

apache/arrow-dotnet
Official .NET implementation of Apache Arrow
Size: 0 Bytes - Last synced at: 6 days ago - Pushed at: 2 months ago - Stars: 3 - Forks: 0

visgl/loaders.gl
Loaders for big data visualization. Website:
Language: TypeScript - Size: 293 MB - Last synced at: 3 days ago - Pushed at: 2 months ago - Stars: 753 - Forks: 206

apache/arrow-js
Official JavaScript implementation of Apache Arrow
Language: TypeScript - Size: 7.66 MB - Last synced at: 6 days ago - Pushed at: 10 days ago - Stars: 16 - Forks: 3

kylebarron/parquet-wasm
Rust-based WebAssembly bindings to read and write Apache Parquet data
Language: Rust - Size: 2.67 MB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 588 - Forks: 20

mluttikh/xml2arrow-python
Convert XML data to Apache Arrow tables
Language: Python - Size: 66.4 KB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 0 - Forks: 0

mluttikh/xml2arrow
Efficiently convert XML data to Apache Arrow format for high-performance data processing
Language: Rust - Size: 249 KB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 5 - Forks: 0

elixir-explorer/adbc
Apache Arrow ADBC bindings for Elixir
Language: C++ - Size: 4.83 MB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 66 - Forks: 17

apache/arrow-swift
Official Swift implementation of Apache Arrow
Language: Swift - Size: 332 KB - Last synced at: 6 days ago - Pushed at: 9 days ago - Stars: 0 - Forks: 1

apache/arrow-go
Official Go implementation of Apache Arrow
Language: Assembly - Size: 19.2 MB - Last synced at: 8 days ago - Pushed at: 10 days ago - Stars: 186 - Forks: 35

mattf96s/QuackDB
Open-source in-browser DuckDB SQL editor
Language: TypeScript - Size: 3.6 MB - Last synced at: 3 days ago - Pushed at: about 2 months ago - Stars: 160 - Forks: 8

kylebarron/arrow-wasm
Building block library for using Apache Arrow in Rust WebAssembly modules.
Language: Rust - Size: 290 KB - Last synced at: 5 days ago - Pushed at: 12 days ago - Stars: 24 - Forks: 6

JosiahParry/arrow-extendr
Integration between arrow-rs and extendr
Language: Rust - Size: 77.1 KB - Last synced at: about 23 hours ago - Pushed at: 10 days ago - Stars: 22 - Forks: 2

kylebarron/arrow-js-ffi
Zero-copy reading of Arrow data from WebAssembly
Language: TypeScript - Size: 360 KB - Last synced at: 5 days ago - Pushed at: 12 months ago - Stars: 116 - Forks: 9

geoarrow/geoarrow-rs
GeoArrow in Rust, Python, and JavaScript (WebAssembly) with vectorized geometry operations
Language: Rust - Size: 14.2 MB - Last synced at: 9 days ago - Pushed at: 10 days ago - Stars: 330 - Forks: 27

man-group/sparrow
C++20 idiomatic APIs for the Apache Arrow Columnar Format
Language: C++ - Size: 1.54 MB - Last synced at: 10 days ago - Pushed at: 10 days ago - Stars: 86 - Forks: 19

geoarrow/geoarrow
Specification for storing geospatial data in Apache Arrow
Size: 120 KB - Last synced at: 10 days ago - Pushed at: 10 days ago - Stars: 467 - Forks: 27

unum-cloud/ustore
Multi-Modal Database replacing MongoDB, Neo4J, and Elastic with 1 faster ACID solution, with NetworkX and Pandas interfaces, and bindings for C 99, C++ 17, Python 3, Java, GoLang 🗄️
Language: C++ - Size: 6.56 MB - Last synced at: 8 days ago - Pushed at: almost 2 years ago - Stars: 600 - Forks: 34

apache/arrow-java
Official Java implementation of Apache Arrow
Language: Java - Size: 25.1 MB - Last synced at: 6 days ago - Pushed at: 10 days ago - Stars: 49 - Forks: 54

kylebarron/arro3
A minimal Python library for Apache Arrow, connecting to the Rust arrow crate
Language: Rust - Size: 3.5 MB - Last synced at: 11 days ago - Pushed at: 11 days ago - Stars: 147 - Forks: 12

mongodb-labs/mongo-arrow
MongoDB integrations for Apache Arrow. Export MongoDB documents to numpy array, parquet files, and pandas dataframes in one line of code.
Language: Python - Size: 556 KB - Last synced at: 3 days ago - Pushed at: 5 days ago - Stars: 106 - Forks: 16

rpy2/rpy2-arrow
Share Apache Arrow datasets between Python and R.
Language: Python - Size: 671 KB - Last synced at: 13 days ago - Pushed at: 13 days ago - Stars: 18 - Forks: 3

tansu-io/example-spark
Tansu schema-backed topics, instantly accessible as Apache Iceberg tables in Apache Spark
Language: Just - Size: 13.7 KB - Last synced at: 13 days ago - Pushed at: 13 days ago - Stars: 0 - Forks: 0

tansu-io/example-pyiceberg
Tansu schema-backed topics, instantly accessible as Apache Iceberg tables with pyiceberg
Language: Just - Size: 38.1 KB - Last synced at: 13 days ago - Pushed at: 13 days ago - Stars: 0 - Forks: 0

G-Research/ParquetSharp
ParquetSharp is a .NET library for reading and writing Apache Parquet files.
Language: C# - Size: 1.72 MB - Last synced at: 10 days ago - Pushed at: 17 days ago - Stars: 206 - Forks: 55

geoarrow/deck.gl-layers
deck.gl layers for rendering GeoArrow data
Language: TypeScript - Size: 2.59 MB - Last synced at: 5 days ago - Pushed at: 4 months ago - Stars: 121 - Forks: 8

nanoporetech/pod5-file-format
Pod5: a high performance file format for nanopore reads.
Language: C++ - Size: 29 MB - Last synced at: 16 days ago - Pushed at: 18 days ago - Stars: 149 - Forks: 20

lykmapipo/Python-Spark-Log-Analysis
Python scripts to process, and analyze log files using PySpark.
Language: Python - Size: 131 KB - Last synced at: 6 days ago - Pushed at: 11 months ago - Stars: 6 - Forks: 0

lykmapipo/NYC-TLC-Trip-Data
Python scripts to download, process, and analyze the New York City Taxi and Limousine Commission (TLC) Trip Record Data dataset
Language: Jupyter Notebook - Size: 100 MB - Last synced at: 6 days ago - Pushed at: 10 months ago - Stars: 5 - Forks: 1

developmentseed/lonboard
A Python library for fast, interactive geospatial vector data visualization in Jupyter.
Language: Python - Size: 122 MB - Last synced at: 17 days ago - Pushed at: 26 days ago - Stars: 754 - Forks: 39

dabevlohn/vispar
Build graphs on data from Parquet-files
Language: JavaScript - Size: 1000 Bytes - Last synced at: 21 days ago - Pushed at: 21 days ago - Stars: 1 - Forks: 0

red-data-tools/red_amber
A dataframe library for Rubyists.
Language: Ruby - Size: 5.34 MB - Last synced at: 14 days ago - Pushed at: 23 days ago - Stars: 71 - Forks: 13

1duo/awesome-ai-infrastructures
Infrastructures™ for Machine Learning Training/Inference in Production.
Size: 11.8 MB - Last synced at: 19 days ago - Pushed at: about 6 years ago - Stars: 416 - Forks: 74

abs-tudelft/fletcher
Fletcher: A framework to integrate FPGA accelerators with Apache Arrow
Language: VHDL - Size: 8.05 MB - Last synced at: 2 days ago - Pushed at: 2 months ago - Stars: 226 - Forks: 31

mbrobbel/narrow
An experimental (work-in-progress) statically typed implementation of Apache Arrow
Language: Rust - Size: 1.32 MB - Last synced at: 19 days ago - Pushed at: 19 days ago - Stars: 19 - Forks: 5

google/space 📦
Unified storage framework for the entire machine learning lifecycle
Language: Python - Size: 825 KB - Last synced at: 27 days ago - Pushed at: over 1 year ago - Stars: 155 - Forks: 8

kszucs/firebolt
Arrow implementation in Mojo
Language: Mojo - Size: 47.9 KB - Last synced at: about 10 hours ago - Pushed at: about 1 month ago - Stars: 21 - Forks: 1

apache/arrow-julia
Official Julia implementation of Apache Arrow
Language: Julia - Size: 1.99 MB - Last synced at: 6 days ago - Pushed at: 28 days ago - Stars: 291 - Forks: 64

ModelarData/ModelarDB-RS
ModelarDB: Model-Based Time Series Management from Edge to Client
Language: Rust - Size: 1.84 MB - Last synced at: 6 days ago - Pushed at: 7 days ago - Stars: 14 - Forks: 5

nevi-me/rust-dataframe 📦
A Rust DataFrame implementation, built on Apache Arrow
Language: Rust - Size: 253 KB - Last synced at: 4 days ago - Pushed at: over 4 years ago - Stars: 280 - Forks: 20

alekLukanen/ChapterhouseQE
A simple distributed SQL query engine written in Rust
Language: Rust - Size: 4.85 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 2 - Forks: 1

grouzen/zio-apache-arrow
Scala ZIO-powered Apache Arrow library
Language: Scala - Size: 485 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 21 - Forks: 1

wilhelmagren/falkorflight
Apache Arrow Flight server for OpenCypher queries to FalkorDB.
Language: Python - Size: 12.7 KB - Last synced at: 21 days ago - Pushed at: about 1 month ago - Stars: 2 - Forks: 0

baggiponte/awesome-pandas-alternatives
Awesome list of alternative dataframe libraries in Python.
Size: 21.5 KB - Last synced at: about 1 month ago - Pushed at: over 2 years ago - Stars: 48 - Forks: 2

amoeba/arrow-python-js-ipc-example
Example showing how to send Arrow RecordBatches from a Python backend to a web browser.
Language: JavaScript - Size: 26.4 KB - Last synced at: 3 days ago - Pushed at: over 2 years ago - Stars: 10 - Forks: 2

influxdata/flightsql-dbapi
DB API 2 interface for Flight SQL with SQLAlchemy extras.
Language: Python - Size: 188 KB - Last synced at: 13 days ago - Pushed at: 2 months ago - Stars: 39 - Forks: 5

cldellow/sqlite-parquet-vtable
A SQLite vtable extension to read Parquet files
Language: C++ - Size: 404 KB - Last synced at: 5 days ago - Pushed at: about 4 years ago - Stars: 271 - Forks: 32

tradewelltech/protarrow
Convert from protobuf to arrow and back
Language: Python - Size: 9.06 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 25 - Forks: 3

ippras/metadata
Metadata for Apache Arrow IPC format
Language: Rust - Size: 41 KB - Last synced at: 19 days ago - Pushed at: 19 days ago - Stars: 0 - Forks: 0

UWHustle/hustle
In-memory, columnar, arrow-based database.
Language: C++ - Size: 13.8 MB - Last synced at: about 2 months ago - Pushed at: almost 3 years ago - Stars: 46 - Forks: 7

amoeba/QLArrow
WIP QuickLook plugin for Apache Arrow and Parquet
Language: C - Size: 23 MB - Last synced at: 3 days ago - Pushed at: 6 months ago - Stars: 16 - Forks: 1

cmudig/falcon-vis Fork of vega/falcon
Cross-filter millions (or even billions) of data entries with no interaction delay
Language: Jupyter Notebook - Size: 131 MB - Last synced at: 9 days ago - Pushed at: over 1 year ago - Stars: 100 - Forks: 2

gr-oss-devops/ParquetSharp Fork of G-Research/ParquetSharp
ParquetSharp is a .NET library for reading and writing Apache Parquet files.
Language: C# - Size: 1.72 MB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 0 - Forks: 0

spaghettifunk/norman
Realtime distributed OLAP datastore, designed to answer OLAP queries with low latency written in Go. In Active development
Language: Go - Size: 370 KB - Last synced at: about 2 months ago - Pushed at: 2 months ago - Stars: 5 - Forks: 0

scikit-hep/awkward-0.x 📦
Manipulate arrays of complex data structures as easily as Numpy.
Language: Python - Size: 6.42 MB - Last synced at: 26 days ago - Pushed at: over 4 years ago - Stars: 214 - Forks: 40

duo-rs/duo
A lightweight Logging and Tracing observability solution for Rust, built with Apache Arrow, Apache Parquet and Apache DataFusion.
Language: Rust - Size: 2.51 MB - Last synced at: 2 months ago - Pushed at: 9 months ago - Stars: 73 - Forks: 7

svraka/asmisc
🧰 Miscellaneous R utility functions
Language: R - Size: 164 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

geoarrow/geoarrow-js
TypeScript implementation of GeoArrow
Language: TypeScript - Size: 308 KB - Last synced at: 26 days ago - Pushed at: 4 months ago - Stars: 28 - Forks: 6

cpg314/polarhouse
Interoperability between Polars and Clickhouse
Language: Rust - Size: 87.9 KB - Last synced at: about 2 months ago - Pushed at: 3 months ago - Stars: 9 - Forks: 1

igor-suhorukov/openstreetmap_h3
OSM planet dump high performance data loader. Transform OpenStreetMap World/Region PBF dump into partitioned by H3 regions PostGIS pgsnapshot (lossless) OSM schema representation and/or into ArrowIPC/Parquet dumps
Language: Java - Size: 6.06 MB - Last synced at: 19 days ago - Pushed at: 4 months ago - Stars: 92 - Forks: 8

cldellow/csv2parquet
Convert a CSV to a parquet file.
Language: Python - Size: 97.7 KB - Last synced at: 5 days ago - Pushed at: over 2 years ago - Stars: 64 - Forks: 14

Benjamin-Philip/serde_arrow
Serialization and deserialization to Apache Arrow for Erlang
Language: Erlang - Size: 158 KB - Last synced at: 6 days ago - Pushed at: 6 months ago - Stars: 11 - Forks: 1

iljavaleev/arrow_examples
apache arrow cpp examples
Language: Jupyter Notebook - Size: 157 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

makcymal/arrow-view
CLI preview of Apache Arrow files
Language: C++ - Size: 214 KB - Last synced at: 2 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

amoeba/arrow-pybind11-example
Minimal example of passing Arrow objects from Python to a C++ extension
Language: C++ - Size: 9.77 KB - Last synced at: 3 days ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

amoeba/arrow-cmake-fetchcontent
Minimal example of including Arrow in a C++ project using CMake and FetchContent
Language: C++ - Size: 18.6 KB - Last synced at: 6 days ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

cupiddb/cupiddb
In-memory Columnar Database
Language: Rust - Size: 49.8 KB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 4 - Forks: 0

sonhmai/how-sqlite-works
A Book about how SQLite works. Rewriting SQLite in Rust for Learning and Fun and writing a book I wished I had when started.
Language: Rust - Size: 16 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 21 - Forks: 1

amoeba/arrow-cpp-csv-examples
Short demonstration of Apache Arrow's CSV readers
Language: C++ - Size: 171 KB - Last synced at: 6 days ago - Pushed at: 6 months ago - Stars: 0 - Forks: 0

amoeba/arrow-gcs-test
Short example showing how to use GCS with Arrow C++
Language: C++ - Size: 6.84 KB - Last synced at: 6 days ago - Pushed at: 6 months ago - Stars: 0 - Forks: 0

amoeba/arrow-declaration-to-examples
Language: C++ - Size: 7.81 KB - Last synced at: 6 days ago - Pushed at: 6 months ago - Stars: 0 - Forks: 0

amoeba/arrow-opentelemetry-example
Example of using OpenTelemetry and Apache Arrow
Language: Python - Size: 115 KB - Last synced at: 3 days ago - Pushed at: 6 months ago - Stars: 2 - Forks: 0

amoeba/arrow-cpp-conan-example
Example using conan to package and use libarrow
Language: CMake - Size: 7.81 KB - Last synced at: 5 days ago - Pushed at: 6 months ago - Stars: 1 - Forks: 0

BauplanLabs/playlist-recomendations-with-bauplan-and-mongodb
Reference implementation of embedding-based, sequential recommendations, using Bauplan (with Apache Iceberg + Apache Arrow) for data preparation and training, and MongoDB for serving real-time suggestions.
Size: 20.5 KB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 0 - Forks: 0

webysther/aws-glue-docker 📦
🐋 Docker image for AWS Glue Spark/Python
Language: Dockerfile - Size: 56.6 KB - Last synced at: about 1 month ago - Pushed at: almost 2 years ago - Stars: 23 - Forks: 8

glimmerphoenix/dataeng_book
Libro Fundamentos de Ingeniería de Datos
Language: TeX - Size: 634 KB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 0 - Forks: 0

poopoothegorilla/fastframe
DataFrame project that utilizes Apache Arrow
Language: Go - Size: 218 KB - Last synced at: about 1 month ago - Pushed at: almost 5 years ago - Stars: 8 - Forks: 0

firelink-data/evolution
Efficiently evolve your old fixed-length data files into modern file formats.
Language: Rust - Size: 657 KB - Last synced at: 1 day ago - Pushed at: 5 months ago - Stars: 6 - Forks: 0

tiwater/rerun-query
Query and extract entity data from Rerun data files.
Language: Rust - Size: 7.94 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 2 - Forks: 0

dadepo/df_extras
A collection of user defined functions, from your favourite databases, in Apache Datafusion
Language: Rust - Size: 161 KB - Last synced at: 7 months ago - Pushed at: 7 months ago - Stars: 0 - Forks: 0

pachadotdev/tradestatistics-plumber-api
tradestatistics.io API, reads from PostgreSQL and provides tidy CSV and Apache Arrow data
Language: R - Size: 166 KB - Last synced at: 2 months ago - Pushed at: 10 months ago - Stars: 3 - Forks: 2

rupurt/zodbc
A blazing fast ODBC Zig client
Language: Zig - Size: 125 KB - Last synced at: about 1 month ago - Pushed at: about 1 year ago - Stars: 6 - Forks: 2

Desdaemon/polars_dart
Dart bindings for the polars library
Language: Dart - Size: 968 KB - Last synced at: about 2 months ago - Pushed at: about 1 year ago - Stars: 11 - Forks: 1

alekLukanen/arrow-ops
Golang implementation of common Apache Arrow operations
Language: Go - Size: 62.5 KB - Last synced at: 7 months ago - Pushed at: 9 months ago - Stars: 0 - Forks: 0

amoeba/arrow-cpp-examples
Various Arrow C++ examples
Language: C++ - Size: 132 KB - Last synced at: 6 days ago - Pushed at: 9 months ago - Stars: 0 - Forks: 0

amoeba/arrow-cpp-wasm
Playing around with Arrow C++ and WASM, see Website for demo
Language: HTML - Size: 4.48 MB - Last synced at: 6 days ago - Pushed at: 9 months ago - Stars: 0 - Forks: 0

tradestatistics/database-postgresql Fork of pachadotdev/tradestatistics-database-postgresql
Tidy trade data from UN COMTRADE and also countries, commodities, units, and reporting system tables. Writes to PostgreSQL.
Language: R - Size: 51.4 MB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 1 - Forks: 0

perspective-community/arrow-wasm-cpp 📦
Standalone Apache Arrow compiled to WebAssembly, extracted from https://github.com/finos/perspective
Language: CMake - Size: 88.9 KB - Last synced at: 8 days ago - Pushed at: over 1 year ago - Stars: 8 - Forks: 0

tradestatistics/plumber-api Fork of pachadotdev/tradestatistics-plumber-api
tradestatistics.io API, reads from PostgreSQL and provides tidy CSV and Apache Arrow data
Language: R - Size: 166 KB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 2 - Forks: 0

roeap/flight-sql-client-node
A Flight SQL client for Node.js
Language: Rust - Size: 1.11 MB - Last synced at: 17 days ago - Pushed at: over 1 year ago - Stars: 4 - Forks: 2
