Topic: "dataframe"
pola-rs/polars
Dataframes powered by a multithreaded, vectorized query engine, written in Rust
Language: Rust - Size: 191 MB - Last synced at: 5 days ago - Pushed at: 6 days ago - Stars: 33,888 - Forks: 2,253

Kanaries/pygwalker
PyGWalker: Turn your dataframe into an interactive UI for visual analysis
Language: Python - Size: 62.9 MB - Last synced at: 20 days ago - Pushed at: 20 days ago - Stars: 14,835 - Forks: 792

modin-project/modin
Modin: Scale your Pandas workflows by changing a single line of code
Language: Python - Size: 50.9 MB - Last synced at: 7 days ago - Pushed at: 9 days ago - Stars: 10,179 - Forks: 666

rapidsai/cudf
cuDF - GPU DataFrame Library
Language: C++ - Size: 157 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 8,956 - Forks: 951

vaexio/vaex
Out-of-Core hybrid Apache Arrow/NumPy DataFrame for Python, ML, visualization and exploration of big tabular data at a billion rows per second š
Language: Python - Size: 133 MB - Last synced at: 5 days ago - Pushed at: 8 months ago - Stars: 8,387 - Forks: 598

apache/datafusion
Apache DataFusion SQL Query Engine
Language: Rust - Size: 142 MB - Last synced at: 5 days ago - Pushed at: 6 days ago - Stars: 7,281 - Forks: 1,501

haifengl/smile
Statistical Machine Intelligence & Learning Engine
Language: Java - Size: 245 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 6,193 - Forks: 1,141

javascriptdata/danfojs
Danfo.js is an open source, JavaScript library providing high performance, intuitive, and easy to use data structures for manipulating and processing structured data.
Language: TypeScript - Size: 79 MB - Last synced at: 3 days ago - Pushed at: 2 months ago - Stars: 4,935 - Forks: 216

lk-geimfari/mimesis
Mimesis is a robust data generator for Python that can produce a wide range of fake data in multiple languages.
Language: Python - Size: 33.8 MB - Last synced at: 5 days ago - Pushed at: 28 days ago - Stars: 4,578 - Forks: 341

jtablesaw/tablesaw
Java dataframe and visualization library
Language: Java - Size: 63.2 MB - Last synced at: 8 days ago - Pushed at: 2 months ago - Stars: 3,642 - Forks: 649

databricks/koalas
Koalas: pandas API on Apache Spark
Language: Python - Size: 11.7 MB - Last synced at: 3 days ago - Pushed at: about 1 year ago - Stars: 3,359 - Forks: 365

adamerose/PandasGUI
A GUI for Pandas DataFrames
Language: Python - Size: 8.66 MB - Last synced at: 10 days ago - Pushed at: 10 days ago - Stars: 3,228 - Forks: 238

mars-project/mars
Mars is a tensor-based unified framework for large-scale data computation which scales numpy, pandas, scikit-learn and Python functions.
Language: Python - Size: 37 MB - Last synced at: 19 days ago - Pushed at: over 1 year ago - Stars: 2,731 - Forks: 327

hosseinmoein/DataFrame
C++ DataFrame for statistical, Financial, and ML analysis -- in modern C++ using native types and contiguous memory storage
Language: C++ - Size: 45 MB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 2,710 - Forks: 334

sngyai/Sequoia
Ač”čŖåØéč”ēØåŗļ¼å®ē°äŗęµ·é¾äŗ¤ęę³åćē¼ äøčÆ“ē¦ ēåøä¹°ē¹ļ¼ä»„åå ¶ä»č„å¹²ē§ęęÆå½¢ę
Language: Python - Size: 21.1 MB - Last synced at: 18 days ago - Pushed at: 10 months ago - Stars: 2,709 - Forks: 665

sfu-db/connector-x
Fastest library to load data from DB to DataFrames in Rust and Python
Language: Rust - Size: 236 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 2,308 - Forks: 179

approximatelabs/sketch
AI code-writing assistant that understands data content
Language: Python - Size: 8.98 MB - Last synced at: 4 days ago - Pushed at: over 1 year ago - Stars: 2,269 - Forks: 119

apache/hamilton
Apache Hamilton helps data scientists and engineers define testable, modular, self-documenting dataflows, that encode lineage/tracing and metadata. Runs and scales everywhere python does.
Language: Jupyter Notebook - Size: 77.5 MB - Last synced at: 7 days ago - Pushed at: 10 days ago - Stars: 2,148 - Forks: 146

alexhallam/tv
šŗ(tv) Tidy Viewer is a cross-platform CLI csv pretty printer that uses column styling to maximize viewer enjoyment.
Language: Rust - Size: 33.2 MB - Last synced at: 19 days ago - Pushed at: 5 months ago - Stars: 2,096 - Forks: 40

man-group/ArcticDB
ArcticDB is a high performance, serverless DataFrame database built for the Python Data Science ecosystem.
Language: C++ - Size: 176 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 1,923 - Forks: 131

apache/datafusion-ballista
Apache DataFusion Ballista Distributed Query Engine
Language: Rust - Size: 20.6 MB - Last synced at: 4 days ago - Pushed at: 8 days ago - Stars: 1,769 - Forks: 219

shramos/Awesome-Cybersecurity-Datasets
A curated list of amazingly awesome Cybersecurity datasets
Size: 26.4 KB - Last synced at: about 1 month ago - Pushed at: over 1 year ago - Stars: 1,637 - Forks: 291

pyjanitor-devs/pyjanitor
Clean APIs for data cleaning. Python implementation of R package Janitor
Language: Python - Size: 11.3 MB - Last synced at: 8 days ago - Pushed at: 9 days ago - Stars: 1,425 - Forks: 173

skrub-data/skrub
Machine learning with dataframes
Language: Python - Size: 12.4 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 1,404 - Forks: 131

uwdata/arquero
Query processing and transformation of array-backed data tables.
Language: JavaScript - Size: 1.46 MB - Last synced at: 10 days ago - Pushed at: 10 days ago - Stars: 1,400 - Forks: 68

rocketlaunchr/dataframe-go
DataFrames for Go: For statistics, machine-learning, and data manipulation/exploration
Language: Go - Size: 1010 KB - Last synced at: about 12 hours ago - Pushed at: about 3 years ago - Stars: 1,252 - Forks: 98

michaelchu/optopsy
A nimble options backtesting library for Python
Language: Python - Size: 8.87 MB - Last synced at: 21 days ago - Pushed at: 11 months ago - Stars: 1,115 - Forks: 175

comet-ml/kangas
š¦ Explore multimedia datasets at scale
Language: Jupyter Notebook - Size: 40.3 MB - Last synced at: 3 days ago - Pushed at: 6 months ago - Stars: 1,057 - Forks: 52

graphframes/graphframes
GraphFrames is a package for Apache Spark which provides DataFrame-based Graphs
Language: Scala - Size: 4.06 MB - Last synced at: 12 days ago - Pushed at: 13 days ago - Stars: 1,053 - Forks: 250

RedisLabs/spark-redis
A connector for Spark that allows reading and writing to/from Redis cluster
Language: Scala - Size: 2.16 MB - Last synced at: 19 days ago - Pushed at: 8 months ago - Stars: 947 - Forks: 369

microsoft/Mobius
C# and F# language binding and extensions to Apache Spark
Language: C# - Size: 6.44 MB - Last synced at: 4 days ago - Pushed at: over 1 year ago - Stars: 940 - Forks: 211

Kotlin/dataframe
Structured data processing in Kotlin
Language: Kotlin - Size: 142 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 929 - Forks: 73

freqtrade/technical
Various indicators developed or collected for the Freqtrade
Language: Python - Size: 7.23 MB - Last synced at: 5 days ago - Pushed at: 7 days ago - Stars: 886 - Forks: 234

stitchfix/hamilton š¦
A scalable general purpose micro-framework for defining dataflows. THIS REPOSITORY HAS BEEN MOVED TO www.github.com/dagworks-inc/hamilton
Language: Python - Size: 7.72 MB - Last synced at: 2 months ago - Pushed at: almost 2 years ago - Stars: 863 - Forks: 37

mrpowers-io/spark-daria
Essential Spark extensions and helper methods āØš²
Language: Scala - Size: 3.02 MB - Last synced at: 5 days ago - Pushed at: 8 months ago - Stars: 760 - Forks: 153

pdpipe/pdpipe
Easy pipelines for pandas DataFrames.
Language: Jupyter Notebook - Size: 2.69 MB - Last synced at: 21 days ago - Pushed at: 7 months ago - Stars: 719 - Forks: 45

techascent/tech.ml.dataset
A Clojure high performance data processing system
Language: Clojure - Size: 9.59 MB - Last synced at: 11 days ago - Pushed at: 27 days ago - Stars: 704 - Forks: 34

elastic/eland
Python Client and Toolkit for DataFrames, Big Data, Machine Learning and ETL in Elasticsearch
Language: Python - Size: 20.9 MB - Last synced at: 5 days ago - Pushed at: 24 days ago - Stars: 677 - Forks: 110

flow-php/flow
The most advanced data processing framework allowing to build scalable data processing pipelines and move data between various data sources and destinations.
Language: PHP - Size: 31.1 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 663 - Forks: 42

dmnfarrell/pandastable
Table analysis in Tkinter using pandas DataFrames.
Language: Python - Size: 8.99 MB - Last synced at: 19 days ago - Pushed at: 3 months ago - Stars: 646 - Forks: 124

andygrove/datafusion-archive š¦
DataFusion has now been donated to the Apache Arrow project
Language: Rust - Size: 945 KB - Last synced at: 8 months ago - Pushed at: over 6 years ago - Stars: 630 - Forks: 57

Axect/Peroxide
Rust numeric library with high performance and friendly syntax
Language: Rust - Size: 12.6 MB - Last synced at: 7 days ago - Pushed at: 13 days ago - Stars: 628 - Forks: 32

Squarespace/datasheets
Read data from, write data to, and modify the formatting of Google Sheets
Language: Python - Size: 911 KB - Last synced at: 17 days ago - Pushed at: over 1 year ago - Stars: 621 - Forks: 55

ranaroussi/pystore
Fast data store for Pandas time-series data
Language: Python - Size: 155 KB - Last synced at: about 1 month ago - Pushed at: 11 months ago - Stars: 577 - Forks: 101

Gmousse/dataframe-js š¦
No Maintenance Intended
Language: JavaScript - Size: 3.13 MB - Last synced at: 28 days ago - Pushed at: 10 months ago - Stars: 462 - Forks: 37

firmai/pandasvault
Advanced Pandas Vault ā Utilities, Functions and Snippets (by @firmai).
Language: Python - Size: 2.91 MB - Last synced at: 16 days ago - Pushed at: over 3 years ago - Stars: 426 - Forks: 73

tobgu/qframe
Immutable data frame for Go
Language: Go - Size: 3.56 MB - Last synced at: 16 days ago - Pushed at: 11 months ago - Stars: 406 - Forks: 33

DeepSpace2/StyleFrame
A library that wraps pandas and openpyxl and allows easy styling of dataframes in excel
Language: Python - Size: 571 KB - Last synced at: 6 days ago - Pushed at: about 1 year ago - Stars: 377 - Forks: 54

manzt/quak
a scalable data profiler
Language: TypeScript - Size: 2.46 MB - Last synced at: 7 days ago - Pushed at: 4 months ago - Stars: 359 - Forks: 15

tirthajyoti/Spark-with-Python
Fundamentals of Spark with Python (using PySpark), code examples
Language: Jupyter Notebook - Size: 8.97 MB - Last synced at: 15 days ago - Pushed at: over 2 years ago - Stars: 347 - Forks: 271

bluenote10/NimData
DataFrame API written in Nim, enabling fast out-of-core data processing
Language: Nim - Size: 416 KB - Last synced at: 20 days ago - Pushed at: almost 4 years ago - Stars: 339 - Forks: 22

scicloj/tablecloth
Dataset manipulation library built on the top of tech.ml.dataset
Language: Clojure - Size: 28.1 MB - Last synced at: 16 days ago - Pushed at: about 1 month ago - Stars: 331 - Forks: 27

tidyverse/duckplyr
A drop-in replacement for dplyr, powered by DuckDB for speed.
Language: R - Size: 15.6 MB - Last synced at: 1 day ago - Pushed at: about 1 month ago - Stars: 323 - Forks: 20

Quantco/dataframely
A declarative, š»āāļø-native data frame validation library.
Language: Python - Size: 358 KB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 315 - Forks: 9

snowflakedb/snowpark-python
Snowflake Snowpark Python API
Language: Python - Size: 57.3 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 300 - Forks: 126

lifeomic/sparkflow
Easy to use library to bring Tensorflow on Apache Spark
Language: Python - Size: 8.79 MB - Last synced at: 15 days ago - Pushed at: over 1 year ago - Stars: 296 - Forks: 45

cylondata/cylon
Cylon is a fast, scalable, distributed memory, parallel runtime with a Pandas like DataFrame.
Language: C++ - Size: 10.7 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 293 - Forks: 44

zero-one-group/geni
A Clojure dataframe library that runs on Spark
Language: Clojure - Size: 1.86 MB - Last synced at: 8 days ago - Pushed at: over 1 year ago - Stars: 293 - Forks: 27

nevi-me/rust-dataframe š¦
A Rust DataFrame implementation, built on Apache Arrow
Language: Rust - Size: 253 KB - Last synced at: 6 days ago - Pushed at: over 4 years ago - Stars: 280 - Forks: 20

tirthajyoti/Design-of-experiment-Python
Design-of-experiment (DOE) generator for science, engineering, and statistics
Language: Jupyter Notebook - Size: 547 KB - Last synced at: 14 days ago - Pushed at: about 1 year ago - Stars: 268 - Forks: 97

dflib/dflib
In-memory Java DataFrame library
Language: Java - Size: 5.38 MB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 266 - Forks: 25

alastairrushworth/inspectdf
š ļø š Tools for Exploring and Comparing Data Frames
Language: R - Size: 24.9 MB - Last synced at: 9 days ago - Pushed at: 10 months ago - Stars: 254 - Forks: 24

zavtech/morpheus-core
The foundational library of the Morpheus data science framework
Language: Java - Size: 56.1 MB - Last synced at: 2 months ago - Pushed at: over 1 year ago - Stars: 240 - Forks: 22

kszucs/pandahouse
Pandas interface for Clickhouse database
Language: Python - Size: 61.5 KB - Last synced at: 16 days ago - Pushed at: over 4 years ago - Stars: 236 - Forks: 69

ank0409/Ditching-Excel-for-Python
Functionalities in Excel translated to Python
Language: Jupyter Notebook - Size: 14.6 KB - Last synced at: 2 months ago - Pushed at: about 4 years ago - Stars: 231 - Forks: 90

datasweet/datatable
A go in-memory table
Language: Go - Size: 130 KB - Last synced at: 12 months ago - Pushed at: almost 3 years ago - Stars: 228 - Forks: 13

alanmarazzi/panthera
Data-frames & arrays on Clojure
Language: Clojure - Size: 414 KB - Last synced at: 14 days ago - Pushed at: about 5 years ago - Stars: 190 - Forks: 15

alteryx/woodwork
Woodwork is a Python library that provides robust methods for managing and communicating data typing information.
Language: Python - Size: 3.2 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 153 - Forks: 21

noahgift/rust-mlops-template Fork of nogibjj/rust-mlops-template
A work in progress to build out solutions in Rust for MLOPs
Language: Rust - Size: 57.6 MB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 146 - Forks: 31

archivesunleashed/aut
The Archives Unleashed Toolkit is an open-source toolkit for analyzing web archives.
Language: Scala - Size: 39.5 MB - Last synced at: 28 days ago - Pushed at: over 1 year ago - Stars: 143 - Forks: 33

SciNim/Datamancer
A dataframe library with a dplyr like API
Language: Nim - Size: 1010 KB - Last synced at: 12 minutes ago - Pushed at: about 2 months ago - Stars: 140 - Forks: 8

dmnfarrell/tablexplore
Table analysis and plotting application written in PySide2/PyQt5
Language: Python - Size: 14 MB - Last synced at: 25 days ago - Pushed at: almost 2 years ago - Stars: 135 - Forks: 26

bertrandmartel/tableau-scraping
Tableau scraper python library. R and Python scripts to scrape data from Tableau viz
Language: Python - Size: 485 KB - Last synced at: 21 days ago - Pushed at: about 1 year ago - Stars: 134 - Forks: 22

scipp/scipp
Multi-dimensional data arrays with labeled dimensions
Language: C++ - Size: 29.7 MB - Last synced at: 17 days ago - Pushed at: 17 days ago - Stars: 124 - Forks: 21

clojure-finance/clojask
Clojask is a Clojure data processing framework with parallel computing on larger-than-memory datasets
Language: Clojure - Size: 10.1 MB - Last synced at: about 1 month ago - Pushed at: almost 2 years ago - Stars: 121 - Forks: 5

kfultz07/go-dataframe
A simple package to abstract away the process of creating usable DataFrames for data analytics. This package is heavily inspired by the amazing Python library, Pandas.
Language: Go - Size: 3.93 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 120 - Forks: 7

finos/ipyregulartable
High performance, editable, stylable datagrids in jupyter and jupyterlab
Language: JavaScript - Size: 8.17 MB - Last synced at: 26 days ago - Pushed at: 7 months ago - Stars: 113 - Forks: 13

yash1994/dframcy
Dataframe Integration with spaCy.
Language: Python - Size: 179 KB - Last synced at: 26 days ago - Pushed at: about 4 years ago - Stars: 103 - Forks: 4

hkpeaks/peaks-consolidation
The Peaks Consolidation is equipped with state-of-the-art algorithms and data structures that support high-performance databending exercises. It specializes in management accounting and consolidation, with some special topics in machine learning and bioinformatics.
Language: Go - Size: 246 MB - Last synced at: 12 months ago - Pushed at: over 1 year ago - Stars: 102 - Forks: 8

areshytko/typedframe
Typed wrappers over pandas DataFrames with schema validation
Language: Python - Size: 318 KB - Last synced at: 9 days ago - Pushed at: over 1 year ago - Stars: 102 - Forks: 8

jgperrin/net.jgp.labs.spark
Apache Spark examples exclusively in Java
Language: Java - Size: 1.75 MB - Last synced at: about 2 months ago - Pushed at: about 2 years ago - Stars: 101 - Forks: 49

tidypyverse/tidypandas
A grammar of data manipulation for pandas inspired by tidyverse
Language: Python - Size: 5.75 MB - Last synced at: about 1 month ago - Pushed at: over 1 year ago - Stars: 98 - Forks: 8

facultyai/lens
Summarise and explore Pandas DataFrames
Language: Python - Size: 229 KB - Last synced at: 29 days ago - Pushed at: almost 5 years ago - Stars: 98 - Forks: 8

CybercentreCanada/jupyterlab-sql-editor
A JupyterLab extension providing, SQL formatter, auto-completion, syntax highlighting, Spark SQL and Trino
Language: Jupyter Notebook - Size: 90.6 MB - Last synced at: 5 days ago - Pushed at: 6 days ago - Stars: 88 - Forks: 14

oreilles/polars-st
Spatial extension for Polars DataFrames.
Language: Python - Size: 1.54 MB - Last synced at: 15 days ago - Pushed at: 18 days ago - Stars: 85 - Forks: 5

nmandery/h3ron š¦
Rust crates for the H3 geospatial indexing system
Language: Rust - Size: 19.4 MB - Last synced at: 5 days ago - Pushed at: over 1 year ago - Stars: 85 - Forks: 12

chitralverma/scala-polars
Polars for Scala & Java projects!
Language: Scala - Size: 4.08 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 84 - Forks: 7

instacart/jardin-archived š¦
A pandas.DataFrame-based ORM.
Language: Python - Size: 866 KB - Last synced at: 10 months ago - Pushed at: about 3 years ago - Stars: 84 - Forks: 8

mahmoudparsian/pyspark-algorithms
PySpark Algorithms Book: https://www.amazon.com/dp/B07X4B2218/ref=sr_1_2
Language: Python - Size: 40.5 MB - Last synced at: 2 months ago - Pushed at: over 5 years ago - Stars: 84 - Forks: 44

abdenlab/oxbow
Oxbow makes genomic data accessible for high-performance analytics.
Language: Rust - Size: 16.1 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 79 - Forks: 8

hablapps/doric
Type safety for spark columns
Language: Scala - Size: 13.6 MB - Last synced at: 14 days ago - Pushed at: 16 days ago - Stars: 78 - Forks: 11

rsheftel/raccoon
Python DataFrame with fast insert and appends
Language: Python - Size: 486 KB - Last synced at: 23 days ago - Pushed at: about 2 months ago - Stars: 75 - Forks: 10

red-data-tools/red_amber
A dataframe library for Rubyists.
Language: Ruby - Size: 5.34 MB - Last synced at: 16 days ago - Pushed at: 25 days ago - Stars: 71 - Forks: 13

evetion/GeoDataFrames.jl
Simple geographical vector interaction built on top of ArchGDAL
Language: Julia - Size: 2.65 MB - Last synced at: 25 days ago - Pushed at: 25 days ago - Stars: 68 - Forks: 7

AlexMili/torch-dataframe
Utility class to manipulate dataset from CSV file
Language: Lua - Size: 989 KB - Last synced at: about 2 months ago - Pushed at: over 7 years ago - Stars: 67 - Forks: 8

alttch/myval
Lightweight Apache Arrow data frame for Rust
Language: Rust - Size: 219 KB - Last synced at: 20 days ago - Pushed at: about 2 years ago - Stars: 63 - Forks: 3

IDouble/Pandas-Python-Data-Analysis-Playground
š Data Analysis with the Pandas Library & Notes šš
Language: Python - Size: 8.93 MB - Last synced at: 1 day ago - Pushed at: over 1 year ago - Stars: 59 - Forks: 8

tirthajyoti/Julia-data-science
Data science and numerical computing with Julia
Language: Jupyter Notebook - Size: 1.6 MB - Last synced at: about 1 month ago - Pushed at: over 5 years ago - Stars: 58 - Forks: 19

bhrnjica/daany
Daany - .NET DAta ANalYtics .NET library with the implementation of DataFrame, Time series decompositions and Linear Algebra routines BLASS and LAPACK.
Language: C# - Size: 31.5 MB - Last synced at: 15 days ago - Pushed at: 21 days ago - Stars: 57 - Forks: 5

nRo/DataFrame
DataFrame Library for Java
Language: Java - Size: 888 KB - Last synced at: almost 2 years ago - Pushed at: over 3 years ago - Stars: 57 - Forks: 13
