An open API service providing repository metadata for many open source software ecosystems.

Topic: "dataframe"

pola-rs/polars

Dataframes powered by a multithreaded, vectorized query engine, written in Rust

Language: Rust - Size: 191 MB - Last synced at: 5 days ago - Pushed at: 6 days ago - Stars: 33,888 - Forks: 2,253

Kanaries/pygwalker

PyGWalker: Turn your dataframe into an interactive UI for visual analysis

Language: Python - Size: 62.9 MB - Last synced at: 20 days ago - Pushed at: 20 days ago - Stars: 14,835 - Forks: 792

modin-project/modin

Modin: Scale your Pandas workflows by changing a single line of code

Language: Python - Size: 50.9 MB - Last synced at: 7 days ago - Pushed at: 9 days ago - Stars: 10,179 - Forks: 666

rapidsai/cudf

cuDF - GPU DataFrame Library

Language: C++ - Size: 157 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 8,956 - Forks: 951

vaexio/vaex

Out-of-Core hybrid Apache Arrow/NumPy DataFrame for Python, ML, visualization and exploration of big tabular data at a billion rows per second šŸš€

Language: Python - Size: 133 MB - Last synced at: 5 days ago - Pushed at: 8 months ago - Stars: 8,387 - Forks: 598

apache/datafusion

Apache DataFusion SQL Query Engine

Language: Rust - Size: 142 MB - Last synced at: 5 days ago - Pushed at: 6 days ago - Stars: 7,281 - Forks: 1,501

haifengl/smile

Statistical Machine Intelligence & Learning Engine

Language: Java - Size: 245 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 6,193 - Forks: 1,141

javascriptdata/danfojs

Danfo.js is an open source, JavaScript library providing high performance, intuitive, and easy to use data structures for manipulating and processing structured data.

Language: TypeScript - Size: 79 MB - Last synced at: 3 days ago - Pushed at: 2 months ago - Stars: 4,935 - Forks: 216

lk-geimfari/mimesis

Mimesis is a robust data generator for Python that can produce a wide range of fake data in multiple languages.

Language: Python - Size: 33.8 MB - Last synced at: 5 days ago - Pushed at: 28 days ago - Stars: 4,578 - Forks: 341

jtablesaw/tablesaw

Java dataframe and visualization library

Language: Java - Size: 63.2 MB - Last synced at: 8 days ago - Pushed at: 2 months ago - Stars: 3,642 - Forks: 649

databricks/koalas

Koalas: pandas API on Apache Spark

Language: Python - Size: 11.7 MB - Last synced at: 3 days ago - Pushed at: about 1 year ago - Stars: 3,359 - Forks: 365

adamerose/PandasGUI

A GUI for Pandas DataFrames

Language: Python - Size: 8.66 MB - Last synced at: 10 days ago - Pushed at: 10 days ago - Stars: 3,228 - Forks: 238

mars-project/mars

Mars is a tensor-based unified framework for large-scale data computation which scales numpy, pandas, scikit-learn and Python functions.

Language: Python - Size: 37 MB - Last synced at: 19 days ago - Pushed at: over 1 year ago - Stars: 2,731 - Forks: 327

hosseinmoein/DataFrame

C++ DataFrame for statistical, Financial, and ML analysis -- in modern C++ using native types and contiguous memory storage

Language: C++ - Size: 45 MB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 2,710 - Forks: 334

sngyai/Sequoia

Ač‚”č‡ŖåŠØé€‰č‚”ēØ‹åŗļ¼Œå®žēŽ°äŗ†ęµ·é¾Ÿäŗ¤ę˜“ę³•åˆ™ć€ē¼ äø­čÆ“ē¦…ē‰›åø‚ä¹°ē‚¹ļ¼Œä»„åŠå…¶ä»–č‹„å¹²ē§ęŠ€ęœÆå½¢ę€

Language: Python - Size: 21.1 MB - Last synced at: 18 days ago - Pushed at: 10 months ago - Stars: 2,709 - Forks: 665

sfu-db/connector-x

Fastest library to load data from DB to DataFrames in Rust and Python

Language: Rust - Size: 236 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 2,308 - Forks: 179

approximatelabs/sketch

AI code-writing assistant that understands data content

Language: Python - Size: 8.98 MB - Last synced at: 4 days ago - Pushed at: over 1 year ago - Stars: 2,269 - Forks: 119

apache/hamilton

Apache Hamilton helps data scientists and engineers define testable, modular, self-documenting dataflows, that encode lineage/tracing and metadata. Runs and scales everywhere python does.

Language: Jupyter Notebook - Size: 77.5 MB - Last synced at: 7 days ago - Pushed at: 10 days ago - Stars: 2,148 - Forks: 146

alexhallam/tv

šŸ“ŗ(tv) Tidy Viewer is a cross-platform CLI csv pretty printer that uses column styling to maximize viewer enjoyment.

Language: Rust - Size: 33.2 MB - Last synced at: 19 days ago - Pushed at: 5 months ago - Stars: 2,096 - Forks: 40

man-group/ArcticDB

ArcticDB is a high performance, serverless DataFrame database built for the Python Data Science ecosystem.

Language: C++ - Size: 176 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 1,923 - Forks: 131

apache/datafusion-ballista

Apache DataFusion Ballista Distributed Query Engine

Language: Rust - Size: 20.6 MB - Last synced at: 4 days ago - Pushed at: 8 days ago - Stars: 1,769 - Forks: 219

shramos/Awesome-Cybersecurity-Datasets

A curated list of amazingly awesome Cybersecurity datasets

Size: 26.4 KB - Last synced at: about 1 month ago - Pushed at: over 1 year ago - Stars: 1,637 - Forks: 291

pyjanitor-devs/pyjanitor

Clean APIs for data cleaning. Python implementation of R package Janitor

Language: Python - Size: 11.3 MB - Last synced at: 8 days ago - Pushed at: 9 days ago - Stars: 1,425 - Forks: 173

skrub-data/skrub

Machine learning with dataframes

Language: Python - Size: 12.4 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 1,404 - Forks: 131

uwdata/arquero

Query processing and transformation of array-backed data tables.

Language: JavaScript - Size: 1.46 MB - Last synced at: 10 days ago - Pushed at: 10 days ago - Stars: 1,400 - Forks: 68

rocketlaunchr/dataframe-go

DataFrames for Go: For statistics, machine-learning, and data manipulation/exploration

Language: Go - Size: 1010 KB - Last synced at: about 12 hours ago - Pushed at: about 3 years ago - Stars: 1,252 - Forks: 98

michaelchu/optopsy

A nimble options backtesting library for Python

Language: Python - Size: 8.87 MB - Last synced at: 21 days ago - Pushed at: 11 months ago - Stars: 1,115 - Forks: 175

comet-ml/kangas

🦘 Explore multimedia datasets at scale

Language: Jupyter Notebook - Size: 40.3 MB - Last synced at: 3 days ago - Pushed at: 6 months ago - Stars: 1,057 - Forks: 52

graphframes/graphframes

GraphFrames is a package for Apache Spark which provides DataFrame-based Graphs

Language: Scala - Size: 4.06 MB - Last synced at: 12 days ago - Pushed at: 13 days ago - Stars: 1,053 - Forks: 250

RedisLabs/spark-redis

A connector for Spark that allows reading and writing to/from Redis cluster

Language: Scala - Size: 2.16 MB - Last synced at: 19 days ago - Pushed at: 8 months ago - Stars: 947 - Forks: 369

microsoft/Mobius

C# and F# language binding and extensions to Apache Spark

Language: C# - Size: 6.44 MB - Last synced at: 4 days ago - Pushed at: over 1 year ago - Stars: 940 - Forks: 211

Kotlin/dataframe

Structured data processing in Kotlin

Language: Kotlin - Size: 142 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 929 - Forks: 73

freqtrade/technical

Various indicators developed or collected for the Freqtrade

Language: Python - Size: 7.23 MB - Last synced at: 5 days ago - Pushed at: 7 days ago - Stars: 886 - Forks: 234

stitchfix/hamilton šŸ“¦

A scalable general purpose micro-framework for defining dataflows. THIS REPOSITORY HAS BEEN MOVED TO www.github.com/dagworks-inc/hamilton

Language: Python - Size: 7.72 MB - Last synced at: 2 months ago - Pushed at: almost 2 years ago - Stars: 863 - Forks: 37

mrpowers-io/spark-daria

Essential Spark extensions and helper methods ✨😲

Language: Scala - Size: 3.02 MB - Last synced at: 5 days ago - Pushed at: 8 months ago - Stars: 760 - Forks: 153

pdpipe/pdpipe

Easy pipelines for pandas DataFrames.

Language: Jupyter Notebook - Size: 2.69 MB - Last synced at: 21 days ago - Pushed at: 7 months ago - Stars: 719 - Forks: 45

techascent/tech.ml.dataset

A Clojure high performance data processing system

Language: Clojure - Size: 9.59 MB - Last synced at: 11 days ago - Pushed at: 27 days ago - Stars: 704 - Forks: 34

elastic/eland

Python Client and Toolkit for DataFrames, Big Data, Machine Learning and ETL in Elasticsearch

Language: Python - Size: 20.9 MB - Last synced at: 5 days ago - Pushed at: 24 days ago - Stars: 677 - Forks: 110

flow-php/flow

The most advanced data processing framework allowing to build scalable data processing pipelines and move data between various data sources and destinations.

Language: PHP - Size: 31.1 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 663 - Forks: 42

dmnfarrell/pandastable

Table analysis in Tkinter using pandas DataFrames.

Language: Python - Size: 8.99 MB - Last synced at: 19 days ago - Pushed at: 3 months ago - Stars: 646 - Forks: 124

andygrove/datafusion-archive šŸ“¦

DataFusion has now been donated to the Apache Arrow project

Language: Rust - Size: 945 KB - Last synced at: 8 months ago - Pushed at: over 6 years ago - Stars: 630 - Forks: 57

Axect/Peroxide

Rust numeric library with high performance and friendly syntax

Language: Rust - Size: 12.6 MB - Last synced at: 7 days ago - Pushed at: 13 days ago - Stars: 628 - Forks: 32

Squarespace/datasheets

Read data from, write data to, and modify the formatting of Google Sheets

Language: Python - Size: 911 KB - Last synced at: 17 days ago - Pushed at: over 1 year ago - Stars: 621 - Forks: 55

ranaroussi/pystore

Fast data store for Pandas time-series data

Language: Python - Size: 155 KB - Last synced at: about 1 month ago - Pushed at: 11 months ago - Stars: 577 - Forks: 101

Gmousse/dataframe-js šŸ“¦

No Maintenance Intended

Language: JavaScript - Size: 3.13 MB - Last synced at: 28 days ago - Pushed at: 10 months ago - Stars: 462 - Forks: 37

firmai/pandasvault

Advanced Pandas Vault — Utilities, Functions and Snippets (by @firmai).

Language: Python - Size: 2.91 MB - Last synced at: 16 days ago - Pushed at: over 3 years ago - Stars: 426 - Forks: 73

tobgu/qframe

Immutable data frame for Go

Language: Go - Size: 3.56 MB - Last synced at: 16 days ago - Pushed at: 11 months ago - Stars: 406 - Forks: 33

DeepSpace2/StyleFrame

A library that wraps pandas and openpyxl and allows easy styling of dataframes in excel

Language: Python - Size: 571 KB - Last synced at: 6 days ago - Pushed at: about 1 year ago - Stars: 377 - Forks: 54

manzt/quak

a scalable data profiler

Language: TypeScript - Size: 2.46 MB - Last synced at: 7 days ago - Pushed at: 4 months ago - Stars: 359 - Forks: 15

tirthajyoti/Spark-with-Python

Fundamentals of Spark with Python (using PySpark), code examples

Language: Jupyter Notebook - Size: 8.97 MB - Last synced at: 15 days ago - Pushed at: over 2 years ago - Stars: 347 - Forks: 271

bluenote10/NimData

DataFrame API written in Nim, enabling fast out-of-core data processing

Language: Nim - Size: 416 KB - Last synced at: 20 days ago - Pushed at: almost 4 years ago - Stars: 339 - Forks: 22

scicloj/tablecloth

Dataset manipulation library built on the top of tech.ml.dataset

Language: Clojure - Size: 28.1 MB - Last synced at: 16 days ago - Pushed at: about 1 month ago - Stars: 331 - Forks: 27

tidyverse/duckplyr

A drop-in replacement for dplyr, powered by DuckDB for speed.

Language: R - Size: 15.6 MB - Last synced at: 1 day ago - Pushed at: about 1 month ago - Stars: 323 - Forks: 20

Quantco/dataframely

A declarative, šŸ»ā€ā„ļø-native data frame validation library.

Language: Python - Size: 358 KB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 315 - Forks: 9

snowflakedb/snowpark-python

Snowflake Snowpark Python API

Language: Python - Size: 57.3 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 300 - Forks: 126

lifeomic/sparkflow

Easy to use library to bring Tensorflow on Apache Spark

Language: Python - Size: 8.79 MB - Last synced at: 15 days ago - Pushed at: over 1 year ago - Stars: 296 - Forks: 45

cylondata/cylon

Cylon is a fast, scalable, distributed memory, parallel runtime with a Pandas like DataFrame.

Language: C++ - Size: 10.7 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 293 - Forks: 44

zero-one-group/geni

A Clojure dataframe library that runs on Spark

Language: Clojure - Size: 1.86 MB - Last synced at: 8 days ago - Pushed at: over 1 year ago - Stars: 293 - Forks: 27

nevi-me/rust-dataframe šŸ“¦

A Rust DataFrame implementation, built on Apache Arrow

Language: Rust - Size: 253 KB - Last synced at: 6 days ago - Pushed at: over 4 years ago - Stars: 280 - Forks: 20

tirthajyoti/Design-of-experiment-Python

Design-of-experiment (DOE) generator for science, engineering, and statistics

Language: Jupyter Notebook - Size: 547 KB - Last synced at: 14 days ago - Pushed at: about 1 year ago - Stars: 268 - Forks: 97

dflib/dflib

In-memory Java DataFrame library

Language: Java - Size: 5.38 MB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 266 - Forks: 25

alastairrushworth/inspectdf

šŸ› ļø šŸ“Š Tools for Exploring and Comparing Data Frames

Language: R - Size: 24.9 MB - Last synced at: 9 days ago - Pushed at: 10 months ago - Stars: 254 - Forks: 24

zavtech/morpheus-core

The foundational library of the Morpheus data science framework

Language: Java - Size: 56.1 MB - Last synced at: 2 months ago - Pushed at: over 1 year ago - Stars: 240 - Forks: 22

kszucs/pandahouse

Pandas interface for Clickhouse database

Language: Python - Size: 61.5 KB - Last synced at: 16 days ago - Pushed at: over 4 years ago - Stars: 236 - Forks: 69

ank0409/Ditching-Excel-for-Python

Functionalities in Excel translated to Python

Language: Jupyter Notebook - Size: 14.6 KB - Last synced at: 2 months ago - Pushed at: about 4 years ago - Stars: 231 - Forks: 90

datasweet/datatable

A go in-memory table

Language: Go - Size: 130 KB - Last synced at: 12 months ago - Pushed at: almost 3 years ago - Stars: 228 - Forks: 13

alanmarazzi/panthera

Data-frames & arrays on Clojure

Language: Clojure - Size: 414 KB - Last synced at: 14 days ago - Pushed at: about 5 years ago - Stars: 190 - Forks: 15

alteryx/woodwork

Woodwork is a Python library that provides robust methods for managing and communicating data typing information.

Language: Python - Size: 3.2 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 153 - Forks: 21

noahgift/rust-mlops-template Fork of nogibjj/rust-mlops-template

A work in progress to build out solutions in Rust for MLOPs

Language: Rust - Size: 57.6 MB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 146 - Forks: 31

archivesunleashed/aut

The Archives Unleashed Toolkit is an open-source toolkit for analyzing web archives.

Language: Scala - Size: 39.5 MB - Last synced at: 28 days ago - Pushed at: over 1 year ago - Stars: 143 - Forks: 33

SciNim/Datamancer

A dataframe library with a dplyr like API

Language: Nim - Size: 1010 KB - Last synced at: 12 minutes ago - Pushed at: about 2 months ago - Stars: 140 - Forks: 8

dmnfarrell/tablexplore

Table analysis and plotting application written in PySide2/PyQt5

Language: Python - Size: 14 MB - Last synced at: 25 days ago - Pushed at: almost 2 years ago - Stars: 135 - Forks: 26

bertrandmartel/tableau-scraping

Tableau scraper python library. R and Python scripts to scrape data from Tableau viz

Language: Python - Size: 485 KB - Last synced at: 21 days ago - Pushed at: about 1 year ago - Stars: 134 - Forks: 22

scipp/scipp

Multi-dimensional data arrays with labeled dimensions

Language: C++ - Size: 29.7 MB - Last synced at: 17 days ago - Pushed at: 17 days ago - Stars: 124 - Forks: 21

clojure-finance/clojask

Clojask is a Clojure data processing framework with parallel computing on larger-than-memory datasets

Language: Clojure - Size: 10.1 MB - Last synced at: about 1 month ago - Pushed at: almost 2 years ago - Stars: 121 - Forks: 5

kfultz07/go-dataframe

A simple package to abstract away the process of creating usable DataFrames for data analytics. This package is heavily inspired by the amazing Python library, Pandas.

Language: Go - Size: 3.93 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 120 - Forks: 7

finos/ipyregulartable

High performance, editable, stylable datagrids in jupyter and jupyterlab

Language: JavaScript - Size: 8.17 MB - Last synced at: 26 days ago - Pushed at: 7 months ago - Stars: 113 - Forks: 13

yash1994/dframcy

Dataframe Integration with spaCy.

Language: Python - Size: 179 KB - Last synced at: 26 days ago - Pushed at: about 4 years ago - Stars: 103 - Forks: 4

hkpeaks/peaks-consolidation

The Peaks Consolidation is equipped with state-of-the-art algorithms and data structures that support high-performance databending exercises. It specializes in management accounting and consolidation, with some special topics in machine learning and bioinformatics.

Language: Go - Size: 246 MB - Last synced at: 12 months ago - Pushed at: over 1 year ago - Stars: 102 - Forks: 8

areshytko/typedframe

Typed wrappers over pandas DataFrames with schema validation

Language: Python - Size: 318 KB - Last synced at: 9 days ago - Pushed at: over 1 year ago - Stars: 102 - Forks: 8

jgperrin/net.jgp.labs.spark

Apache Spark examples exclusively in Java

Language: Java - Size: 1.75 MB - Last synced at: about 2 months ago - Pushed at: about 2 years ago - Stars: 101 - Forks: 49

tidypyverse/tidypandas

A grammar of data manipulation for pandas inspired by tidyverse

Language: Python - Size: 5.75 MB - Last synced at: about 1 month ago - Pushed at: over 1 year ago - Stars: 98 - Forks: 8

facultyai/lens

Summarise and explore Pandas DataFrames

Language: Python - Size: 229 KB - Last synced at: 29 days ago - Pushed at: almost 5 years ago - Stars: 98 - Forks: 8

CybercentreCanada/jupyterlab-sql-editor

A JupyterLab extension providing, SQL formatter, auto-completion, syntax highlighting, Spark SQL and Trino

Language: Jupyter Notebook - Size: 90.6 MB - Last synced at: 5 days ago - Pushed at: 6 days ago - Stars: 88 - Forks: 14

oreilles/polars-st

Spatial extension for Polars DataFrames.

Language: Python - Size: 1.54 MB - Last synced at: 15 days ago - Pushed at: 18 days ago - Stars: 85 - Forks: 5

nmandery/h3ron šŸ“¦

Rust crates for the H3 geospatial indexing system

Language: Rust - Size: 19.4 MB - Last synced at: 5 days ago - Pushed at: over 1 year ago - Stars: 85 - Forks: 12

chitralverma/scala-polars

Polars for Scala & Java projects!

Language: Scala - Size: 4.08 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 84 - Forks: 7

instacart/jardin-archived šŸ“¦

A pandas.DataFrame-based ORM.

Language: Python - Size: 866 KB - Last synced at: 10 months ago - Pushed at: about 3 years ago - Stars: 84 - Forks: 8

mahmoudparsian/pyspark-algorithms

PySpark Algorithms Book: https://www.amazon.com/dp/B07X4B2218/ref=sr_1_2

Language: Python - Size: 40.5 MB - Last synced at: 2 months ago - Pushed at: over 5 years ago - Stars: 84 - Forks: 44

abdenlab/oxbow

Oxbow makes genomic data accessible for high-performance analytics.

Language: Rust - Size: 16.1 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 79 - Forks: 8

hablapps/doric

Type safety for spark columns

Language: Scala - Size: 13.6 MB - Last synced at: 14 days ago - Pushed at: 16 days ago - Stars: 78 - Forks: 11

rsheftel/raccoon

Python DataFrame with fast insert and appends

Language: Python - Size: 486 KB - Last synced at: 23 days ago - Pushed at: about 2 months ago - Stars: 75 - Forks: 10

red-data-tools/red_amber

A dataframe library for Rubyists.

Language: Ruby - Size: 5.34 MB - Last synced at: 16 days ago - Pushed at: 25 days ago - Stars: 71 - Forks: 13

evetion/GeoDataFrames.jl

Simple geographical vector interaction built on top of ArchGDAL

Language: Julia - Size: 2.65 MB - Last synced at: 25 days ago - Pushed at: 25 days ago - Stars: 68 - Forks: 7

AlexMili/torch-dataframe

Utility class to manipulate dataset from CSV file

Language: Lua - Size: 989 KB - Last synced at: about 2 months ago - Pushed at: over 7 years ago - Stars: 67 - Forks: 8

alttch/myval

Lightweight Apache Arrow data frame for Rust

Language: Rust - Size: 219 KB - Last synced at: 20 days ago - Pushed at: about 2 years ago - Stars: 63 - Forks: 3

IDouble/Pandas-Python-Data-Analysis-Playground

šŸ Data Analysis with the Pandas Library & Notes šŸ“ŠšŸ“ˆ

Language: Python - Size: 8.93 MB - Last synced at: 1 day ago - Pushed at: over 1 year ago - Stars: 59 - Forks: 8

tirthajyoti/Julia-data-science

Data science and numerical computing with Julia

Language: Jupyter Notebook - Size: 1.6 MB - Last synced at: about 1 month ago - Pushed at: over 5 years ago - Stars: 58 - Forks: 19

bhrnjica/daany

Daany - .NET DAta ANalYtics .NET library with the implementation of DataFrame, Time series decompositions and Linear Algebra routines BLASS and LAPACK.

Language: C# - Size: 31.5 MB - Last synced at: 15 days ago - Pushed at: 21 days ago - Stars: 57 - Forks: 5

nRo/DataFrame

DataFrame Library for Java

Language: Java - Size: 888 KB - Last synced at: almost 2 years ago - Pushed at: over 3 years ago - Stars: 57 - Forks: 13