Topic: "data-processing"
pathwaycom/pathway
Python ETL framework for stream processing, real-time analytics, LLM pipelines, and RAG.
Language: Python - Size: 133 MB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 50,011 - Forks: 1,454
onceupon/Bash-Oneliner
A collection of handy Bash One-Liners and terminal tricks for data processing and Linux system maintenance.
Size: 919 KB - Last synced at: 3 days ago - Pushed at: 8 months ago - Stars: 10,595 - Forks: 641
johnkerl/miller
Miller is like awk, sed, cut, join, and sort for name-indexed data such as CSV, TSV, and tabular JSON
Language: Go - Size: 201 MB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 9,521 - Forks: 227
TomWright/dasel
Select, put and delete data from JSON, TOML, YAML, XML and CSV files with a single tool. Supports conversion between formats and can be used as a Go package.
Language: Go - Size: 10.1 MB - Last synced at: 20 days ago - Pushed at: about 1 month ago - Stars: 7,664 - Forks: 157
NVIDIA/DALI
A GPU-accelerated library containing highly optimized building blocks and an execution engine for data processing to accelerate deep learning training and inference applications.
Language: C++ - Size: 398 MB - Last synced at: 3 days ago - Pushed at: 6 days ago - Stars: 5,551 - Forks: 649
datajuicer/data-juicer
Data processing for and with foundation models! 🍎 🍋 🌽 ➡️ ➡️🍸 🍹 🍷
Language: Python - Size: 560 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 5,517 - Forks: 289
deepseek-ai/smallpond
A lightweight data processing framework built on DuckDB and 3FS.
Language: Python - Size: 1.77 MB - Last synced at: 4 days ago - Pushed at: 9 months ago - Stars: 4,839 - Forks: 431
unionai-oss/pandera
A light-weight, flexible, and expressive statistical data testing library
Language: Python - Size: 4.46 MB - Last synced at: 16 days ago - Pushed at: 20 days ago - Stars: 4,073 - Forks: 363
cocoindex-io/cocoindex
Data transformation framework for AI. Ultra performant, with incremental processing. 🌟 Star if you like it!
Language: Rust - Size: 78.3 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 3,258 - Forks: 262
dashbitco/broadway
Concurrent and multi-stage data ingestion and data processing with Elixir
Language: Elixir - Size: 665 KB - Last synced at: 9 days ago - Pushed at: about 2 months ago - Stars: 2,586 - Forks: 169
microsoft/DialoGPT
Large-scale pretraining for dialogue
Language: Python - Size: 43.6 MB - Last synced at: about 1 month ago - Pushed at: about 3 years ago - Stars: 2,412 - Forks: 346
numaproj/numaflow
Kubernetes-native platform to run massively parallel data/streaming jobs
Language: Rust - Size: 52.7 MB - Last synced at: 5 days ago - Pushed at: 6 days ago - Stars: 2,411 - Forks: 147
asyml/texar
Toolkit for Machine Learning, Natural Language Processing, and Text Generation, in TensorFlow. This is part of the CASL project: http://casl-project.ai/
Language: Python - Size: 13.6 MB - Last synced at: 20 days ago - Pushed at: about 4 years ago - Stars: 2,391 - Forks: 369
bytewax/bytewax
Python Stream Processing
Language: Python - Size: 12 MB - Last synced at: about 1 month ago - Pushed at: 8 months ago - Stars: 1,861 - Forks: 96
python-bonobo/bonobo
Extract Transform Load for Python 3.5+
Language: Python - Size: 1.46 MB - Last synced at: 3 months ago - Pushed at: over 2 years ago - Stars: 1,600 - Forks: 142
pyper-dev/pyper
Concurrent Python made simple
Language: Python - Size: 462 KB - Last synced at: about 2 months ago - Pushed at: 10 months ago - Stars: 1,503 - Forks: 30
OpenDCAI/DataFlow
Easy Data Preparation with latest LLMs-based Operators and Pipelines.
Language: Python - Size: 4.58 MB - Last synced at: about 12 hours ago - Pushed at: about 13 hours ago - Stars: 1,476 - Forks: 102
GoogleCloudPlatform/data-science-on-gcp
Source code accompanying book: Data Science on the Google Cloud Platform, Valliappa Lakshmanan, O'Reilly 2017
Language: Jupyter Notebook - Size: 6.51 MB - Last synced at: 4 months ago - Pushed at: 8 months ago - Stars: 1,387 - Forks: 727
allenai/dolma
Data and tools for generating and inspecting OLMo pre-training data.
Language: Python - Size: 62.9 MB - Last synced at: 5 days ago - Pushed at: 14 days ago - Stars: 1,343 - Forks: 154
NVIDIA-NeMo/Curator
Scalable data pre processing and curation toolkit for LLMs
Language: Python - Size: 18.6 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 1,219 - Forks: 188
run-house/kubetorch
Distribute and run AI workloads on Kubernetes magically in Python, like PyTorch for ML infra.
Language: Python - Size: 31.2 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 1,093 - Forks: 45
microsoft/GODEL
Large-scale pretrained models for goal-directed dialog
Language: Python - Size: 49.8 MB - Last synced at: about 1 month ago - Pushed at: almost 2 years ago - Stars: 883 - Forks: 112
GoogleCloudPlatform/DataflowJavaSDK 📦
Google Cloud Dataflow provides a simple, powerful model for building both batch and streaming parallel data processing pipelines.
Size: 12.9 MB - Last synced at: about 1 month ago - Pushed at: almost 5 years ago - Stars: 852 - Forks: 320
benibela/xidel
Command line tool to download and extract data from HTML/XML pages or JSON-APIs, using CSS, XPath 3.0, XQuery 3.0, JSONiq or pattern matching. It can also create new or transformed XML/HTML/JSON documents.
Language: Pascal - Size: 2.09 MB - Last synced at: 6 days ago - Pushed at: 9 months ago - Stars: 818 - Forks: 45
asyml/texar-pytorch
Integrating the Best of TF into PyTorch, for Machine Learning, Natural Language Processing, and Text Generation. This is part of the CASL project: http://casl-project.ai/
Language: Python - Size: 3.08 MB - Last synced at: about 1 month ago - Pushed at: over 3 years ago - Stars: 747 - Forks: 113
hstreamdb/hstream
HStreamDB is an open-source, cloud-native streaming database for IoT and beyond. Modernize your data stack for real-time applications.
Language: Haskell - Size: 6.28 MB - Last synced at: 17 days ago - Pushed at: 11 months ago - Stars: 727 - Forks: 55
ChenghaoMou/text-dedup
All-in-one text de-duplication
Language: Python - Size: 58.9 MB - Last synced at: 2 months ago - Pushed at: 3 months ago - Stars: 714 - Forks: 73
SebKrantz/collapse
Advanced and Fast Data Transformation in R
Language: C - Size: 111 MB - Last synced at: about 19 hours ago - Pushed at: about 20 hours ago - Stars: 693 - Forks: 35
jofpin/synthBTC
A tool that uses advanced Monte Carlo simulations and Turbit parallel processing to create possible Bitcoin prediction scenarios.
Language: JavaScript - Size: 6.46 MB - Last synced at: about 2 months ago - Pushed at: over 1 year ago - Stars: 672 - Forks: 398
infoslack/awesome-kafka
A list about Apache Kafka
Size: 96.7 KB - Last synced at: 9 days ago - Pushed at: 8 months ago - Stars: 583 - Forks: 165
kousun12/eternal
👾~ music, eternal ~ 👾
Language: JavaScript - Size: 91.4 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 555 - Forks: 34
Puchaczov/Musoq
SQL Syntax without any database
Language: C# - Size: 17.2 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 493 - Forks: 21
constellation-rs/amadeus
Harmonious distributed data analysis in Rust.
Language: Rust - Size: 2.46 MB - Last synced at: 22 days ago - Pushed at: over 4 years ago - Stars: 482 - Forks: 25
polyaxon/haupt
Lineage metadata API, artifacts streams, sandbox, API, and spaces for Polyaxon
Language: Python - Size: 1.18 MB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 451 - Forks: 210
maykulkarni/Machine-Learning-Notebooks
Machine Learning notebooks for refreshing concepts.
Language: Jupyter Notebook - Size: 13.2 MB - Last synced at: about 2 years ago - Pushed at: about 4 years ago - Stars: 420 - Forks: 218
msamogh/nonechucks
Deal with bad samples in your dataset dynamically, use Transforms as Filters, and more!
Language: Python - Size: 25.4 KB - Last synced at: 3 months ago - Pushed at: about 3 years ago - Stars: 377 - Forks: 27
flow-php/etl
PHP - ETL (Extract Transform Load) data processing library
Language: PHP - Size: 3.73 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 366 - Forks: 22
ml6team/fondant
Production-ready data processing made easy and shareable
Language: Python - Size: 23 MB - Last synced at: about 2 months ago - Pushed at: over 1 year ago - Stars: 354 - Forks: 27
lithops-cloud/lithops
A multi-cloud framework for big data analytics and embarrassingly parallel jobs, that provides an universal API for building parallel applications in the cloud ☁️🚀
Language: Python - Size: 12.9 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 348 - Forks: 114
keithorange/PatternPy
📈 PatternPy: A Python package revolutionizing trading analysis with high-speed pattern recognition, leveraging Pandas & Numpy. Effortlessly spot Head & Shoulders, Tops & Bottoms, Supports & Resistances. For experts & beginners. #TradingMadeEasy 🔥
Language: Python - Size: 404 KB - Last synced at: 7 months ago - Pushed at: over 1 year ago - Stars: 335 - Forks: 78
matousc89/padasip
Python Adaptive Signal Processing
Language: Python - Size: 5.93 MB - Last synced at: 3 months ago - Pushed at: over 2 years ago - Stars: 314 - Forks: 52
PytLab/VASPy
Manipulating VASP files with Python.
Language: Python - Size: 21.1 MB - Last synced at: 21 days ago - Pushed at: over 3 years ago - Stars: 289 - Forks: 99
alttch/rapidtables
Super fast list of dicts to pre-formatted tables conversion library for Python 2/3
Language: Python - Size: 240 KB - Last synced at: 15 days ago - Pushed at: 15 days ago - Stars: 288 - Forks: 9
streamnative/pulsar-flink 📦
Elastic data processing with Apache Pulsar and Apache Flink
Language: Java - Size: 2.16 MB - Last synced at: 4 months ago - Pushed at: about 3 years ago - Stars: 279 - Forks: 120
ColasGael/Machine-Learning-for-Solar-Energy-Prediction
Predict the Power Production of a solar panel farm from Weather Measurements using Machine Learning
Language: Python - Size: 922 MB - Last synced at: 9 days ago - Pushed at: about 6 years ago - Stars: 279 - Forks: 120
svenkreiss/pysparkling
A pure Python implementation of Apache Spark's RDD and DStream interfaces.
Language: Python - Size: 3.45 MB - Last synced at: 10 days ago - Pushed at: about 1 year ago - Stars: 271 - Forks: 45
Yord/pxi
🧚 pxi (pixie) is a small, fast, and magical command-line data processor similar to jq, mlr, and awk.
Language: JavaScript - Size: 19.6 MB - Last synced at: about 1 month ago - Pushed at: about 5 years ago - Stars: 268 - Forks: 3
scramjetorg/scramjet
Public tracker for Scramjet Cloud Platform, a platform that bring data from many environments together.
Size: 2.71 MB - Last synced at: about 2 months ago - Pushed at: almost 3 years ago - Stars: 252 - Forks: 20
airscholar/e2e-data-engineering
An end-to-end data engineering pipeline that orchestrates data ingestion, processing, and storage using Apache Airflow, Python, Apache Kafka, Apache Zookeeper, Apache Spark, and Cassandra. All components are containerized with Docker for easy deployment and scalability.
Language: Python - Size: 289 KB - Last synced at: 6 months ago - Pushed at: 9 months ago - Stars: 250 - Forks: 123
asyml/forte
Forte is a flexible and powerful ML workflow builder. This is part of the CASL project: http://casl-project.ai/
Language: Python - Size: 17.8 MB - Last synced at: 2 months ago - Pushed at: almost 2 years ago - Stars: 248 - Forks: 59
mech-lang/mech
🦾 Mech is a programming language for building data-driven systems like robots, games, and interfaces. Start here!
Language: Rust - Size: 18.5 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 245 - Forks: 14
apache/incubator-wayang
Apache Wayang(incubating) is the first cross-platform data processing system.
Language: Java - Size: 19.3 MB - Last synced at: 14 days ago - Pushed at: 14 days ago - Stars: 234 - Forks: 107
helmholtz-analytics/heat
Distributed tensors and Machine Learning framework with GPU and MPI acceleration in Python
Language: Python - Size: 22.2 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 225 - Forks: 57
senbox-org/snap-engine
ESA Earth Observation Toolbox and Java Development Platform
Language: Java - Size: 1.01 GB - Last synced at: 20 days ago - Pushed at: 20 days ago - Stars: 199 - Forks: 105
LibreCat/Catmandu
Catmandu - a data processing toolkit
Language: Perl - Size: 53.3 MB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 191 - Forks: 36
hxz393/BrutalityExtractor
适用于高性能系统的多进程解压缩软件(A multiprocess decompression software for high-performance system)
Language: Python - Size: 4.91 MB - Last synced at: 3 months ago - Pushed at: about 2 years ago - Stars: 187 - Forks: 12
markus-wa/cq
Clojure Query: A Command-line Data Processor for JSON, YAML, EDN, XML and more
Language: Clojure - Size: 202 KB - Last synced at: about 1 month ago - Pushed at: about 1 year ago - Stars: 183 - Forks: 11
fluxus-labs/fluxus
Fluxus Stream Processing Engine
Language: Rust - Size: 5.07 MB - Last synced at: 29 days ago - Pushed at: about 2 months ago - Stars: 166 - Forks: 22
remotesensinginfo/rsgislib
Remote Sensing and GIS Software Library; python module tools for processing spatial data.
Language: C++ - Size: 140 MB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 153 - Forks: 28
senbox-org/snap-desktop
Desktop GUI for SNAP based on NetBeans Platform
Language: Java - Size: 77.9 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 143 - Forks: 66
iam-mhaseeb/Skytrax-Data-Warehouse 📦
A full data warehouse infrastructure with ETL pipelines running inside docker on Apache Airflow for data orchestration, AWS Redshift for cloud data warehouse and Metabase to serve the needs of data visualizations such as analytical dashboards.
Language: Python - Size: 1.34 MB - Last synced at: 3 months ago - Pushed at: over 5 years ago - Stars: 137 - Forks: 30
tollwerk/data-processing-agreements
Collection of Data Processing Agreement (DPA) and GDPR compliance resources
Language: SCSS - Size: 98.6 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 134 - Forks: 24
Nonanti/PipeFlow
High-performance ETL pipeline library for .NET. Process CSV, JSON, Excel, and SQL data with minimal memory usage through streaming operations.
Language: C# - Size: 4.01 MB - Last synced at: 24 days ago - Pushed at: about 2 months ago - Stars: 132 - Forks: 10
luckylittle/blinkist-m4a-downloader
Grabs all of the audio files from all of the Blinkist books
Language: Go - Size: 101 KB - Last synced at: 7 months ago - Pushed at: almost 3 years ago - Stars: 132 - Forks: 25
kfultz07/go-dataframe
A simple package to abstract away the process of creating usable DataFrames for data analytics. This package is heavily inspired by the amazing Python library, Pandas.
Language: Go - Size: 3.96 MB - Last synced at: 4 months ago - Pushed at: 9 months ago - Stars: 130 - Forks: 8
thu-coai/cotk
Conversational Toolkit. An Open-Source Toolkit for Fast Development and Fair Evaluation of Text Generation
Language: Python - Size: 10.5 MB - Last synced at: 2 months ago - Pushed at: about 5 years ago - Stars: 128 - Forks: 26
LiberTEM/LiberTEM
Open pixelated STEM framework
Language: Python - Size: 230 MB - Last synced at: 17 days ago - Pushed at: 30 days ago - Stars: 122 - Forks: 68
NVIDIA/nvImageCodec
A nvImageCodec library of GPU- and CPU- accelerated codecs featuring a unified interface
Language: Jupyter Notebook - Size: 30.6 MB - Last synced at: 9 days ago - Pushed at: 3 months ago - Stars: 122 - Forks: 9
drshahizan/HPDP
High performance data processing employs high performance computing (HPC) to process data, which is then translated into information and knowledge. The advent of high-performance computing and data analytics enabled real-time interrogation of extremely large data sets.
Language: Jupyter Notebook - Size: 527 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 122 - Forks: 89
Siteimprove/alfa
:wheelchair: Suite of open and standards-based tools for performing reliable accessibility conformance testing at scale
Language: HTML - Size: 64.5 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 121 - Forks: 15
utdemir/distributed-dataset
A distributed data processing framework in Haskell.
Language: Haskell - Size: 875 KB - Last synced at: about 2 months ago - Pushed at: over 5 years ago - Stars: 117 - Forks: 5
streamnative/pulsar-spark
Spark Connector to read and write with Pulsar
Language: Scala - Size: 722 KB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 116 - Forks: 53
zengwangfa/2019-Electronic-Design-Competition
【电赛】2019 全国大学生电子设计竞赛 (F题)纸张数量检测装置 (基于STM32F407 & FDC2214 & USART HMI)
Language: C - Size: 80.9 MB - Last synced at: about 2 years ago - Pushed at: over 4 years ago - Stars: 105 - Forks: 41
kubeflow/mcp-apache-spark-history-server
MCP Server for Apache Spark History Server. The bridge between Agentic AI and Apache Spark.
Language: Python - Size: 2.33 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 102 - Forks: 33
whoiskatrin/financial-statement-pdf-extractor
Python script to extract as much structured information as possible from annual/quarterly reports.
Language: Python - Size: 17.6 KB - Last synced at: 8 months ago - Pushed at: almost 2 years ago - Stars: 99 - Forks: 24
akashlevy/Deep-Learn-Oil
Deep learning tools for predicting oil well data
Language: Python - Size: 512 MB - Last synced at: about 1 month ago - Pushed at: about 4 years ago - Stars: 97 - Forks: 55
docwire/docwire
DocWire SDK: Award-winning modern data processing in C++20. SourceForge Community Choice & Microsoft support. AI-driven processing. Supports nearly 100 data formats, including email boxes and OCR. Boost efficiency in text extraction, web data extraction, data mining, document analysis. Offline processing is possible for security and confidentiality
Language: C++ - Size: 36.4 MB - Last synced at: 19 days ago - Pushed at: 19 days ago - Stars: 94 - Forks: 24
asavinov/prosto
Prosto is a data processing toolkit radically changing how data is processed by heavily relying on functions and operations with functions - an alternative to map-reduce and join-groupby
Language: Python - Size: 1.95 MB - Last synced at: about 1 month ago - Pushed at: almost 4 years ago - Stars: 91 - Forks: 5
MDSplus/mdsplus
The MDSplus data management system
Language: Java - Size: 196 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 87 - Forks: 49
DRUMNICORN/Visio
Visio is an AI-powered IDE concept that turns software development into a visual, code-free experience, making programming accessible to everyone.
Size: 1020 KB - Last synced at: 9 months ago - Pushed at: about 1 year ago - Stars: 83 - Forks: 5
aces/cbrain
CBRAIN is a flexible Ruby on Rails framework for accessing and processing of large data on high-performance computing infrastructures.
Language: Ruby - Size: 20.5 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 80 - Forks: 51
pauliacomi/pyGAPS
A framework for processing adsorption data and isotherm fitting
Language: Python - Size: 26.4 MB - Last synced at: 10 days ago - Pushed at: 9 months ago - Stars: 78 - Forks: 26
vortex-exoplanet/VIP
VIP is a python package/library for angular, reference star and spectral differential imaging for exoplanet/disk detection through high-contrast imaging.
Language: Python - Size: 330 MB - Last synced at: 20 days ago - Pushed at: 28 days ago - Stars: 77 - Forks: 62
duoan/ijcai18-mama-ads-competition
IJCAI-18 阿里妈妈搜索广告转化预测初赛方案
Language: Jupyter Notebook - Size: 1.11 MB - Last synced at: over 1 year ago - Pushed at: over 7 years ago - Stars: 72 - Forks: 22
alirezatheh/perke
A keyphrase extractor for Persian
Language: Python - Size: 143 KB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 69 - Forks: 8
JusperLee/LRS3-For-Speech-Separation
Multi-modal speech separation task data generation script on LRS3 data set.
Language: MATLAB - Size: 3.4 MB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 69 - Forks: 14
p-ranav/pipeline
Pipelines for Modern C++
Language: C++ - Size: 245 KB - Last synced at: 7 months ago - Pushed at: about 5 years ago - Stars: 67 - Forks: 8
UrbanOS-Public/smartcitiesdata
The core micro services of UrbanOS as an umbrella project with component documentation
Language: Elixir - Size: 14.6 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 64 - Forks: 11
NVIDIA-AI-IOT/deepstream_libraries
DeepStream Libraries offer CVCUDA, NvImageCodec, and PyNvVideoCodec modules as Python APIs for seamless integration into custom frameworks.
Language: Python - Size: 246 KB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 63 - Forks: 1
unidentifieddeveloper/blaze
A blazing fast exporter for your Elasticsearch data.
Language: C++ - Size: 34.2 KB - Last synced at: 6 months ago - Pushed at: 11 months ago - Stars: 62 - Forks: 9
AtomGraph/Processor
Ontology-driven Linked Data processor and server for SPARQL backends. Apache License.
Language: Java - Size: 1.51 MB - Last synced at: 4 months ago - Pushed at: over 2 years ago - Stars: 60 - Forks: 7
BjoernKW/ZenQuery
Enterprise backend as a service
Language: Java - Size: 5.84 MB - Last synced at: over 2 years ago - Pushed at: about 7 years ago - Stars: 60 - Forks: 15
josephmachado/online_store
End to end data engineering project
Language: Python - Size: 1.53 MB - Last synced at: 5 months ago - Pushed at: about 3 years ago - Stars: 57 - Forks: 18
wq/itertable
⇔ IterTable is a Pythonic API for iterating through tabular data formats, including CSV, XLSX, XML, and JSON.
Language: Python - Size: 248 KB - Last synced at: 8 days ago - Pushed at: over 2 years ago - Stars: 53 - Forks: 11
31z4/storm-docker 📦
Docker image packaging for Apache Storm
Language: Dockerfile - Size: 81.1 KB - Last synced at: over 1 year ago - Pushed at: almost 2 years ago - Stars: 52 - Forks: 27
TirendazAcademy/Data-Visualization-with-Python
Data Visualization Tutorial | Matplotlib | Seaborn | Pandas
Language: Jupyter Notebook - Size: 25.5 MB - Last synced at: 7 months ago - Pushed at: over 2 years ago - Stars: 51 - Forks: 34
luisbelloch/data_processing_course
Some class materials for a data processing course using PySpark
Language: Python - Size: 563 KB - Last synced at: over 2 years ago - Pushed at: almost 3 years ago - Stars: 51 - Forks: 24
Samson-Mano/Fast_Fourier_Transform
C# implementation of Cooley–Tukey's FFT algorithm.
Language: C# - Size: 1.44 MB - Last synced at: 7 months ago - Pushed at: over 2 years ago - Stars: 48 - Forks: 17
adelekuzmiakova/CS229-machine-learning-solar-energy-predictions
Predicting solar energy using machine learning (LSTM, PCA, boosting). This is our CS 229 project from autumn 2017. Report and poster are included.
Language: Python - Size: 922 MB - Last synced at: almost 2 years ago - Pushed at: over 4 years ago - Stars: 48 - Forks: 12
gabyx/ExecutionGraph
Fast Generic Execution Graph/Network
Language: C++ - Size: 24.8 MB - Last synced at: 6 months ago - Pushed at: about 3 years ago - Stars: 46 - Forks: 7