GitHub topics: data-processing
Soriano-R/disaster-response-pipeline
A data science pipeline for analyzing and responding to disaster-related data
Language: HTML - Size: 60 MB - Last synced at: about 7 hours ago - Pushed at: about 7 hours ago - Stars: 0 - Forks: 0

qbxlvnf11/data-preprocessing-methods
Image/Text/Signal data processing methods & data parser & other utils etc.
Language: Jupyter Notebook - Size: 4.14 MB - Last synced at: about 11 hours ago - Pushed at: about 11 hours ago - Stars: 2 - Forks: 1

cocoindex-io/cocoindex
Real-time data transformation framework for AI. Ultra performant, with incremental processing.
Language: Rust - Size: 7.78 MB - Last synced at: about 13 hours ago - Pushed at: about 14 hours ago - Stars: 1,927 - Forks: 130

pathwaycom/pathway
Python ETL framework for stream processing, real-time analytics, LLM pipelines, and RAG.
Language: Python - Size: 132 MB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 27,732 - Forks: 623

johnkerl/miller
Miller is like awk, sed, cut, join, and sort for name-indexed data such as CSV, TSV, and tabular JSON
Language: Go - Size: 201 MB - Last synced at: about 12 hours ago - Pushed at: 9 days ago - Stars: 9,329 - Forks: 224

Labs64/labs64.io-auditflow
Labs64.IO - Scalable & Searchable Auditing Solution
Language: Python - Size: 117 KB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 1 - Forks: 0

deepseek-ai/smallpond
A lightweight data processing framework built on DuckDB and 3FS.
Language: Python - Size: 1.77 MB - Last synced at: 1 day ago - Pushed at: 4 months ago - Stars: 4,700 - Forks: 415

unionai-oss/pandera
A light-weight, flexible, and expressive statistical data testing library
Language: Python - Size: 3.98 MB - Last synced at: about 21 hours ago - Pushed at: 4 days ago - Stars: 3,861 - Forks: 345

ull0sm/Drawer
Drawer automates single-elimination draw systems, ensuring fairness with balanced group allocation and bias-free brackets. Now enhanced with Docker, it eliminates dependency issues for seamless event management.
Language: Python - Size: 495 KB - Last synced at: 1 day ago - Pushed at: 2 days ago - Stars: 0 - Forks: 0

apache/incubator-wayang
Apache Wayang(incubating) is the first cross-platform data processing system.
Language: Java - Size: 19.1 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 222 - Forks: 96

ndjapic/mat7-2024
Материјали за предмет математика у седмом разреду у школској 2024/2025. години
Language: TeX - Size: 1.27 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 0 - Forks: 0

Efidieeieiddidfkkfkfkf/Generador-De-Oficios
Aplicación web en Flask que genera oficios personalizados en Word desde una plantilla, usando datos de destinatarios almacenados en un Excel de directorio empresarial.
Language: Python - Size: 14.6 KB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 0 - Forks: 0

olympus-terminal/data-processing
Data analysis and processing tools
Language: Python - Size: 14.6 KB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 0 - Forks: 0

Dat09123/btc_address_sorter_by_type
🔎 Ultra-fast Bitcoin address sorter with real-time multiprocessing, address format detection, and low RAM usage. Ideal for forensic research, data analytics, and blockchain intelligence.
Language: Python - Size: 9.77 KB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 0 - Forks: 0

Tyson-cyber/GetMerlin2Api
GetMerlin2Api is a versatile API that allows users to seamlessly integrate Merlin2 software capabilities into their own applications, enabling enhanced project management and collaboration features. With its comprehensive documentation and user-friendly endpoints, developers can easily leverage the power of Merlin2 within their projects for optimal
Size: 1000 Bytes - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 0 - Forks: 0

docwire/docwire
DocWire SDK: Award-winning modern data processing in C++20. SourceForge Community Choice & Microsoft support. AI-driven processing. Supports nearly 100 data formats, including email boxes and OCR. Boost efficiency in text extraction, web data extraction, data mining, document analysis. Offline processing is possible for security and confidentiality
Language: C++ - Size: 36.1 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 88 - Forks: 18

TomWright/dasel
Select, put and delete data from JSON, TOML, YAML, XML and CSV files with a single tool. Supports conversion between formats and can be used as a Go package.
Language: Go - Size: 8.56 MB - Last synced at: about 20 hours ago - Pushed at: 3 months ago - Stars: 7,485 - Forks: 149

bytewax/bytewax
Python Stream Processing
Language: Python - Size: 12 MB - Last synced at: about 2 hours ago - Pushed at: 3 months ago - Stars: 1,765 - Forks: 82

BADER76/solar-power-measurement
This repository hosts a solar power measurement system that tracks voltage, current, and power using the STM32F103C8T6 microcontroller. The data is visualized in real-time on an OLED display and sent to the ThingSpeak IoT cloud for further analysis. 🌐🌞
Language: C - Size: 18 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 0 - Forks: 0

numaproj/numaflow
Kubernetes-native platform to run massively parallel data/streaming jobs
Language: Go - Size: 45.2 MB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 1,878 - Forks: 133

Naveen-526/Federated-Learning-based-IDS
This repository features a federated learning system designed for intrusion detection in IoT networks, ensuring data privacy while maintaining high accuracy. The project utilizes the Flower framework and includes essential components like data processing, server-client architecture, and SSL certificates for secure communication. 🐙🌐
Language: Jupyter Notebook - Size: 19.5 KB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 0 - Forks: 0

tathithienthanh/WomenFashionProductRecommendationSystem
Build a recommendation system for recommending woman fashion's products on e-commerce platforms
Language: Jupyter Notebook - Size: 49.8 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 0 - Forks: 0

Pig85236/45K-Udemy-Course-WordPress-Posts
XML files of 45K+ Udemy courses for WordPress—Share Knowledge, Drive Traffic, & Make Money! 🔥🚀
Size: 1.95 KB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 4 - Forks: 1

modelscope/data-juicer
Data processing for and with foundation models! 🍎 🍋 🌽 ➡️ ➡️🍸 🍹 🍷
Language: Python - Size: 223 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 4,607 - Forks: 243

nxoti/cnpj-data-pipeline
# 🇧🇷 CNPJ Data PipelineUm script modular e configurável para processar arquivos CNPJ da Receita Federal do Brasil. 🐙 Este projeto oferece suporte a múltiplos bancos de dados e permite o processamento inteligente de mais de 50 milhões de empresas.
Language: Python - Size: 384 KB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 0 - Forks: 0

crate/cratedb-toolkit
CrateDB Toolkit, an SDK for CrateDB and CrateDB Cloud.
Language: Python - Size: 3.54 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 10 - Forks: 4

johnhany/awesome-list
A list of useful stuff in Machine Learning, Computer Graphics, Software Development, ...
Size: 1.13 MB - Last synced at: 3 days ago - Pushed at: over 2 years ago - Stars: 15 - Forks: 5

LiberTEM/LiberTEM
Open pixelated STEM framework
Language: Python - Size: 229 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 117 - Forks: 68

allenai/dolma
Data and tools for generating and inspecting OLMo pre-training data.
Language: Python - Size: 63.5 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 1,241 - Forks: 144

seinecle/nocodefunctions-web-app
The code base of the front-end of nocodefunctions.com
Language: Java - Size: 37.9 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 39 - Forks: 7

drshahizan/HPDP
High performance data processing employs high performance computing (HPC) to process data, which is then translated into information and knowledge. The advent of high-performance computing and data analytics enabled real-time interrogation of extremely large data sets.
Language: Jupyter Notebook - Size: 400 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 121 - Forks: 86

microsoft/GODEL
Large-scale pretrained models for goal-directed dialog
Language: Python - Size: 49.8 MB - Last synced at: 2 days ago - Pushed at: over 1 year ago - Stars: 869 - Forks: 112

deermichel/flowing
🔀 Rusty flow graph processing library
Language: Rust - Size: 17.6 KB - Last synced at: 3 days ago - Pushed at: over 2 years ago - Stars: 8 - Forks: 1

infoslack/awesome-kafka
A list about Apache Kafka
Size: 96.7 KB - Last synced at: 1 day ago - Pushed at: 3 months ago - Stars: 579 - Forks: 164

KikiBoum4980/2025-One-Billion-Row-Challenge
Projeto One Billion Row atualizado para 2025
Language: Python - Size: 438 KB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 1 - Forks: 0

louisejuliedelhaye/counting-ocean-particles
A set of easy codes to process data on marine suspended particles collected with different sensors
Language: Jupyter Notebook - Size: 7.33 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 0 - Forks: 0

venis-majkofci/Log2Csv
A PowerShell script designed to parse and convert unstructured log files into structured CSV format, facilitating easier analysis and processing.
Language: PowerShell - Size: 34.2 KB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 1 - Forks: 0

speedcell4/torchglyph
Data Processor Combinators for Natural Language Processing
Language: Python - Size: 546 KB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 7 - Forks: 1

ChenghaoMou/text-dedup
All-in-one text de-duplication
Language: Python - Size: 5.77 MB - Last synced at: 4 days ago - Pushed at: 28 days ago - Stars: 688 - Forks: 74

dd-hebert/uv_pro
Command line tool for parsing and processing UV-Vis data from the Agilent 845x Chemstation software.
Language: Python - Size: 4.96 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 2 - Forks: 1

streamnative/pulsar-spark
Spark Connector to read and write with Pulsar
Language: Scala - Size: 726 KB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 113 - Forks: 51

StatCan/gensol-gseries
(EN) Package gseries - R version of generalized system G-Series https://StatCan.github.io/gensol-gseries/en/ =========================== (FR) Librairie gseries - Version R du système généralisé G-Séries https://StatCan.github.io/gensol-gseries/fr/
Language: R - Size: 12.6 MB - Last synced at: 4 days ago - Pushed at: 6 days ago - Stars: 5 - Forks: 1

AtomGraph/Processor
Ontology-driven Linked Data processor and server for SPARQL backends. Apache License.
Language: Java - Size: 1.51 MB - Last synced at: 2 days ago - Pushed at: almost 2 years ago - Stars: 60 - Forks: 7

NVIDIA-NeMo/Curator
Scalable data pre processing and curation toolkit for LLMs
Language: Jupyter Notebook - Size: 7.99 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 948 - Forks: 138

ObinnaOkoye89/diet-coach-app
A Python-based Diet Coach app that calculates total nutritional values—calories, fats, proteins, carbohydrates, and sugars—based on user-selected foods and quantities. Built using a JSON nutrition dataset for real-time feedback on dietary choices. Ideal for health-conscious users and developers interested in nutrition-focused applications.
Language: Python - Size: 238 KB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 0 - Forks: 0

technologiestiftung/erfrischungskarte-daten
Code for preprocessing and modeling and raw and resulting data for the 'Erfrischungskarte'.
Language: R - Size: 32.9 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 8 - Forks: 2

arm-university/Arm-Helium-Technology
A reference book on M-Profile Vector Extensions (MVE) for Arm Cortex-M Processors
Size: 9.58 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 12 - Forks: 0

brunocampos01/data-engineering
Language: Python - Size: 165 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 11 - Forks: 2

subhayu99/datasetpipeline
A data processing and analysis pipeline designed to handle various jobs related to data transformation, quality assessment, deduplication, and formatting.
Language: Python - Size: 1.59 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 1 - Forks: 0

FaninhoFrade/GSP632-convolution
This repository offers a practical approach to image processing using convolutions and pooling on Google Cloud. 🖼️ Dive into hands-on experiments with SciPy and NumPy to enhance your understanding of deep learning concepts. 💻
Language: Python - Size: 373 KB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 0 - Forks: 0

microsoft/DialoGPT
Large-scale pretraining for dialogue
Language: Python - Size: 43.6 MB - Last synced at: 2 days ago - Pushed at: over 2 years ago - Stars: 2,389 - Forks: 348

ljubogdan/GSP632-convolution
A project for image processing using convolutions and pooling in Google Cloud. Load and process images with SciPy and NumPy, create 3x3 filters, and analyze output effects. Focuses on practical applications of deep learning and computer vision.
Language: Python - Size: 374 KB - Last synced at: 5 days ago - Pushed at: 9 days ago - Stars: 0 - Forks: 0

ictchenbo/SmartETL
SmartETL:一个简单、灵活、可配置、开箱即用的Python ETL框架,具有领域特色,拒绝重复造轮子!提供Wikidata / Wikipedia / GDELT等多种开源数据的处理流程; 支持txt/json/csv/excel等文件格式、MySQL/PostgreSQL/MongoDB/ClickHouse/ElasticSearch等数据库作为输入和输出; 提供大模型、Web API等多种处理算子
Language: Python - Size: 4.74 MB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 17 - Forks: 3

NVIDIA/nvImageCodec
A nvImageCodec library of GPU- and CPU- accelerated codecs featuring a unified interface
Language: Jupyter Notebook - Size: 22.3 MB - Last synced at: 8 days ago - Pushed at: 3 months ago - Stars: 106 - Forks: 8

NVIDIA/DALI
A GPU-accelerated library containing highly optimized building blocks and an execution engine for data processing to accelerate deep learning training and inference applications.
Language: C++ - Size: 395 MB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 5,429 - Forks: 639

QuantumRevenant/ListProductsImages
ListProductsImages: C#/.NET 8 console utility for listing and filtering files in directories with advanced rules (folders, regex) and interactive menus. Exports to .txt.
Language: C# - Size: 36.1 KB - Last synced at: 10 days ago - Pushed at: 10 days ago - Stars: 0 - Forks: 0

polyaxon/haupt
Lineage metadata API, artifacts streams, sandbox, API, and spaces for Polyaxon
Language: Python - Size: 1.16 MB - Last synced at: 10 days ago - Pushed at: 10 days ago - Stars: 453 - Forks: 209

CEA-MetroCarac/SPECTROview
SPECTROview : A Tool for Spectroscopic Data Processing and Visualization.
Language: Python - Size: 208 MB - Last synced at: 10 days ago - Pushed at: 10 days ago - Stars: 1 - Forks: 0

niamoto/niamoto
Niamoto is a command-line application and library focused on processing and publishing botanical data
Language: Python - Size: 11.6 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 3 - Forks: 0

dashbitco/broadway
Concurrent and multi-stage data ingestion and data processing with Elixir
Language: Elixir - Size: 656 KB - Last synced at: 7 days ago - Pushed at: 16 days ago - Stars: 2,536 - Forks: 167

abhimehro/Seatek_Analysis
R-based analysis tier for Seatek sensor data processing and Excel workbook generation. Part of a three-tier analysis system working in conjunction with Python-based visualization project.
Language: Python - Size: 50.7 MB - Last synced at: 10 days ago - Pushed at: 10 days ago - Stars: 1 - Forks: 0

adanSiqueira/modular-data-pipeline
Data pipeline in Python, structured with Object-Oriented Programming (OOP), using pandas for processing, requests for automated downloads, and pathlib for directory handling. Modular and organized to transform raw data (JSON and CSV) into analysis-ready datasets with a single command.
Language: Python - Size: 56.1 MB - Last synced at: 10 days ago - Pushed at: 11 days ago - Stars: 0 - Forks: 0

simsam8/ers_data_processing
This repo contains code for processing and visualizing ERS and AIS data from Fiskeridirektoratet.
Language: Jupyter Notebook - Size: 556 KB - Last synced at: 11 days ago - Pushed at: 11 days ago - Stars: 0 - Forks: 0

tinosingh/multipass
Universal API Wrapper - Turn ANY Python Library into a Robust API
Language: Python - Size: 41 KB - Last synced at: 11 days ago - Pushed at: 11 days ago - Stars: 0 - Forks: 0

senbox-org/snap-engine
ESA Earth Observation Toolbox and Java Development Platform
Language: Java - Size: 1020 MB - Last synced at: 7 days ago - Pushed at: 11 days ago - Stars: 193 - Forks: 102

lispking/fluxus
Fluxus Stream Processing Engine
Language: Rust - Size: 5.02 MB - Last synced at: 7 days ago - Pushed at: 29 days ago - Stars: 150 - Forks: 22

aces/cbrain
CBRAIN is a flexible Ruby on Rails framework for accessing and processing of large data on high-performance computing infrastructures.
Language: Ruby - Size: 20.4 MB - Last synced at: 12 days ago - Pushed at: 12 days ago - Stars: 78 - Forks: 51

mech-lang/mech
🦾 Mech is a programming language for building data-driven systems like robots, games, and interfaces. Start here!
Language: Rust - Size: 11.2 MB - Last synced at: 12 days ago - Pushed at: 12 days ago - Stars: 222 - Forks: 12

kwadwo-Oppong/gdp-natural-cubic-spline-regression
This project investigates economic growth factors, specifically GDP, by applying ordinary least squares (OLS) and a more robust, proposed estimator. It includes data preparation, feature engineering with natural cubic splines, and detailed analysis
Language: Jupyter Notebook - Size: 76.2 KB - Last synced at: 12 days ago - Pushed at: 12 days ago - Stars: 0 - Forks: 0

senbox-org/snap-desktop
Desktop GUI for SNAP based on NetBeans Platform
Language: Java - Size: 77.8 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 142 - Forks: 64

helmholtz-analytics/heat
Distributed tensors and Machine Learning framework with GPU and MPI acceleration in Python
Language: Python - Size: 21.1 MB - Last synced at: 10 days ago - Pushed at: 10 days ago - Stars: 222 - Forks: 55

etsap-TIMES/xl2times
Open source tool to convert TIMES models specified in Excel
Language: Python - Size: 931 KB - Last synced at: 4 days ago - Pushed at: 13 days ago - Stars: 18 - Forks: 9

markus-wa/cq
Clojure Query: A Command-line Data Processor for JSON, YAML, EDN, XML and more
Language: Clojure - Size: 202 KB - Last synced at: 3 days ago - Pushed at: 10 months ago - Stars: 178 - Forks: 11

rohankharche34/Solar-panel-performance-optimization
Stacked regression ensemble using sensor and environmental data to forecast solar panel efficiency.
Language: Jupyter Notebook - Size: 5.25 MB - Last synced at: 8 minutes ago - Pushed at: 13 days ago - Stars: 0 - Forks: 0

havrak/fmcw-surveillance-radar
Respository of my bachelor's thesis whose subject is constructing a surveillance radar based on FMCW SiRad Easy
Language: MATLAB - Size: 148 MB - Last synced at: 13 days ago - Pushed at: 13 days ago - Stars: 1 - Forks: 0

legend-exp/Juleanita.jl
Meta-package for the Julia software stack to analyse teststand data for the LEGEND experiment.
Language: Julia - Size: 1.15 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 2 - Forks: 1

flow-php/etl-adapter-elasticsearch
PHP ETL Adapter: Elasticsearch
Language: PHP - Size: 289 KB - Last synced at: 10 days ago - Pushed at: 13 days ago - Stars: 2 - Forks: 1

flow-php/etl
PHP - ETL (Extract Transform Load) data processing library
Language: PHP - Size: 3.7 MB - Last synced at: 7 days ago - Pushed at: 13 days ago - Stars: 359 - Forks: 20

tealtools/awesome-apache-pulsar
A curated list of resources about Apache Pulsar.
Size: 367 KB - Last synced at: 13 days ago - Pushed at: 13 days ago - Stars: 30 - Forks: 3

abrahamkoloboe27/Housing-Price-Prediction
Language: Jupyter Notebook - Size: 937 KB - Last synced at: 13 days ago - Pushed at: 14 days ago - Stars: 2 - Forks: 0

zazuko/barnard59
An intuitive and flexible RDF pipeline solution designed to simplify and automate ETL processes for efficient data management.
Language: JavaScript - Size: 3.66 MB - Last synced at: 10 days ago - Pushed at: 2 months ago - Stars: 33 - Forks: 2

ddeutils/ddeutil-extensions
:building_construction: Dynamic data processing & transformation plugins
Language: Python - Size: 604 KB - Last synced at: about 17 hours ago - Pushed at: 14 days ago - Stars: 0 - Forks: 0

karimosman89/iot-predictive-maintenance
This repository will simulate an IoT-based predictive maintenance system designed to monitor industrial equipment through sensors. It will include data ingestion, processing, and machine learning components to predict potential failures, optimizing maintenance schedules and reducing downtime.
Language: Python - Size: 38.1 KB - Last synced at: 5 days ago - Pushed at: 8 months ago - Stars: 2 - Forks: 0

MrGL1TCH/Exportador_csv_a_base_de_datos
Exportador CSV a Base de Datos es una aplicación web diseñada para simplificar y automatizar la importación de datos desde archivos CSV hacia bases de datos MySQL. Ideal para usuarios y desarrolladores que necesitan una herramienta rápida, confiable y fácil de usar para manejar grandes volúmenes de datos sin complicaciones técnicas.
Language: PHP - Size: 13.7 KB - Last synced at: 15 days ago - Pushed at: 15 days ago - Stars: 0 - Forks: 0

ColasGael/Machine-Learning-for-Solar-Energy-Prediction
Predict the Power Production of a solar panel farm from Weather Measurements using Machine Learning
Language: Python - Size: 922 MB - Last synced at: 7 days ago - Pushed at: over 5 years ago - Stars: 267 - Forks: 113

CoreBlader/autobiz-api-extractor
# Autobiz API Extractor## DescriptionThis project extracts data from the [Autobiz API](https://corporate.autobiz.com/es/nuestros-productos/autobizapi/), storing it in JSON or CSV files, and analyzes the results. It features a modular structure for easy data extraction, processing, and visualization. 🐙📊
Language: Python - Size: 14.6 KB - Last synced at: 15 days ago - Pushed at: 15 days ago - Stars: 0 - Forks: 0

jwalsh/emacsconf-2024
EmacsConf 2024 conference notes, transcript processing, and analysis tools
Language: Python - Size: 356 KB - Last synced at: 3 days ago - Pushed at: 7 months ago - Stars: 0 - Forks: 0

getstrm/pace
Data policy IN, dynamic view OUT: PACE is the Policy As Code Engine. It helps you to programatically create and apply a data policy to a processing platform like Databricks, Snowflake or BigQuery (or plain 'ol Postgres, even!) with definitions imported from Collibra, Datahub, ODD and the like.
Language: Kotlin - Size: 13.1 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 36 - Forks: 1

asavinov/prosto
Prosto is a data processing toolkit radically changing how data is processed by heavily relying on functions and operations with functions - an alternative to map-reduce and join-groupby
Language: Python - Size: 1.95 MB - Last synced at: 15 days ago - Pushed at: over 3 years ago - Stars: 91 - Forks: 5

pyper-dev/pyper
Concurrent Python made simple
Language: Python - Size: 462 KB - Last synced at: 14 days ago - Pushed at: 5 months ago - Stars: 1,421 - Forks: 28

IBM/ibm-cloud-functions-data-processing-message-hub 📦
Create a serverless, event-driven application with Apache OpenWhisk on IBM Cloud Functions that executes code in response to messages or to handle streams of data records from Apache Kafka or IBM Message Hub.
Language: Shell - Size: 1.55 MB - Last synced at: 13 days ago - Pushed at: about 6 years ago - Stars: 21 - Forks: 26

MDSplus/mdsplus
The MDSplus data management system
Language: Java - Size: 148 MB - Last synced at: 11 days ago - Pushed at: 11 days ago - Stars: 82 - Forks: 48

Kellybrackets/data-Analytics-projects
A collection of end-to-end analytics projects demonstrating expertise in transforming raw data into actionable business insights using modern analytics tools and methodologies.
Size: 1.95 KB - Last synced at: 18 days ago - Pushed at: 19 days ago - Stars: 0 - Forks: 0

Ambeteco/faster-os
6800% faster "os" module replacement. A drop-in replacement for Python's standard 'OS' module. Fully-rewritten, optimized, and speeded-up functions, that replace ones in the os.path module.
Language: Python - Size: 1.53 MB - Last synced at: 3 days ago - Pushed at: over 2 years ago - Stars: 17 - Forks: 2

Siteimprove/alfa
:wheelchair: Suite of open and standards-based tools for performing reliable accessibility conformance testing at scale
Language: TypeScript - Size: 52.2 MB - Last synced at: 19 days ago - Pushed at: 19 days ago - Stars: 115 - Forks: 12

ml6team/fondant
Production-ready data processing made easy and shareable
Language: Python - Size: 23 MB - Last synced at: 7 days ago - Pushed at: about 1 year ago - Stars: 351 - Forks: 27

ElecGeek/HealthMeter
Converts the binary file (.DAT) into a more readable and makes some statistics for health or sport meters.
Language: C++ - Size: 53.7 KB - Last synced at: 19 days ago - Pushed at: 20 days ago - Stars: 0 - Forks: 0

remotesensinginfo/rsgislib
Remote Sensing and GIS Software Library; python module tools for processing spatial data.
Language: C++ - Size: 140 MB - Last synced at: 20 days ago - Pushed at: 20 days ago - Stars: 153 - Forks: 28

Siteimprove/alfa-act-r
:clipboard: Acceptance testing of rules authored by the ACT Rules Community Group (@act-rules) and implemented by Alfa
Language: TypeScript - Size: 34.6 MB - Last synced at: 20 days ago - Pushed at: 20 days ago - Stars: 1 - Forks: 2

AmirAli104/Text2Excel
A GUI desktop application that can extract data from a text file and put them in an Excel or CSV file using regular expression (regex) patterns
Language: Python - Size: 208 KB - Last synced at: 4 days ago - Pushed at: 21 days ago - Stars: 4 - Forks: 0
