An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: data-processing

ayeujjawalsingh/Databricks

Size: 163 KB - Last synced at: about 2 hours ago - Pushed at: about 4 hours ago - Stars: 0 - Forks: 0

legend-exp/legend-dataflow

LEGEND data flow management

Language: Python - Size: 1.21 MB - Last synced at: about 2 hours ago - Pushed at: about 4 hours ago - Stars: 2 - Forks: 13

Labs64/labs64.io-auditflow

Labs64.IO - Scalable & Searchable Auditing Solution

Language: Java - Size: 118 KB - Last synced at: about 13 hours ago - Pushed at: about 15 hours ago - Stars: 1 - Forks: 0

IstariRobotics/egohub

An end-to-end toolkit for ingesting, normalizing, and processing diverse egocentric datasets for humanoid robotics research. It provides a flexible pipeline for converting multiple data formats into a unified canonical schema, enriching them with features like object detection, training toolsets and visualizations.

Language: Python - Size: 3.46 MB - Last synced at: about 24 hours ago - Pushed at: 1 day ago - Stars: 2 - Forks: 0

Pig85236/45K-Udemy-Course-WordPress-Posts

XML files of 45K+ Udemy courses for WordPress—Share Knowledge, Drive Traffic, & Make Money! 🔥🚀

Size: 1.95 KB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 4 - Forks: 1

OpenDCAI/DataFlow

Easy Data Preparation with latest LLMs-based Operators and Pipelines.

Language: Python - Size: 70.3 MB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 881 - Forks: 49

Pablo-gitub/excel_category

Flutter apllication working with excel file, it help to filter and see your element basing on what you are searching, as like as you could do using pivot tables, but with a simple interface

Language: Dart - Size: 8.61 MB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 0 - Forks: 0

modelscope/data-juicer

Data processing for and with foundation models! 🍎 🍋 🌽 ➡️ ➡️🍸 🍹 🍷

Language: Python - Size: 281 MB - Last synced at: 2 days ago - Pushed at: 3 days ago - Stars: 4,756 - Forks: 245

Efidieeieiddidfkkfkfkf/Generador-De-Oficios

Aplicación web en Flask que genera oficios personalizados en Word desde una plantilla, usando datos de destinatarios almacenados en un Excel de directorio empresarial.

Language: Python - Size: 14.6 KB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 0 - Forks: 0

Dat09123/btc_address_sorter_by_type

🔎 Ultra-fast Bitcoin address sorter with real-time multiprocessing, address format detection, and low RAM usage. Ideal for forensic research, data analytics, and blockchain intelligence.

Language: Python - Size: 9.77 KB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 0 - Forks: 0

abhimehro/Seatek_Analysis

R-based analysis tier for Seatek sensor data processing and Excel workbook generation. Part of a three-tier analysis system working in conjunction with Python-based visualization project.

Language: Python - Size: 50.7 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 1 - Forks: 0

Tyson-cyber/GetMerlin2Api

GetMerlin2Api is a versatile API that allows users to seamlessly integrate Merlin2 software capabilities into their own applications, enabling enhanced project management and collaboration features. With its comprehensive documentation and user-friendly endpoints, developers can easily leverage the power of Merlin2 within their projects for optimal

Size: 1000 Bytes - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 0 - Forks: 0

microsoft/GODEL

Large-scale pretrained models for goal-directed dialog

Language: Python - Size: 49.8 MB - Last synced at: 2 days ago - Pushed at: over 1 year ago - Stars: 873 - Forks: 113

microsoft/DialoGPT

Large-scale pretraining for dialogue

Language: Python - Size: 43.6 MB - Last synced at: 2 days ago - Pushed at: over 2 years ago - Stars: 2,395 - Forks: 348

johnkerl/miller

Miller is like awk, sed, cut, join, and sort for name-indexed data such as CSV, TSV, and tabular JSON

Language: Go - Size: 201 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 9,355 - Forks: 225

brooks-code/silver-broccoli

ATMO-viz submission for the DGE 2024 Challenge.

Language: JavaScript - Size: 5.41 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 0 - Forks: 0

CNIC-Proteomics/TurboPutative-web

Language: HTML - Size: 298 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 0 - Forks: 0

ChenghaoMou/text-dedup

All-in-one text de-duplication

Language: Python - Size: 5.77 MB - Last synced at: 3 days ago - Pushed at: about 2 months ago - Stars: 700 - Forks: 74

venis-majkofci/Log2Csv

A PowerShell script designed to parse and convert unstructured log files into structured CSV format, facilitating easier analysis and processing.

Language: PowerShell - Size: 43.9 KB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 1 - Forks: 0

YM1KTC/DMRListEditor

Tailwind CSS ve DataTables ile yapılmış çevrimiçi bir DMR Contact List düzenleyici. CSV dosyalarını yükleme, düzenleme ve dışa aktarma özelliklerini içerir. Türkiye'deki DMR kullanıcılarını RadioID.net'ten almayı destekler.

Language: HTML - Size: 355 KB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 1 - Forks: 0

bytewax/bytewax

Python Stream Processing

Language: Python - Size: 12 MB - Last synced at: 3 days ago - Pushed at: 4 months ago - Stars: 1,776 - Forks: 84

KuznetcovIvan/csv_file_processor

A Python script for processing CSV files with support for filtering and aggregation operations. It allows filtering rows by text or numeric columns using operators (>, <, =) and aggregating numeric columns with functions (avg, min, max). The script outputs results in a formatted table to the console.

Language: Python - Size: 50.8 KB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 1 - Forks: 0

Conqxeror/veloxx

Veloxx: A high-performance, lightweight Rust library for in-memory data processing and analytics. Features DataFrames, Series, CSV/JSON I/O, powerful transformations, aggregations, and statistical functions for efficient data science and engineering.

Language: Rust - Size: 555 KB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 1 - Forks: 0

sodascience/kansenkaart_preprocessing

The processing pipeline for the Dutch Opportunity atlas

Language: R - Size: 620 KB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 3 - Forks: 2

KikiBoum4980/2025-One-Billion-Row-Challenge

Projeto One Billion Row atualizado para 2025

Language: Python - Size: 438 KB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 1 - Forks: 0

NVIDIA-NeMo/Curator

Scalable data pre processing and curation toolkit for LLMs

Language: Jupyter Notebook - Size: 8.48 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 1,017 - Forks: 145

L0g0rhythm/URL-Refiner

A Python tool to efficiently process, modify, and deduplicate URL lists. Ideal for security professionals, analysts, and developers, with both CLI and GUI support.

Language: Python - Size: 15.6 KB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 0 - Forks: 0

aces/cbrain

CBRAIN is a flexible Ruby on Rails framework for accessing and processing of large data on high-performance computing infrastructures.

Language: Ruby - Size: 20.3 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 78 - Forks: 51

albertopeinador/fDSC

Streamlit app to process flashDSC data. 'Kinetic' and 'Annealing' type measurements are supported as of now

Language: Python - Size: 80.1 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 1 - Forks: 0

niamoto/niamoto

Niamoto is a command-line application and library focused on processing and publishing botanical data

Language: Python - Size: 11.8 MB - Last synced at: 5 days ago - Pushed at: 6 days ago - Stars: 3 - Forks: 0

flow-php/etl-adapter-elasticsearch

PHP ETL Adapter: Elasticsearch

Language: PHP - Size: 291 KB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 2 - Forks: 1

flow-php/etl-adapter-parquet

PHP ETL Adapter: Parquet

Language: PHP - Size: 3.16 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 7 - Forks: 0

NVIDIA/DALI

A GPU-accelerated library containing highly optimized building blocks and an execution engine for data processing to accelerate deep learning training and inference applications.

Language: C++ - Size: 395 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 5,453 - Forks: 642

numaproj/numaflow

Kubernetes-native platform to run massively parallel data/streaming jobs

Language: Go - Size: 46.9 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 1,889 - Forks: 136

LibreCat/Catmandu

Catmandu - a data processing toolkit

Language: Perl - Size: 53.3 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 191 - Forks: 36

cocoindex-io/cocoindex

Data transformation framework for AI. Ultra performant, with incremental processing.

Language: Rust - Size: 9.61 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 2,111 - Forks: 145

Siteimprove/alfa-act-r

:clipboard: Acceptance testing of rules authored by the ACT Rules Community Group (@act-rules) and implemented by Alfa

Language: TypeScript - Size: 28 MB - Last synced at: about 22 hours ago - Pushed at: 7 days ago - Stars: 1 - Forks: 2

pathwaycom/pathway

Python ETL framework for stream processing, real-time analytics, LLM pipelines, and RAG.

Language: Python - Size: 133 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 28,567 - Forks: 625

streamnative/pulsar-hub

The canonical source of StreamNative Hub.

Language: JavaScript - Size: 14.4 MB - Last synced at: about 4 hours ago - Pushed at: about 6 hours ago - Stars: 17 - Forks: 11

polyaxon/haupt

Lineage metadata API, artifacts streams, sandbox, API, and spaces for Polyaxon

Language: Python - Size: 1.11 MB - Last synced at: 1 day ago - Pushed at: 7 days ago - Stars: 454 - Forks: 210

deepseek-ai/smallpond

A lightweight data processing framework built on DuckDB and 3FS.

Language: Python - Size: 1.77 MB - Last synced at: 7 days ago - Pushed at: 4 months ago - Stars: 4,726 - Forks: 415

stephenm7777/pointcloud-rs

🦀☁️ Rust library for 3D point cloud processing

Language: Rust - Size: 11.7 KB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 0 - Forks: 0

FaninhoFrade/GSP632-convolution

This repository offers a practical approach to image processing using convolutions and pooling on Google Cloud. 🖼️ Dive into hands-on experiments with SciPy and NumPy to enhance your understanding of deep learning concepts. 💻

Language: Python - Size: 373 KB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 0 - Forks: 0

ion-fusion/fusion-java

Ion Fusion is a customizable programming language for working with JSON and Amazon Ion data.

Language: Java - Size: 4.58 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 8 - Forks: 4

MeguminBOT/fnf-chart-info-tool

The FNF Chart Info Tool is a web-based utility designed to analyze and display information about chart files used in Friday Night Funkin' (FNF). This tool supports multiple FNF engines and provides information about the chart, including note counts, BPM, scroll speeds, and more.

Language: JavaScript - Size: 12.9 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 0 - Forks: 0

unionai-oss/pandera

A light-weight, flexible, and expressive statistical data testing library

Language: Python - Size: 4.14 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 3,895 - Forks: 349

allenai/dolma

Data and tools for generating and inspecting OLMo pre-training data.

Language: Python - Size: 62.5 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 1,258 - Forks: 146

Iyanuvicky22/data_processing_comparison

The goal of this project is to build FastAPI-based web service that processes large datasets using Polars and Pandas, and comparison of the two packages performance in loading, cleaning and transforming large datasets are reviewed. The API is tested via an interactive `/docs` interface to reveal results of the comparison project.

Language: Jupyter Notebook - Size: 0 Bytes - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 0 - Forks: 0

mech-lang/mech

🦾 Mech is a programming language for building data-driven systems like robots, games, and interfaces. Start here!

Language: Rust - Size: 16.2 MB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 225 - Forks: 12

TomWright/dasel

Select, put and delete data from JSON, TOML, YAML, XML and CSV files with a single tool. Supports conversion between formats and can be used as a Go package.

Language: Go - Size: 8.56 MB - Last synced at: 7 days ago - Pushed at: 4 months ago - Stars: 7,493 - Forks: 149

nzamski/playlist-compare

A Python tool to compare two music playlist files, identify common and unique tracks, and output the results. It uses pandas for data handling and supports flexible normalization of track and artist names

Language: Python - Size: 0 Bytes - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 1 - Forks: 0

dashbitco/broadway

Concurrent and multi-stage data ingestion and data processing with Elixir

Language: Elixir - Size: 656 KB - Last synced at: 5 days ago - Pushed at: about 1 month ago - Stars: 2,540 - Forks: 167

theveryhim/Basic-Data-analysis

Working with basic Python tools frequently used in data science

Language: Jupyter Notebook - Size: 0 Bytes - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 0 - Forks: 0

CarToi/BE

'새길' 데이터 파이프라인 & 소비 WAS

Language: Java - Size: 314 KB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 0 - Forks: 0

CEA-MetroCarac/SPECTROview

SPECTROview : A Tool for Spectroscopic Data Processing and Visualization.

Language: Python - Size: 206 MB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 1 - Forks: 0

helmholtz-analytics/heat

Distributed tensors and Machine Learning framework with GPU and MPI acceleration in Python

Language: Python - Size: 21.6 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 223 - Forks: 55

ThomasSalamone/Titanic-Prediction-NN

This notebook demonstrates how to predict passenger survival on the Titanic using a neural network. It covers data preprocessing steps such as handling missing values, encoding categorical features, and feature scaling. Then, it builds and trains a simple feedforward neural network using a deep learning framework to classify passengers as survived

Language: Jupyter Notebook - Size: 483 KB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 0 - Forks: 0

StatCan/gensol-gseries

(EN) Package gseries - R version of generalized system G-Series https://StatCan.github.io/gensol-gseries/en/ =========================== (FR) Librairie gseries - Version R du système généralisé G-Séries https://StatCan.github.io/gensol-gseries/fr/

Language: R - Size: 12.9 MB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 5 - Forks: 1

NVIDIA/nvImageCodec

A nvImageCodec library of GPU- and CPU- accelerated codecs featuring a unified interface

Language: Jupyter Notebook - Size: 22.3 MB - Last synced at: 3 days ago - Pushed at: 4 months ago - Stars: 111 - Forks: 8

crate/cratedb-toolkit

CrateDB Toolkit, an SDK for CrateDB and CrateDB Cloud.

Language: Python - Size: 1.27 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 10 - Forks: 4

corese-stack/corese-core

A powerful Java library for RDF data manipulation, SPARQL querying, reasoning, and semantic web standard compliance.

Language: Java - Size: 26.9 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 6 - Forks: 0

LiberTEM/LiberTEM

Open pixelated STEM framework

Language: Python - Size: 229 MB - Last synced at: 6 days ago - Pushed at: 20 days ago - Stars: 118 - Forks: 69

MDSplus/mdsplus

The MDSplus data management system

Language: Java - Size: 148 MB - Last synced at: 2 days ago - Pushed at: 3 days ago - Stars: 84 - Forks: 48

Krypton3/AnomalyNodeML

Using TensorFlow to analyze server logs for anomaly detection involves training a machine learning model to identify unusual patterns in the data.

Language: Python - Size: 5.21 MB - Last synced at: 6 days ago - Pushed at: 10 months ago - Stars: 1 - Forks: 0

vortex-exoplanet/VIP

VIP is a python package/library for angular, reference star and spectral differential imaging for exoplanet/disk detection through high-contrast imaging.

Language: Python - Size: 407 MB - Last synced at: 6 days ago - Pushed at: about 2 months ago - Stars: 72 - Forks: 61

BADER76/solar-power-measurement

This repository hosts a solar power measurement system that tracks voltage, current, and power using the STM32F103C8T6 microcontroller. The data is visualized in real-time on an OLED display and sent to the ThingSpeak IoT cloud for further analysis. 🌐🌞

Language: C - Size: 18 MB - Last synced at: 10 days ago - Pushed at: 10 days ago - Stars: 0 - Forks: 0

CityofToronto/bdit_data-sources

Data sources used by the Big Data Innovation Team

Language: Jupyter Notebook - Size: 119 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 40 - Forks: 8

nxoti/cnpj-data-pipeline

# 🇧🇷 CNPJ Data PipelineUm script modular e configurável para processar arquivos CNPJ da Receita Federal do Brasil. 🐙 Este projeto oferece suporte a múltiplos bancos de dados e permite o processamento inteligente de mais de 50 milhões de empresas.

Language: Python - Size: 384 KB - Last synced at: 10 days ago - Pushed at: 10 days ago - Stars: 2 - Forks: 0

drshahizan/HPDP

High performance data processing employs high performance computing (HPC) to process data, which is then translated into information and knowledge. The advent of high-performance computing and data analytics enabled real-time interrogation of extremely large data sets.

Language: Jupyter Notebook - Size: 400 MB - Last synced at: 10 days ago - Pushed at: 10 days ago - Stars: 122 - Forks: 86

Naveen-526/Federated-Learning-based-IDS

This repository features a federated learning system designed for intrusion detection in IoT networks, ensuring data privacy while maintaining high accuracy. The project utilizes the Flower framework and includes essential components like data processing, server-client architecture, and SSL certificates for secure communication. 🐙🌐

Language: Jupyter Notebook - Size: 19.5 KB - Last synced at: 10 days ago - Pushed at: 10 days ago - Stars: 0 - Forks: 0

alirezatheh/perke

A keyphrase extractor for Persian

Language: Python - Size: 143 KB - Last synced at: 5 days ago - Pushed at: 20 days ago - Stars: 69 - Forks: 8

flow-php/etl

PHP - ETL (Extract Transform Load) data processing library

Language: PHP - Size: 3.36 MB - Last synced at: 6 days ago - Pushed at: 11 days ago - Stars: 360 - Forks: 20

seinecle/nocodefunctions-web-app

The code base of the front-end of nocodefunctions.com

Language: Java - Size: 38 MB - Last synced at: 11 days ago - Pushed at: 11 days ago - Stars: 39 - Forks: 7

LMLK-seal/ArrowShelf

A lightning-fast, zero-copy, cross-process data store for Python using Apache Arrow.

Language: Python - Size: 75.2 KB - Last synced at: 11 days ago - Pushed at: 11 days ago - Stars: 0 - Forks: 0

senbox-org/snap-engine

ESA Earth Observation Toolbox and Java Development Platform

Language: Java - Size: 1020 MB - Last synced at: 7 days ago - Pushed at: 10 days ago - Stars: 196 - Forks: 103

danielnorlan/prospectinator-backend-github

Serverless Azure Functions backend for Prospectinator, automates Excel/CSV uploads, uses an AI API to retrieve contact phone numbers & sources, tracks progress, and stores results in Blob Storage.

Language: Python - Size: 1.55 MB - Last synced at: 12 days ago - Pushed at: 12 days ago - Stars: 0 - Forks: 0

ColasGael/Machine-Learning-for-Solar-Energy-Prediction

Predict the Power Production of a solar panel farm from Weather Measurements using Machine Learning

Language: Python - Size: 922 MB - Last synced at: 7 days ago - Pushed at: over 5 years ago - Stars: 269 - Forks: 114

apache/incubator-wayang

Apache Wayang(incubating) is the first cross-platform data processing system.

Language: Java - Size: 19.2 MB - Last synced at: 6 days ago - Pushed at: 20 days ago - Stars: 224 - Forks: 98

ictchenbo/SmartETL

SmartETL:一个简单、灵活、可配置、开箱即用的Python ETL框架,具有领域特色,拒绝重复造轮子!提供Wikidata / Wikipedia / GDELT等多种开源数据的处理流程; 支持txt/json/csv/excel等文件格式、MySQL/PostgreSQL/MongoDB/ClickHouse/ElasticSearch等数据库作为输入和输出; 提供大模型、Web API等多种处理算子

Language: Python - Size: 4.78 MB - Last synced at: 12 days ago - Pushed at: 12 days ago - Stars: 19 - Forks: 3

zazuko/barnard59

An intuitive and flexible RDF pipeline solution designed to simplify and automate ETL processes for efficient data management.

Language: JavaScript - Size: 3.66 MB - Last synced at: 10 days ago - Pushed at: 3 months ago - Stars: 36 - Forks: 2

Smart-Shaped/chaM3Leon

By Smart Shaped s.r.l. (https://www.smartshaped.com/)

Language: Java - Size: 899 KB - Last synced at: 12 days ago - Pushed at: 12 days ago - Stars: 28 - Forks: 2

Solrikk/MagicXML-url

🧙‍♂️ MagicXML is a FastAPI-based service designed to fetch, process, and convert XML data into structured CSV files. It is optimized for handling large XML files by processing them in chunks asynchronously, making it suitable for heavy data processing tasks.

Language: Python - Size: 6.21 MB - Last synced at: 13 days ago - Pushed at: 13 days ago - Stars: 4 - Forks: 0

tathithienthanh/WomenFashionProductRecommendationSystem

Build a recommendation system for recommending woman fashion's products on e-commerce platforms

Language: Jupyter Notebook - Size: 66 MB - Last synced at: 13 days ago - Pushed at: 13 days ago - Stars: 0 - Forks: 0

asyml/texar

Toolkit for Machine Learning, Natural Language Processing, and Text Generation, in TensorFlow. This is part of the CASL project: http://casl-project.ai/

Language: Python - Size: 13.6 MB - Last synced at: 9 days ago - Pushed at: almost 4 years ago - Stars: 2,387 - Forks: 371

Joanna20Carrion/Generador-De-Oficios

Aplicación web en Flask que genera oficios personalizados en Word desde una plantilla, usando datos de destinatarios almacenados en un Excel de directorio empresarial.

Language: HTML - Size: 37.1 KB - Last synced at: 14 days ago - Pushed at: 14 days ago - Stars: 1 - Forks: 0

getstrm/pace

Data policy IN, dynamic view OUT: PACE is the Policy As Code Engine. It helps you to programatically create and apply a data policy to a processing platform like Databricks, Snowflake or BigQuery (or plain 'ol Postgres, even!) with definitions imported from Collibra, Datahub, ODD and the like.

Language: Kotlin - Size: 13.1 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 36 - Forks: 1

rs2lab/fatigueset-data-preprocessing

This repository contains the code used to preprocess the FatigueSet dataset, and was produced as a part of the research project "Early Fadigue Detection using Wearables as a means for the Reduction of Critical Accidents in the Mining Process"

Language: Python - Size: 12.7 KB - Last synced at: 14 days ago - Pushed at: 14 days ago - Stars: 0 - Forks: 0

ChanMeng666/douban-review-scraper

【One star = One happy developer doing a little dance 💃⭐️】A robust Python scraper for collecting and analyzing movie reviews from Douban.com, featuring comprehensive data processing and analysis capabilities.

Language: Python - Size: 105 KB - Last synced at: 16 days ago - Pushed at: 16 days ago - Stars: 0 - Forks: 0

malika-n/Data-Science-Cryptocurrencies-Data-Analysis-Forecasting

Analyze and forecast prices of 10 cryptocurrencies using historical data from Binance API. Visualize predictions with a Flask web interface. 🐙📊

Language: Python - Size: 7.56 MB - Last synced at: 16 days ago - Pushed at: 16 days ago - Stars: 0 - Forks: 0

flow-php/etl-adapter-csv

PHP ETL Adapter: CSV

Language: PHP - Size: 1.39 MB - Last synced at: 12 days ago - Pushed at: 16 days ago - Stars: 5 - Forks: 2

Keerthi-rithi/Bike-Sales-Analysis-Project

A comprehensive data analytics project focused on bike sales performance across different countries, months, and product categories. The goal is to provide valuable business insights through visual dashboards and data exploration.

Size: 2.75 MB - Last synced at: 17 days ago - Pushed at: 17 days ago - Stars: 0 - Forks: 0

speedcell4/torchglyph

Data Processor Combinators for Natural Language Processing

Language: Python - Size: 549 KB - Last synced at: 17 days ago - Pushed at: 17 days ago - Stars: 7 - Forks: 1

dandr94/tender-project-backend

A Django management script that automates the parsing and importing of contract award data from XML files from TED Europa

Language: Python - Size: 949 KB - Last synced at: 17 days ago - Pushed at: 17 days ago - Stars: 0 - Forks: 0

brunocampos01/data-engineering

Language: Python - Size: 165 MB - Last synced at: 17 days ago - Pushed at: 17 days ago - Stars: 11 - Forks: 2

Menziess/slipstream-async

Slipstream provides a data-flow model to simplify development of stateful streaming applications.

Language: Python - Size: 628 KB - Last synced at: 9 days ago - Pushed at: 3 months ago - Stars: 38 - Forks: 1

zeynepcol/Data-Science-Cryptocurrencies-Data-Analysis-Forecasting

Cryptocurrency price analysis and prediction using regression models

Language: Python - Size: 8.88 MB - Last synced at: 7 days ago - Pushed at: 18 days ago - Stars: 0 - Forks: 0

dd-hebert/uv_pro

Command line tool for parsing and processing UV-Vis data from the Agilent 845x Chemstation software.

Language: Python - Size: 5.05 MB - Last synced at: 18 days ago - Pushed at: 18 days ago - Stars: 2 - Forks: 1

josephmachado/online_store

End to end data engineering project

Language: Python - Size: 1.53 MB - Last synced at: 9 days ago - Pushed at: over 2 years ago - Stars: 57 - Forks: 18

flow-php/doctrine-dbal-bulk

Doctrine DBAL Bulk Operations for selected database engines

Language: PHP - Size: 349 KB - Last synced at: 18 days ago - Pushed at: 18 days ago - Stars: 11 - Forks: 5

flow-php/etl-adapter-json

PHP ETL Adapter: JSON

Language: PHP - Size: 1.72 MB - Last synced at: 12 days ago - Pushed at: 18 days ago - Stars: 6 - Forks: 3

Related Keywords