GitHub topics: ingestion
cyclotruc/gitingest
Replace 'hub' with 'ingest' in any github url to get a prompt-friendly extract of a codebase
Language: Python - Size: 505 KB - Last synced at: about 5 hours ago - Pushed at: about 1 month ago - Stars: 8,890 - Forks: 698

giulianoc/CatraMMS
Media Management System: ingestion, processing, encoding, delivery, ...
Language: C++ - Size: 89.8 MB - Last synced at: about 17 hours ago - Pushed at: 1 day ago - Stars: 39 - Forks: 15

getlago/lago
Open Source Metering and Usage Based Billing API ⭐️ Consumption tracking, Subscription management, Pricing iterations, Payment orchestration & Revenue analytics
Language: Go - Size: 132 MB - Last synced at: 1 day ago - Pushed at: 5 days ago - Stars: 7,686 - Forks: 377

jitsucom/bulker
Service for bulk-loading data to databases with automatic schema management (Redshift, Snowflake, BigQuery, ClickHouse, Postgres, MySQL)
Language: Go - Size: 5.65 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 176 - Forks: 28

opensearch-project/data-prepper
OpenSearch Data Prepper is a component of the OpenSearch project that accepts, filters, transforms, enriches, and routes data at scale.
Language: Java - Size: 133 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 304 - Forks: 232

StarlightSearch/EmbedAnything
Production-ready Inference, Ingestion and Indexing built in Rust 🦀
Language: Rust - Size: 37.1 MB - Last synced at: 5 days ago - Pushed at: 7 days ago - Stars: 585 - Forks: 51

netboxlabs/diode
Diode data model and ingestion services for NetBox, from NetBox Labs
Language: Go - Size: 2.2 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 97 - Forks: 8

lcandy2/gitingest-extension
✨ A extension can help you open git ingest to turn any git repository into a prompt-friendly text ingest for LLMs.
Language: TypeScript - Size: 111 KB - Last synced at: 7 days ago - Pushed at: 5 months ago - Stars: 148 - Forks: 13

NASA-PDS/nucleus
Nucleus is a software platform used to create workflows for the Planetary Data (PDS).
Language: HCL - Size: 15.8 MB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 0 - Forks: 0

mrsimonemms/gobblr
Make your development databases gobble up known data
Language: Go - Size: 138 KB - Last synced at: 4 days ago - Pushed at: 11 days ago - Stars: 6 - Forks: 0

apache/gobblin
A distributed data integration framework that simplifies common aspects of big data integration such as data ingestion, replication, organization and lifecycle management for both streaming and batch data ecosystems.
Language: Java - Size: 127 MB - Last synced at: 1 day ago - Pushed at: 12 days ago - Stars: 2,237 - Forks: 750

ryhkml/ytingest
Extract YouTube video, feed it to any LLM as knowledge
Language: C - Size: 191 KB - Last synced at: 14 days ago - Pushed at: 14 days ago - Stars: 1 - Forks: 0

netboxlabs/diode-netbox-plugin
Official NetBox Labs plugin for NetBox for Diode
Language: Python - Size: 443 KB - Last synced at: 12 days ago - Pushed at: 12 days ago - Stars: 46 - Forks: 11

akram0zaki/breach-ingestor
A resilient, prefix-sharded ingestion pipeline for large static breach dumps (e.g. AntiPublic), optimized for low-resource environments (e.g., Raspberry Pi + NAS/SSD).
Language: JavaScript - Size: 25.4 KB - Last synced at: 16 days ago - Pushed at: 16 days ago - Stars: 0 - Forks: 0

PerisN/Data-Ingestion-Pipeline
Python-based data pipeline that extracts CSV files from a ZIP archive, converts them to Parquet format, and ingests them into a PostgreSQL database. Ideal for automating ETL workflows with minimal configuration.
Language: Jupyter Notebook - Size: 104 MB - Last synced at: 16 days ago - Pushed at: 16 days ago - Stars: 0 - Forks: 0

datainsider-co/rocket-bi
A free, open-source, web-based self-service BI tailor-made for clickhouse, google bigquery, mysql, postgresql, vertica
Language: TypeScript - Size: 69.5 MB - Last synced at: 8 days ago - Pushed at: 6 months ago - Stars: 111 - Forks: 31

alekLukanen/ChapterhouseDB-v1
Allows you to create simple data streaming warehouses written in Golang using Apache Parquet and Arrow.
Language: Go - Size: 189 KB - Last synced at: 8 days ago - Pushed at: 23 days ago - Stars: 1 - Forks: 0

ylem-co/ylem
Ylem is an open-source platform for real-time data streaming orchestration
Language: JavaScript - Size: 5.87 MB - Last synced at: 26 days ago - Pushed at: 26 days ago - Stars: 71 - Forks: 0

7-docs/7-docs
Use local files or public GitHub repository as a source and ask questions through ChatGPT about it
Language: TypeScript - Size: 363 KB - Last synced at: 2 days ago - Pushed at: over 1 year ago - Stars: 119 - Forks: 9

emcd/python-mimeogram
Exchange collections of files with Large Language Models.
Language: Python - Size: 520 KB - Last synced at: 24 days ago - Pushed at: 2 months ago - Stars: 3 - Forks: 0

Dicklesworthstone/automatic_log_collector_and_analyzer
Replace Splunk in your small company with this one weird trick!
Language: Python - Size: 824 KB - Last synced at: 10 days ago - Pushed at: 3 months ago - Stars: 407 - Forks: 37

jrcichra/ingestd
HTTP server that easily ingests data into a database
Language: Go - Size: 388 KB - Last synced at: 14 days ago - Pushed at: about 1 month ago - Stars: 1 - Forks: 0

tagbase/tagbase-server
Tagbase is a data lifecycle management system for electronic timeseries sensor data. It supports different types of data and works with equipment from various manufacturers.
Language: Python - Size: 2.18 MB - Last synced at: 20 days ago - Pushed at: 20 days ago - Stars: 7 - Forks: 2

beebeeep/chafka
Real-time Kafka to ClickHouse ingestion service
Language: Rust - Size: 119 KB - Last synced at: about 1 month ago - Pushed at: about 1 year ago - Stars: 33 - Forks: 0

souzomain/logflow
LogFlow é uma aplicação ETL (Extração, Transformação e Carregamento) especializada em processamento de logs
Language: Python - Size: 3.13 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 2 - Forks: 0

samber/go-quickwit
🍱 A Go ingestion client for Quickwit
Language: Go - Size: 26.4 KB - Last synced at: 8 days ago - Pushed at: 11 months ago - Stars: 3 - Forks: 2

EricZoop/vsingest
Transform any codebase or techstack in Visual Studio to prompt-friendly text for LLMs!
Language: JavaScript - Size: 59 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 2 - Forks: 0

nathadriele/spotify-data-pipeline
This project implements a full-stack data engineering solution that connects to the Spotify Web API to extract a user’s recently played tracks, stores the data in a PostgreSQL database, applies transformations using dbt, and delivers actionable insights via Metabase dashboards.
Language: Python - Size: 1.98 MB - Last synced at: about 1 month ago - Pushed at: about 2 months ago - Stars: 1 - Forks: 0

endernoke/linkedingest
Turn LinkedIn profiles into AI-friendly text ingests.
Language: JavaScript - Size: 354 KB - Last synced at: about 2 months ago - Pushed at: 3 months ago - Stars: 5 - Forks: 3

FellowTraveler/ngest
Python script for ingesting various files into a semantic graph. For text, images, cpp, python, rust, javascript, and PDFs.
Language: Python - Size: 2.98 MB - Last synced at: about 2 months ago - Pushed at: 9 months ago - Stars: 25 - Forks: 2

vertica/PSTL
Parallel Streaming Transformation Loader
Language: Java - Size: 106 MB - Last synced at: about 1 month ago - Pushed at: about 6 years ago - Stars: 9 - Forks: 6

jmfeck/bigquery-local-framework
This repo provides tools to manage BigQuery operations locally, simplifying tasks like uploading flat files, running SQL queries, and downloading tables. It offers a unified interface for local BigQuery interactions, enabling more efficient interaction with it.
Language: Python - Size: 44.9 KB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 0 - Forks: 0

abakermi/gitllm
A powerful GitHub repository analysis tool that helps you process and analyze repository content efficiently. Built with Next.js, Cloudflare Workers, and modern web technologies
Language: TypeScript - Size: 3.55 MB - Last synced at: about 2 months ago - Pushed at: 5 months ago - Stars: 3 - Forks: 0

aymane-maghouti/Big-Data-Project
This project aims to predict smartphone prices using a combination of batch and stream processing techniques in a Big Data environment. The architecture follows the Lambda Architecture pattern, providing both real-time and batch processing capabilities to users.
Language: Python - Size: 960 KB - Last synced at: about 2 months ago - Pushed at: about 1 year ago - Stars: 11 - Forks: 2

jgperrin/net.jgp.labs.spark
Apache Spark examples exclusively in Java
Language: Java - Size: 1.75 MB - Last synced at: about 2 months ago - Pushed at: about 2 years ago - Stars: 101 - Forks: 49

xycloo/rs-ingest
Single and multi-threaded custom ingestion crate for Stellar Futurenet, written in Rust.
Language: Rust - Size: 101 KB - Last synced at: 17 days ago - Pushed at: 3 months ago - Stars: 3 - Forks: 1

Jaebum0505/subscription-tracker-api
Skip the basic CRUD—this Backend Crash Course is all about building a production-ready Subscription Management System with real users, real money, and real business logic. You'll learn JWT authentication, database modeling, API architecture, security, automated workflows, and much more!
Size: 1.95 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

Clarifai/clarifai-python-datautils
Extract Transform and Load unstructured data into the Clarifai's AI platform
Language: Python - Size: 1.02 MB - Last synced at: 14 days ago - Pushed at: about 1 month ago - Stars: 6 - Forks: 0

rapidomize/rapidomize
Rapidly Access, Processes, Analyze & Visualize Your Data
Size: 14.6 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 3 - Forks: 1

ahammadnafiz/RepoRAG
A fully interactive tool designed to streamline your GitHub repository prompt generation process and facilitate RAG (Retrieval-Augmented Generation) workflows
Language: Python - Size: 222 KB - Last synced at: 3 months ago - Pushed at: 5 months ago - Stars: 3 - Forks: 0

jgperrin/net.jgp.books.spark.ch09
Spark in Action, 2e - chapter 9 - Advanced ingestion: finding data sources and building your own
Language: Java - Size: 26.9 MB - Last synced at: 26 days ago - Pushed at: about 2 years ago - Stars: 18 - Forks: 14

akornatskyy/sample-etl-flink-java
The sample ingests multiline gzipped files of popular books into postgres.
Language: Java - Size: 61.5 KB - Last synced at: about 2 months ago - Pushed at: 4 months ago - Stars: 2 - Forks: 0

BonnardValentin/nmemo-foundation
Nmemo Foundation is a minimal, domain-driven platform for ingesting and retrieving documents with PostgreSQL and a modular architecture. It’s designed for easy expansion to other data sources and advanced search features
Language: Python - Size: 22.5 KB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 1 - Forks: 0

Azure/azure-event-hubs-spark
Enabling Continuous Data Processing with Apache Spark and Azure Event Hubs
Language: Scala - Size: 19.6 MB - Last synced at: 1 day ago - Pushed at: 4 months ago - Stars: 235 - Forks: 178

garethcmurphy/SciCat-Data-Ingestion-with-TypeScript
# SciCat Data Ingestion with TypeScript 📥✨ This repository provides a **TypeScript-based tool** for importing and ingesting data into **SciCat**, the science data catalog used at the **European Spallation Source (ESS)**. --- ## Features ✨ - **Data Ingestion**: Automates data import into SciCat. - **TypeScript Implementation**: Ensures ty
Language: TypeScript - Size: 15.6 KB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 0 - Forks: 0

vicentedpsantos/repo2text
repo2text is a command-line tool that converts the content of a Git repository into a structured text file. It extracts all committed files and outputs them in a format suitable for easy ingestion by AI tools like ChatGPT. Ideal for sharing or analyzing repository contents in AI-driven conversations. 🤖
Language: Rust - Size: 23.4 KB - Last synced at: 30 days ago - Pushed at: 5 months ago - Stars: 0 - Forks: 0

lopezj1/youtube_fishing
This project ingests YouTube video data related to fishing, stores it in MongoDB, and provides visualizations through Metabase for analysis.
Language: Python - Size: 3.91 KB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 0 - Forks: 0

averemee-si/oralog
Ingestion tool for various database logs
Language: Java - Size: 126 KB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 0 - Forks: 1

rodrigo85/dms_ingestion
This project demonstrates how to use Apache Airflow to orchestrate AWS Database Migration Service (DMS) tasks for data ingestion. The solution leverages the power of Airflow for workflow automation and AWS DMS for seamless data migration to Amazon S3.
Language: Python - Size: 881 KB - Last synced at: 7 months ago - Pushed at: 7 months ago - Stars: 0 - Forks: 0

AbsaOSS/hyperdrive
Extensible streaming ingestion pipeline on top of Apache Spark
Language: Scala - Size: 1.64 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 44 - Forks: 13

Scody0/SQL-Injection-Training-Site
SQL Injection Training Site
Language: HTML - Size: 15.6 KB - Last synced at: 7 months ago - Pushed at: 9 months ago - Stars: 0 - Forks: 0

zezs/Langchain-Docs---AI-Chat-Assistant
This repository is dedicated to learning LangChain by creating a generative AI application. This web application uses Pinecone as a vector store to answer questions related to LangChain, utilizing sources from the official LangChain documentation.
Language: Python - Size: 1.21 MB - Last synced at: 2 days ago - Pushed at: 10 months ago - Stars: 0 - Forks: 0

lmolas/http-ingestor
Go implementation for handling huge amounts of http uploads
Language: Go - Size: 5.86 KB - Last synced at: over 2 years ago - Pushed at: almost 6 years ago - Stars: 0 - Forks: 0

rachita27/AUTOMATING
Automating Ingestion Excel Files On To Azure Data Studio (SQL-Server)
Language: Jupyter Notebook - Size: 13.2 MB - Last synced at: 28 days ago - Pushed at: over 3 years ago - Stars: 1 - Forks: 0

kharigardner/pyfivetran
Simple python interface for the Fivetran API. Powered by HTTPx.
Language: Python - Size: 139 KB - Last synced at: 1 day ago - Pushed at: 12 months ago - Stars: 2 - Forks: 0

ClarityNLP/ingest-api
Ingest data into Solr from a variety of sources
Language: JavaScript - Size: 1.69 MB - Last synced at: about 1 year ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

ClarityNLP/ingest-client
React client for Solr ingest
Language: JavaScript - Size: 1.87 MB - Last synced at: about 1 year ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 1

postlang/posthog-llm-examples
Upload data to PostHog-LLM
Size: 3.91 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 1 - Forks: 0

apivideo/ingest.new
A simple demo application for uploading, ingesting, embedding videos and converting them to mp4s. From api.video (https://api.video)
Language: JavaScript - Size: 739 KB - Last synced at: about 2 months ago - Pushed at: over 2 years ago - Stars: 18 - Forks: 0

CocoaPriest/AssistAI
macOS app to chat with your local documents
Language: Swift - Size: 164 KB - Last synced at: about 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

streamsqldb/streamsql-js
The javascript ingestion API for streamsql.
Language: TypeScript - Size: 59.6 KB - Last synced at: 6 days ago - Pushed at: about 5 years ago - Stars: 1 - Forks: 0

marceloboeira/crowd
👥 [WIP] An experimental High Available Reverse Proxy for Massive Asynchronous Message Consumption
Language: Go - Size: 559 KB - Last synced at: 2 months ago - Pushed at: almost 6 years ago - Stars: 6 - Forks: 1

se02035/azure-eventhub-ingestor
This repo contains samples of a single process / high performant eventhub ingestor (using C#)
Language: C# - Size: 5.86 KB - Last synced at: about 1 year ago - Pushed at: over 7 years ago - Stars: 0 - Forks: 0

CocoaPriest/bubbleai
FastAPI server to process client request both for injection and interference
Language: Python - Size: 48.8 KB - Last synced at: about 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

Azure/Azure-AppServices-Diagnostics-KustoIngestor
Azure App Service Diagnostics Kusto Ingestor provides developers ability to write custom logic before logs in Kusto can be aggregated and ingested in that may not be possible within a single query. Supported ingestion mechanisms are ingest from query and ingest from DataTable.
Language: C# - Size: 39.1 KB - Last synced at: about 1 year ago - Pushed at: about 2 years ago - Stars: 2 - Forks: 3

abideenml/RealTime-StarRatingPrediction-with-AWSKinesis
This repository contains an End to End Real time 🕰️ Machine Learning Pipeline to predict star ⭐️ rating of product reviews. This project uses AWS Sagemaker, Kinesis, Lambda, S3, Redshift, Athena, and Step functions. Deployment of multiple models for AB testing and Bandit testing is also included.
Language: Jupyter Notebook - Size: 15.1 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

tweag/lagoon
Data centralization tool
Language: Haskell - Size: 341 KB - Last synced at: about 1 month ago - Pushed at: almost 5 years ago - Stars: 35 - Forks: 1

mahmudie/data_engineering_projects
List of my data engineering projects
Language: Jupyter Notebook - Size: 384 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

biocaddie/foundry-nlp-enhancer
NLP enhancer plugin for Foundry-ES pipeline management system. The service that enhances elasticsearch functionality with NLP elements.
Language: Java - Size: 156 MB - Last synced at: over 1 year ago - Pushed at: over 6 years ago - Stars: 0 - Forks: 1

coxwave/impaction-ai-sdk-python
Server-side impaction.ai Data Ingestion SDK for Python
Language: Python - Size: 78.1 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 2 - Forks: 0

coxwave/impaction-ai-sdk-node
Server-side impaction.ai Data Ingestion SDK for Node.js
Language: TypeScript - Size: 22.5 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

dativebase/dailp-ingest-clj
DAILP Ingest (of Cherokee language data from Google Sheets)
Language: Clojure - Size: 197 KB - Last synced at: about 1 year ago - Pushed at: about 5 years ago - Stars: 2 - Forks: 0

jacobmarks/twilio-automation-plugin
Automate data ingestion into FiftyOne with Twilio
Language: Python - Size: 16.6 KB - Last synced at: about 1 year ago - Pushed at: over 1 year ago - Stars: 2 - Forks: 0

3amory99/Building-Sales-Data-Mart-Using-ETL-SSIS
By using AdventureWorks2022 Dataset I have built a Sales Data Mart using (SQL Server Integration Services SSIS) SQL Server involves leveraging the capabilities of Integration Services (SSIS) and the Modeling of SQL Server, This Data mart offers several benefits, making them valuable components in the main purpose of data management and analytics wi
Language: TSQL - Size: 1.87 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 2 - Forks: 0

rconjoe/etl.ts Fork of smartive/proc-that
smartive/proc-that forked to play with
Size: 406 KB - Last synced at: about 13 hours ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

mbsuraj/postgresql_ingestion_script
Ingest any format data into postgreSQL database
Language: Python - Size: 15.6 KB - Last synced at: almost 2 years ago - Pushed at: over 3 years ago - Stars: 1 - Forks: 1

Grokery/grokerylab
A data pipeline management platform
Language: JavaScript - Size: 32.3 MB - Last synced at: almost 2 years ago - Pushed at: over 2 years ago - Stars: 2 - Forks: 0

ldaniels528/transgress
A distributed processing/orchestration server and ETL for NodeJS
Language: Scala - Size: 685 KB - Last synced at: almost 2 years ago - Pushed at: about 8 years ago - Stars: 1 - Forks: 0

seb7887/janus
Data ingestion service written in Go
Language: Go - Size: 110 KB - Last synced at: 12 months ago - Pushed at: almost 4 years ago - Stars: 1 - Forks: 0

c-drault/ingest-csv-for-elasticsearch 📦
Lab n°2 of "Applications of Big-Data" @ Efrei Paris
Language: Jupyter Notebook - Size: 5.86 KB - Last synced at: almost 2 years ago - Pushed at: over 4 years ago - Stars: 0 - Forks: 0

woodRock/psychic-invention
NZODN Data Ingestion Project
Language: Shell - Size: 5.34 MB - Last synced at: almost 2 years ago - Pushed at: about 3 years ago - Stars: 0 - Forks: 0

crosslibs/incremental-ingestion-using-airflow
Periodically ingest incremental updates (inserts / deletes) into BigQuery using Cloud Composer / Airflow orchestration workflow
Language: Python - Size: 7.81 KB - Last synced at: 10 months ago - Pushed at: over 5 years ago - Stars: 9 - Forks: 1

gpism/OpenDataCore
Welcome to the fascinating intersection of Web3, Artificial Intelligence (AI), Open Data Core (ODC), and Composable Enterprise Fabric - a nexus of modern technologies that are significantly reshaping the enterprise landscape
Language: Java - Size: 14.6 KB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

phphoebe/Python-Data-Analysis-with-NumPy-and-Pandas
NumPy & Pandas for data science, data analysis & business intelligence, with practical, hands-on Python projects
Language: Jupyter Notebook - Size: 54.2 MB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 1 - Forks: 0

azuregig/work_with_OrdnanceSurvey_data
Sample Azure Data Factory pipeline for ingesting Data Packages directly from the Download API of the Ordnance Survey Data Hub into Azure Storage.
Size: 2.21 MB - Last synced at: almost 2 years ago - Pushed at: almost 3 years ago - Stars: 3 - Forks: 3

luccayz/dataengineer_project_001
Efetuar o download de arquivos da web com Python. Inserir dados de um dataframe na cloud Azure com Azure SQL Database. Efetuar transformações nos dados com Azure Data Factory.
Language: Jupyter Notebook - Size: 11.7 KB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

helenamin/deb-finalProject-group3
End-To-End-Solution-DataEngineering-FinalProject
Language: Python - Size: 2.87 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 3

phdata/pipeforge 📦
Language: Scala - Size: 12.9 MB - Last synced at: about 1 year ago - Pushed at: over 6 years ago - Stars: 6 - Forks: 6

sorcero/ingestum
Read-only mirror of https://gitlab.com/sorcero/community/ingestum
Language: Python - Size: 2.54 MB - Last synced at: 26 days ago - Pushed at: over 2 years ago - Stars: 7 - Forks: 0

snollygolly/borrow-bot
:moneybag: A bot for maximizing the borrow subreddit
Language: JavaScript - Size: 860 KB - Last synced at: about 1 month ago - Pushed at: over 8 years ago - Stars: 27 - Forks: 0

italia/daf-replicate-ingestion 📦
Microservice to ingest data from Replicate and push it into DAF. Warning: this repo is deprecated.
Language: Java - Size: 151 KB - Last synced at: about 1 year ago - Pushed at: over 7 years ago - Stars: 3 - Forks: 6

zalando-zmon/zmon-data-service 📦
Receiving end of new worker to push data across DC boundaries
Language: Java - Size: 555 KB - Last synced at: about 2 months ago - Pushed at: about 5 years ago - Stars: 6 - Forks: 2

timxor/bitcoind-data-ingestion
crypto payments bitcoind data ingestion
Language: JavaScript - Size: 90.8 KB - Last synced at: 17 days ago - Pushed at: about 4 years ago - Stars: 1 - Forks: 0

san089/Yelp_Project
This project is to create a Data lake for Yelp data-set and further using the it to create an Analytical Sandbox Data Science purpose and also creating a data warehouse for reporting purpose.
Language: Jupyter Notebook - Size: 351 KB - Last synced at: 3 months ago - Pushed at: almost 6 years ago - Stars: 2 - Forks: 2

Cigna/ibis
IBIS is a workflow creation-engine that abstracts the Hadoop internals of ingesting RDBMS data.
Language: Python - Size: 749 KB - Last synced at: 6 months ago - Pushed at: about 3 years ago - Stars: 51 - Forks: 15

padogrid/bundle-hazelcast-3n4n5-app-pado_dbsched-perf_test_dbsched-docker-mysql
The dbsched bundle is preconfigured with the Pado scheduler to periodically execute jobs that dump database tables to CSV files from which it automatically extracts column information to generate the corresponding VersionedPortable classes. It then transforms the CSV records to objects using the generated classes before ingesting them into Hazelcast.
Language: Shell - Size: 744 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

Kinjuriu/python-ingestion
Data Ingestion, reading files, working with databases, troubleshooting data, calling APIs and schemas
Language: Julia - Size: 18.2 MB - Last synced at: over 2 years ago - Pushed at: almost 3 years ago - Stars: 0 - Forks: 0

magengit/magen-in
Ingestion Server for Magen Data Leak Prevention Software
Language: Python - Size: 537 KB - Last synced at: over 1 year ago - Pushed at: about 7 years ago - Stars: 1 - Forks: 1

projectkeas/ingestion
The core ingestion API for KEAS
Language: Go - Size: 70.3 KB - Last synced at: 10 days ago - Pushed at: 10 days ago - Stars: 0 - Forks: 0

Soumyadeep-github/Data-Ingestion
The aim of this project is automate data ingestion from flat files like CSV and compressed files GZIP into a database like Postgres. The entire setup is automated using Docker and is pretty fast too as multiprocessing is being used.
Language: Python - Size: 33.1 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 1 - Forks: 2
