Ecosyste.ms: Repos
An open API service providing repository metadata for many open source software ecosystems.
GitHub topics: ingestion-pipeline
vivek-bombatkar/Graph-Datastructure-for-Movielens-dataset
Language: Jupyter Notebook - Size: 726 KB - Last synced: 22 days ago - Pushed: over 5 years ago - Stars: 0 - Forks: 0
bruin-data/ingestr
ingestr is a CLI tool to copy data between any databases with a single command seamlessly.
Language: Python - Size: 721 KB - Last synced: about 1 month ago - Pushed: about 1 month ago - Stars: 2,315 - Forks: 47
ds-rafaelfelippe/DataIngestionPython
Mini projeto desenvolvido no contexto da disciplina de Banco de Dados Não Relacional do programa de pós-graduação em Ciência de Dados e Machine Learning na PUC Campinas.
Language: Jupyter Notebook - Size: 1.06 MB - Last synced: 6 months ago - Pushed: 6 months ago - Stars: 0 - Forks: 0
opensemanticsearch/open-semantic-etl
Python based Open Source ETL tools for file crawling, document processing (text extraction, OCR), content analysis (Entity Extraction & Named Entity Recognition) & data enrichment (annotation) pipelines & ingestor to Solr or Elastic search index & linked data graph database
Language: Python - Size: 615 KB - Last synced: 7 months ago - Pushed: over 1 year ago - Stars: 241 - Forks: 68
tmcgrath/cassandra-ingest
DataStax or Cassandra Ingest from Relational Databases with StreamSets
Language: PLSQL - Size: 12.3 MB - Last synced: 9 months ago - Pushed: about 5 years ago - Stars: 4 - Forks: 13
Morphl-AI/MorphL-Model-User-Search-Intent
Google Cloud Storage connector, pre-processor and model for predicting user search intent based on keywords
Language: Python - Size: 70.3 KB - Last synced: 3 months ago - Pushed: over 4 years ago - Stars: 21 - Forks: 4
mbsuraj/postgresql_ingestion_script
Ingest any format data into postgreSQL database
Language: Python - Size: 15.6 KB - Last synced: 9 months ago - Pushed: over 2 years ago - Stars: 1 - Forks: 1
KnudsenMorten/AzLogDcrIngestPS
AzLogDcrIngestPS - Unleashing the power of Log Ingestion API with Azure LogAnalytics custom table v2, Azure Data Collection Rules and Azure Data Ingestion Pipeline
Language: PowerShell - Size: 22.9 MB - Last synced: 9 months ago - Pushed: 9 months ago - Stars: 17 - Forks: 0
Charanaicore/multinational-retail-data-centralisation
The multinational retail data contralisation project is a data warehousing project that focuses on ingesting data from disparate sources to create a centralised warehouse
Language: Python - Size: 20.5 KB - Last synced: 10 months ago - Pushed: 10 months ago - Stars: 0 - Forks: 0
rohitshubham/Cloud-pipeline
A real-life end-to-end cloud sub-system scenario
Language: Python - Size: 294 KB - Last synced: 10 months ago - Pushed: almost 4 years ago - Stars: 1 - Forks: 0
akshaybahadur21/Emancipitaion-of-Apache-Spark
My experiments with Apache Spark for Humans ⭐
Language: Java - Size: 12.6 MB - Last synced: about 1 month ago - Pushed: about 1 year ago - Stars: 5 - Forks: 6
alanzhaonys/workmail-intercepter-excel-to-csv
Transform incoming AWS WorkMail email with Excel attachment to CSV and save to S3 bucket
Language: Python - Size: 367 KB - Last synced: 5 days ago - Pushed: 12 months ago - Stars: 0 - Forks: 0
azuregig/work_with_OrdnanceSurvey_data
Sample Azure Data Factory pipeline for ingesting Data Packages directly from the Download API of the Ordnance Survey Data Hub into Azure Storage.
Size: 2.21 MB - Last synced: 10 months ago - Pushed: almost 2 years ago - Stars: 3 - Forks: 3
amyth-singh/multinational-retail-data-centralisation
The multinational retail data contralisation project is a data warehousing project that focuses on ingesting data from disparate sources to create a centralised warehouse
Language: Python - Size: 981 KB - Last synced: 3 months ago - Pushed: 3 months ago - Stars: 0 - Forks: 0
Ludovik99/Analysis-of-Gas-Stations-with-Apache-Spark
Simulating a consultancy project for Repsol, the repository contains both the code notebook and the analysis.
Language: Jupyter Notebook - Size: 13 MB - Last synced: 11 months ago - Pushed: over 1 year ago - Stars: 0 - Forks: 0
Morphl-AI/MorphL-Model-Publishers-Churning-Users
Google Analytics connector, pre-processor and model for predicting churning users for digital publishers.
Language: Python - Size: 212 KB - Last synced: 3 months ago - Pushed: about 5 years ago - Stars: 10 - Forks: 6
siddharth271101/Stock-Exchange-Analysis
Created a data pipeline using sqoop to ingest data from sql server into the hive table and used hive for feature engineering and analysis.
Language: Shell - Size: 14.5 MB - Last synced: about 1 year ago - Pushed: almost 4 years ago - Stars: 2 - Forks: 1
fnldesign/crypthobot-ingestion
A cryptho currency automated bot
Size: 14.6 KB - Last synced: over 1 year ago - Pushed: over 5 years ago - Stars: 0 - Forks: 0
Sqooba/mssql-to-avro-with-spark
Apache Spark example reading from MSSQL and converting in AVRO format.
Language: Java - Size: 9.77 KB - Last synced: about 1 year ago - Pushed: over 5 years ago - Stars: 0 - Forks: 0