Ecosyste.ms: Repos

An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: ingestion-pipeline

vivek-bombatkar/Graph-Datastructure-for-Movielens-dataset

Language: Jupyter Notebook - Size: 726 KB - Last synced: 22 days ago - Pushed: over 5 years ago - Stars: 0 - Forks: 0

bruin-data/ingestr

ingestr is a CLI tool to copy data between any databases with a single command seamlessly.

Language: Python - Size: 721 KB - Last synced: about 1 month ago - Pushed: about 1 month ago - Stars: 2,315 - Forks: 47

ds-rafaelfelippe/DataIngestionPython

Mini projeto desenvolvido no contexto da disciplina de Banco de Dados Não Relacional do programa de pós-graduação em Ciência de Dados e Machine Learning na PUC Campinas.

Language: Jupyter Notebook - Size: 1.06 MB - Last synced: 6 months ago - Pushed: 6 months ago - Stars: 0 - Forks: 0

opensemanticsearch/open-semantic-etl

Python based Open Source ETL tools for file crawling, document processing (text extraction, OCR), content analysis (Entity Extraction & Named Entity Recognition) & data enrichment (annotation) pipelines & ingestor to Solr or Elastic search index & linked data graph database

Language: Python - Size: 615 KB - Last synced: 7 months ago - Pushed: over 1 year ago - Stars: 241 - Forks: 68

tmcgrath/cassandra-ingest

DataStax or Cassandra Ingest from Relational Databases with StreamSets

Language: PLSQL - Size: 12.3 MB - Last synced: 9 months ago - Pushed: about 5 years ago - Stars: 4 - Forks: 13

Morphl-AI/MorphL-Model-User-Search-Intent

Google Cloud Storage connector, pre-processor and model for predicting user search intent based on keywords

Language: Python - Size: 70.3 KB - Last synced: 3 months ago - Pushed: over 4 years ago - Stars: 21 - Forks: 4

mbsuraj/postgresql_ingestion_script

Ingest any format data into postgreSQL database

Language: Python - Size: 15.6 KB - Last synced: 9 months ago - Pushed: over 2 years ago - Stars: 1 - Forks: 1

KnudsenMorten/AzLogDcrIngestPS

AzLogDcrIngestPS - Unleashing the power of Log Ingestion API with Azure LogAnalytics custom table v2, Azure Data Collection Rules and Azure Data Ingestion Pipeline

Language: PowerShell - Size: 22.9 MB - Last synced: 9 months ago - Pushed: 9 months ago - Stars: 17 - Forks: 0

Charanaicore/multinational-retail-data-centralisation

The multinational retail data contralisation project is a data warehousing project that focuses on ingesting data from disparate sources to create a centralised warehouse

Language: Python - Size: 20.5 KB - Last synced: 10 months ago - Pushed: 10 months ago - Stars: 0 - Forks: 0

rohitshubham/Cloud-pipeline

A real-life end-to-end cloud sub-system scenario

Language: Python - Size: 294 KB - Last synced: 10 months ago - Pushed: almost 4 years ago - Stars: 1 - Forks: 0

akshaybahadur21/Emancipitaion-of-Apache-Spark

My experiments with Apache Spark for Humans ⭐

Language: Java - Size: 12.6 MB - Last synced: about 1 month ago - Pushed: about 1 year ago - Stars: 5 - Forks: 6

alanzhaonys/workmail-intercepter-excel-to-csv

Transform incoming AWS WorkMail email with Excel attachment to CSV and save to S3 bucket

Language: Python - Size: 367 KB - Last synced: 5 days ago - Pushed: 12 months ago - Stars: 0 - Forks: 0

azuregig/work_with_OrdnanceSurvey_data

Sample Azure Data Factory pipeline for ingesting Data Packages directly from the Download API of the Ordnance Survey Data Hub into Azure Storage.

Size: 2.21 MB - Last synced: 10 months ago - Pushed: almost 2 years ago - Stars: 3 - Forks: 3

amyth-singh/multinational-retail-data-centralisation

The multinational retail data contralisation project is a data warehousing project that focuses on ingesting data from disparate sources to create a centralised warehouse

Language: Python - Size: 981 KB - Last synced: 3 months ago - Pushed: 3 months ago - Stars: 0 - Forks: 0

Ludovik99/Analysis-of-Gas-Stations-with-Apache-Spark

Simulating a consultancy project for Repsol, the repository contains both the code notebook and the analysis.

Language: Jupyter Notebook - Size: 13 MB - Last synced: 11 months ago - Pushed: over 1 year ago - Stars: 0 - Forks: 0

Morphl-AI/MorphL-Model-Publishers-Churning-Users

Google Analytics connector, pre-processor and model for predicting churning users for digital publishers.

Language: Python - Size: 212 KB - Last synced: 3 months ago - Pushed: about 5 years ago - Stars: 10 - Forks: 6

siddharth271101/Stock-Exchange-Analysis

Created a data pipeline using sqoop to ingest data from sql server into the hive table and used hive for feature engineering and analysis.

Language: Shell - Size: 14.5 MB - Last synced: about 1 year ago - Pushed: almost 4 years ago - Stars: 2 - Forks: 1

fnldesign/crypthobot-ingestion

A cryptho currency automated bot

Size: 14.6 KB - Last synced: over 1 year ago - Pushed: over 5 years ago - Stars: 0 - Forks: 0

Sqooba/mssql-to-avro-with-spark

Apache Spark example reading from MSSQL and converting in AVRO format.

Language: Java - Size: 9.77 KB - Last synced: about 1 year ago - Pushed: over 5 years ago - Stars: 0 - Forks: 0