An open API service providing repository metadata for many open source software ecosystems.

Topic: "data-integration"

apache/airflow

Apache Airflow - A platform to programmatically author, schedule, and monitor workflows

Language: Python - Size: 374 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 39,930 - Forks: 14,972

Avaiga/taipy

Turns Data and AI algorithms into production-ready web applications in no time.

Language: Python - Size: 150 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 18,048 - Forks: 1,881

airbytehq/airbyte

The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.

Language: Python - Size: 666 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 18,044 - Forks: 4,503

dagster-io/dagster

An orchestration platform for the development, production, and observation of data assets.

Language: Python - Size: 1.26 GB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 13,081 - Forks: 1,668

apache/seatunnel

SeaTunnel is a next-generation super high-performance, distributed, massive data integration tool.

Language: Java - Size: 42.4 MB - Last synced at: about 1 hour ago - Pushed at: about 2 hours ago - Stars: 8,478 - Forks: 1,966

mage-ai/mage-ai

🧙 Build, run, and manage data pipelines for integrating and transforming data.

Language: Python - Size: 233 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 8,300 - Forks: 841

cloudquery/cloudquery

The developer first cloud governance platform

Language: Go - Size: 172 MB - Last synced at: 4 days ago - Pushed at: 5 days ago - Stars: 6,083 - Forks: 528

apache/flink-cdc

Flink CDC is a streaming data integration tool

Language: Java - Size: 40.9 MB - Last synced at: 2 days ago - Pushed at: 12 days ago - Stars: 6,051 - Forks: 2,010

apache/hudi

Upserts, Deletes And Incremental Processing on Big Data.

Language: Java - Size: 1.74 GB - Last synced at: 6 days ago - Pushed at: 7 days ago - Stars: 5,756 - Forks: 2,396

infinyon/fluvio

🦀 event stream processing for developers to stream and process data in motion to power responsive data intensive applications.

Language: Rust - Size: 34.1 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 4,916 - Forks: 514

jitsucom/jitsu

Jitsu is an open-source Segment alternative. Fully-scriptable data ingestion engine for modern data teams. Set-up a real-time data pipeline in minutes, not days

Language: TypeScript - Size: 43 MB - Last synced at: about 7 hours ago - Pushed at: 5 days ago - Stars: 4,292 - Forks: 312

rudderlabs/rudder-server

Privacy and Security focused Segment-alternative, in Golang and React

Language: Go - Size: 308 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 4,176 - Forks: 330

DTStack/chunjun

A data integration framework

Language: Java - Size: 126 MB - Last synced at: 13 days ago - Pushed at: 2 months ago - Stars: 4,046 - Forks: 1,699

seandavi/awesome-single-cell

Community-curated list of software packages and data resources for single-cell, including RNA-seq, ATAC-seq, etc.

Size: 1.43 MB - Last synced at: 12 days ago - Pushed at: 20 days ago - Stars: 3,358 - Forks: 1,016

bruin-data/ingestr

ingestr is a CLI tool to copy data between any databases with a single command seamlessly.

Language: Python - Size: 167 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 2,951 - Forks: 79

apache/incubator-devlake

Apache DevLake is an open-source dev data platform to ingest, analyze, and visualize the fragmented data from DevOps tools, extracting insights for engineering excellence, developer experience, and community growth.

Language: Go - Size: 38.3 MB - Last synced at: 4 days ago - Pushed at: 5 days ago - Stars: 2,721 - Forks: 577

mara/mara-pipelines

A lightweight opinionated ETL framework, halfway between plain scripts and Apache Airflow

Language: Python - Size: 3.29 MB - Last synced at: 28 days ago - Pushed at: over 1 year ago - Stars: 2,078 - Forks: 100

bytedance/bitsail

BitSail is a distributed high-performance data integration engine which supports batch, streaming and incremental scenarios. BitSail is widely used to synchronize hundreds of trillions of data every day.

Language: Java - Size: 26.4 MB - Last synced at: 27 days ago - Pushed at: over 1 year ago - Stars: 1,654 - Forks: 331

apache/hop

Hop Orchestration Platform

Language: Java - Size: 197 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 1,130 - Forks: 367

kuwala-io/kuwala

Kuwala is the no-code data platform for BI analysts and engineers enabling you to build powerful analytics workflows. We are set out to bring state-of-the-art data engineering tools you love, such as Airbyte, dbt, or Great Expectations together in one intuitive interface built with React Flow. In addition we provide third-party data into data science models and products with a focus on geospatial data. Currently, the following data connectors are available worldwide: a) High-resolution demographics data b) Point of Interests from Open Street Map c) Google Popular Times

Language: JavaScript - Size: 7.79 MB - Last synced at: about 1 month ago - Pushed at: almost 3 years ago - Stars: 792 - Forks: 54

apache/seatunnel-web

SeaTunnel is a distributed, high-performance data integration platform for the synchronization and transformation of massive data (offline & real-time).

Language: Java - Size: 17.4 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 677 - Forks: 302

artie-labs/transfer

Database replication platform that leverages change data capture. Stream production data from databases to your data warehouse (Snowflake, BigQuery, Redshift, Databricks) in real-time.

Language: Go - Size: 3.85 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 651 - Forks: 33

immunogenomics/harmony

Fast, sensitive and accurate integration of single-cell data with Harmony

Language: R - Size: 52.9 MB - Last synced at: 6 days ago - Pushed at: 6 months ago - Stars: 568 - Forks: 102

leesf/hudi-resources

汇总Apache Hudi相关资料

Size: 23.7 MB - Last synced at: about 1 hour ago - Pushed at: about 3 hours ago - Stars: 551 - Forks: 161

saeyslab/nichenetr

NicheNet: predict active ligand-target links between interacting cells

Language: R - Size: 152 MB - Last synced at: 17 days ago - Pushed at: 17 days ago - Stars: 534 - Forks: 124

ConduitIO/conduit

Conduit streams data between data stores. Kafka Connect replacement. No JVM required.

Language: Go - Size: 12.6 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 490 - Forks: 51

theislab/scarches

Reference mapping for single-cell genomics

Language: Jupyter Notebook - Size: 825 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 347 - Forks: 52

gabledata/recap

Work with your web service, database, and streaming schemas in a single format.

Language: Python - Size: 1.42 MB - Last synced at: 14 days ago - Pushed at: 16 days ago - Stars: 344 - Forks: 26

CategoricalData/CQL

Categorical Query Language IDE

Language: Java - Size: 145 MB - Last synced at: 7 months ago - Pushed at: 7 months ago - Stars: 299 - Forks: 22

cuebook/cuelake

Use SQL to build ELT pipelines on a data lakehouse.

Language: JavaScript - Size: 28 MB - Last synced at: about 1 month ago - Pushed at: almost 3 years ago - Stars: 285 - Forks: 28

hetio/hetionet

Hetionet: an integrative network of disease

Language: HTML - Size: 380 MB - Last synced at: 28 days ago - Pushed at: about 2 years ago - Stars: 282 - Forks: 69

pracdata/awesome-open-source-data-engineering

A curated list of open source tools used in analytics platforms and data engineering ecosystem

Size: 219 KB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 274 - Forks: 29

CommonCoreOntology/CommonCoreOntologies

The Common Core Ontology Repository holds the current released version of the Common Core Ontology suite.

Language: Makefile - Size: 16.5 MB - Last synced at: 20 days ago - Pushed at: 20 days ago - Stars: 230 - Forks: 61

dataplane-app/dataplane

Dataplane is an Airflow inspired unified data platform with additional data mesh and RPA capability to automate, schedule and design data pipelines and workflows. Dataplane is written in Golang with a React front end.

Language: JavaScript - Size: 281 MB - Last synced at: 10 days ago - Pushed at: about 1 month ago - Stars: 226 - Forks: 33

slowkow/harmonypy

🎼 Integrate multiple high-dimensional datasets with fuzzy k-means and locally linear adjustments.

Language: Python - Size: 2.77 MB - Last synced at: 6 days ago - Pushed at: 10 months ago - Stars: 217 - Forks: 22

morph-kgc/morph-kgc

Powerful RDF Knowledge Graph Generation with RML Mappings

Language: Python - Size: 32.8 MB - Last synced at: 4 days ago - Pushed at: 6 days ago - Stars: 209 - Forks: 39

opensanctions/nomenklatura

Framework and command-line tools for integrating FollowTheMoney data streams from multiple sources

Language: Python - Size: 2.54 MB - Last synced at: 17 days ago - Pushed at: 17 days ago - Stars: 209 - Forks: 38

mara/mara-example-project-2

An example mini data warehouse for python project stats, template for new projects

Language: Python - Size: 24 MB - Last synced at: about 1 month ago - Pushed at: almost 5 years ago - Stars: 178 - Forks: 39

ceumicrodata/mETL

mito ETL tool

Language: Python - Size: 7.43 MB - Last synced at: 4 days ago - Pushed at: almost 4 years ago - Stars: 163 - Forks: 41

mims-harvard/scikit-fusion

scikit-fusion: Data fusion via collective latent factor models

Language: Python - Size: 9.28 MB - Last synced at: 26 days ago - Pushed at: almost 2 years ago - Stars: 147 - Forks: 44

google/megalista 📦

First Party data integration solution built for marketing teams to enable audience and conversion onboarding into Google Marketing products (Google Ads, Campaign Manager, Google Analytics).

Language: Python - Size: 1.34 MB - Last synced at: 22 days ago - Pushed at: 4 months ago - Stars: 137 - Forks: 55

genular/pandora

PANDORA :computer:

Language: Vue - Size: 16.4 MB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 135 - Forks: 21

SDM-TIB/SDM-RDFizer

An Efficient RML-Compliant Engine for Knowledge Graph Construction

Language: Python - Size: 21 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 118 - Forks: 25

starlake-ai/starlake

Declarative text based tool for data analysts and engineers to extract, load, transform and orchestrate their data pipelines.

Language: Scala - Size: 170 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 107 - Forks: 23

olehmberg/winter

WInte.r is a Java framework for end-to-end data integration. The WInte.r framework implements well-known methods for data pre-processing, schema matching, identity resolution, data fusion, and result evaluation.

Language: Java - Size: 18.6 MB - Last synced at: about 1 year ago - Pushed at: almost 3 years ago - Stars: 105 - Forks: 32

thedataengineeringbook/thedataengineeringbook

The Data Engineering Book - หนังสือวิศวกรรมข้อมูล ของคนไทย เพื่อคนไทย

Language: JavaScript - Size: 1.54 MB - Last synced at: about 1 year ago - Pushed at: over 1 year ago - Stars: 103 - Forks: 43

Teichlab/cellhint

A tool for semi-automatic cell type harmonization and integration

Language: Python - Size: 6.78 MB - Last synced at: 28 days ago - Pushed at: about 2 months ago - Stars: 102 - Forks: 14

runprism/prism

Prism is the easiest way to develop, orchestrate, and execute data pipelines in Python.

Language: Python - Size: 2.42 MB - Last synced at: 14 days ago - Pushed at: 6 months ago - Stars: 85 - Forks: 2

SysBioChalmers/GECKO

Toolbox for including enzyme constraints on a genome-scale model.

Language: MATLAB - Size: 107 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 70 - Forks: 52

saezlab/cosmosR

COSMOS (Causal Oriented Search of Multi-Omic Space) is a method that integrates phosphoproteomics, transcriptomics, and metabolomics data sets.

Language: R - Size: 53.2 MB - Last synced at: 23 days ago - Pushed at: 2 months ago - Stars: 60 - Forks: 16

munchy-bytes/SchemaMapper

A .NET class library that allows you to import data from different sources into a unified destination

Language: C# - Size: 5.9 MB - Last synced at: 2 days ago - Pushed at: almost 2 years ago - Stars: 60 - Forks: 16

jupyter-naas/drivers

Low-code Python library enabling access to APIs, tools, data sources in seconds.

Language: Python - Size: 1.53 MB - Last synced at: 28 days ago - Pushed at: 10 months ago - Stars: 59 - Forks: 12

siyul-park/uniflow

A high-performance, extremely flexible, and easily extensible universal workflow engine.

Language: Go - Size: 2.94 MB - Last synced at: 7 days ago - Pushed at: 8 days ago - Stars: 51 - Forks: 5

CogStack/CogStack-NiFi

Building data processing pipelines for documents processing with NLP using Apache NiFi and related services

Language: Python - Size: 74.4 MB - Last synced at: 16 days ago - Pushed at: 16 days ago - Stars: 50 - Forks: 20

DP6/marketing-data-sync Fork of google/megalista

First Party data integration solution built for marketing teams to enable audience and conversion onboarding into Google Marketing products and Facebook Ads.

Language: Python - Size: 959 KB - Last synced at: 7 days ago - Pushed at: about 2 years ago - Stars: 49 - Forks: 6

linkml/linkml-model

Link Modeling Language (LinkML) model

Language: Python - Size: 12.7 MB - Last synced at: 23 days ago - Pushed at: 24 days ago - Stars: 48 - Forks: 20

datasphere-oss/datasphere-integration

an data-centric integration platform

Language: Java - Size: 20.7 MB - Last synced at: about 1 month ago - Pushed at: almost 4 years ago - Stars: 48 - Forks: 17

umer7/Data-Warehouse-Concepts-Design-and-Data-Integration

Repo for Data Warehouse Concepts, Design, and Data Integration by University of Colorado System (coursera)(Notes,Assignments, quiz and research papers)

Size: 35 MB - Last synced at: 6 months ago - Pushed at: almost 7 years ago - Stars: 45 - Forks: 32

neuroforgede/nfcompose

Build REST APIs/Integrations in minutes instead of hours - NF Compose is a (data) integration platform that allows developers to define REST APIs in seconds instead of hours. Generated REST APIs are backed by postgres and support automatic consumer webhook notifications on data changes out of the box.

Language: Python - Size: 2.57 MB - Last synced at: about 6 hours ago - Pushed at: 24 days ago - Stars: 39 - Forks: 3

Azure/data-product-batch

Template to deploy a Data Product for Batch data processing into a Data Landing Zone of the Data Management & Analytics Scenario (former Enterprise-Scale Analytics). The Data Product template can be used by cross-functional teams to ingest, provide and create new data assets within the platform.

Language: Bicep - Size: 11.3 MB - Last synced at: 6 days ago - Pushed at: almost 2 years ago - Stars: 38 - Forks: 22

mara/mara-etl-tools

Utilities for creating ETL pipelines with mara

Language: PLpgSQL - Size: 54.7 KB - Last synced at: about 23 hours ago - Pushed at: almost 3 years ago - Stars: 36 - Forks: 4

Azure/data-product-streaming

Template to deploy a Data Product for data stream processing into a Data Landing Zone of the Data Management & Analytics Scenario (former Enterprise-Scale Analytics). The Data Product template can be used by cross-functional teams to ingest, provide and create new data assets within the platform.

Language: Bicep - Size: 12.1 MB - Last synced at: about 1 year ago - Pushed at: almost 2 years ago - Stars: 35 - Forks: 12

AltschulerWu-Lab/MUSE

MUSE is a deep learning approach characterizing tissue composition through combined analysis of morphologies and transcriptional states for spatially resolved transcriptomics data.

Language: Jupyter Notebook - Size: 153 MB - Last synced at: 19 days ago - Pushed at: about 3 years ago - Stars: 34 - Forks: 8

selbouhaddani/OmicsPLS

R package for High dimensional data analysis and integration with O2PLS!

Language: HTML - Size: 31.4 MB - Last synced at: 28 days ago - Pushed at: about 1 month ago - Stars: 32 - Forks: 8

DerwenAI/ERKG

Demonstrate integration of Senzing and Neo4j to construct an Entity Resolved Knowledge Graph

Size: 13.9 MB - Last synced at: 9 days ago - Pushed at: 9 months ago - Stars: 32 - Forks: 6

JonnyTran/OpenOmics

A bioinformatics API to interface with public multi-omics bio databases for wicked fast data integration.

Language: Python - Size: 68.5 MB - Last synced at: 3 days ago - Pushed at: 10 months ago - Stars: 32 - Forks: 11

oeg-upm/mapeathor

Translator of spreadsheet mappings into R2RML, RML or YARRRML

Language: Python - Size: 58.8 MB - Last synced at: 30 days ago - Pushed at: 12 months ago - Stars: 32 - Forks: 10

dhimmel/integrate

Scripts and resources to create Hetionet v1.0, a heterogeneous network for drug repurposing

Language: Jupyter Notebook - Size: 565 MB - Last synced at: 29 days ago - Pushed at: over 7 years ago - Stars: 32 - Forks: 17

zazuko/barnard59

An intuitive and flexible RDF pipeline solution designed to simplify and automate ETL processes for efficient data management.

Language: JavaScript - Size: 3.66 MB - Last synced at: 1 day ago - Pushed at: about 1 month ago - Stars: 30 - Forks: 2

linkedin/data-integration-library

The Data Integration Library project provides a library of generic components based on a multi-stage architecture for data ingress and egress.

Language: Java - Size: 1.51 MB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 30 - Forks: 14

YangLabHKUST/Portal

Adversarial domain translation networks for integrating large-scale atlas-level single-cell datasets

Language: Python - Size: 119 KB - Last synced at: 22 days ago - Pushed at: almost 2 years ago - Stars: 30 - Forks: 6

DTUComputeStatisticsAndDataAnalysis/MBPLS

(Multiblock) Partial Least Squares Regression for Python

Language: Python - Size: 16.6 MB - Last synced at: 8 days ago - Pushed at: over 5 years ago - Stars: 30 - Forks: 7

thymeflow/thymeflow

Installer for Thymeflow, a personal knowledge management system.

Size: 20.5 KB - Last synced at: about 2 years ago - Pushed at: about 7 years ago - Stars: 30 - Forks: 5

raamana/pyradigm

Research data management in biomedical and machine learning applications

Language: Python - Size: 7.25 MB - Last synced at: 8 days ago - Pushed at: about 2 years ago - Stars: 29 - Forks: 12

cthoyt/doctoral-thesis

📖 Generation and Applications of Knowledge Graphs in Systems and Networks Biology

Language: TeX - Size: 68.6 MB - Last synced at: about 2 months ago - Pushed at: over 5 years ago - Stars: 29 - Forks: 2

ginkgobioworks/geckopy

Enzyme-constrained genome-scale models in python

Language: Python - Size: 4.84 MB - Last synced at: about 1 month ago - Pushed at: over 1 year ago - Stars: 26 - Forks: 7

dosorio/rPanglaoDB

An R package to download and merge labeled single-cell RNA-seq data from the PanglaoDB database into a Seurat object.

Language: HTML - Size: 2.24 MB - Last synced at: 6 days ago - Pushed at: almost 2 years ago - Stars: 26 - Forks: 3

cloudquery/plugin-sdk

CloudQuery Go SDK for source and destination plugins

Language: Go - Size: 18.2 MB - Last synced at: 3 days ago - Pushed at: 4 days ago - Stars: 25 - Forks: 25

glasgowcompbio/pyMultiOmics

Python toolbox for multi-omics data mapping and analysis

Language: Jupyter Notebook - Size: 45.9 MB - Last synced at: 22 days ago - Pushed at: about 2 years ago - Stars: 24 - Forks: 5

davidfoerster/schema-matching

Match schema attributes of relational databases by value similarity. As a study assignment, this isn't well documented, but you can contact me for questions and I may even add docs, if I sense enough interest.

Language: Python - Size: 271 KB - Last synced at: 17 days ago - Pushed at: over 5 years ago - Stars: 24 - Forks: 8

JinmiaoChenLab/FastIntegration

FastIntegrate integrates thousands of scRNA-seq datasets and outputs batch-corrected values for downstream analysis

Language: R - Size: 2.37 MB - Last synced at: about 2 months ago - Pushed at: 7 months ago - Stars: 23 - Forks: 4

shuxiaoc/mario-py

MARIO: single-cell proteomic data matching and integration using both shared and distinct features

Language: Jupyter Notebook - Size: 660 MB - Last synced at: 24 days ago - Pushed at: over 1 year ago - Stars: 23 - Forks: 2

abcsys/libem

Compound AI toolchain for fast and accurate entity matching, powered by LLMs.

Language: Python - Size: 3.54 MB - Last synced at: about 13 hours ago - Pushed at: about 2 months ago - Stars: 22 - Forks: 4

yezhengSTAT/ADTnorm

ADTnorm normalizes the cell surface protein measurement of CITE-seq data, facilitating across batches and across studies data integration.

Language: R - Size: 48.2 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 22 - Forks: 5

bio2bel/bio2bel

A Python framework for integrating biological databases and structured data sources in Biological Expression Language (BEL)

Language: Python - Size: 417 KB - Last synced at: about 15 hours ago - Pushed at: over 3 years ago - Stars: 21 - Forks: 5

JohnnyBravo75/DataBridge.NET

Configurable data bridge for permanent ETL jobs

Language: C# - Size: 11.1 MB - Last synced at: about 1 year ago - Pushed at: over 2 years ago - Stars: 20 - Forks: 10

Amine-Smahi/R-Learning-Journey

Some of the projects i made when starting to learn R for Data Science at the university

Language: R - Size: 63.5 KB - Last synced at: about 1 month ago - Pushed at: almost 6 years ago - Stars: 20 - Forks: 0

CloudFormations/CF.Cumulus

A cloud data platform product to accelerate time to insights. Our open-source framework is designed for the real world. Stripping away the complexity, giving you the power to build, scale, and manage your dataflows with ease, accelerating data delivery.

Language: TSQL - Size: 10.5 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 19 - Forks: 11

caokai1073/Pamona

The software of Pamona, a partial manifold alignment algorithm.

Language: Jupyter Notebook - Size: 41.4 MB - Last synced at: 26 days ago - Pushed at: about 4 years ago - Stars: 19 - Forks: 3

NPLinker/nplinker

A python framework for microbial natural products data mining by integrating genomics and metabolomics data

Language: Python - Size: 116 MB - Last synced at: 20 days ago - Pushed at: 27 days ago - Stars: 18 - Forks: 13

oeg-upm/gtfs-bench

GTFS-Madrid-Bench: A Benchmark for Knowledge Graph Construction Engines

Language: Python - Size: 197 MB - Last synced at: 14 minutes ago - Pushed at: about 2 months ago - Stars: 18 - Forks: 13

scify/jedai-ui

UI for JedAI Toolkit

Language: Java - Size: 1.09 MB - Last synced at: 29 days ago - Pushed at: almost 3 years ago - Stars: 17 - Forks: 5

cutterkom/remove-na-lgbtiq-queer-knowledge-graph

A knowledge graph on queer history

Language: R - Size: 9.45 MB - Last synced at: 14 days ago - Pushed at: 5 months ago - Stars: 16 - Forks: 1

alexkychen/assignPOP

Population Assignment using Genetic, Non-genetic or Integrated Data in a Machine-learning Framework. Methods in Ecology and Evolution. 2018;9:439–446.

Language: R - Size: 8.81 MB - Last synced at: 28 days ago - Pushed at: about 1 year ago - Stars: 16 - Forks: 4

NYXFLOWER/GripNet

GripNet: Graph Information Propagation on Supergraph for Heterogeneous Graphs (PatternRecognit, 2023)

Language: Python - Size: 88 MB - Last synced at: about 1 month ago - Pushed at: about 2 years ago - Stars: 16 - Forks: 2

MeltanoLabs/Singer-Working-Group

Working group for ongoing development and iteration of the Singer Spec, the de-facto protocol for open source data connectors. Please use "Issues" to create discussion items - or use "Discussions" for general questions.

Size: 28.3 KB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 15 - Forks: 4

KarrLab/datanator

Toolkit for discovering and aggregating data for whole-cell modeling

Language: Python - Size: 73.9 MB - Last synced at: over 1 year ago - Pushed at: over 3 years ago - Stars: 14 - Forks: 4

michaelbironneau/analyst

A declarative, SQL-like DSL for data integration tasks.

Language: Go - Size: 4.05 MB - Last synced at: 11 months ago - Pushed at: almost 7 years ago - Stars: 14 - Forks: 2

cognitedata/python-extractor-utils

Framework for developing extractors in Python

Language: Python - Size: 1.24 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 13 - Forks: 5

lisad/phaser

The missing layer for complex data batch integration pipelines

Language: Python - Size: 548 KB - Last synced at: 18 days ago - Pushed at: 2 months ago - Stars: 13 - Forks: 1