Ecosyste.ms: Repos

An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: data-infrastructure

uktrade/pg-sync-roles

Python utility function to ensure that a PostgreSQL role has certain permissions

Language: Python - Size: 253 KB - Last synced: 8 days ago - Pushed: 8 days ago - Stars: 2 - Forks: 0

uktrade/data-workspace

PostgreSQL-based open source data analysis platform

Language: HCL - Size: 1.14 MB - Last synced: 5 days ago - Pushed: 5 days ago - Stars: 0 - Forks: 1

carbonitech/data-api

Data Virtualization improving accessibility to datasets and enriching those datasets - for the HVAC Industry

Language: Python - Size: 1.95 MB - Last synced: 11 days ago - Pushed: 12 days ago - Stars: 0 - Forks: 0

uktrade/mirror-git-to-s3

Python functions and CLI to mirror git repositories to S3

Language: Python - Size: 104 KB - Last synced: 11 days ago - Pushed: 9 months ago - Stars: 3 - Forks: 1

uktrade/public-data-api

The source for the Department for International Trade's Public Data API

Language: HTML - Size: 1.61 MB - Last synced: 19 days ago - Pushed: 19 days ago - Stars: 5 - Forks: 0

uktrade/stream-zip

Python function to construct a ZIP archive on the fly

Language: Python - Size: 665 KB - Last synced: 19 days ago - Pushed: 19 days ago - Stars: 87 - Forks: 7

uktrade/pg-bulk-ingest

Python utility function to ingest data into a SQLAlchemy-defined PostgreSQL table

Language: Python - Size: 678 KB - Last synced: 18 days ago - Pushed: 19 days ago - Stars: 34 - Forks: 0

uktrade/stream-read-xbrl

Python package to parse Companies House accounts data in a streaming way

Language: Python - Size: 727 KB - Last synced: 18 days ago - Pushed: 19 days ago - Stars: 13 - Forks: 2

uktrade/stream-unzip

Python function to stream unzip all the files in a ZIP archive on the fly

Language: Python - Size: 666 KB - Last synced: 18 days ago - Pushed: 19 days ago - Stars: 252 - Forks: 11

uktrade/quicksight-bulk-update-datasets

Command line interface (CLI) to make bulk updates to Quicksight datasets

Language: Python - Size: 114 KB - Last synced: 21 days ago - Pushed: 3 months ago - Stars: 0 - Forks: 0

CrunchyData/postgres-operator

Production PostgreSQL for Kubernetes, from high availability Postgres clusters to full-scale database-as-a-service.

Language: Go - Size: 623 MB - Last synced: 23 days ago - Pushed: 23 days ago - Stars: 3,732 - Forks: 573

uktrade/vulnerability-priority-list

A command line report on a GitHub organisation's repositories, ordered by priority, and including time-to-SLA for each severity level

Language: Python - Size: 221 KB - Last synced: 25 days ago - Pushed: 26 days ago - Stars: 3 - Forks: 0

zalando/nakadi

A distributed event bus that implements a RESTful API abstraction on top of Kafka-like queues

Language: Java - Size: 14.7 MB - Last synced: 24 days ago - Pushed: about 2 months ago - Stars: 949 - Forks: 294

uktrade/data-workspace-superset

Language: Python - Size: 36.1 KB - Last synced: about 1 month ago - Pushed: about 1 month ago - Stars: 0 - Forks: 0

uktrade/data-workspace-mlflow

Language: Python - Size: 25.4 KB - Last synced: about 1 month ago - Pushed: about 1 month ago - Stars: 0 - Forks: 0

zalando/spilo

Highly available elephant herd: HA PostgreSQL cluster using Docker

Language: Python - Size: 27.8 MB - Last synced: about 1 month ago - Pushed: about 1 month ago - Stars: 1,316 - Forks: 359

uktrade/data-workspace-tools

Language: HTML - Size: 4.14 MB - Last synced: about 1 month ago - Pushed: about 1 month ago - Stars: 3 - Forks: 0

uktrade/streamlit-gov-uk-components

A collection of Streamlit components that use or are inspired by the GOV.UK Design System

Language: Shell - Size: 2.58 MB - Last synced: 24 days ago - Pushed: 25 days ago - Stars: 5 - Forks: 0

zalando/postgres-operator

Postgres operator creates and manages PostgreSQL clusters running in Kubernetes

Language: Go - Size: 32.5 MB - Last synced: about 1 month ago - Pushed: about 1 month ago - Stars: 3,951 - Forks: 937

uktrade/countries-of-interest-service

Lightweight API service for querying for companies that have expressed interest in exporting to specific countries

Language: Python - Size: 2.06 MB - Last synced: about 1 month ago - Pushed: 2 months ago - Stars: 3 - Forks: 0

uktrade/legal-basis-api

Legal Basis for Consent Service API Server

Language: Python - Size: 1.19 MB - Last synced: about 1 month ago - Pushed: about 1 month ago - Stars: 3 - Forks: 1

uktrade/data-workspace-frontend

An open source data analysis platform with features for users with a range of technical skills

Language: Python - Size: 49.3 MB - Last synced: about 1 month ago - Pushed: about 1 month ago - Stars: 43 - Forks: 20

uktrade/data-engineering-common

Library of common functionality used by data engineering microservices

Language: Python - Size: 84 KB - Last synced: about 1 month ago - Pushed: 7 months ago - Stars: 1 - Forks: 0

uktrade/to-file-like-obj

Python utility function to convert an iterable of bytes or str to a readable file-like object

Language: Python - Size: 42 KB - Last synced: 11 days ago - Pushed: 7 months ago - Stars: 7 - Forks: 0

uktrade/sqlite-s3vfs

Python writable virtual filesystem for SQLite on S3

Language: Python - Size: 157 KB - Last synced: about 1 month ago - Pushed: 2 months ago - Stars: 109 - Forks: 7

aivanzhang/panda_patrol

Language: Python - Size: 33.2 MB - Last synced: 18 days ago - Pushed: 5 months ago - Stars: 21 - Forks: 0

uktrade/mbtiles-s3-server

Python server to on-the-fly extract and serve vector tiles from an mbtiles file on S3

Language: Python - Size: 6.78 MB - Last synced: 14 days ago - Pushed: over 1 year ago - Stars: 136 - Forks: 4

zalando-incubator/spark-json-schema

JSON schema parser for Apache Spark

Language: Scala - Size: 78.1 KB - Last synced: 2 months ago - Pushed: over 1 year ago - Stars: 79 - Forks: 42

uktrade/data-workspace-gitlab

Language: Shell - Size: 5.86 KB - Last synced: about 1 month ago - Pushed: 3 months ago - Stars: 0 - Forks: 0

tensorbase/tensorbase

TensorBase is a new big data warehousing with modern efforts.

Language: Rust - Size: 32.9 MB - Last synced: 3 months ago - Pushed: about 2 years ago - Stars: 1,423 - Forks: 116

thedataengineeringbook/thedataengineeringbook

The Data Engineering Book - หนังสือวิศวกรรมข้อมูล ของคนไทย เพื่อคนไทย

Language: JavaScript - Size: 1.54 MB - Last synced: about 1 month ago - Pushed: 7 months ago - Stars: 103 - Forks: 43

amkrajewski/mpdd-alignn Fork of usnistgov/alignn

MPDD Calculator for Atomistic Line Graph Neural Network Deployment

Language: Python - Size: 151 MB - Last synced: 3 months ago - Pushed: 3 months ago - Stars: 3 - Forks: 0

uktrade/fargatespawner

Spawns JupyterHub single user servers in Docker containers running in AWS Fargate

Language: Python - Size: 67.4 KB - Last synced: 9 days ago - Pushed: almost 4 years ago - Stars: 44 - Forks: 21

zalando/PGObserver 📦

A battle-tested, flexible & comprehensive monitoring solution for your PostgreSQL databases

Language: Python - Size: 4.75 MB - Last synced: about 1 month ago - Pushed: almost 4 years ago - Stars: 315 - Forks: 64

zalando-nakadi/kanadi

Kanadi is a Nakadi client for Scala

Language: Scala - Size: 406 KB - Last synced: 25 days ago - Pushed: about 2 months ago - Stars: 30 - Forks: 20

uktrade/mobius3

Continuously sync folder to S3, using inotify under the hood

Language: Python - Size: 4.15 MB - Last synced: 13 days ago - Pushed: 2 months ago - Stars: 47 - Forks: 3

uktrade/stream-sqlite

Python function to extract rows from a SQLite file while iterating over its bytes

Language: Python - Size: 10.4 MB - Last synced: 12 days ago - Pushed: over 1 year ago - Stars: 23 - Forks: 5

uktrade/jwt-postgresql-proxy

Stateless JWT authentication in front of PostgreSQL

Language: Python - Size: 249 KB - Last synced: about 1 month ago - Pushed: over 3 years ago - Stars: 6 - Forks: 1

uktrade/git-lfs-http-mirror

Simple Python server to serve a read only HTTP mirror of git repositories that use Large File Storage (LFS)

Language: Python - Size: 32.2 KB - Last synced: 11 days ago - Pushed: 4 months ago - Stars: 0 - Forks: 1

uktrade/postgresql-proxy

Language: Python - Size: 17.6 KB - Last synced: about 1 month ago - Pushed: 7 months ago - Stars: 0 - Forks: 0

uktrade/stream-read-ods

Python function to extract data from an ODS spreadsheet on the fly - without having to store the entire file in memory or disk

Language: Python - Size: 150 KB - Last synced: 11 days ago - Pushed: 10 months ago - Stars: 1 - Forks: 0

uktrade/stream-write-ods

Python function to construct an ODS spreadsheet on the fly - without having to store the entire file in memory or disk

Language: Python - Size: 126 KB - Last synced: 11 days ago - Pushed: 10 months ago - Stars: 3 - Forks: 0

uktrade/pg-force-execute

Context manager to run PostgreSQL queries with SQLAlchemy, terminating any other clients that block it

Language: Python - Size: 86.9 KB - Last synced: 14 days ago - Pushed: 10 months ago - Stars: 4 - Forks: 0

uktrade/iterable-subprocess

Python context manager to communicate with a subprocess using iterables: for when data is too big to fit in memory and has to be streamed

Language: Python - Size: 83 KB - Last synced: 29 days ago - Pushed: 9 months ago - Stars: 7 - Forks: 2

uktrade/tidy-json-to-csv

Convert JSON to a set of tidy CSV files

Language: Python - Size: 104 KB - Last synced: 18 days ago - Pushed: over 3 years ago - Stars: 22 - Forks: 0

abhishek-ch/data-machinelearning-the-boring-way

Build & Learn Data Engineering,Machine Learning over Kubernetes. No Shortcut approach.

Language: Python - Size: 3.33 MB - Last synced: about 1 month ago - Pushed: over 1 year ago - Stars: 52 - Forks: 9

uktrade/streampq

Python PostgreSQL adapter to stream results of multi-statement queries without a server-side cursor

Language: Python - Size: 228 KB - Last synced: 18 days ago - Pushed: over 1 year ago - Stars: 8 - Forks: 0

alphagov/consent-api

Service for sharing user consent to cookies across multiple domains

Language: Python - Size: 1.13 MB - Last synced: 13 days ago - Pushed: 13 days ago - Stars: 7 - Forks: 0

uktrade/theia-postgres

PostgreSQL plugin for Theia providing explorer, highlighting, diagnostics, and intellisense

Language: TypeScript - Size: 2.35 MB - Last synced: about 1 month ago - Pushed: 11 months ago - Stars: 1 - Forks: 0

uktrade/dt08-data-tools 📦

Tools which may be useful for data processing and data science applications

Language: Python - Size: 42 KB - Last synced: about 1 month ago - Pushed: about 4 years ago - Stars: 0 - Forks: 0

GiorgiaAuroraAdorni/virtual-CAT-data-infrastructure

This repository contains the data infrastructure for the Virtual Cross Array Task (CAT) platform designed to assess algorithmic skills among K-12 students.

Language: Java - Size: 937 KB - Last synced: 7 months ago - Pushed: 7 months ago - Stars: 0 - Forks: 0

uktrade/dns-rewrite-proxy

A DNS proxy server that conditionally rewrites and filters A record requests

Language: Python - Size: 114 KB - Last synced: 28 days ago - Pushed: over 3 years ago - Stars: 30 - Forks: 5

uktrade/activity-stream

Activity Stream is a collector of various interactions between contacts at companies.

Language: Python - Size: 1.65 MB - Last synced: about 1 month ago - Pushed: about 1 month ago - Stars: 1 - Forks: 3

alphagov/analytics-settings-database Fork of google/analytics-settings-database

Export Google Analytics (GA4 and UA) settings

Language: Python - Size: 47.9 KB - Last synced: about 2 months ago - Pushed: 9 months ago - Stars: 1 - Forks: 0

uktrade/data-flow-metrics

Language: Python - Size: 6.84 KB - Last synced: about 1 month ago - Pushed: 2 months ago - Stars: 1 - Forks: 0

uktrade/kibana-paas

Dockerfile and associated files for deploying Kibana in GOV.UK PaaS

Language: Python - Size: 17.6 KB - Last synced: about 1 month ago - Pushed: 5 months ago - Stars: 0 - Forks: 0

SurenNihalani/incubator-iceberg Fork of apache/iceberg

Apache Iceberg (Incubating)

Language: Java - Size: 4.53 MB - Last synced: 10 months ago - Pushed: almost 5 years ago - Stars: 0 - Forks: 0

uktrade/s3-dropbox

A simple bearer token authenticated dropbox that drops its payloads into an S3 bucket

Language: Python - Size: 48.8 KB - Last synced: about 1 month ago - Pushed: about 1 month ago - Stars: 1 - Forks: 0

realize-engineering/pipebird

Pipebird is open source infrastructure for securely sharing data with customers.

Language: TypeScript - Size: 1.91 MB - Last synced: 7 months ago - Pushed: 11 months ago - Stars: 168 - Forks: 7

yennanliu/data_infra_repo

Collections of POC/dev data infrastructure. | #SE

Language: Python - Size: 7.06 MB - Last synced: about 1 month ago - Pushed: about 1 year ago - Stars: 6 - Forks: 0

anna-geller/kestra-terraform-examples

Bring Infrastructure as Code best practices to your data workflows with Kestra and Terraform

Language: HCL - Size: 735 KB - Last synced: about 1 year ago - Pushed: about 1 year ago - Stars: 0 - Forks: 0

alphagov/sde-prototype-govuk

A fake GOV.UK homepage and start pages for SDE prototype services

Language: HTML - Size: 343 KB - Last synced: about 2 months ago - Pushed: 9 months ago - Stars: 3 - Forks: 0

Jzbonner/dataengineering-db

Information relating to topics on Data Engineering, Data Infrastructure, Data Storing, Data Warehouses and Business Analysis. For those interested in both conceptual theory and use case examples for database design and development.

Size: 1020 KB - Last synced: about 1 year ago - Pushed: almost 3 years ago - Stars: 4 - Forks: 2

uktrade/mlflow-tracking-server 📦

Language: Python - Size: 4.88 KB - Last synced: about 1 month ago - Pushed: almost 2 years ago - Stars: 0 - Forks: 0

alphagov/sde-prototype-haas Fork of Nyzl/HaaS

SDE prototype dummy service - Hexagrams as a Service

Language: HTML - Size: 410 KB - Last synced: about 2 months ago - Pushed: about 1 year ago - Stars: 1 - Forks: 0

uktrade/jupyters3

Jupyter Notebook Contents Manager for AWS S3

Language: Python - Size: 127 KB - Last synced: 11 days ago - Pushed: about 4 years ago - Stars: 17 - Forks: 6

zalando-incubator/darty 📦

Data dependency manager

Language: Python - Size: 35.2 KB - Last synced: 2 months ago - Pushed: about 4 years ago - Stars: 22 - Forks: 3

uktrade/data-store-service

Language: Python - Size: 6.84 MB - Last synced: about 1 month ago - Pushed: 2 months ago - Stars: 0 - Forks: 0

uktrade/ecs-new-task-definition

Creates a new task definition of an ECS task

Language: Shell - Size: 5.86 KB - Last synced: about 1 month ago - Pushed: over 4 years ago - Stars: 0 - Forks: 0

bizzabo/elasticsearch_to_bigquery_data_pipeline

A generic data pipeline which will map Elasticsearch documents to Bigquery table rows

Language: Kotlin - Size: 627 KB - Last synced: about 1 year ago - Pushed: over 4 years ago - Stars: 12 - Forks: 3

uktrade/python-streaming-left-join

Join iterables in code without loading them all in memory: similar to a SQL left join

Language: Python - Size: 39.1 KB - Last synced: about 1 month ago - Pushed: over 3 years ago - Stars: 2 - Forks: 0

uktrade/kibana-proxy

Language: Python - Size: 15.6 KB - Last synced: about 1 month ago - Pushed: 5 months ago - Stars: 0 - Forks: 0

uktrade/aio-throttle-to-next-second Fork of michalc/aiothrottler

Throttler for asyncio Python that throttles to the next whole second

Language: Python - Size: 51.8 KB - Last synced: 15 days ago - Pushed: over 4 years ago - Stars: 0 - Forks: 0

uktrade/hawk-server-asyncio

Utility function to perform the server-side of Hawk authentication for asyncio HTTP servers

Language: Python - Size: 70.3 KB - Last synced: about 1 month ago - Pushed: almost 4 years ago - Stars: 0 - Forks: 0

uktrade/data-engineering-sample-app

a sample app showing how to use the data-engineering-common repo to create a lightweight flask, hawk authenticated app

Language: Python - Size: 16.6 KB - Last synced: about 1 month ago - Pushed: almost 4 years ago - Stars: 0 - Forks: 0

uktrade/hawk-server

Utility function to perform the server-side of Hawk authentication

Language: Python - Size: 42 KB - Last synced: 21 days ago - Pushed: almost 4 years ago - Stars: 0 - Forks: 0