Ecosyste.ms: Repos

An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: iceberg

dbt-athena/dbt-athena

The athena adapter plugin for dbt (https://getdbt.com)

Language: Python - Size: 1.01 MB - Last synced: about 2 hours ago - Pushed: about 11 hours ago - Stars: 191 - Forks: 84

trinodb/trino

Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)

Language: Java - Size: 235 MB - Last synced: about 5 hours ago - Pushed: about 7 hours ago - Stars: 9,633 - Forks: 2,790

apache/iceberg-python

Apache PyIceberg

Language: Python - Size: 5.87 MB - Last synced: about 16 hours ago - Pushed: about 20 hours ago - Stars: 242 - Forks: 99

alldatacenter/alldata

🔥🔥 AllData大数据产品是可定义数据中台,以数据平台为底座,以数据中台为桥梁,以机器学习平台为中层框架,以大模型应用为上游产品,提供全链路数字化解决方案。全新会员商业版 X 微信群:https://docs.qq.com/doc/DVHlkSEtvVXVCdEFo

Language: Java - Size: 1.55 GB - Last synced: about 5 hours ago - Pushed: about 21 hours ago - Stars: 2,321 - Forks: 816

projectnessie/iceberg-catalog-migrator

CLI tool to bulk migrate the tables from one catalog another without a data copy

Language: Java - Size: 451 KB - Last synced: about 15 hours ago - Pushed: 1 day ago - Stars: 31 - Forks: 10

conduitio-labs/conduit-connector-s3-iceberg

Language: Java - Size: 196 KB - Last synced: about 19 hours ago - Pushed: 2 days ago - Stars: 0 - Forks: 0

linkedin/openhouse

Open Control Plane for Tables in Data Lakehouse

Language: Java - Size: 4.25 MB - Last synced: 2 days ago - Pushed: 2 days ago - Stars: 257 - Forks: 36

WeBankFinTech/Streamis

Streaming application development and management system, based on Linkis and DSS, planning to provide the workflow-like graphical drag-and-drop development capability.

Language: Java - Size: 70 MB - Last synced: 5 days ago - Pushed: about 1 month ago - Stars: 97 - Forks: 40

ComputeAI/computeAI-integrations

Supercharge Your Compute for Analytics & AI

Language: Jupyter Notebook - Size: 186 KB - Last synced: 6 days ago - Pushed: 6 days ago - Stars: 7 - Forks: 1

apache/iceberg-go

Apache Iceberg - Go

Language: Go - Size: 136 KB - Last synced: 6 days ago - Pushed: 7 days ago - Stars: 71 - Forks: 14

apache/iceberg-rust

Apache Iceberg

Language: Rust - Size: 1.5 MB - Last synced: 15 days ago - Pushed: 15 days ago - Stars: 405 - Forks: 84

apache/doris-streamloader

Stream Loader for Apache Doris

Language: Go - Size: 37.1 KB - Last synced: 7 days ago - Pushed: 7 days ago - Stars: 12 - Forks: 10

explodingcamera/lastfm-iceberg

Generate an Iceberg-Chart based on your Last.fm music history

Language: JavaScript - Size: 2.22 MB - Last synced: 8 days ago - Pushed: 8 days ago - Stars: 9 - Forks: 1

icelake-io/icelake

Pure Rust Iceberg Implementation

Language: Rust - Size: 647 KB - Last synced: 9 days ago - Pushed: 10 days ago - Stars: 160 - Forks: 17

Quocc1/OpenStack

An end-to-end open-source data stack for crawling and visualizing real estate data, facilitating insights into market trends.

Language: Jupyter Notebook - Size: 6.97 MB - Last synced: 9 days ago - Pushed: 10 days ago - Stars: 1 - Forks: 0

DashDipti/cdw-workshop

This workshop aims to make use of airlines data set that is publicly available and showcase how one can make use of CDW for Open Data Lakehouse using Apache Iceberg.

Size: 33.2 MB - Last synced: 10 days ago - Pushed: 10 days ago - Stars: 4 - Forks: 8

projectnessie/nessie

Nessie: Transactional Catalog for Data Lakes with Git-like semantics

Language: Java - Size: 173 MB - Last synced: 30 days ago - Pushed: 30 days ago - Stars: 824 - Forks: 115

adform/stream-loader

Components for building stream loaders from Kafka to arbitrary storages

Language: Scala - Size: 2.47 MB - Last synced: 10 days ago - Pushed: 10 days ago - Stars: 31 - Forks: 7

Mrkuhuo/bigdata_learning

大数据组件学习代码

Language: Java - Size: 36.7 MB - Last synced: 10 days ago - Pushed: 10 days ago - Stars: 22 - Forks: 7

mytiki/platform-cap-bulk

Bulk load data into a mytiki.com data lake

Language: Rust - Size: 85 KB - Last synced: 10 days ago - Pushed: 10 days ago - Stars: 0 - Forks: 1

wherobots/havasu

The spatial table format for spatial lakehouse

Size: 9.77 KB - Last synced: 11 days ago - Pushed: 12 days ago - Stars: 13 - Forks: 0

apache/iceberg

Apache Iceberg

Language: Java - Size: 60.2 MB - Last synced: 14 days ago - Pushed: 14 days ago - Stars: 5,531 - Forks: 1,998

memiiso/debezium-server-iceberg

Replicates database CDC events to Apache Iceberg Tables

Language: Java - Size: 670 KB - Last synced: 12 days ago - Pushed: 13 days ago - Stars: 159 - Forks: 33

ngmy/wt-settings

ngmy's Windows Terminal settings

Language: Shell - Size: 17.6 KB - Last synced: 13 days ago - Pushed: 7 months ago - Stars: 1 - Forks: 0

ngmy/mac-terminal-settings

ngmy's Mac Terminal settings

Size: 11.7 KB - Last synced: 13 days ago - Pushed: almost 4 years ago - Stars: 1 - Forks: 0

Lapig/Iceberg

LASTFM thing

Language: Java - Size: 2.21 MB - Last synced: 14 days ago - Pushed: over 6 years ago - Stars: 0 - Forks: 0

ebyhr/puffin-tools

Language: Java - Size: 61.5 KB - Last synced: 14 days ago - Pushed: over 1 year ago - Stars: 5 - Forks: 0

cocopon/xcode-iceberg

Dark blue color theme for Xcode

Size: 6.84 KB - Last synced: 14 days ago - Pushed: over 1 year ago - Stars: 13 - Forks: 1

cocopon/hyper-iceberg

Dark blue color theme for Hyper™

Language: JavaScript - Size: 4.88 KB - Last synced: 14 days ago - Pushed: over 5 years ago - Stars: 18 - Forks: 1

cocopon/atom-iceberg-syntax

A dark blue color scheme for Atom, originally for Vim

Language: CSS - Size: 14.6 KB - Last synced: 14 days ago - Pushed: over 7 years ago - Stars: 7 - Forks: 1

cocopon/iceberg.vim

:antarctica: Bluish color scheme for Vim and Neovim

Language: Vim Script - Size: 1.78 MB - Last synced: 14 days ago - Pushed: 16 days ago - Stars: 2,131 - Forks: 130

apache/doris-thirdparty

Self-managed thirdparty dependencies for Apache Doris

Size: 244 MB - Last synced: 2 days ago - Pushed: 2 days ago - Stars: 28 - Forks: 25

mytiki/platform-cap-async-stage

Staging environment for asynchronous data writes to mytiki.com data lakes

Size: 31.3 KB - Last synced: 16 days ago - Pushed: 16 days ago - Stars: 0 - Forks: 0

mytiki/platform-cap-async-endpoint

Generate REST endpoints to asynchronously load data into a mytiki.com data lake

Language: Rust - Size: 102 KB - Last synced: 9 days ago - Pushed: 9 days ago - Stars: 0 - Forks: 0

mytiki/platform-cap-async-write

Combine and load staged data from REST endpoints into a mytiki.com data lake

Language: Java - Size: 102 KB - Last synced: 9 days ago - Pushed: 10 days ago - Stars: 0 - Forks: 0

StarRocks/starrocks

StarRocks, a Linux Foundation project, is a next-generation sub-second MPP OLAP database for full analytics scenarios, including multi-dimensional analytics, real-time analytics, and ad-hoc queries. InfoWorld’s 2023 BOSSIE Award for best open source software.

Language: Java - Size: 343 MB - Last synced: about 1 month ago - Pushed: about 1 month ago - Stars: 7,708 - Forks: 1,600

seanpm2001/BWS_IceBerg

Technical specification and documentation for BWS IceBergs, the highest tier big data storage for the BWS Ice hardware series.

Language: Markdown - Size: 533 KB - Last synced: 18 days ago - Pushed: over 1 year ago - Stars: 2 - Forks: 1

apache/doris

Apache Doris is an easy-to-use, high performance and unified analytics database.

Language: Java - Size: 696 MB - Last synced: 21 days ago - Pushed: 21 days ago - Stars: 11,354 - Forks: 3,045

sutoiku/puffin

Serverless HTAP cloud data platform powered by Arrow × DuckDB × Iceberg

Size: 17.8 MB - Last synced: 21 days ago - Pushed: about 1 year ago - Stars: 281 - Forks: 12

DaniJonesOcean/southern_fwflux Fork of mark-hammond/southern_fwflux

Language: MATLAB - Size: 27.5 MB - Last synced: 28 days ago - Pushed: 28 days ago - Stars: 1 - Forks: 0

waiyan1612/postgres-kafka-iceberg-pipeline

Example pipeline to stream the data changes from RDBMS to Apache Iceberg tables

Language: Python - Size: 29.3 KB - Last synced: 29 days ago - Pushed: 29 days ago - Stars: 0 - Forks: 0

slidoapp/duckberg

Python package for querying iceberg data through duckdb.

Language: Jupyter Notebook - Size: 548 KB - Last synced: 3 days ago - Pushed: 3 months ago - Stars: 29 - Forks: 0

apache/doris-website

Apache Doris Website

Language: TypeScript - Size: 275 MB - Last synced: about 1 month ago - Pushed: about 1 month ago - Stars: 62 - Forks: 110

cocopon/vscode-iceberg-theme

Dark blue color theme for Visual Studio Code

Language: TypeScript - Size: 95.7 KB - Last synced: 14 days ago - Pushed: over 1 year ago - Stars: 112 - Forks: 9

wolfeidau/duckdb-docker-iceberg

This is a docker image containing the C++ libs required to run DuckDB with apache Iceberg

Language: Makefile - Size: 23.4 KB - Last synced: about 1 month ago - Pushed: 2 months ago - Stars: 2 - Forks: 0

1ambda/lakehouse

Playground for Lakehouse (Iceberg, Hudi, Spark, Flink, Trino, DBT, Airflow, Kafka, Debezium CDC)

Language: Kotlin - Size: 3.28 MB - Last synced: about 1 month ago - Pushed: 8 months ago - Stars: 26 - Forks: 5

tlepple/data_origination_workshop

Hands-on workshop with Iceberg, Redpanda, Debezium and Kafka-Connect

Language: Shell - Size: 3.78 MB - Last synced: 2 months ago - Pushed: 2 months ago - Stars: 11 - Forks: 1

WesleyJw/modern-data-stack

Creating a modern data stack in Kubernetes with open-source products, both on-premises and cloud-agnostic, is an increasingly popular approach. By leveraging Kubernetes for container orchestration, you can deploy and manage data processing, storage, and analysis tools more efficiently.

Language: Python - Size: 28 MB - Last synced: 2 months ago - Pushed: 2 months ago - Stars: 0 - Forks: 0

zsvoboda/ngods-stocks

New Generation Opensource Data Stack Demo

Language: Jupyter Notebook - Size: 22.1 MB - Last synced: 2 months ago - Pushed: over 1 year ago - Stars: 365 - Forks: 86

Stefen-Taime/Iceberg-Dbt-Trino-Hive-modern-open-source-data-stack

To provide a deeper understanding of how the modern, open-source data stack consisting of Iceberg, dbt, Trino, and Hive operates within a music streaming platform, let’s delve into the detailed workflow and benefits of each component.

Language: Shell - Size: 33.2 KB - Last synced: 2 months ago - Pushed: 2 months ago - Stars: 1 - Forks: 0

ydah/irb-theme-iceberg

🟦 Bluish color scheme for irb

Language: Ruby - Size: 14.6 KB - Last synced: 14 days ago - Pushed: 3 months ago - Stars: 6 - Forks: 0

ismailsimsek/iceberg-examples

Apache iceberg Spark s3 examples

Language: Java - Size: 33.2 KB - Last synced: 3 months ago - Pushed: 3 months ago - Stars: 12 - Forks: 8

aws-samples/emr-serverless-samples

Example code for running Spark and Hive jobs on EMR Serverless.

Language: Python - Size: 1.69 MB - Last synced: 2 months ago - Pushed: 2 months ago - Stars: 131 - Forks: 68

izhangzhihao/Real-time-Data-Warehouse

Real-time Data Warehouse with Apache Flink & Apache Kafka & Apache Hudi

Language: Dockerfile - Size: 106 KB - Last synced: 2 months ago - Pushed: 5 months ago - Stars: 95 - Forks: 40

jeongwhanchoi/MLND-Capstone-Project

Capstone Project for Udacity Machine Learning Nanodegree

Language: Jupyter Notebook - Size: 22.2 MB - Last synced: 25 days ago - Pushed: over 5 years ago - Stars: 1 - Forks: 0

apache/iceberg-docs 📦

Apache Iceberg Documentation Site

Language: SCSS - Size: 28.9 MB - Last synced: 3 months ago - Pushed: 3 months ago - Stars: 40 - Forks: 98

apache/doris-sdk

SDK for Apache Doris

Language: Thrift - Size: 35.2 KB - Last synced: 6 days ago - Pushed: about 1 month ago - Stars: 6 - Forks: 7

tkasuz/pyiceberg-lambda-layer

Public Lambda Layer for pyiceberg

Language: HCL - Size: 92.8 KB - Last synced: 4 months ago - Pushed: 4 months ago - Stars: 0 - Forks: 0

projectnessie/nessie-demos

Demos for Nessie. Nessie provides Git-like capabilities for your Data Lake.

Language: Jupyter Notebook - Size: 813 KB - Last synced: 3 months ago - Pushed: 3 months ago - Stars: 22 - Forks: 19

spatialx-project/geolake

Universal solution for geospatial data tailored to data lakehouse systems for the first time in the industry

Language: Java - Size: 20.2 MB - Last synced: 4 months ago - Pushed: 7 months ago - Stars: 45 - Forks: 3

spatialx-project/sedona-iceberg-extension

Unleash the power of Apache Sedona when processing Iceberg Tables.

Language: Scala - Size: 107 KB - Last synced: 4 months ago - Pushed: about 1 year ago - Stars: 2 - Forks: 0

bloomberg/trino Fork of trinodb/trino

Trino, the distributed SQL query engine for big data

Size: 223 MB - Last synced: about 1 month ago - Pushed: 2 months ago - Stars: 10 - Forks: 8

luckyQing/cdc

通过flink cd实时同步数据,支持mysql到mysql(数据库迁移场景),mysql到iceberg(报表场景)

Language: Java - Size: 97.7 KB - Last synced: 23 days ago - Pushed: 7 months ago - Stars: 1 - Forks: 0

wwisser/spark-dsl-generator

Java DSL generator for Spark columns supporting Parquet & SQL files using Hive/Iceberg dialect

Language: Java - Size: 23.4 KB - Last synced: 4 months ago - Pushed: 6 months ago - Stars: 1 - Forks: 0

MOBIN-F/iceberg-spark-tpcds-benchmark

iceberg-spark-tpcds-benchmark

Language: Scala - Size: 13.7 MB - Last synced: 7 months ago - Pushed: about 2 years ago - Stars: 2 - Forks: 0

luatnc87/robust-data-analytics-platform-with-duckdb-dbt-iceberg

Discover the simplicity and strength of Duckdb, dbt, and Iceberg in this project. Create an efficient, versatile data analytics solution for valuable insights.

Language: Shell - Size: 545 KB - Last synced: 8 months ago - Pushed: 8 months ago - Stars: 2 - Forks: 0

Joel-hanson/Iceberg-locations

Current Antarctic large iceberg positions derived from ASCAT and OSCAT-2

Language: Python - Size: 147 KB - Last synced: about 1 month ago - Pushed: 9 months ago - Stars: 6 - Forks: 1

OrvilleX/DataLake

本教程将主要围绕数据湖现主流框架知识进行分享,当前计划就Delta Lake、Hudi与Iceberg三大主流框架的使用方式 进行教程编写。

Language: Scala - Size: 17.6 KB - Last synced: 10 months ago - Pushed: over 2 years ago - Stars: 0 - Forks: 0

Alexiscomete/iceberg_dibistan

Website and Discord bot for Dibistan's Iceberg. Can be fork to create another iceberg. An iceberg classify all references to a world by its celebrity.

Language: Kotlin - Size: 11.4 MB - Last synced: 10 months ago - Pushed: 10 months ago - Stars: 1 - Forks: 1

tj---/iceberg-demo

A sample implementation of stream writes to an Iceberg table on GCS using Flink and reading it using Trino

Language: Java - Size: 13.7 KB - Last synced: 10 months ago - Pushed: almost 2 years ago - Stars: 8 - Forks: 3

moj-analytical-services/iceberg-evaluation

Language: Jupyter Notebook - Size: 2.6 MB - Last synced: 3 months ago - Pushed: 3 months ago - Stars: 1 - Forks: 1

aguirreSL/Iceberg

This is an example of use to the hybrid auralization method with VBAP and Ambisonics

Language: MATLAB - Size: 82.7 MB - Last synced: 6 months ago - Pushed: 8 months ago - Stars: 5 - Forks: 0

vontikov/flink-iceberg-demo

Apache Flink and Apache Iceberg Demo

Language: Dockerfile - Size: 4.88 KB - Last synced: 11 months ago - Pushed: 11 months ago - Stars: 0 - Forks: 0

Chenzhiling/datalake-metadata-api

this project can help you to get iceberg,delta,hudi table's metadata info by java

Language: Java - Size: 13.7 KB - Last synced: 11 months ago - Pushed: 11 months ago - Stars: 1 - Forks: 0

bihaiyang/datalake-example

Data lake implementation demo, include iceberg on flink, iceberg on spark, hudi on flink, hudi on spark

Language: Java - Size: 924 KB - Last synced: 6 months ago - Pushed: 6 months ago - Stars: 4 - Forks: 0

sudohainguyen/mini-lakehouse

Data lakehouse at home with k8s and helm

Language: Jupyter Notebook - Size: 530 KB - Last synced: 12 months ago - Pushed: 12 months ago - Stars: 0 - Forks: 0

maubat/IceDT

An alternative machine-learning implementation focused on superpixel segmentation, deep learning, and ensemble learning for iceberg detection from SAR images

Language: Jupyter Notebook - Size: 6.46 MB - Last synced: 12 months ago - Pushed: 12 months ago - Stars: 1 - Forks: 1

10xfuturetechnologies/kafka-connect-iceberg

Kafka Connector for Iceberg tables

Language: Java - Size: 117 KB - Last synced: about 1 year ago - Pushed: about 1 year ago - Stars: 9 - Forks: 2

dacort/modern-data-lake-storage-layers

Jupyter notebooks and AWS CloudFormation template to show how Hudi, Iceberg, and Delta Lake work

Language: Jupyter Notebook - Size: 262 KB - Last synced: about 1 year ago - Pushed: almost 2 years ago - Stars: 33 - Forks: 21

ExpediaGroup/hiveberg 📦

Demonstration of a Hive Input Format for Iceberg

Language: Java - Size: 172 KB - Last synced: about 1 year ago - Pushed: about 3 years ago - Stars: 24 - Forks: 8

Chenzhiling/dataLake-template

Some demos of using Spark to write MySQL and Kafka data to data lake,such as Delta,Hudi,Iceberg

Language: Scala - Size: 65.4 KB - Last synced: about 1 year ago - Pushed: about 1 year ago - Stars: 1 - Forks: 0

zsvoboda/ngods

New generation opensource data stack

Language: Dockerfile - Size: 1.62 MB - Last synced: about 1 year ago - Pushed: almost 2 years ago - Stars: 19 - Forks: 3

kuni3933/pomotroid-Iceberg

Iceberg color theme in Splode/pomotroid https://github.com/Splode/pomotroid https://github.com/cocopon/iceberg.vim

Size: 11.7 KB - Last synced: about 1 year ago - Pushed: almost 2 years ago - Stars: 1 - Forks: 0

jaehyeon-kim/iceberg-etl-demo

Data Warehousing ETL Demo with Apache Iceberg on EMR Local Environment

Language: Python - Size: 1.26 MB - Last synced: about 1 year ago - Pushed: almost 2 years ago - Stars: 1 - Forks: 1

gkeep/iceberg-dark

🇦🇶 Dark blue color scheme for various programs, complementary to iceberg.vim

Language: Vim script - Size: 451 KB - Last synced: about 1 year ago - Pushed: almost 3 years ago - Stars: 84 - Forks: 4

Dam1029/iceberg-assembly

汇总Apache Iceberg相关的最新文章、资料以及Demo等

Size: 51.8 KB - Last synced: about 1 year ago - Pushed: almost 3 years ago - Stars: 25 - Forks: 10

jaehyeon-kim/dbt-on-aws

dbt (data build tool) projects targeting AWS analytics services (redshift, glue, emr, athena) and open table formats

Language: HCL - Size: 192 KB - Last synced: about 1 year ago - Pushed: over 1 year ago - Stars: 3 - Forks: 0

spancer/zeus

Zeus is an open-source, analytical engine for big data hold in data lake; it was designed to provide OLAP (Online Analytical Processing) capability in the big data era. You can use Zeus to store, query, analysis, and manage data.

Language: Java - Size: 631 KB - Last synced: about 1 year ago - Pushed: over 2 years ago - Stars: 20 - Forks: 9

nikini/iceberg 📦

Karma, flow, eslint, sass lint packer

Language: JavaScript - Size: 542 KB - Last synced: 9 days ago - Pushed: over 6 years ago - Stars: 1 - Forks: 0

lvyanquan/learning

个人学习资料汇总

Size: 162 KB - Last synced: about 1 year ago - Pushed: about 1 year ago - Stars: 5 - Forks: 1

kai-ten/dotsDB

Exploring ways to extend Apache Iceberg's capabilities.

Language: Rust - Size: 109 KB - Last synced: about 1 year ago - Pushed: over 1 year ago - Stars: 2 - Forks: 1

victorcouste/iceberg-trino-sql-demo

Trino SQL queries examples for Iceberg connector

Size: 2.93 KB - Last synced: about 1 year ago - Pushed: over 1 year ago - Stars: 0 - Forks: 1

unexist/showcase-hadoop-cdc-quarkus

Showcase for Hadoop with CDC on Quarkus [MIRROR]

Language: Java - Size: 25 MB - Last synced: about 2 months ago - Pushed: about 2 months ago - Stars: 0 - Forks: 0

OpenTableFormat/OpenTableFormat.github.io

Website for open table format 🕸

Language: CSS - Size: 4.59 MB - Last synced: about 1 year ago - Pushed: over 1 year ago - Stars: 0 - Forks: 0

ev2900/EMR_Studio_Iceberg

Apache Icebery examples designed to be run on AWS Elastic Map Reduce (EMR) via. EMR Studio or EMR Notebooks

Language: Jupyter Notebook - Size: 113 KB - Last synced: 30 days ago - Pushed: about 1 month ago - Stars: 1 - Forks: 0

jolshylar/qazmine

⛏ Qazmine - a Web Game w/ Periodic Table Elements

Language: TypeScript - Size: 5 MB - Last synced: about 1 year ago - Pushed: almost 2 years ago - Stars: 0 - Forks: 0

iomete/kafka-streaming-job

Kafka streaming job from iomete. This streaming job copies data from Kafka to Iceberg.

Language: Python - Size: 383 KB - Last synced: 7 months ago - Pushed: 7 months ago - Stars: 2 - Forks: 0

brandonsueur/Iceberg-iTerm2

dark blue color scheme for iTerm2

Size: 2.93 KB - Last synced: about 1 year ago - Pushed: over 4 years ago - Stars: 1 - Forks: 1

lgo/iceberg-lookup-srv

Service library for efficiency and fast Iceberg dataset point lookups

Language: Java - Size: 24.4 KB - Last synced: about 1 year ago - Pushed: about 2 years ago - Stars: 0 - Forks: 0

gkeep/dotfiles

Dotfiles for i3wm/sway, neovim and others with iceberg.vim colorscheme

Language: Shell - Size: 19.4 MB - Last synced: about 1 year ago - Pushed: about 2 years ago - Stars: 7 - Forks: 0