Ecosyste.ms: Repos

An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: spark-structured-streaming

Pedro-Manoel/iot-analytics-solution-tcc

🎓 Repositório com solução de IoT Analytics, desenvolvida como parte do Trabalho de Conclusão de Curso (TCC) do curso de Ciência da Computação da Universidade Federal de Campina Grande (UFCG)

Language: TypeScript - Size: 165 MB - Last synced: 16 days ago - Pushed: 16 days ago - Stars: 0 - Forks: 0

pprzetacznik/datalake

Simple datalake

Language: Python - Size: 34.2 KB - Last synced: 25 days ago - Pushed: 25 days ago - Stars: 1 - Forks: 0

guidok91/spark-structured-streaming-kafka

Spark Structured Streaming data pipeline that processes movie ratings data in real-time.

Language: Python - Size: 111 KB - Last synced: 27 days ago - Pushed: 28 days ago - Stars: 11 - Forks: 3

ohmycloud/sub_trip_with_structured_spark_streaming

使用 Structured Spark Streaming 进行行程划分

Language: Scala - Size: 59.6 KB - Last synced: 30 days ago - Pushed: 30 days ago - Stars: 0 - Forks: 0

AlexRogalskiy/spark-patterns

🏆 Spark4You Design patterns

Language: Shell - Size: 15.6 MB - Last synced: about 1 month ago - Pushed: about 1 month ago - Stars: 1 - Forks: 0

qubole/kinesis-sql

Kinesis Connector for Structured Streaming

Language: Scala - Size: 251 KB - Last synced: 30 days ago - Pushed: 10 months ago - Stars: 137 - Forks: 80

jaceklaskowski/spark-workshop

Apache Spark™ and Scala Workshops

Language: HTML - Size: 57 MB - Last synced: 16 days ago - Pushed: over 1 year ago - Stars: 253 - Forks: 143

AmadeusITGroup/Elastic-Scaling

Elastic scaling is a library that allows to control the number of resources (executors or workers) instantiated by a Spark Structured Streaming Job in order to optimize the effective microbatch duration.

Language: Scala - Size: 32.2 KB - Last synced: about 1 month ago - Pushed: 6 months ago - Stars: 4 - Forks: 1

roksolana-d/spark-streaming-examples

Research on legacy and structured streaming with Spark

Language: Scala - Size: 22.5 KB - Last synced: 4 months ago - Pushed: about 5 years ago - Stars: 2 - Forks: 2

ramottamado/isabel

Spark Structured Streaming with Kafka Integration

Language: Python - Size: 14.6 KB - Last synced: 4 months ago - Pushed: 4 months ago - Stars: 1 - Forks: 0

hadiezatpanah/Spark_Structured_Streaming_Java

In this solution, the issue of creating a table with case-sensitive columns (in the scenario where the table doesn't exist or when writing the table in overwrite mode) in Oracle has been addressed by developing a custom Oracle dialect and registering it.

Language: Java - Size: 385 KB - Last synced: 6 months ago - Pushed: 6 months ago - Stars: 0 - Forks: 0

tienlepham094/TwitterSparkStreaming

Twitter Streaming Project

Language: Python - Size: 202 KB - Last synced: 6 months ago - Pushed: over 1 year ago - Stars: 0 - Forks: 0

RosarioB/spark-streaming-kafka

Exploring Spark Structured Streaming features by making use of Jupiter notebooks, Pyspark and interacting with a Kafka cluster.

Size: 130 KB - Last synced: 7 months ago - Pushed: 7 months ago - Stars: 0 - Forks: 0

ozancicek/artan

Online latent state estimation with Spark

Language: Scala - Size: 553 KB - Last synced: about 1 month ago - Pushed: over 1 year ago - Stars: 5 - Forks: 2

AbsaOSS/hyperdrive

Extensible streaming ingestion pipeline on top of Apache Spark

Language: Scala - Size: 1.62 MB - Last synced: about 1 month ago - Pushed: about 2 months ago - Stars: 41 - Forks: 13

SimpleSoulll/ss-aof

spark structured streaming appending only file source based on datasource apiv2. Spark增量日志流式抓取

Language: Scala - Size: 20.5 KB - Last synced: 8 months ago - Pushed: 8 months ago - Stars: 0 - Forks: 0

liupeirong/spark-structured-streaming-ci-cd

Spark structured streaming with unit tests integrated with Travis CI

Language: Scala - Size: 5.86 KB - Last synced: 8 months ago - Pushed: almost 6 years ago - Stars: 1 - Forks: 3

nama1arpit/reddit-streaming-pipeline

A real-time reddit data streaming pipeline for sentiment analysis of various subreddits

Language: HCL - Size: 15.1 MB - Last synced: 9 months ago - Pushed: 9 months ago - Stars: 51 - Forks: 2

jwsmai/ScalaTools

This project provides Apache Spark SQL, Flink DataStream API examples in Scala language

Language: Scala - Size: 3.19 MB - Last synced: 9 months ago - Pushed: 9 months ago - Stars: 0 - Forks: 0

yuanzhaoYZ/spark_structured_streaming_demo

A Log Analytics demo based on Spark Structured Streaming + Kafka

Language: Python - Size: 1.41 MB - Last synced: 9 months ago - Pushed: almost 5 years ago - Stars: 1 - Forks: 3

fhuertas/uah-mbi-2019-streaming

Repositorio para la clase de UAM, Máster en Business Intelligence, PARALELIZACIÓN DE DATOS, Modulo de Streaming

Language: Jupyter Notebook - Size: 9.04 MB - Last synced: 10 months ago - Pushed: over 1 year ago - Stars: 2 - Forks: 1

kklimexk/zio-playground

Playground for ZIO library

Language: Scala - Size: 16.6 KB - Last synced: 10 months ago - Pushed: almost 4 years ago - Stars: 0 - Forks: 0

NashTech-Labs/Sparkathon

A library having Java and Scala examples for Spark 2.x

Language: Java - Size: 113 MB - Last synced: 7 months ago - Pushed: over 7 years ago - Stars: 7 - Forks: 9

firecast/dhs-2019-demo

DataHack Summit 2019 demo files

Size: 33.7 MB - Last synced: 10 months ago - Pushed: over 4 years ago - Stars: 0 - Forks: 1

JulienPeloton/mini_spark_broker

Design and proof-of-concept for a Broker for astronomy using Apache Spark

Language: Jupyter Notebook - Size: 8.98 MB - Last synced: 10 months ago - Pushed: about 5 years ago - Stars: 3 - Forks: 2

Mark1002/sf-crime-statistics-spark-streaming

my udacity project

Language: Jupyter Notebook - Size: 1.69 MB - Last synced: 10 months ago - Pushed: over 1 year ago - Stars: 0 - Forks: 0

garystafford/streaming-sales-generator

Streaming Synthetic Sales Data Generator: Streaming sales data generator for Apache Kafka, written in Python

Language: Python - Size: 9.28 MB - Last synced: 10 months ago - Pushed: over 1 year ago - Stars: 28 - Forks: 11

chermenin/spark-states

Custom state store providers for Apache Spark

Language: Scala - Size: 260 KB - Last synced: 24 days ago - Pushed: over 2 years ago - Stars: 92 - Forks: 26

hadiezatpanah/Spark_Java_Stateful

This project presents a distributable solution based on Spark Java, aiming to connect start and end session events together in a stateful manner. The project utilizes `flatMapGroupWithState`functionality which is a powerful feature for stateful stream processing in Spark. It enables you to maintain and update the state across batches.

Language: Java - Size: 95.7 KB - Last synced: 11 months ago - Pushed: 11 months ago - Stars: 0 - Forks: 0

hadiezatpanah/Spark_Java_MostValuableCustomers

This Spark Java project serves as a demonstration of Gradle Spark configuration, specifically focusing on utilizing the MemoryStream class as the streaming source.

Language: Java - Size: 65.4 KB - Last synced: 11 months ago - Pushed: 11 months ago - Stars: 2 - Forks: 0

AndrewKuzmin/spark-structured-streaming-examples

Spark structured streaming examples with using of version 3.4.0

Language: Scala - Size: 1.06 MB - Last synced: 11 months ago - Pushed: 11 months ago - Stars: 25 - Forks: 14

stephen29xie/tweet-streaming-data-pipeline

Real-time streaming data pipeline for Twitter Tweets

Language: Scala - Size: 301 KB - Last synced: 10 months ago - Pushed: over 2 years ago - Stars: 11 - Forks: 9

Mathews-Tom/MSc-in-Machine-Learning-and-Artificial-Intelligence

Master of Science in Machine Learning & Artificial Intelligence - Indian Institute Technology Madras & Liverpool John Moores University

Language: Jupyter Notebook - Size: 2.12 GB - Last synced: 12 months ago - Pushed: 12 months ago - Stars: 6 - Forks: 7

xkondix/MsgBrokerSys

Spark Structured Streaming vs Kafka Streams

Language: Python - Size: 55.4 MB - Last synced: 8 months ago - Pushed: 8 months ago - Stars: 1 - Forks: 0

rajat2004/twitter-kafka

Twitter Web-App using Apache Kafka, Spark & perform analysis

Language: Python - Size: 29.3 KB - Last synced: about 1 year ago - Pushed: about 1 year ago - Stars: 1 - Forks: 0

streamnative/awesome-pulsar

A curated list of Pulsar tools, integrations and resources.

Size: 11.7 KB - Last synced: 2 days ago - Pushed: over 3 years ago - Stars: 78 - Forks: 9

Uriah372-DS/DDBMSPysparkProject

A course project with implementation of machine learning with spark structured streaming in python

Language: Jupyter Notebook - Size: 20.8 MB - Last synced: about 1 year ago - Pushed: over 1 year ago - Stars: 0 - Forks: 0

anmollp/Zootopia

A distributed streaming data processing pipeline.

Language: Python - Size: 1.15 MB - Last synced: 11 months ago - Pushed: about 1 year ago - Stars: 1 - Forks: 0

b-b3rn4rd/terraform-provider-emrstreaming

The emrstreaming provider offers continuous deployment functionality for streaming steps into an EMR cluster.

Language: Go - Size: 116 KB - Last synced: about 1 year ago - Pushed: about 1 year ago - Stars: 0 - Forks: 0

vvittis/CCFD-RF

Credit Card Fraudulent Detection with Random Forest

Language: Java - Size: 4.49 MB - Last synced: 9 months ago - Pushed: 9 months ago - Stars: 2 - Forks: 0

tomaztk/Azure-Databricks

Azure Databricks - Advent of 2020 Blogposts

Language: Jupyter Notebook - Size: 44.9 MB - Last synced: about 1 year ago - Pushed: over 1 year ago - Stars: 46 - Forks: 33

martinKindall/NYC-Taxi-Limousine-Data-Spark

NYC Taxi & Limousine Commission's open data with Spark Streaming 3.0.0

Language: Scala - Size: 43.9 KB - Last synced: about 1 year ago - Pushed: over 3 years ago - Stars: 1 - Forks: 1

vodkolav/DataEngineerProject

This is my final project for Data Engineer Expert course at Naya College.

Language: Jupyter Notebook - Size: 930 KB - Last synced: 12 months ago - Pushed: over 4 years ago - Stars: 1 - Forks: 0

CloudComputingProject-2022/Data_visualization_and_analysis_tool_for_telemetry_data

An naive anomaly detection and data visualization tool for F1 on board telemetry data.

Language: Python - Size: 1.4 MB - Last synced: about 1 year ago - Pushed: almost 2 years ago - Stars: 8 - Forks: 1

rajeshsantha/MonitoredStructuredStreaming

Repository for Spark structured streaming use case implementations.

Language: Scala - Size: 65.4 KB - Last synced: about 1 year ago - Pushed: about 4 years ago - Stars: 1 - Forks: 1

hoseinlook/cpu-anomaly-detection-with-spark

cpu anomaly detection with spark

Language: Python - Size: 333 KB - Last synced: about 1 year ago - Pushed: over 1 year ago - Stars: 5 - Forks: 0

falaybeg/SparkStreaming-Network-Anomaly-Detection

This repository includes supervised and unsupervised machine learning methods which are used to detect anomalies on network datasets. Decision Tree, Random Forest, Gradient Boost Tree, Naive Bayes, and Logistic Regression were used for supervised learning. K-Means was used for unsupervised learning.

Language: Jupyter Notebook - Size: 2.98 MB - Last synced: about 1 year ago - Pushed: almost 5 years ago - Stars: 11 - Forks: 3

hadiezatpanah/Trending_Topic_Spark_Streaming_Scala

This is an End to End solution to read data from streaming source (kafka), extract different topic from data in each time window, calculating Hot Topics using a modified Z-Score Algorithm and storing Final Trend Topics in Postgres SQL Database

Language: Scala - Size: 6.47 MB - Last synced: 11 months ago - Pushed: over 1 year ago - Stars: 0 - Forks: 0

jacobceles/ChicagoTaxiTrips-SparkStreaming-RealTimeDashboard

Analyzing Chicago taxi trips dataset using Spark Streaming, and a real-time dashboard for reporting using Flask.

Language: CSS - Size: 15.4 MB - Last synced: 9 months ago - Pushed: over 2 years ago - Stars: 1 - Forks: 1

iomete/sql-streaming-sqs

Fork of the Apache Bahir sql-streaming-sqs, compatible with Spark 3

Language: Scala - Size: 25.4 KB - Last synced: about 1 year ago - Pushed: over 1 year ago - Stars: 0 - Forks: 0

bluejoe2008/spark-http-stream

spark structured streaming via HTTP communication

Language: Scala - Size: 207 KB - Last synced: 10 months ago - Pushed: almost 2 years ago - Stars: 18 - Forks: 10

dharaneeshvrd/spark-examples

Spark Examples

Language: Python - Size: 35.2 KB - Last synced: about 1 year ago - Pushed: almost 3 years ago - Stars: 2 - Forks: 5

thedevd/techBlog

Examples of IT ruling technologies

Language: Scala - Size: 29.3 MB - Last synced: 14 days ago - Pushed: 3 months ago - Stars: 1 - Forks: 1

ArmanShakeri/Pyspark-upsert-oracle

Pyspark sample for upsert data to oracle table

Language: Python - Size: 23.4 KB - Last synced: about 1 year ago - Pushed: over 1 year ago - Stars: 0 - Forks: 0

renardeinside/spark-streaming-state-store-example

Spark Structured Streaming with State Store

Language: Scala - Size: 27.3 KB - Last synced: about 1 year ago - Pushed: over 4 years ago - Stars: 5 - Forks: 3

iomete/kafka-streaming-job

Kafka streaming job from iomete. This streaming job copies data from Kafka to Iceberg.

Language: Python - Size: 383 KB - Last synced: 7 months ago - Pushed: 7 months ago - Stars: 2 - Forks: 0

hosnaa/Apache-Spark-Streaming-Analysis

Analysis for a streaming daily retail data using Spark structured streaming and querying this data to get insights

Language: HTML - Size: 57.6 KB - Last synced: about 1 year ago - Pushed: over 2 years ago - Stars: 1 - Forks: 0

Thelin90/deiteo

P.O.C Spark On Kubernetes

Language: Shell - Size: 1.21 MB - Last synced: about 1 year ago - Pushed: about 3 years ago - Stars: 1 - Forks: 2

mpfishe2/eventhubs-databricks-quickstart

Get up and running quickly with Spark Structured Streaming on Azure Databricks using Azure Event Hubs

Language: Scala - Size: 3.91 KB - Last synced: about 1 year ago - Pushed: over 5 years ago - Stars: 1 - Forks: 0

PierreVerbe/Scala-Spark-Template

🛠️ Template to do data processing with Scala and Apache Spark ✨

Language: Scala - Size: 137 KB - Last synced: about 1 year ago - Pushed: over 2 years ago - Stars: 0 - Forks: 0

michelheil/BigData

Projects related to Big Data technologies

Language: Java - Size: 2.24 MB - Last synced: about 1 year ago - Pushed: almost 2 years ago - Stars: 0 - Forks: 0

AndrewKuzmin/Analytics-For-IoT-Devices-Using-Spark

Analytics for IoT devices using Apache Spark Structured Streaming 2.4.0

Language: Scala - Size: 1.03 MB - Last synced: 12 months ago - Pushed: about 5 years ago - Stars: 5 - Forks: 1

aTechGuide/click-stream-analysis

Spark Structured Streaming App to aggregate data on rolling window of events (Not necessarily time)

Language: Scala - Size: 8.79 KB - Last synced: about 1 year ago - Pushed: about 4 years ago - Stars: 0 - Forks: 0

PLarboulette/spark-structured-streaming

Language: Scala - Size: 22.5 KB - Last synced: about 1 year ago - Pushed: over 3 years ago - Stars: 0 - Forks: 0

sunujh6/spark_practice

Language: Jupyter Notebook - Size: 1.62 MB - Last synced: about 1 year ago - Pushed: over 1 year ago - Stars: 2 - Forks: 0

LuckyZXL2016/Spark-Example

Spark1.6和spark2.2的示例,包含kafka,flume,structuredstreaming,jedis,elasticsearch,mysql,dataframe

Language: Scala - Size: 2.06 MB - Last synced: about 1 year ago - Pushed: over 6 years ago - Stars: 15 - Forks: 6

aTechGuide/spark-streaming

Spark Streaming Scripts and integrations with other technologies

Language: TSQL - Size: 32.4 MB - Last synced: about 1 year ago - Pushed: about 4 years ago - Stars: 0 - Forks: 0

aqib1/java-spark-structured-streaming

Language: Java - Size: 3.91 KB - Last synced: about 1 year ago - Pushed: over 4 years ago - Stars: 0 - Forks: 0

aqib1/spark-structured-streaming-java

Language: Java - Size: 21.5 KB - Last synced: about 1 year ago - Pushed: over 3 years ago - Stars: 0 - Forks: 0

greysap/microbatch2cassandra

Language: Java - Size: 8.79 KB - Last synced: 11 months ago - Pushed: over 5 years ago - Stars: 1 - Forks: 1

conker84/kafka-rome-june-2k19

Size: 35 MB - Last synced: about 1 year ago - Pushed: almost 5 years ago - Stars: 0 - Forks: 0

haozhang-x/log-analysis-spark

Structured Streaming Log Analysis

Language: Scala - Size: 72.3 KB - Last synced: about 1 year ago - Pushed: about 5 years ago - Stars: 2 - Forks: 2

AndrewKuzmin/spark-ml-pipelines-with-structured-streaming-examples

Examples of using Apache Spark MLlib Pipelines and Structured Streaming on version 2.4.0

Language: Shell - Size: 1020 KB - Last synced: 12 months ago - Pushed: over 5 years ago - Stars: 1 - Forks: 0

ramkashyap-s/Live-Dash

Stream processing pipeline for analyzing live chat data

Language: Python - Size: 5 MB - Last synced: about 1 year ago - Pushed: over 1 year ago - Stars: 1 - Forks: 0

kthristov/cubos-olap

Language: Scala - Size: 24.4 KB - Last synced: about 1 year ago - Pushed: over 5 years ago - Stars: 1 - Forks: 0

sergei-grigorev/spark-streaming-project

In-Stream final project

Language: Scala - Size: 107 KB - Last synced: about 1 year ago - Pushed: about 6 years ago - Stars: 1 - Forks: 1

SevakAvet/gridu-spark-streaming

Study project, apache kafka + apache spark

Language: Scala - Size: 19.5 KB - Last synced: about 1 year ago - Pushed: over 5 years ago - Stars: 1 - Forks: 1

tspannhw/nifi-spark-structuredstreaming

Language: Scala - Size: 5.86 KB - Last synced: about 1 month ago - Pushed: about 6 years ago - Stars: 1 - Forks: 0