GitHub topics: spark-sql | Ecosyste.ms: Repos

aessing/demo-azuresynapse

This repository includes the demos and codes I use to play around with Azure Synapse Anayltics

Size: 80 MB - Last synced at: about 1 month ago - Pushed at: almost 3 years ago - Stars: 5 - Forks: 5

MM24J/Home_Sales_Analysis

Using SparkSQL, I analyzed home sales data to identify key metrics.

Language: Jupyter Notebook - Size: 7.81 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

vim89/datapipelines-essentials-python

Simplified ETL process in Hadoop using Apache Spark. Has complete ETL pipeline for datalake. SparkSession extensions, DataFrame validation, Column extensions, SQL functions, and DataFrame transformations

Language: Python - Size: 1.76 MB - Last synced at: about 1 year ago - Pushed at: about 2 years ago - Stars: 53 - Forks: 34

amy-panda/NY_Taxi_Data_Analysis_and_Modelling

Analysing the taxi trips in New York City and predicting total fare amount of taxi trips

Language: Jupyter Notebook - Size: 1.84 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

sakethmukkanti/Demand-Navigator-Real-Time-Streaming-with-Azure

A real-time application to guide cab drivers looking for ride towards the areas of the cities experiencing higher demand

Language: Jupyter Notebook - Size: 156 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

xiaruolei/SparkSQLProject

Language: Scala - Size: 865 KB - Last synced at: over 1 year ago - Pushed at: almost 7 years ago - Stars: 0 - Forks: 0

nelsonssjunior/Python_Spark

Estudos de Streaming de dados com Python e SPark

Language: Jupyter Notebook - Size: 4.88 KB - Last synced at: over 1 year ago - Pushed at: about 4 years ago - Stars: 1 - Forks: 0

aliabbasi2000/Spark

Solving Big Data Problems using Spark framework in Java. Running the Project on HDFS clusters (BigData@Polito) to get the results.

Language: Java - Size: 143 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

sakethmukkanti/Movielens-Dataset-Analysis-Azure-Data-Engineering-Project

Created a movie recommendation system on Azure utilizing Spark SQL for analyzing the MovieLens dataset.

Language: Jupyter Notebook - Size: 1.6 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

TiagoCebola/BigData-GooglePlayStore

This project's was developed to solidify the use of Scala manipulating files and dataframes to generate metrics.

Language: Scala - Size: 3.97 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

techmonad/spark-datasets

This example give a quick overview of the Spark DataFrame API.

Language: Scala - Size: 88.9 KB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 1

rohitkulkarni08/Azure-ETL-AmazonSalesAnalysis

A comprehensive ETL pipeline and sales analysis project leveraging Microsoft Azure and PySpark, designed to optimize e-commerce sales by providing actionable insights through detailed data analysis.

Language: Jupyter Notebook - Size: 8.04 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

assamese/spark-python

Spark Python examples

Language: Python - Size: 83 KB - Last synced at: over 1 year ago - Pushed at: about 5 years ago - Stars: 0 - Forks: 0

MoustafaAMahmoud/spark-sandbox

Spark Sandbox project

Language: Scala - Size: 8.79 KB - Last synced at: over 1 year ago - Pushed at: about 5 years ago - Stars: 0 - Forks: 0

mrogove/NewHampshireOpioidDeepDive

Using spark and other tools to analyze large, disparate data sources. Term Group Project for COMP119 Tufts F'19

Language: Jupyter Notebook - Size: 17.3 MB - Last synced at: over 1 year ago - Pushed at: over 5 years ago - Stars: 1 - Forks: 0

Lakshmiaddepalli/BigDataProject

CSCI-GA.3033-005 - Big Data Application Development

Language: Python - Size: 41.4 MB - Last synced at: over 1 year ago - Pushed at: about 3 years ago - Stars: 1 - Forks: 0

IcarusSO/Sparksql-UnitTest

Simple utilities for testing Spark SQL queries, functions, and applications

Size: 12.7 KB - Last synced at: 11 days ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

jkanclerz/data-science-workshop-2022

The repository contains notebook templates for the purposes of the data science course at the Cracow University of Economics.

Language: Jupyter Notebook - Size: 2.13 MB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

abulbasar/SparkJavaExamples

Code of example of working with Apache Spark using Java

Language: Java - Size: 399 KB - Last synced at: about 1 year ago - Pushed at: about 2 years ago - Stars: 4 - Forks: 8

JBris/time-series-airflow-kafka-spark

A simple demonstration of an Airflow-Kafka-Spark (AKS) stack for online time series forecasting.

Language: Python - Size: 699 KB - Last synced at: 4 months ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

zy969/film-genre-insights

DataTalksClub Data Engineering Zoomcamp Project

Language: Python - Size: 32.8 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

Safaa-p/Machine-Failure-Prediction

Predicting Machine failure using Machine learning on a synthetic dataset of an existing milling machine consisting of 10,000 data points

Language: Jupyter Notebook - Size: 4.7 MB - Last synced at: 4 months ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

amitnema/spark-coach

This project contains the learning and experiments with the Apache Spark.

Language: Scala - Size: 46.9 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

AbdelmajidLh/Spark_ML_Weather

Projet d'apprentissage Scala et Spark : Prédire la pluie de demain avec des données historiques

Language: Scala - Size: 13.7 KB - Last synced at: 4 months ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

mliarakos/spark-typed-ops

Lightweight type-safe operations for Spark

Language: Scala - Size: 64.5 KB - Last synced at: 3 months ago - Pushed at: over 4 years ago - Stars: 5 - Forks: 0

LakshMundhada/Real-Time-Fraudulent-Transaction-Analytics-Pipeline

A Big Data project leveraging AWS services and Apache frameworks to identify and visualize fraudulent credit card transaction patterns, providing actionable insights to mitigate financial fraud.

Language: Python - Size: 33.5 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

bhanu-kanamarlapudi/EarthquakeAnalysis-PySpark

Language: Python - Size: 18.6 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

Robyn2024/Home_Sales

I'll use your knowledge of SparkSQL to determine key metrics about home sales data. Then I'll use Spark to create temporary views, partition the data, cache and uncache a temporary table, and verify that the table has been uncached.

Language: Jupyter Notebook - Size: 9.77 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

DalyaLami/Home_Sales

Determine key metrics about home sales data using SparkSQL and then use Spark to create temporary views, partition the data, cache and uncache a temporary table, and verify that the table has been uncached.

Language: Jupyter Notebook - Size: 1.25 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

Mr-Mens/Analyzing-Wikipedia-Clickstreams-with-PySpark-Project

This project focuses on analyzing Wikipedia's clickstream data to uncover patterns in how users navigate from one article to another. Utilizing Apache Spark and PySpark for data manipulation and analysis, the project aims to provide insights into user behavior on Wikipedia, including the most popular pathways to specific articles.

Language: Jupyter Notebook - Size: 7.81 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

adnanrahin/NFL-Big-Data-Bowl-2022

The 2022 Big Data Bowl data contains Next Gen Stats player tracking, play, game, player, and PFF scouting data for all 2018-2020 Special Teams play. Here, you'll find a summary of each data set in the 2022 Data Bowl, a list of key variables to join on, and a description of each variable.

Language: Scala - Size: 1.02 MB - Last synced at: 18 days ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

bobxwang/predict-stock-in-spark

using spark to predict stock, the data come from sina

Language: Scala - Size: 143 KB - Last synced at: over 1 year ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0

bmazzarol/SparkTest.NET 📦

Support for testing :test_tube: Spark dotnet applications

Language: C# - Size: 284 KB - Last synced at: over 1 year ago - Pushed at: almost 2 years ago - Stars: 2 - Forks: 0

CaioBrainer/Hadoop_Projects

Pequenos projetos utilizando ferramentas do ecossistema Apache Hadoop

Language: Python - Size: 18.6 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

lydia-ath/SparkLinux

Assignment for Big Data course of MSc

Language: Python - Size: 20.5 KB - Last synced at: 3 days ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0

aravinthsci/Spark-DB-Connector

Sharing Examples for Apache Spark

Language: Scala - Size: 20.5 KB - Last synced at: over 1 year ago - Pushed at: over 5 years ago - Stars: 3 - Forks: 2

aravinthsci/Miscellaneous1

Language: Jupyter Notebook - Size: 42.2 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 1

mohammad-safari/spark-hadoop-exercise

spark hadoop exercise of cloud computing course - aut 1402-1403 fall

Language: Jupyter Notebook - Size: 33.2 MB - Last synced at: 8 months ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

kosminus/polyflow

Polyflow is an ETL tool based on Apache Spark.

Language: Scala - Size: 41 KB - Last synced at: over 1 year ago - Pushed at: almost 3 years ago - Stars: 2 - Forks: 0

tlepple/iceberg-intro-workshop

Hands-on workshop with Apache Iceberg

Language: Shell - Size: 2.31 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 10 - Forks: 0

nardyjh/Home_Sales

Spark Home Sales Analysis utilizes Apache Spark to explore and analyze home sales data, providing insights into average prices based on various criteria. The project employs Spark SQL queries for efficient data processing and is designed for easy setup and usage.

Language: Jupyter Notebook - Size: 1.25 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

dhiraa/spark-tpcds

Apache Spark TPC-DS benchmark setup with EMR launch setup

Language: Smarty - Size: 1.3 MB - Last synced at: over 1 year ago - Pushed at: almost 3 years ago - Stars: 10 - Forks: 4

lifeomic/spark-vcf

Spark VCF data source implementation for Dataframes

Language: Scala - Size: 314 KB - Last synced at: 9 days ago - Pushed at: almost 3 years ago - Stars: 14 - Forks: 2

HuemulSolutions/huemul-bigdatagovernance

Huemul BigDataGovernance, es una framework que trabaja sobre Spark, Hive y HDFS. Permite la implementación de una estrategia corporativa de dato único, basada en buenas prácticas de Gobierno de Datos. Permite implementar tablas con control de Primary Key y Foreing Key al insertar y actualizar datos utilizando la librería, Validación de nulos, largos de textos, máximos/mínimos de números y fechas, valores únicos y valores por default. También permite clasificar los campos en aplicabilidad de derechos ARCO para facilitar la implementación de leyes de protección de datos tipo GDPR, identificar los niveles de seguridad y si se está aplicando algún tipo de encriptación. Adicionalmente permite agregar reglas de validación más complejas sobre la misma tabla.

Language: Scala - Size: 1.27 MB - Last synced at: 12 days ago - Pushed at: about 2 years ago - Stars: 11 - Forks: 7

invent-analytics/metaframe

Spark DataFrame with metadata

Language: Python - Size: 14.6 KB - Last synced at: about 1 year ago - Pushed at: over 3 years ago - Stars: 9 - Forks: 1

annagracia12/MassiveDataProcessing

Projects of the subject Massive Data Processing Engineering at Universidad Internacional de La Rioja.

Language: Jupyter Notebook - Size: 3.73 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

sangwanamit621/sql-solutions-in-pyspark-dataframe-api-and-spark-sql

This repository contains my solutions to various SQL problems from LeetCode, implemented using PySpark DataFrame API and Spark SQL. The goal is to provide alternative solutions and insights for SQL enthusiasts who want to explore the power of PySpark and Spark SQL.

Language: Jupyter Notebook - Size: 71.3 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

SharpData/SharpETL

Write ETL using your favorite SQL dialects

Language: Scala - Size: 3.37 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 36 - Forks: 5

marcocolangelo/Big-Data-processing-and-Analytics

The current repository contains all the code developed during the Big Data processing and Analytics laboratories. Data are processed and analyzed using Hadoop and Spark

Language: Java - Size: 6.1 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

spshah1701/World-Development-Indicators

Analysis of World Development Indicators (WDI) using big data technologies, specifically Databricks, Apache Spark, and Scala.

Size: 107 KB - Last synced at: 4 months ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

arpendu11/graph-based-data-lake

An ETL application which is written in Quarkus, Spark SQL Streaming, Neo4j and various types of Databases and stores. It also covers the devops frameworks like Jenkins CI/CD, docker and Kubernetes.

Language: Java - Size: 56.6 KB - Last synced at: about 1 year ago - Pushed at: about 2 years ago - Stars: 7 - Forks: 2

Adarsh-Hota/ETL_spark-on-dataproc

A Pyspark project that performs ETL on a Dataproc cluster and writes data to Google Cloud Storage/BigQuery.

Language: Jupyter Notebook - Size: 46.3 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

mikerly131/serveUpRecos

Build a mock EMR app and integrate an AI/ML prediction into an encounter workflow

Language: CSS - Size: 54.9 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

aysekonus/movie_recommendation_system

Movie Recommendation System using PySpark, ALS, SQLLite (Movielens Dataset)

Language: Jupyter Notebook - Size: 3.36 MB - Last synced at: over 1 year ago - Pushed at: about 3 years ago - Stars: 0 - Forks: 0

WazirRohiman/Apache_Spark_Basics

This series explores the basics of Apache Spark with the application of some practical elements of Spark, PySpark & SparkSQL

Language: Jupyter Notebook - Size: 29.3 KB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

polaroidz/sales_prediction

A Production Machine Learning Pipeline for Predicting Future Sales with Spark

Language: Jupyter Notebook - Size: 90.8 KB - Last synced at: over 1 year ago - Pushed at: over 6 years ago - Stars: 3 - Forks: 0

Jeanette22/pipelines_datos_vuelos-ETL

Proceso de ETL: proceso de ingesta, transformación y carga de data al DataWarehouse. Todo esto es una guía personal sobre los pasos que realicé para llevar adelante el proyecto solicitado, igual cualquier sugerencia/error es bien recibida para seguir aprendiendo más y mejorar. Cualquier contirbución es recibida!!

Language: Python - Size: 1.85 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

anshul1004/MutualFriends

Implementation of Hadoop and Spark

Language: Java - Size: 23 MB - Last synced at: over 1 year ago - Pushed at: about 5 years ago - Stars: 1 - Forks: 0

sunwu51/bigdatatutorial

bigdatatutorial

Language: Shell - Size: 23.3 MB - Last synced at: 7 months ago - Pushed at: almost 7 years ago - Stars: 35 - Forks: 6

miguelangel43/Prediction-Flight-Arrivals-Delays-Spark

Application that trains a classifier and predicts flight arrival delays based on past information. Uses the libraries pyspark.ml and pyspark.sql, performs feature engineering, cross-validation and tests various ML algorithms.

Language: Python - Size: 41 KB - Last synced at: about 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

ThomasByr/lichess-analysis-of-chess-games

♟️ analysis of stockfish anotated lichess games

Language: Jupyter Notebook - Size: 1.88 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 1

seyfal/SparkMitMAttackSim

Scalable simulation of MitM attacks using parallel random walks and graph analytics on Spark.

Language: Scala - Size: 76.2 KB - Last synced at: 8 days ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

AjmalSarwary/IoT---assignment-IBM-Data-Science-Specialization

This assignment was part of an IoT motion sensor App running on a watch, predicting actions of the individual wearing the watch based on his arm movements; this IoT Analytics assignments is one of a series of data pipeline coding challenges in the IBM course Scalable Data Science.

Language: Jupyter Notebook - Size: 11.7 KB - Last synced at: over 1 year ago - Pushed at: almost 3 years ago - Stars: 0 - Forks: 0

bluishglc/bdp

A prototype project of big data platform, the source codes of the book Big Data Platform Architecture and Prototype

Language: Java - Size: 403 KB - Last synced at: over 1 year ago - Pushed at: almost 5 years ago - Stars: 184 - Forks: 135

microsoft/Functional-Validation-Testing-Spark-SQL

Business Validation Testing in Spark SQL

Language: Scala - Size: 43.9 KB - Last synced at: 5 days ago - Pushed at: over 3 years ago - Stars: 3 - Forks: 4

SamaSamrin/Amazon_Product_Analysis_with_PySpark

We are utilizing Big Data technologies and the platform of PySpark to perform an analysis of the Amazon Products with Python.

Language: Jupyter Notebook - Size: 16.7 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

yennanliu/spark-etl-pipeline

Various data stream/batch process demo with Apache Scala Spark 🚀

Language: Scala - Size: 5.06 MB - Last synced at: about 1 year ago - Pushed at: over 5 years ago - Stars: 11 - Forks: 8

sahilbhange/spark-slowly-changing-dimension

Spark implementation of Slowly Changing Dimension type 2

Language: Scala - Size: 351 KB - Last synced at: over 1 year ago - Pushed at: over 6 years ago - Stars: 11 - Forks: 13

youheekil/udacity-data-streaming

Projects completed in the Udacity Data Streaming Nanodegree program. Tech used: Apache Kafka, Kafka Connect, KSQL, Faust Stream Processing, Spark Structured Streaming

Language: Python - Size: 1.01 MB - Last synced at: 5 months ago - Pushed at: about 3 years ago - Stars: 4 - Forks: 4

LarryLoveIV/PySpark-NBA

Work in-progress NBA Game Predictor using Spark

Language: Jupyter Notebook - Size: 404 KB - Last synced at: 7 months ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

javieraespinosa/lifranum

Discovering French Digital Literature (LIFRANUM ANR project)

Language: Jupyter Notebook - Size: 871 KB - Last synced at: 16 days ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

ShubhamJagtap2000/Spark-Python

🐍💥Python and Spark for Big Data

Language: Jupyter Notebook - Size: 73.2 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

arpendu11/stocks-data-pipelining

Data pipeline using Spring Boot to consume Kafka streams data and process it and forward to multiple DB like MySQL and PostgreSQL

Language: Java - Size: 18.6 KB - Last synced at: about 1 year ago - Pushed at: about 5 years ago - Stars: 3 - Forks: 1

chiomauche/Home_Sales

The purpose of the study was to use the knowledge of SparkSQL to determine key metrics about home sales data

Language: Jupyter Notebook - Size: 230 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

xiaogp/recsys_spark

Spark SQL 实现 ItemCF，UserCF，Swing，推荐系统，推荐算法，协同过滤

Language: Scala - Size: 10.7 KB - Last synced at: over 1 year ago - Pushed at: over 5 years ago - Stars: 121 - Forks: 47

SamuelBarbosaDev/Justweb_Technical_Test

Esse é um teste técnico para a vaga de Desenvolvedor Python Pleno.

Language: Jupyter Notebook - Size: 3.65 MB - Last synced at: about 1 month ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

tiatsou/SparkDistributedComputing

Distributed Computing with Spark SQL

Size: 193 KB - Last synced at: over 1 year ago - Pushed at: about 4 years ago - Stars: 0 - Forks: 0

charalav7/BigData

Explore the technologies of Hadoop, MapReduce, Spark, Spark SQL, Spark Streaming, Kafka, GraphX, HBase, Cassandra

Language: Jupyter Notebook - Size: 2.7 MB - Last synced at: over 1 year ago - Pushed at: over 6 years ago - Stars: 0 - Forks: 0

sohilsshah91/Airline-Stock-Prediction-Using-Google-Trends-Oil-Prices

This project highlights a Spark application built on Scala. It utilizes Spark Core, Spark SQL and Spark ML (Machine Learning libraries) for predicting stock prices of specific airline companies. We have used the Google trending words (searched on internet and relevant to financial domain) and also macro-economic oil prices as alternate data to predict stock prices.

Language: Scala - Size: 1.02 MB - Last synced at: over 1 year ago - Pushed at: about 7 years ago - Stars: 2 - Forks: 1

Mrk-Nguyen/spark-projects

Assignment and personal projects involving Apache Spark using Scala and Python

Language: Scala - Size: 40 KB - Last synced at: over 1 year ago - Pushed at: over 6 years ago - Stars: 1 - Forks: 3

JunjianS/spark-streaming-kafka-demo

spark streaming从kafka读取消息，offset写入Redis，spark计算单词出现频率，最后写入hive表

Language: Java - Size: 14.6 KB - Last synced at: over 1 year ago - Pushed at: almost 6 years ago - Stars: 15 - Forks: 7

Dimitrov-S-Dev/PySpark

PySpark

Language: Jupyter Notebook - Size: 62.5 KB - Last synced at: 4 months ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

ketanpurohit0/python

Self directed Python PoC etc/ PostgreSQL / Apache Spark / Pandas

Language: Python - Size: 1.18 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

mdrkb/spark-tutorial

A basic Spark project written in Scala

Language: Scala - Size: 7.81 KB - Last synced at: almost 2 years ago - Pushed at: over 5 years ago - Stars: 0 - Forks: 0

piyushgoyal1620/Big-Data-Project-3

This project is basically for collecting enormous data and analyzing it. It includes live streaming of data from FOREX trading API and Electric Vehicle stocks API. The data is fetched and processed using Kafka Streaming and Spark streaming.Throughout this project stocks of Forex data and Electric Vehicle parts making companies data were analyzed…

Language: Jupyter Notebook - Size: 34.7 MB - Last synced at: almost 2 years ago - Pushed at: almost 4 years ago - Stars: 0 - Forks: 0

Azimboy/feed-stats

Feed statistics with Spark

Language: Scala - Size: 847 KB - Last synced at: almost 2 years ago - Pushed at: about 4 years ago - Stars: 1 - Forks: 0

sanxore/spark-theta-sketch-udfs

This project aims to use Yahoo Theta Sketch api as Spark sql UDFs

Language: Scala - Size: 9.77 KB - Last synced at: almost 2 years ago - Pushed at: about 6 years ago - Stars: 3 - Forks: 0

jaezak/home_sales_sparkSQL

Determine key metrics about home sales data using SparkSQL.

Language: Jupyter Notebook - Size: 11.7 KB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

inferrinizzard/prettier-sql Fork of sql-formatter-org/sql-formatter 📦

[ARCHIVED] Please use https://github.com/sql-formatter-org/sql-formatter

Language: TypeScript - Size: 3.08 MB - Last synced at: about 2 months ago - Pushed at: about 3 years ago - Stars: 21 - Forks: 5

wangj1106/recommendMoteur

电影推荐系统、电影推荐引擎、使用Spark完成的电影推荐引擎

Language: Scala - Size: 10.4 MB - Last synced at: over 1 year ago - Pushed at: about 7 years ago - Stars: 106 - Forks: 37

mathewsrc/machine-failure-prediction

Predicting machine failure

Language: Jupyter Notebook - Size: 6.34 MB - Last synced at: about 1 year ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

zrlio/spark-nullio-fileformat

Spark Null I/O file format

Language: Scala - Size: 98.6 KB - Last synced at: almost 2 years ago - Pushed at: over 7 years ago - Stars: 1 - Forks: 0

zrlio/fileformat-benchmarks Fork of animeshtrivedi/fileformat-benchmarks

file format specific benchmarks for Parquet, ORC, Avro, JSON, and Arrow

Language: Scala - Size: 27.3 KB - Last synced at: almost 2 years ago - Pushed at: over 7 years ago - Stars: 4 - Forks: 0

bmazzarol/TypedSpark.NET 📦

Typesafe bindings for :star: Spark.NET

Language: C# - Size: 497 KB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 3 - Forks: 0