GitHub topics: sparksql
commoncrawl/cc-pyspark
Process Common Crawl data with Python and Spark
Language: Python - Size: 157 KB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 430 - Forks: 89

SathyaV99/hadoop-spark-traffic-predictor-toronto
🚦 Toronto Traffic Prediction with Apache Spark, Hadoop and SparkML. Used Random Forest as the model for prediction
Language: Jupyter Notebook - Size: 31.9 MB - Last synced at: 12 days ago - Pushed at: 12 days ago - Stars: 0 - Forks: 0

locationtech/rasterframes
Geospatial Raster support for Spark DataFrames
Language: Jupyter Notebook - Size: 102 MB - Last synced at: 8 days ago - Pushed at: about 1 year ago - Stars: 252 - Forks: 45

zio/zio-quill
Compile-time Language Integrated Queries for Scala
Language: Scala - Size: 12.5 MB - Last synced at: 12 days ago - Pushed at: 16 days ago - Stars: 2,156 - Forks: 352

zio/zio-protoquill
Quill for Scala 3
Language: Scala - Size: 30 MB - Last synced at: 10 days ago - Pushed at: 16 days ago - Stars: 217 - Forks: 52

DarrenDavy12/Databricks-Certification
topic-specific projects and end-to-end project
Size: 210 KB - Last synced at: 3 days ago - Pushed at: 21 days ago - Stars: 0 - Forks: 0

Stratio/sparta
Real Time Analytics and Data Pipelines based on Spark Streaming
Language: Scala - Size: 123 MB - Last synced at: 23 days ago - Pushed at: over 5 years ago - Stars: 526 - Forks: 196

maddieemihle/Home_Sales
A PySpark-powered analysis of real estate trends using home sales data. This project explores average prices by year, room configuration, and property features, while demonstrating SparkSQL, caching, and partitioning techniques in a scalable data pipeline—all within Google Colab
Language: Jupyter Notebook - Size: 11.7 KB - Last synced at: 20 days ago - Pushed at: 26 days ago - Stars: 0 - Forks: 0

CybercentreCanada/jupyterlab-sql-editor
A JupyterLab extension providing, SQL formatter, auto-completion, syntax highlighting, Spark SQL and Trino
Language: Jupyter Notebook - Size: 90.5 MB - Last synced at: 9 days ago - Pushed at: about 2 months ago - Stars: 88 - Forks: 14

udao-moo/udao-spark-optimizer
A Spark Optimizer for Adaptive, Fine-Grained Parameter Tuning
Language: Python - Size: 4.19 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 9 - Forks: 0

Tavo17s/PySpark-Tutorial
hands-on, beginner-friendly walkthrough of PySpark concepts, practical examples, and real life scenarios using the Databricks platform.
Language: Jupyter Notebook - Size: 526 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

microsoft/data-accelerator
Data Accelerator for Apache Spark simplifies onboarding to Streaming of Big Data. It offers a rich, easy to use experience to help with creation, editing and management of Spark jobs on Azure HDInsights or Databricks while enabling the full power of the Spark engine.
Language: C# - Size: 401 MB - Last synced at: 11 days ago - Pushed at: 2 months ago - Stars: 302 - Forks: 90

zsvoboda/ngods
New generation opensource data stack
Language: Dockerfile - Size: 1.62 MB - Last synced at: about 1 month ago - Pushed at: about 3 years ago - Stars: 66 - Forks: 9

teeyog/IQL
An ad hoc query service based on the spark sql engine.(基于spark sql引擎的即席查询服务)
Language: JavaScript - Size: 12.7 MB - Last synced at: 2 months ago - Pushed at: over 1 year ago - Stars: 383 - Forks: 180

harsha2010/magellan
Geo Spatial Data Analytics on Spark
Language: Scala - Size: 13 MB - Last synced at: 27 days ago - Pushed at: almost 4 years ago - Stars: 532 - Forks: 149

Galaxy092/Samsung-Innovation-Campus-Big-Data-Capstone-Project
Samsung Innovation Campus Big Data Capstone Project - Weather Prediction
Language: Jupyter Notebook - Size: 10.5 MB - Last synced at: 2 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

alisonpezzott/fabric-medallion-cotacoes-bcb
Language: Jupyter Notebook - Size: 56.6 KB - Last synced at: 2 months ago - Pushed at: 4 months ago - Stars: 2 - Forks: 2

RenanBjj/Databricks-SQL-Optical-Campaign
Databricks Optical Campaign for Hoya Products
Language: Jupyter Notebook - Size: 163 KB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

SonicDMG/SonicDMG.github.io
A fun place for me to blog about distributed databases, aerial arts, and life in general
Language: HTML - Size: 19.1 MB - Last synced at: 3 months ago - Pushed at: over 7 years ago - Stars: 1 - Forks: 0

stonezhong/DataManager
Better organize data in data lake and build ETL pipeline with Web UI tool.
Language: JavaScript - Size: 2.33 MB - Last synced at: 19 days ago - Pushed at: about 4 years ago - Stars: 9 - Forks: 2

microsoft/A-TALE-OF-THREE-CITIES
Analyzing the safety (311) dataset published by Azure Open Datasets for Chicago, Boston and New York City using SparkR, SParkSQL, Azure Databricks, visualization using ggplot2 and leaflet. Focus is on descriptive analytics, visualization, clustering, time series forecasting and anomaly detection.
Language: R - Size: 21.8 MB - Last synced at: 4 days ago - Pushed at: about 4 years ago - Stars: 86 - Forks: 34

dazheng/SparkETL
Implement a complete data warehouse etl using spark SQL
Language: Java - Size: 132 KB - Last synced at: 27 days ago - Pushed at: over 2 years ago - Stars: 14 - Forks: 7

yjshen/spark-connector-test
A tutorial on how to use pulsar-spark-connector
Language: Scala - Size: 12.7 KB - Last synced at: about 2 months ago - Pushed at: over 4 years ago - Stars: 11 - Forks: 3

ravishankar324/Washington-state-electric-vehicles-ETL-pipeline
ETL Datapipeline to process Washington's EV data using Apache Spark, Docker, Snowflake, Airflow, AWS services and visualize the transformed parquet data by creating Tableau Dashboards.
Language: Python - Size: 1.85 MB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 0 - Forks: 0

Suryan5h/Apache-Spark
RDDs | Spark SQL | Catalyst Optimizer | Spark Streaming | ALS Algorithm
Language: Jupyter Notebook - Size: 972 KB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 0 - Forks: 0

Ren294/Covid-Data-Process
This project integrates real-time data processing and analytics using Apache NiFi, Kafka, Spark, Hive, and AWS services for comprehensive COVID-19 data insights.
Language: Shell - Size: 6.22 MB - Last synced at: about 2 months ago - Pushed at: 9 months ago - Stars: 6 - Forks: 0

armahdavi/big_data_spark_building_IOT_sensor_ieq_analytics_ML_thermal_comfort_MURB_retrofit_social_housing
This repository summarizes my analytics, big data, and ML code work from a Multi-Unit Residential Building (MURB) retrofit project run back during my Ph.D.,
Language: Stata - Size: 1.34 MB - Last synced at: about 2 months ago - Pushed at: 7 months ago - Stars: 0 - Forks: 0

largecats/sparksql-formatter
A SparkSQL formatter based on https://github.com/zeroturnaround/sql-formatter, with customizations and extra features.
Language: Python - Size: 346 KB - Last synced at: 7 months ago - Pushed at: 7 months ago - Stars: 14 - Forks: 3

BrooksIan/DS_GTDB
KMeans Clustering on Global Terrorism Database
Size: 1.14 MB - Last synced at: about 2 months ago - Pushed at: about 7 years ago - Stars: 2 - Forks: 3

spirom/LearningSpark
Scala examples for learning to use Spark
Language: Scala - Size: 482 KB - Last synced at: 3 months ago - Pushed at: over 4 years ago - Stars: 444 - Forks: 290

saurfang/sparksql-protobuf
Read SparkSQL parquet file as RDD[Protobuf]
Language: Scala - Size: 61.5 KB - Last synced at: about 2 months ago - Pushed at: over 6 years ago - Stars: 93 - Forks: 37

burhanahmed1/Big-Data-Analytics
Practice tasks in Python programming language using Hadoop, MRJob, PySpark for Big Data Analytics.
Language: Jupyter Notebook - Size: 40 KB - Last synced at: 3 months ago - Pushed at: 11 months ago - Stars: 2 - Forks: 0

potix2/spark-google-spreadsheets
Google Spreadsheets datasource for SparkSQL and DataFrames
Language: Scala - Size: 72.3 KB - Last synced at: about 2 months ago - Pushed at: almost 2 years ago - Stars: 57 - Forks: 47

Kmohamedalie/ApacheSpark-Data_Analytics
Data Analytics with Apache Spark ⭐
Language: Jupyter Notebook - Size: 487 KB - Last synced at: 3 months ago - Pushed at: 12 months ago - Stars: 0 - Forks: 0

shanukatiyar111/Pyspark-Project-1
DATABRICKS PROJECT- END TO END SALES ANALYSIS
Language: Python - Size: 19.5 KB - Last synced at: 3 months ago - Pushed at: about 1 year ago - Stars: 1 - Forks: 0

shanukatiyar111/Pyspark-Project-2
DATABRICKS PROJECT - END TO END GOOGLE PLAYSTORE DATA ANALYSIS
Language: Jupyter Notebook - Size: 4.08 MB - Last synced at: 3 months ago - Pushed at: about 1 year ago - Stars: 1 - Forks: 0

BrooksIan/SBIR_TFIDF_KMeans
Document clustering using KMeans on TF/IDF features on Small Business Innovation Research (SBIR) data
Language: Jupyter Notebook - Size: 2.41 MB - Last synced at: 19 days ago - Pushed at: about 4 years ago - Stars: 5 - Forks: 2

yaooqinn/spark-postgres
PostgreSQL and GreenPlum Data Source for Apache Spark
Language: Scala - Size: 78.1 KB - Last synced at: about 1 month ago - Pushed at: over 1 year ago - Stars: 35 - Forks: 13

Pirata-Codex/Sentiment-Analysis-SparkML
Using SparkML to build different machine learning models for simulating a small scale of big data management
Language: Jupyter Notebook - Size: 172 KB - Last synced at: 12 months ago - Pushed at: over 4 years ago - Stars: 2 - Forks: 1

yaooqinn/spark-ranger 📦
已经合入(apache/incubator-kyuubi) ACL Management for Apache Spark SQL with Apache Ranger.
Language: Scala - Size: 116 KB - Last synced at: 4 months ago - Pushed at: over 3 years ago - Stars: 54 - Forks: 56

WinThitiwat/Data_Lake_with_Spark
ETL process to S3 Data Lake through EMR, Spark, Hadoop, Schema-on-Read
Language: Jupyter Notebook - Size: 536 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

darule0/sparkdiff
A rudimentary command line utility for contrasting Apache Spark event logs.
Language: Shell - Size: 703 KB - Last synced at: 2 months ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

sana1410/NYPD-Arrest-Data-Year-to-Date
This repository is used to perform data analysis using Databricks and Tableau on NYC crime datasets
Language: HTML - Size: 1.77 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

morfious902002/impala-spark-jdbc-kerberos 📦
Language: Java - Size: 4.88 KB - Last synced at: about 1 year ago - Pushed at: almost 3 years ago - Stars: 7 - Forks: 5

austinlmcconnell/home-sales-data-evaluation
Contains an analysis of key home sales metrics using SparkSQL and Python to manage large amounts of data.
Language: Jupyter Notebook - Size: 1.39 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 1 - Forks: 0

rodrigoorf/SparkStudies
Repo with some Spark and SparkSQL exercises
Language: Java - Size: 41.1 MB - Last synced at: about 1 year ago - Pushed at: over 5 years ago - Stars: 0 - Forks: 0

hyunjoonbok/PySpark
PySpark functions and utilities with examples. Assists ETL process of data modeling
Language: Jupyter Notebook - Size: 3.79 MB - Last synced at: about 1 year ago - Pushed at: over 4 years ago - Stars: 89 - Forks: 73

Amarilli/Home_Sales
In pursuit of significant metrics for home sales data, Google Colab and SparkSQL were employed to extract essential insights.
Language: Jupyter Notebook - Size: 68.4 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

SteveTuttle/home-sales-sparkSQL-metrics
Use SparkSQL to determine key metrics of the data. Use Spark to create temporary views, partition the data, cache and uncache a temporary table, and verify that the table has been uncached.
Language: Jupyter Notebook - Size: 27.3 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

bulutenesemre/OsmAnalysis
OpenStreetMap Data Analysis with Python programming language.
Language: Jupyter Notebook - Size: 1.03 MB - Last synced at: about 1 year ago - Pushed at: over 5 years ago - Stars: 2 - Forks: 0

MoustafaAMahmoud/spark-sandbox
Spark Sandbox project
Language: Scala - Size: 8.79 KB - Last synced at: about 1 year ago - Pushed at: about 5 years ago - Stars: 0 - Forks: 0

zydusss/Spark
Data Analytics using Spark
Language: Jupyter Notebook - Size: 33.2 KB - Last synced at: about 1 year ago - Pushed at: over 5 years ago - Stars: 2 - Forks: 0

luisadrianml/nyctaxi2017
Implementing Big Data Methods to Analyze 2017 NYC Yellow Taxi
Language: Scala - Size: 3.16 MB - Last synced at: about 1 year ago - Pushed at: over 6 years ago - Stars: 0 - Forks: 1

200413-java-spark/Project-2-Group-1
Apache Spark Life Expectancy - Daniel, Sutter, and John
Language: Java - Size: 877 KB - Last synced at: about 1 year ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 1

longtng/SparkSQL
The laboratory from CLOUDS Course at EURECOM
Language: HTML - Size: 710 KB - Last synced at: about 1 year ago - Pushed at: over 6 years ago - Stars: 1 - Forks: 0

NavyaTrilok/Advanced-Big-Data-ML-Project
Weather Data Analysis using Python, Pandas, SparkSQL, AutoRegression Model
Size: 1000 Bytes - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

amitnema/spark-coach
This project contains the learning and experiments with the Apache Spark.
Language: Scala - Size: 46.9 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 1 - Forks: 0

giucris/yasp
Yet Another SPark Framework
Language: Scala - Size: 228 KB - Last synced at: about 1 year ago - Pushed at: over 2 years ago - Stars: 10 - Forks: 1

cyofeiyue/SparkSQLProject
SparkSQL Training
Language: Scala - Size: 10.7 KB - Last synced at: over 1 year ago - Pushed at: about 7 years ago - Stars: 0 - Forks: 0

mehrdadalmasi2020/ApacheSpark_ApacheZeppelin_SQL_Shell
Run your first analysis project on Apache Zeppelin using Scala (Spark), Shell, and SQL
Language: Scala - Size: 1.4 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

jubins/Spark-And-MLlib-Projects
This repository contains Spark, MLlib, PySpark and Dataframes projects
Language: Jupyter Notebook - Size: 101 KB - Last synced at: about 1 year ago - Pushed at: over 7 years ago - Stars: 39 - Forks: 97

gcdev373/example-spark-datasourcev2
A very simple Java implementation of the Spark DataSourceV2 API.
Language: Java - Size: 9.77 KB - Last synced at: over 1 year ago - Pushed at: about 4 years ago - Stars: 4 - Forks: 0

aravinthsci/Miscellaneous1
Language: Jupyter Notebook - Size: 42.2 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 1

open-datastudio/spark-thriftserver
Spark thriftserver itself deploys Spark cluster on Kubernetes and allows JDBC/ODBC clients to execute SQL
Size: 8.79 KB - Last synced at: over 1 year ago - Pushed at: almost 5 years ago - Stars: 1 - Forks: 1

RJBarker/home_sales
Use PySpark and SparkSQL to execute SQL queries through a temporary view of the DataFrame created. Conduct additional queries on cached and partitioned data to determine runtime comparisons.
Language: Jupyter Notebook - Size: 146 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

andreiramani/Machine-Learning-with-Apache-Spark
Coursera IBM Data Engineering (Course 12 from 13)
Language: Jupyter Notebook - Size: 2.55 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

omar-elaqqad/BigData-Spark-tps-
Tps Spark & BigData fondamentals
Language: Java - Size: 113 KB - Last synced at: over 1 year ago - Pushed at: about 3 years ago - Stars: 0 - Forks: 0

RahulGupta16/Pyspark-Theory-and-Code-Basics
Pyspark serves as a Python interface to Apache Spark, enabling the execution of Python and SQL-like instructions for the manipulation and analysis of data within a distributed processing framework.
Language: Jupyter Notebook - Size: 1.07 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

shubhammirajkar/superstore_azure_de_project
Copying data from Amazon S3 bucket to Azure Blob container by using Azure Data Factory pipeline. This Data is mounted to Databricks and further analysis is done using Spark SQL.
Language: Python - Size: 2.61 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

bluishglc/bdp
A prototype project of big data platform, the source codes of the book Big Data Platform Architecture and Prototype
Language: Java - Size: 403 KB - Last synced at: over 1 year ago - Pushed at: almost 5 years ago - Stars: 184 - Forks: 135

Heisenberghj7/Retail-Store-BigData
📊 📑This project provides a step-by-step big data analytics applied in the retail industry through the use of a variety of big data technologies. such as HDFS, Hive and Spark..
Language: Python - Size: 2.11 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

MouhtaramSoufiane/SparkSQL
Language: Java - Size: 20.5 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

el-moudni-hicham/bigdata-spark-sql
This repository includes a brief but informative and simple explanation of Apache Spark and Spark SQL terms with java implementation.
Language: Java - Size: 39.1 KB - Last synced at: 3 months ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

carmhuo/carmhuo.github.io
学习笔记、心得体会
Language: HTML - Size: 5.08 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 1

Heli515/San-Francisco-Crime-Analysis
Language: HTML - Size: 539 KB - Last synced at: over 1 year ago - Pushed at: over 6 years ago - Stars: 0 - Forks: 0

hbutani/spark-druid-olap 📦
Sparkline BI Accelerator provides fast ad-hoc query capability over Logical Cubes. This has been folded into our SNAP Platform(http://bit.ly/2oBJSpP) an Integrated BI platform on Apache Spark.
Language: Scala - Size: 127 MB - Last synced at: over 1 year ago - Pushed at: almost 7 years ago - Stars: 286 - Forks: 96

vks2106/spark-custom
Spark examples.
Language: Java - Size: 18.6 KB - Last synced at: over 1 year ago - Pushed at: about 6 years ago - Stars: 0 - Forks: 0

futurikidis21/Spark-text-analysis-predicting-exchange-rate-shifts
Language: Jupyter Notebook - Size: 156 MB - Last synced at: over 1 year ago - Pushed at: over 6 years ago - Stars: 0 - Forks: 0

nrothchicago/SparkCode
Spark code used for my Master's Thesis. Run on AWS EMR clusters
Language: Scala - Size: 22.5 KB - Last synced at: over 1 year ago - Pushed at: over 7 years ago - Stars: 0 - Forks: 1

JunjianS/spark-streaming-kafka-demo
spark streaming从kafka读取消息,offset写入Redis,spark计算单词出现频率,最后写入hive表
Language: Java - Size: 14.6 KB - Last synced at: over 1 year ago - Pushed at: almost 6 years ago - Stars: 15 - Forks: 7

StianPedersen/TDDE31_Big_Data
Advanced Big Data course taught at Linköping University. Topics included paralellisation, machine learning with Big Data and querying on distributed systems.
Language: Python - Size: 2.71 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

liumingmusic/HadoopLearning
全套大数据基础学习教程,包含最基础的centos、maven。大数据主要包含hdfs、mr、yarn、hbase、kafka、scala、sparkcore、sparkstreaming、sparksql。教程包含所有的源代码演示以及在线文档说明。
Language: Scala - Size: 5.95 MB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 52 - Forks: 24

Mrying0823/Bigdata_group_project
航班数据集分析小组项目
Language: JavaScript - Size: 73.4 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

Pratikdomadiya/Databricks_workspace
Data exploration, Preprocessing, Analysis, and visualization using PySpark, SparkSQL, Pandas, and Python.
Language: Jupyter Notebook - Size: 4.22 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

ramapilli16/CCA175-PySpark-Practice-with-solutions
CCA175-PySpark-Practice-with-solutions
Size: 20.5 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 3 - Forks: 2

Charityosai/Analysing-clinical-trial-dataset-to-uncover-insights
This project involved the analysis of two historical datasets (a clinical trial datasets and a dataset listing pharmaceutical violations) to extract valuable insights that can aid medical research and strategic decision making in the medical field. The steps were carried out in 3 phases on databricks using (PySpark(RDD and Dataframe) and Spark SQL)
Size: 32.2 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

e-petrachi/MisinformationAnalysis
Un piccolo progetto SPARK e SPARKSQL per fare delle analisi in un contesto distribuito su dati estrapolati da Twitter e salvati in MongoDB.
Language: Java - Size: 253 KB - Last synced at: almost 2 years ago - Pushed at: almost 7 years ago - Stars: 1 - Forks: 0

ritamghoshgds/DnA-F1-POC
The project harnessed an ETL multi-hop architecture, ingesting data from the Ergast API into a storage backed by Azure Data Lake. The process involved weekly ingestion of bronze layer data as cutover and delta files. Raw data, in varied formats, was transformed using Azure Databricks PySpark notebooks into enriched Silver and Gold layers.
Language: Python - Size: 5.67 MB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

jay-chauhan/Spark-TPCH-Benchmark
Course project for University of Washington CSE 599: Big Data Systems on benchmarking bigdata tools
Language: Jupyter Notebook - Size: 9.77 KB - Last synced at: almost 2 years ago - Pushed at: over 7 years ago - Stars: 0 - Forks: 0

Heisenberg0203/Apache_Spark
Apache Spark Projects :-From beginners to advanced level
Language: Java - Size: 64.5 KB - Last synced at: almost 2 years ago - Pushed at: over 6 years ago - Stars: 1 - Forks: 0

spg-47/Big-Data-Analytics-on-Stocks-Data
Enhanced profitability and research of stocks historical data using distributed system analytics.
Language: Jupyter Notebook - Size: 7.18 MB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

PeterSchuld/UCSanDiego_MicroMasters_DataScience-BigDataAnalyticsUsingSpark
The University of California, San Diego, course DSE230x "Big Data Analytics Using Spark" (Summer 2019): Learn how to analyze large datasets using Jupyter notebooks, MapReduce and Spark as a platform. Part 4 of the »Data Science« MicroMasters® Program on edX. Instructor: Yoav Freund, Professor of CS and Engineering, University of California San Diego.
Size: 6.12 MB - Last synced at: almost 2 years ago - Pushed at: about 5 years ago - Stars: 1 - Forks: 1

ZhuXS/Spring-Shiro-Spark
Spring-Shiro-Spark是Spring-Boot Hibernate Spark Spark-SQL Shiro iView VueJs... ...的集成尝试
Language: Java - Size: 15.8 MB - Last synced at: 29 days ago - Pushed at: over 7 years ago - Stars: 114 - Forks: 35

smartlin5228/CCA175
Language: Java - Size: 107 KB - Last synced at: almost 2 years ago - Pushed at: over 7 years ago - Stars: 7 - Forks: 10

FredrikBakken/propertypartner-transactions
:house_with_garden: Application for calculating tax information from Property Partner investments.
Language: Scala - Size: 410 KB - Last synced at: almost 2 years ago - Pushed at: almost 5 years ago - Stars: 0 - Forks: 0

SharpRay/spark-druid-connector
A library for querying Druid data sources with Apache Spark
Language: Scala - Size: 248 KB - Last synced at: almost 2 years ago - Pushed at: over 4 years ago - Stars: 22 - Forks: 14

Akhileshkumarkc/Scala-Programs
Language: Scala - Size: 1.8 MB - Last synced at: almost 2 years ago - Pushed at: almost 7 years ago - Stars: 0 - Forks: 0

MhmdSyd/Wuzzuf_Jobs_DataAnalysis
Wuzzuf DataAnalysis by java using (SparkSql-Spring-XChart-Spark-ML)
Language: Java - Size: 333 KB - Last synced at: almost 2 years ago - Pushed at: almost 4 years ago - Stars: 1 - Forks: 1

margaretkhendre/Home-Sales-vs-Big-Data
In this repository, Google Collab is paired with SparkSQL to determine key metrics about home sales data. Spark is also used to create temporary views, partition data, and cache/unchache a temporary table in the process.
Language: Jupyter Notebook - Size: 14.6 KB - Last synced at: 3 months ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

cami5326/Home_Sales
Big Data Analysis of a fictional home sales dataset using SparkSQL
Language: Jupyter Notebook - Size: 75.2 KB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0
