An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: rdds

akshitvjain/realtime-twitter-trends-analytics

A big data project to develop a real-time data pipeline for analyzing the popularity and sentiments of trending topics on Twitter.

Language: Scala - Size: 50.6 MB - Last synced at: 2 months ago - Pushed at: almost 3 years ago - Stars: 23 - Forks: 8

roshankoirala/pySpark_tutorial

Implementation of Spark code in Jupyter notebook. Topics include: RDDs and DataFrame, exploratory data analysis (EDA), handling multiple DataFrames, visualization, Machine Learning

Language: Jupyter Notebook - Size: 202 KB - Last synced at: about 2 months ago - Pushed at: over 4 years ago - Stars: 29 - Forks: 26

aiwithqasim/pyspark_bigdata

Getting started with PySpark for Big data analysis

Language: Jupyter Notebook - Size: 835 KB - Last synced at: about 2 months ago - Pushed at: over 2 years ago - Stars: 9 - Forks: 12

drewm8080/data_mining_spark_rdds

Data Mining using Spark Rdds

Language: Python - Size: 745 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

AjmalSarwary/IoT---assignment-IBM-Data-Science-Specialization

This assignment was part of an IoT motion sensor App running on a watch, predicting actions of the individual wearing the watch based on his arm movements; this IoT Analytics assignments is one of a series of data pipeline coding challenges in the IBM course Scalable Data Science.

Language: Jupyter Notebook - Size: 11.7 KB - Last synced at: over 1 year ago - Pushed at: almost 3 years ago - Stars: 0 - Forks: 0

quadrantofsola/PySpark_RDD

Analysis of Clinical Trial Dataset using PySpark RDD implementation.

Size: 3.91 KB - Last synced at: over 1 year ago - Pushed at: about 3 years ago - Stars: 0 - Forks: 0

Thanaraklee/PySpark-Big-Data-RDD-Operations

This project illustrates Apache Spark RDD operations, from creation and transformation to actions and results, enhancing users' understanding of distributed data processing.

Language: Jupyter Notebook - Size: 3.6 MB - Last synced at: 29 days ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 1

mdarm/map-reduce-project

Project on MapReduce for the Μ111 - Big Data Management course, NKUA, Spring 2023.

Language: TeX - Size: 3.7 MB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

thiagoneye/course-pyspark

Pyspark studies.

Language: Python - Size: 4.08 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

TrainingByPackt/Big-Data-Processing-with-Apache-Spark-eLearning

Efficiently tackle large datasets and perform big data analysis with Spark and Python

Language: Python - Size: 36.1 KB - Last synced at: about 1 month ago - Pushed at: over 6 years ago - Stars: 7 - Forks: 6

DavideAG/BigData

Spark, RDDs and Map Reduce applications related to the BigData @Polito course (2019-2020). A set of personal notes are already provided.

Language: Java - Size: 5.7 MB - Last synced at: over 1 year ago - Pushed at: over 4 years ago - Stars: 0 - Forks: 0