Topic: "spark-mllib"
LuckyZXL2016/Movie_Recommend
基于Spark的电影推荐系统,包含爬虫项目、web网站、后台管理系统以及spark推荐系统
Language: Java - Size: 55.1 MB - Last synced at: about 12 hours ago - Pushed at: about 6 years ago - Stars: 2,936 - Forks: 1,049

databricks/LearningSparkV2
This is the github repo for Learning Spark: Lightning-Fast Data Analytics [2nd Edition]
Language: Scala - Size: 75.2 MB - Last synced at: 1 day ago - Pushed at: 4 months ago - Stars: 1,297 - Forks: 770

wzhe06/SparkCTR
CTR prediction model based on spark(LR, GBDT, DNN)
Language: Scala - Size: 35 MB - Last synced at: 6 days ago - Pushed at: about 5 years ago - Stars: 914 - Forks: 260

qubole/sparklens
Qubole Sparklens tool for performance tuning Apache Spark
Language: Scala - Size: 175 KB - Last synced at: 6 days ago - Pushed at: 11 months ago - Stars: 574 - Forks: 141

derrickburns/generalized-kmeans-clustering
Spark library for generalized K-Means clustering. Supports general Bregman divergences. Suitable for clustering probabilistic data, time series data, high dimensional data, and very large data.
Language: HTML - Size: 7.42 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 300 - Forks: 50

jaceklaskowski/spark-workshop
Apache Spark™ and Scala Workshops
Language: HTML - Size: 57 MB - Last synced at: 5 days ago - Pushed at: 10 months ago - Stars: 264 - Forks: 148

P7h/Spark-MLlib-Twitter-Sentiment-Analysis
:star2: :sparkles: Analyze and visualize Twitter Sentiment on a world map using Spark MLlib
Language: Scala - Size: 19.7 MB - Last synced at: about 1 year ago - Pushed at: about 4 years ago - Stars: 135 - Forks: 69

jingpeicomp/product-category-predict
商品类目预测,使用 Spring Boot 开发框架和 Spark MLlib 机器学习框架,通过 TF-IDF 和 Bayes 算法,训练出一个商品类目预测模型。该模型可以根据商品名称自动预测出商品类目。项目对外提供 RESTFul 接口。
Language: Java - Size: 41.2 MB - Last synced at: over 1 year ago - Pushed at: almost 4 years ago - Stars: 119 - Forks: 60

harishpuvvada/BitCoin-Value-Predictor 📦
[NOT MAINTAINED] Predicting Bit coin price using Time series analysis and sentiment analysis of tweets on bitcoin
Language: Jupyter Notebook - Size: 3.07 MB - Last synced at: over 1 year ago - Pushed at: over 5 years ago - Stars: 113 - Forks: 29

huangyueranbbc/Spark_ALS
基于spark-ml,spark-mllib,spark-streaming的推荐算法实现
Language: Java - Size: 97.7 KB - Last synced at: about 2 years ago - Pushed at: almost 6 years ago - Stars: 89 - Forks: 46

vaslnk/Spotify-Song-Recommendation-ML
UC Berkeley team's submission for RecSys Challenge 2018
Language: Jupyter Notebook - Size: 34.2 MB - Last synced at: about 1 year ago - Pushed at: about 7 years ago - Stars: 86 - Forks: 22

alessandrolulli/reforest
Random Forests in Apache Spark
Language: Scala - Size: 71 MB - Last synced at: over 1 year ago - Pushed at: almost 6 years ago - Stars: 72 - Forks: 11

OrvilleX/MachineLearning
本项目以应用为主出发,结合了从基础的机器学习、深度学习到目标检测以及目前最新的大模型,采用目前成熟的 第三方库、开源预训练模型以及相关论文的最新技术,目的是记录学习的过程同时也进行分享以供更多人可以直接进行使用。
Language: Jupyter Notebook - Size: 11.4 MB - Last synced at: 3 days ago - Pushed at: 3 months ago - Stars: 67 - Forks: 22

omerbsezer/SparkDeepMlpGADow30 📦
A Deep Neural-Network based (Deep MLP) Stock Trading System based on Evolutionary (Genetic Algorithm) Optimized Technical Analysis Parameters (using Apache Spark MLlib)
Language: Java - Size: 213 MB - Last synced at: over 1 year ago - Pushed at: about 7 years ago - Stars: 58 - Forks: 46

uosdmlab/spark-nkp
Natural Korean Processor for Apache Spark
Language: Scala - Size: 53.7 KB - Last synced at: 6 months ago - Pushed at: about 7 years ago - Stars: 53 - Forks: 16

wikistat/AI-Frameworks
Science des Données Saison 5: Technologies pour l'apprentissage automatique / statistique de données massives et l'Intelligence Artificielle
Language: Jupyter Notebook - Size: 646 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 44 - Forks: 42

omerbsezer/SparkMlpDow30 📦
A new stock trading and prediction model based on a MLP neural network utilizing technical analysis indicator values as features (using Apache Spark MLlib)
Language: Java - Size: 201 MB - Last synced at: 24 days ago - Pushed at: about 7 years ago - Stars: 36 - Forks: 17

wengbenjue/spark_recomend
使用Spark的MLlib、Hbase作为模型、Hive作数据清洗的核心推荐引擎,在Spark on Yarn测试通过
Language: C - Size: 41.2 MB - Last synced at: about 1 year ago - Pushed at: about 8 years ago - Stars: 29 - Forks: 17

jingpeicomp/product-relation-mining
商品关联关系挖掘,使用Spring Boot开发框架和Spark MLlib机器学习框架,通过FP-Growth算法,分析用户的购物车商品数据,挖掘商品之间的关联关系。项目对外提供RESTFul接口。
Language: Java - Size: 68.4 KB - Last synced at: about 2 years ago - Pushed at: almost 4 years ago - Stars: 24 - Forks: 16

FlorentF9/sparkml-som
:sparkles: Spark ML implementation of SOM algorithm (Kohonen self-organizing map)
Language: Scala - Size: 29.3 KB - Last synced at: 16 days ago - Pushed at: over 3 years ago - Stars: 18 - Forks: 6

IBM/icp4d-customer-churn-classifier
Infuse AI into your application. Create and deploy a customer churn prediction model with IBM Cloud Private for Data, Db2 Warehouse, Spark MLlib, and Jupyter notebooks.
Language: Jupyter Notebook - Size: 28.1 MB - Last synced at: 16 days ago - Pushed at: about 2 years ago - Stars: 17 - Forks: 22

yennanliu/NYC_Taxi_Trip_Duration
Develop ML models predict taxi trip duration in NYC. Ranked : Top 6% | RMSLE : 0.377 (Kaggle) | #DS
Language: Jupyter Notebook - Size: 43.2 MB - Last synced at: about 1 year ago - Pushed at: over 2 years ago - Stars: 15 - Forks: 8

akashsethi24/Machine-Learning
Examples of all Machine Learning Algorithm in Apache Spark
Language: Scala - Size: 3.64 MB - Last synced at: about 2 years ago - Pushed at: over 7 years ago - Stars: 15 - Forks: 10

amitkumarusc/recommendation-system
A movie recommendation system trained on the MovieLens 20 Million dataset. This system makes use of Collaborative filtering methods to come up with recommendations for a particular user.
Language: Jupyter Notebook - Size: 21.8 MB - Last synced at: about 1 year ago - Pushed at: over 5 years ago - Stars: 13 - Forks: 3

lp-dataninja/SparkML
Detailed notes and code to learn machine learning with Apache Spark.
Language: Jupyter Notebook - Size: 4.06 MB - Last synced at: about 2 years ago - Pushed at: over 6 years ago - Stars: 12 - Forks: 17

giuseppegambino/Italian-Sentiment-Analysis-with-Spark
Application of Sentiment Analysis of Italian tweet with Python and Spark
Language: Python - Size: 6.84 KB - Last synced at: about 2 years ago - Pushed at: about 4 years ago - Stars: 10 - Forks: 0

Rohini2505/Lending-Club-Loan-Analysis
Explanatory Data Analysis and ML model building using Apache Spark and PySpark
Language: HTML - Size: 6.26 MB - Last synced at: about 2 years ago - Pushed at: about 4 years ago - Stars: 10 - Forks: 12

josemarialuna/ClusterIndices
This package contains the code for executing clustering validity indices in Spark. The package includes BD-Silhouette, BD-Dunn, Davies-Bouldin and WSSSE indices.
Language: Scala - Size: 588 KB - Last synced at: about 2 years ago - Pushed at: over 6 years ago - Stars: 10 - Forks: 3

josemarialuna/ExternalValidity
This package contains the code for calculating external clustering validity indices in Spark. The package includes Chi Index among others.
Language: Scala - Size: 146 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 9 - Forks: 1

MHassaanButt/Flight-Delays-Prediction
In this project, I used Decision Tree Learning Model as the main algorithm to build the model. Due to the big amount of flight data, we implement the project using MRJob, PySpark and Spark's MLlib then compare the performance and accuracy of those implementations.
Language: Jupyter Notebook - Size: 17.5 MB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 8 - Forks: 0

DavideNardone/TwitterSentimentAnalysis
A Spark Streaming implementation for Online Twitter Sentiment Analysis.
Language: Python - Size: 1.78 MB - Last synced at: about 2 years ago - Pushed at: about 7 years ago - Stars: 8 - Forks: 3

TrainingByPackt/Big-Data-Processing-with-Apache-Spark-eLearning
Efficiently tackle large datasets and perform big data analysis with Spark and Python
Language: Python - Size: 36.1 KB - Last synced at: about 1 month ago - Pushed at: over 6 years ago - Stars: 7 - Forks: 6

P7h/p7hb-docker-mllib-twitter-sentiment
:ship: Docker image for Twitter Sentiment analysis with Spark MLlib
Language: Shell - Size: 138 KB - Last synced at: about 1 year ago - Pushed at: almost 8 years ago - Stars: 7 - Forks: 3

NashTech-Labs/Sparkathon
A library having Java and Scala examples for Spark 2.x
Language: Java - Size: 113 MB - Last synced at: about 2 months ago - Pushed at: over 8 years ago - Stars: 7 - Forks: 9

cbozan/graduation-project
Graduation project categorizes popular search phrases using Python and Spark and presents them on a website to inspire creators.
Language: Python - Size: 402 KB - Last synced at: about 1 month ago - Pushed at: over 2 years ago - Stars: 5 - Forks: 0

ging/fiware-ml-supermarket
Demo: Predicting purchase volume in a supermarket using FIWARE
Language: JavaScript - Size: 290 KB - Last synced at: 15 days ago - Pushed at: over 2 years ago - Stars: 5 - Forks: 1

alivcor/SMORK
Implementation of SMOTE - Synthetic Minority Over-sampling Technique in SparkML / MLLib
Language: Scala - Size: 165 KB - Last synced at: about 1 year ago - Pushed at: over 4 years ago - Stars: 5 - Forks: 2

surajsrivathsa/Supervised_Link_Prediction_Using_Spark_and_Neo4j
A project which involves analysis of Authorship graph data from Microsoft academic graph. In this project we calculate different graph features using temporal parameters of the authors and tried different classifiers. The final aim is to predict the link or coauthorsip possibility between two authors based on topological graph features and also find out the feasibility of performing this task on Neo4j and Spark
Language: Scala - Size: 15.8 MB - Last synced at: over 1 year ago - Pushed at: almost 5 years ago - Stars: 5 - Forks: 4

tkachuksergiy/aws-spark-nlp
Works related to recent project on the use of Apache Spark and AWS cloud for NLP task.
Language: Jupyter Notebook - Size: 2.76 MB - Last synced at: about 2 years ago - Pushed at: about 5 years ago - Stars: 5 - Forks: 0

AttitudeAdjuster/Accident-Severity-Prediction
IBM Coursera Capstone Project - Predict Accident Severity Given Weather, Road and Lighting Conditions
Language: Jupyter Notebook - Size: 161 MB - Last synced at: 7 months ago - Pushed at: over 6 years ago - Stars: 5 - Forks: 9

esap120/spark-twitter-streaming
Streaming Twitter Sentiment Analysis with Apache Spark
Language: Scala - Size: 8.79 KB - Last synced at: almost 2 years ago - Pushed at: over 6 years ago - Stars: 5 - Forks: 1

giovannigarifo/bigdata
Code samples, summaries, cheatsheets and other study material for Hadoop MapReduce and Apache Spark
Language: Java - Size: 69.1 MB - Last synced at: about 2 years ago - Pushed at: almost 7 years ago - Stars: 5 - Forks: 2

spoddutur/spark-ml-dashboard
Spark ML Dashboard built to plug-in and tweak the model params to real-time verify classification results on sample test data
Language: Scala - Size: 20.5 MB - Last synced at: about 2 years ago - Pushed at: over 7 years ago - Stars: 5 - Forks: 2

ABigdataer/MovieRecommendSystem
基于Spark的电影推荐系统
Language: HTML - Size: 64.6 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 4 - Forks: 4

MHassaanButt/Crime-Spark-ML
In this project I stream data and do crime classification using Spark. This dataset contains incidents derived from the SFPD Crime Incident Reporting system. The data ranges from 1/1/2003 to 5/13/2015. I do some data analysis of crime scenes in different areas and with respect to other parameters.
Language: Python - Size: 5.86 KB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 4 - Forks: 0

amanjeetsahu/Apache-Spark-Tutorials
This repo contains my learnings and practice notebooks on Spark using PySpark (Python Language API on Spark). All the notebooks in the repo can be used as template code for most of the ML algorithms and can be built upon it for more complex problems.
Language: Jupyter Notebook - Size: 20.9 MB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 4 - Forks: 10

IBM/db2-event-store-iot-analytics 📦
IoT sensor temperature analysis and prediction with IBM Db2 Event Store
Language: Jupyter Notebook - Size: 36.6 MB - Last synced at: 16 days ago - Pushed at: over 4 years ago - Stars: 4 - Forks: 22

avaibh/Twitter-Bot-Detection
Big Data Stack: Spark, Kafka, Elasticsearch and NoSQL
Language: Jupyter Notebook - Size: 304 KB - Last synced at: about 2 years ago - Pushed at: about 5 years ago - Stars: 4 - Forks: 0

desaiankitb/spark-mllib
Apache Spark is one of the most widely used and supported open-source tools for machine learning and big data. In this repo, discover how to work with this powerful platform for machine learning. This repo discusses MLlib—the Spark machine learning library—which provides tools for data scientists and analysts who would rather find solutions to business problems than code, test, and maintain their own machine learning libraries. Repo shows how to use DataFrames to organize data structure, and covers data preparation and the most commonly used types of machine learning algorithms: clustering, classification, regression, and recommendations. You will have experience loading data into Spark, preprocessing data as needed to apply MLlib algorithms, and applying those algorithms to a variety of machine learning problems.
Language: Python - Size: 150 KB - Last synced at: about 2 years ago - Pushed at: about 7 years ago - Stars: 4 - Forks: 5

kocharshaivi19/Stock-Analysis-and-Prediction
Financial Forecasting and its correlation with Human Sentiments using Distributed Computing on Spark Framework
Language: Scala - Size: 811 KB - Last synced at: about 2 years ago - Pushed at: over 7 years ago - Stars: 4 - Forks: 4

merrillm1/Olist_Recommender_System
Recommendation engine with a .97 AUC achieved using clustering techniques to create user features. Data represents Olist marketplace transactions and was retrieved from kaggle.com.
Language: Jupyter Notebook - Size: 77.4 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 3 - Forks: 6

SainathDutkar/Fraud_Transaction_Monitor
For detecting the fraud credit card transactions at real time
Language: Scala - Size: 1000 KB - Last synced at: about 2 years ago - Pushed at: about 3 years ago - Stars: 3 - Forks: 4

mnassrib/pyspark-examples
This tutorial presents some examples in order to give a quick overview of the Spark APIs.
Language: Jupyter Notebook - Size: 8.48 MB - Last synced at: 12 months ago - Pushed at: over 4 years ago - Stars: 3 - Forks: 0

satyajeetmaharana/floodprediction
The goal of this project is to identify the flood-prone areas with probabilities of flood in counties in a future date, using Spark MLLib.
Language: Scala - Size: 3.46 MB - Last synced at: over 1 year ago - Pushed at: over 5 years ago - Stars: 3 - Forks: 1

DerrickBu/Movie_Recommendation_Application
This is a web-based movie recommendation application written in Scala using Apache Spark and Livy.
Language: Scala - Size: 17.6 KB - Last synced at: about 2 years ago - Pushed at: over 5 years ago - Stars: 3 - Forks: 2

polaroidz/sales_prediction
A Production Machine Learning Pipeline for Predicting Future Sales with Spark
Language: Jupyter Notebook - Size: 90.8 KB - Last synced at: over 1 year ago - Pushed at: over 6 years ago - Stars: 3 - Forks: 0

jacopocav/spark-ifs
Iterative filter-based feature selection on large datasets with Apache Spark
Language: Scala - Size: 130 KB - Last synced at: almost 2 years ago - Pushed at: almost 7 years ago - Stars: 3 - Forks: 0

suyash248/recommender_system
Recommendation system using Graph DB(Neo4j), Apache Spark & Machine learning.
Language: Jupyter Notebook - Size: 17.4 MB - Last synced at: 4 months ago - Pushed at: over 7 years ago - Stars: 3 - Forks: 1

LuisFalva/ophelia
Ophelian On Mars! More than a simple framework.
Language: Python - Size: 2.16 MB - Last synced at: about 1 month ago - Pushed at: 2 months ago - Stars: 2 - Forks: 5

DebanjanSarkar/pyspark-maestro
This repo contains implementations of PySpark for real-world use cases for batch data processing, streaming data processing sourced from Kafka, sockets, etc., spark optimizations, business specific bigdata processing scenario solutions, and machine learning use cases.
Language: Jupyter Notebook - Size: 66.1 MB - Last synced at: about 1 month ago - Pushed at: 10 months ago - Stars: 2 - Forks: 1

grishenkovp/apache_spark
Изучение Apache Spark. Библиотека PySpark
Language: Jupyter Notebook - Size: 135 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 2 - Forks: 0

sunujh6/spark_practice
Language: Jupyter Notebook - Size: 1.62 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 2 - Forks: 0

forons/BigDataExamples
Code repository for the MSc course "Big Data and Social Networks" of the University of Trento
Language: Jupyter Notebook - Size: 229 MB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 2 - Forks: 13

kapilthakre/Bicycle-Sharing-Demand-Forecasting-Using-Spark-Scala
In this project, we are going to build a Bicycle sharing demand prediction service using Apache Spark and Scala. I have created a two spark application one for model generation and another for model demand prediction.
Language: Scala - Size: 295 KB - Last synced at: about 2 years ago - Pushed at: over 4 years ago - Stars: 2 - Forks: 0

tweichle/Spark-for-Big-Data
Spark: Work with Big Data and Build Machine Learning Models at Scale
Language: Jupyter Notebook - Size: 63.5 KB - Last synced at: almost 2 years ago - Pushed at: almost 5 years ago - Stars: 2 - Forks: 1

askmrsinh/spark-stocksim
Monte Carlo stock simulation using Apache Spark.
Language: Scala - Size: 1.81 MB - Last synced at: about 2 years ago - Pushed at: about 5 years ago - Stars: 2 - Forks: 2

Aveek-Saha/Cricket-score-predictor
A Big data application to predict the outcome of a T20 cricket match.
Language: Jupyter Notebook - Size: 2.17 MB - Last synced at: about 1 month ago - Pushed at: about 6 years ago - Stars: 2 - Forks: 0

joyoyoyoyoyo/emojipasta-topic-modeling
😅 A topic model of reddit.com/r/EmojiPasta trained with Spark and an LDA model (NSFW) - Trigger Warning: The r/emojipasta subreddit posts controversial content and anything I have crawled is to provide visibility of a topic modeling some of this controversial content. Unfortunately there is also discriminatory speech which must be called out!
Language: Scala - Size: 700 KB - Last synced at: about 2 years ago - Pushed at: about 7 years ago - Stars: 2 - Forks: 0

crazyalin92/movie_recomendation_system
Spark MLLIB: Collaborative Filtering Movie Recommendation System
Language: Scala - Size: 5.6 MB - Last synced at: about 2 years ago - Pushed at: over 7 years ago - Stars: 2 - Forks: 1

SayamAlt/Amazon-Products-API-ETL-and-ML-pipeline
In this project, I've created an end-to-end ETL pipeline and subsequently developed a machine learning model to predict the price of Amazon products based on several product-related features.
Language: Python - Size: 2.95 MB - Last synced at: 2 months ago - Pushed at: 6 months ago - Stars: 1 - Forks: 0

zikzakjack/spark-demos
Apache Spark Demos
Language: Jupyter Notebook - Size: 103 KB - Last synced at: 8 months ago - Pushed at: 8 months ago - Stars: 1 - Forks: 0

aliabbasi2000/Spark
Solving Big Data Problems using Spark framework in Java. Running the Project on HDFS clusters (BigData@Polito) to get the results.
Language: Java - Size: 143 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 1 - Forks: 0

simbafl/spark-branch-2.4
源码剖析Spark2.4
Language: Scala - Size: 17.8 MB - Last synced at: 29 days ago - Pushed at: almost 2 years ago - Stars: 1 - Forks: 1

svngoku/Pyspark-pour-les-datas-engineers
Introduction à Pyspark pour les Data Engineers par la pratique
Language: Jupyter Notebook - Size: 784 KB - Last synced at: about 1 year ago - Pushed at: over 2 years ago - Stars: 1 - Forks: 0

MostafaToema/Stroke-Prediction-using-Pyspark
Data preparation, visualization, and feature engineering and classification of people have stroke using pyspark libraries
Language: Jupyter Notebook - Size: 79.1 KB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 1 - Forks: 0

aabdel-kader/Apache-Spark
A repository for my practices and projects using pyspark
Language: Jupyter Notebook - Size: 11.6 MB - Last synced at: about 1 year ago - Pushed at: over 3 years ago - Stars: 1 - Forks: 0

szaher/spark
Playing with Spark using Java
Language: Java - Size: 424 KB - Last synced at: about 1 year ago - Pushed at: almost 4 years ago - Stars: 1 - Forks: 0

Coursal/Text-Sentiment-Analysis-In-Hadoop-And-Spark
The source code developed and used for the purposes of my thesis with the same title under the guidance of my supervisor professor Vasilis Mamalis for the Department of Informatics and Computer Engineering of the University of West Attica.
Language: Java - Size: 66.5 MB - Last synced at: about 1 year ago - Pushed at: about 4 years ago - Stars: 1 - Forks: 1

angeligareta/machine-learning-spark
Assignment for Scalable Machine Learning which aims to study the basics of regression and classification in Spark.
Language: Scala - Size: 1.42 MB - Last synced at: 2 months ago - Pushed at: over 4 years ago - Stars: 1 - Forks: 0

angeligareta/spark-hadoop-hbase-overview
First lab for Data-Intensive Computing course at KTH where we are introduced to Apache Spark MLlib and Spark SQL, Hadoop, and HBase.
Language: Jupyter Notebook - Size: 22.4 MB - Last synced at: 2 months ago - Pushed at: over 4 years ago - Stars: 1 - Forks: 0

anmolmore/Enzyme-Classifier-Using-ML
Classify enzymes with geomic sequence using spark-ML
Language: Jupyter Notebook - Size: 719 MB - Last synced at: about 2 years ago - Pushed at: over 4 years ago - Stars: 1 - Forks: 0

shantanu-93/scalable-matrix-multiply
Fast and Scalable Matrix Multiply using spark, breeze and BLAS libraries
Language: JavaScript - Size: 127 MB - Last synced at: about 2 years ago - Pushed at: over 4 years ago - Stars: 1 - Forks: 0

jr2ngb2/yelp_recommender
Language: Jupyter Notebook - Size: 45.9 KB - Last synced at: almost 2 years ago - Pushed at: about 6 years ago - Stars: 1 - Forks: 0

corneliouzbett/Master-Apache-Spark
Apache Spark is a fast and general-purpose cluster computing system. It provides high-level APIs in Java, Scala, Python and R, and an optimized engine that supports general execution graphs. It also supports a rich set of higher-level tools including Spark SQL for SQL and structured data processing, MLlib for machine learning, GraphX for graph processing, and Spark Streaming
Language: Python - Size: 889 KB - Last synced at: about 2 years ago - Pushed at: about 6 years ago - Stars: 1 - Forks: 0

multivacplatform/multivac-nlp
Testing and benchmarking some of the existing NLP libraries in Apache Spark
Language: Scala - Size: 12 MB - Last synced at: 3 months ago - Pushed at: over 6 years ago - Stars: 1 - Forks: 1

nahidalam/Spark
Spark, Python, AWS EMR, MLLib, Spark Streaming, Spark - SQL
Language: Jupyter Notebook - Size: 2.93 KB - Last synced at: almost 2 years ago - Pushed at: over 6 years ago - Stars: 1 - Forks: 1

AswathKiruba/Stock_Price_Prediction
This is the CSYE7200 Big Data Systems Engineering Using Scala Final Project for Team 9 Fall 2018
Language: Scala - Size: 3.48 MB - Last synced at: about 1 year ago - Pushed at: over 6 years ago - Stars: 1 - Forks: 1

tertiarycourses/ApacheSparkTraining
Exercise files for Apache Spark Essential Training
Language: Jupyter Notebook - Size: 4.05 MB - Last synced at: about 2 years ago - Pushed at: over 6 years ago - Stars: 1 - Forks: 1

AndrewKuzmin/spark-ml-pipelines-with-structured-streaming-examples
Examples of using Apache Spark MLlib Pipelines and Structured Streaming on version 2.4.0
Language: Shell - Size: 1020 KB - Last synced at: almost 2 years ago - Pushed at: over 6 years ago - Stars: 1 - Forks: 0

gavalle94/Songs-Recommender
Recommendation System written in Python, using the pySpark framework and other Data Science libraries
Language: HTML - Size: 5.23 MB - Last synced at: almost 2 years ago - Pushed at: about 7 years ago - Stars: 1 - Forks: 1

lookuut/raif-competition
Spark application for prediction home and work coordinates of the customer by payment transactions
Language: Scala - Size: 27.7 MB - Last synced at: over 1 year ago - Pushed at: about 7 years ago - Stars: 1 - Forks: 0

shubham-deb/Neural-Circuit-Tracer
This repository contains the source codes & scripts of my project for Master's level course - CS6240 Parallel Data Processing in Map-Reduce course at College of Computer & Information Science, Northeastern University, Boston MA.
Language: Scala - Size: 663 KB - Last synced at: about 1 year ago - Pushed at: over 7 years ago - Stars: 1 - Forks: 0

shubham-deb/Spark_Scala_Programs
This repository contains all the Spark Scala programs that I have implemented during my Master's level course - CS6240 Parallel Data Processing in Map-Reduce course at College of Computer & Information Science, Northeastern University, Boston MA.
Language: Makefile - Size: 4 MB - Last synced at: about 1 year ago - Pushed at: over 7 years ago - Stars: 1 - Forks: 0

samanta-anupam/similar-water-regions
In this project we look at the global surface water explorer and find patches of areas that are similar to each other in the entire world using the European Commision Global Surface satellite water dataset
Language: Jupyter Notebook - Size: 346 KB - Last synced at: over 1 year ago - Pushed at: over 7 years ago - Stars: 1 - Forks: 0

abulbasar/zeppelin-notebooks
Size: 3.91 KB - Last synced at: about 1 year ago - Pushed at: over 7 years ago - Stars: 1 - Forks: 1

robzoros/TFP-Utad-InferenciaEtiquetas
Trabajo fin Programa Experto en Big Data: Procesar set de imágenes de MIRFLICKR en Spark TendorFlow Inception y entrenar modelo para inferencia de etiquetas. Clasificación de imágenes subidas a Twitter con Storm.
Language: Java - Size: 51.8 KB - Last synced at: 8 days ago - Pushed at: over 7 years ago - Stars: 1 - Forks: 0

bassrehab/zerofish-imaging
Using the Thunder Library for Image Processing with Spark ML Lib
Language: Python - Size: 1.83 MB - Last synced at: almost 2 years ago - Pushed at: about 8 years ago - Stars: 1 - Forks: 0

dorianbg/cs110x-big-data-analysis-with-spark-labs
Graded lab exercises from the CS110x Big Data Analysis with Apache Spark online course on edx
Language: Jupyter Notebook - Size: 74.2 KB - Last synced at: about 2 years ago - Pushed at: over 8 years ago - Stars: 1 - Forks: 2

keanteng/wqd7007-project
Big Data Pipeline for NYC Taxi Trips
Language: Jupyter Notebook - Size: 8.47 MB - Last synced at: about 9 hours ago - Pushed at: about 10 hours ago - Stars: 0 - Forks: 0

lukilme/general-machine-learning-studies
repository for storing practices and studies in the area of machine learning
Language: Jupyter Notebook - Size: 7.81 KB - Last synced at: 12 days ago - Pushed at: 12 days ago - Stars: 0 - Forks: 0
