An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: spark-mllib

MHassaanButt/Crime-Spark-ML

In this project I stream data and do crime classification using Spark. This dataset contains incidents derived from the SFPD Crime Incident Reporting system. The data ranges from 1/1/2003 to 5/13/2015. I do some data analysis of crime scenes in different areas and with respect to other parameters.

Language: Python - Size: 5.86 KB - Last synced at: almost 2 years ago - Pushed at: over 3 years ago - Stars: 4 - Forks: 0

spoddutur/spark-ml-dashboard

Spark ML Dashboard built to plug-in and tweak the model params to real-time verify classification results on sample test data

Language: Scala - Size: 20.5 MB - Last synced at: about 2 years ago - Pushed at: over 7 years ago - Stars: 5 - Forks: 2

jingpeicomp/product-relation-mining

商品关联关系挖掘,使用Spring Boot开发框架和Spark MLlib机器学习框架,通过FP-Growth算法,分析用户的购物车商品数据,挖掘商品之间的关联关系。项目对外提供RESTFul接口。

Language: Java - Size: 68.4 KB - Last synced at: about 2 years ago - Pushed at: almost 4 years ago - Stars: 24 - Forks: 16

rishiravikumar-tul-scm/IPL-Analysis

IPL Match Simulation using K-means Clustering and Collaborative Filtering.

Language: Python - Size: 2.64 MB - Last synced at: almost 2 years ago - Pushed at: over 5 years ago - Stars: 0 - Forks: 0

ZhipengHong0123/Steam-Game-Analysis

Language: HTML - Size: 3.68 MB - Last synced at: about 2 years ago - Pushed at: about 3 years ago - Stars: 0 - Forks: 4

samanta-anupam/similar-water-regions

In this project we look at the global surface water explorer and find patches of areas that are similar to each other in the entire world using the European Commision Global Surface satellite water dataset

Language: Jupyter Notebook - Size: 346 KB - Last synced at: about 1 year ago - Pushed at: over 7 years ago - Stars: 1 - Forks: 0

tweichle/Spark-for-Big-Data

Spark: Work with Big Data and Build Machine Learning Models at Scale

Language: Jupyter Notebook - Size: 63.5 KB - Last synced at: almost 2 years ago - Pushed at: almost 5 years ago - Stars: 2 - Forks: 1

tkachuksergiy/aws-spark-nlp

Works related to recent project on the use of Apache Spark and AWS cloud for NLP task.

Language: Jupyter Notebook - Size: 2.76 MB - Last synced at: almost 2 years ago - Pushed at: about 5 years ago - Stars: 5 - Forks: 0

grishenkovp/databricks

Коллекция кейсов на базе платформы Databricks

Language: Jupyter Notebook - Size: 504 KB - Last synced at: almost 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

MostafaToema/Stroke-Prediction-using-Pyspark

Data preparation, visualization, and feature engineering and classification of people have stroke using pyspark libraries

Language: Jupyter Notebook - Size: 79.1 KB - Last synced at: almost 2 years ago - Pushed at: over 3 years ago - Stars: 1 - Forks: 0

nbegumc/market-basket-analysis

Finding frequent itemsets using Apriori and FP Growth algorithm on Spark

Language: Jupyter Notebook - Size: 692 KB - Last synced at: almost 2 years ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0

DavideNardone/TwitterSentimentAnalysis

A Spark Streaming implementation for Online Twitter Sentiment Analysis.

Language: Python - Size: 1.78 MB - Last synced at: about 2 years ago - Pushed at: about 7 years ago - Stars: 8 - Forks: 3

josemarialuna/ClusterIndices

This package contains the code for executing clustering validity indices in Spark. The package includes BD-Silhouette, BD-Dunn, Davies-Bouldin and WSSSE indices.

Language: Scala - Size: 588 KB - Last synced at: about 2 years ago - Pushed at: over 6 years ago - Stars: 10 - Forks: 3

lp-dataninja/SparkML

Detailed notes and code to learn machine learning with Apache Spark.

Language: Jupyter Notebook - Size: 4.06 MB - Last synced at: about 2 years ago - Pushed at: over 6 years ago - Stars: 12 - Forks: 17

giuseppegambino/Italian-Sentiment-Analysis-with-Spark

Application of Sentiment Analysis of Italian tweet with Python and Spark

Language: Python - Size: 6.84 KB - Last synced at: about 2 years ago - Pushed at: about 4 years ago - Stars: 10 - Forks: 0

TrainingByPackt/Big-Data-Processing-with-Apache-Spark-eLearning

Efficiently tackle large datasets and perform big data analysis with Spark and Python

Language: Python - Size: 36.1 KB - Last synced at: 17 days ago - Pushed at: over 6 years ago - Stars: 7 - Forks: 6

OmarZOS/http-storage-service-mediator

This service is a component inside the petroleum production information system that I conceived and proposed.

Language: JavaScript - Size: 81.1 KB - Last synced at: almost 2 years ago - Pushed at: about 3 years ago - Stars: 0 - Forks: 0

askmrsinh/spark-stocksim

Monte Carlo stock simulation using Apache Spark.

Language: Scala - Size: 1.81 MB - Last synced at: about 2 years ago - Pushed at: about 5 years ago - Stars: 2 - Forks: 2

demanejar/sparkml

Demo clustering with LDA Spark MLlib

Language: Python - Size: 831 KB - Last synced at: about 2 years ago - Pushed at: about 3 years ago - Stars: 0 - Forks: 0

kapilthakre/Bicycle-Sharing-Demand-Forecasting-Using-Spark-Scala

In this project, we are going to build a Bicycle sharing demand prediction service using Apache Spark and Scala. I have created a two spark application one for model generation and another for model demand prediction.

Language: Scala - Size: 295 KB - Last synced at: about 2 years ago - Pushed at: over 4 years ago - Stars: 2 - Forks: 0

merrillm1/Olist_Recommender_System

Recommendation engine with a .97 AUC achieved using clustering techniques to create user features. Data represents Olist marketplace transactions and was retrieved from kaggle.com.

Language: Jupyter Notebook - Size: 77.4 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 3 - Forks: 6

MostafaToema/Titanic-Survival-Prediction-using-Pyspark

Data preparation, visualization and feature engineering and classification of survival people using pyspark libraries

Language: Jupyter Notebook - Size: 43 KB - Last synced at: almost 2 years ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0

MostafaToema/Wuzzuf-jobs

Wuzzuf data analysis, visualization and apply k-Mean algorithm using Spark-java

Language: Java - Size: 565 KB - Last synced at: almost 2 years ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0

MostafaToema/PySpark-Practices

How to use Pyspark libraries with real data

Language: Jupyter Notebook - Size: 2.22 MB - Last synced at: almost 2 years ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0

uosdmlab/spark-nkp

Natural Korean Processor for Apache Spark

Language: Scala - Size: 53.7 KB - Last synced at: 5 months ago - Pushed at: about 7 years ago - Stars: 53 - Forks: 16

yizhiru/mllibX

A customized version of mllib

Language: Scala - Size: 35.2 KB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0

simbafl/spark-branch-2.4

源码剖析Spark2.4

Language: Scala - Size: 17.8 MB - Last synced at: 4 days ago - Pushed at: almost 2 years ago - Stars: 1 - Forks: 1

szaher/spark

Playing with Spark using Java

Language: Java - Size: 424 KB - Last synced at: about 1 year ago - Pushed at: almost 4 years ago - Stars: 1 - Forks: 0

AlexKbit/titanic-sparkml

Sample with SparkML on Titanic dataset

Language: Scala - Size: 34.2 KB - Last synced at: about 2 years ago - Pushed at: almost 6 years ago - Stars: 0 - Forks: 0

saLeox/Lambda-ServingAPI-HousePricePredict

Embeding the machine learning from spark-mllib into Springboot to provide house price prediction API

Language: Java - Size: 26.4 KB - Last synced at: almost 2 years ago - Pushed at: almost 4 years ago - Stars: 0 - Forks: 0

saLeox/Lambda-BatchTraining-HousePricePredict

Use Spark-Milib Java library to perform machine learning (regression problem)

Language: Jupyter Notebook - Size: 2.06 MB - Last synced at: almost 2 years ago - Pushed at: almost 4 years ago - Stars: 0 - Forks: 0

akashsethi24/Machine-Learning

Examples of all Machine Learning Algorithm in Apache Spark

Language: Scala - Size: 3.64 MB - Last synced at: about 2 years ago - Pushed at: over 7 years ago - Stars: 15 - Forks: 10

giovannigarifo/bigdata

Code samples, summaries, cheatsheets and other study material for Hadoop MapReduce and Apache Spark

Language: Java - Size: 69.1 MB - Last synced at: about 2 years ago - Pushed at: over 6 years ago - Stars: 5 - Forks: 2

dorianbg/cs110x-big-data-analysis-with-spark-labs

Graded lab exercises from the CS110x Big Data Analysis with Apache Spark online course on edx

Language: Jupyter Notebook - Size: 74.2 KB - Last synced at: about 2 years ago - Pushed at: over 8 years ago - Stars: 1 - Forks: 2

angeligareta/machine-learning-spark

Assignment for Scalable Machine Learning which aims to study the basics of regression and classification in Spark.

Language: Scala - Size: 1.42 MB - Last synced at: about 1 month ago - Pushed at: about 4 years ago - Stars: 1 - Forks: 0

angeligareta/spark-hadoop-hbase-overview

First lab for Data-Intensive Computing course at KTH where we are introduced to Apache Spark MLlib and Spark SQL, Hadoop, and HBase.

Language: Jupyter Notebook - Size: 22.4 MB - Last synced at: about 1 month ago - Pushed at: about 4 years ago - Stars: 1 - Forks: 0

Bcromas/pyspark_projects

A collection of small projects exploring PySpark features and functionality including packages and modules, algorithms, and general data science techniques.

Language: Jupyter Notebook - Size: 80.1 KB - Last synced at: almost 2 years ago - Pushed at: over 4 years ago - Stars: 0 - Forks: 0

sunujh6/spark_practice

Language: Jupyter Notebook - Size: 1.62 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 2 - Forks: 0

tertiarycourses/ApacheSparkTraining

Exercise files for Apache Spark Essential Training

Language: Jupyter Notebook - Size: 4.05 MB - Last synced at: about 2 years ago - Pushed at: over 6 years ago - Stars: 1 - Forks: 1

Zoe-0925/PredictRain

A python program that predicts the rainfall using Spark MLlib machine learning algorithms.

Language: Jupyter Notebook - Size: 3.62 MB - Last synced at: about 2 years ago - Pushed at: over 4 years ago - Stars: 0 - Forks: 1

lowener/Spark-social-network-backend

Backend for a social network in Spark in Scala

Language: Scala - Size: 79.8 MB - Last synced at: about 2 years ago - Pushed at: almost 5 years ago - Stars: 0 - Forks: 0

SainathDutkar/Fraud_Transaction_Monitor

For detecting the fraud credit card transactions at real time

Language: Scala - Size: 1000 KB - Last synced at: about 2 years ago - Pushed at: almost 3 years ago - Stars: 3 - Forks: 4

joyoyoyoyoyo/emojipasta-topic-modeling

😅 A topic model of reddit.com/r/EmojiPasta trained with Spark and an LDA model (NSFW) - Trigger Warning: The r/emojipasta subreddit posts controversial content and anything I have crawled is to provide visibility of a topic modeling some of this controversial content. Unfortunately there is also discriminatory speech which must be called out!

Language: Scala - Size: 700 KB - Last synced at: about 2 years ago - Pushed at: about 7 years ago - Stars: 2 - Forks: 0

crazyalin92/movie_recomendation_system

Spark MLLIB: Collaborative Filtering Movie Recommendation System

Language: Scala - Size: 5.6 MB - Last synced at: about 2 years ago - Pushed at: over 7 years ago - Stars: 2 - Forks: 1

billyean/ztml

Implementation to coursera machine learning course, some tensor flow code.

Language: Python - Size: 305 MB - Last synced at: about 2 years ago - Pushed at: almost 5 years ago - Stars: 0 - Forks: 0

shantanu-93/scalable-matrix-multiply

Fast and Scalable Matrix Multiply using spark, breeze and BLAS libraries

Language: JavaScript - Size: 127 MB - Last synced at: about 2 years ago - Pushed at: over 4 years ago - Stars: 1 - Forks: 0

vaibhav50596/DeerfootTrailAnalysis

The goal is to train a linear regression model to predict Deerfoot commute times given weather and accident conditions using Spark RDD and MLlib

Language: Jupyter Notebook - Size: 82 KB - Last synced at: about 2 years ago - Pushed at: about 5 years ago - Stars: 0 - Forks: 1

desaiankitb/spark-mllib

Apache Spark is one of the most widely used and supported open-source tools for machine learning and big data. In this repo, discover how to work with this powerful platform for machine learning. This repo discusses MLlib—the Spark machine learning library—which provides tools for data scientists and analysts who would rather find solutions to business problems than code, test, and maintain their own machine learning libraries. Repo shows how to use DataFrames to organize data structure, and covers data preparation and the most commonly used types of machine learning algorithms: clustering, classification, regression, and recommendations. You will have experience loading data into Spark, preprocessing data as needed to apply MLlib algorithms, and applying those algorithms to a variety of machine learning problems.

Language: Python - Size: 150 KB - Last synced at: almost 2 years ago - Pushed at: almost 7 years ago - Stars: 4 - Forks: 5

abhay6694/PySpark-Component

Collection of spark-components functions for big-data processing

Language: Jupyter Notebook - Size: 12.7 KB - Last synced at: about 2 years ago - Pushed at: about 5 years ago - Stars: 0 - Forks: 0

nahidalam/Spark

Spark, Python, AWS EMR, MLLib, Spark Streaming, Spark - SQL

Language: Jupyter Notebook - Size: 2.93 KB - Last synced at: almost 2 years ago - Pushed at: over 6 years ago - Stars: 1 - Forks: 1

Aveek-Saha/Cricket-score-predictor

A Big data application to predict the outcome of a T20 cricket match.

Language: Jupyter Notebook - Size: 2.17 MB - Last synced at: 17 days ago - Pushed at: about 6 years ago - Stars: 2 - Forks: 0

DerrickBu/Movie_Recommendation_Application

This is a web-based movie recommendation application written in Scala using Apache Spark and Livy.

Language: Scala - Size: 17.6 KB - Last synced at: about 2 years ago - Pushed at: over 5 years ago - Stars: 3 - Forks: 2

jeanlks/sparkCourse

Spark Course notebooks.

Language: Jupyter Notebook - Size: 714 KB - Last synced at: about 2 years ago - Pushed at: over 5 years ago - Stars: 0 - Forks: 0

BinhMinhs10/DataMiningExample

Maven project cover scala language: sparkml, spark_streaming, spark_dataframe, ... + java language: threadpool, kafka, jpa, timer, request api

Language: Scala - Size: 1.32 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 1

trendyol-data-eng-summer-intern-2019/recom-engine-ml

ML component of the project, which is written with Spark ML.

Language: Java - Size: 12.7 KB - Last synced at: over 1 year ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 1

nath2709/spark-ml

Language: Scala - Size: 558 KB - Last synced at: about 2 years ago - Pushed at: almost 6 years ago - Stars: 0 - Forks: 0

maniram-yadav/Spark_And_Scala_codes

Language: Scala - Size: 498 KB - Last synced at: about 2 years ago - Pushed at: about 7 years ago - Stars: 0 - Forks: 1

AndrewKuzmin/spark-ml-pipelines-with-structured-streaming-examples

Examples of using Apache Spark MLlib Pipelines and Structured Streaming on version 2.4.0

Language: Shell - Size: 1020 KB - Last synced at: almost 2 years ago - Pushed at: over 6 years ago - Stars: 1 - Forks: 0

hichem/spark-training

Language: Jupyter Notebook - Size: 1.93 MB - Last synced at: about 2 years ago - Pushed at: about 6 years ago - Stars: 0 - Forks: 0

AswathKiruba/Stock_Price_Prediction

This is the CSYE7200 Big Data Systems Engineering Using Scala Final Project for Team 9 Fall 2018

Language: Scala - Size: 3.48 MB - Last synced at: about 1 year ago - Pushed at: over 6 years ago - Stars: 1 - Forks: 1

corneliouzbett/Master-Apache-Spark

Apache Spark is a fast and general-purpose cluster computing system. It provides high-level APIs in Java, Scala, Python and R, and an optimized engine that supports general execution graphs. It also supports a rich set of higher-level tools including Spark SQL for SQL and structured data processing, MLlib for machine learning, GraphX for graph processing, and Spark Streaming

Language: Python - Size: 889 KB - Last synced at: about 2 years ago - Pushed at: about 6 years ago - Stars: 1 - Forks: 0

multivacplatform/multivac-nlp

Testing and benchmarking some of the existing NLP libraries in Apache Spark

Language: Scala - Size: 12 MB - Last synced at: about 2 months ago - Pushed at: over 6 years ago - Stars: 1 - Forks: 1

suyash248/recommender_system

Recommendation system using Graph DB(Neo4j), Apache Spark & Machine learning.

Language: Jupyter Notebook - Size: 17.4 MB - Last synced at: 3 months ago - Pushed at: about 7 years ago - Stars: 3 - Forks: 1

rtahmasbi/Spark

Size: 18.6 KB - Last synced at: 2 months ago - Pushed at: over 6 years ago - Stars: 0 - Forks: 1

AliAminian/server-response-time-predictor

This is an example to show how Spark ML could be used to predict response time of a service for a server-side application

Language: Java - Size: 243 KB - Last synced at: over 1 year ago - Pushed at: over 6 years ago - Stars: 0 - Forks: 0

markusheilig/san-francisco-calls-for-service

Apache Spark mllib example for seminar 'AI with scala'

Language: CSS - Size: 21.6 MB - Last synced at: over 1 year ago - Pushed at: over 6 years ago - Stars: 0 - Forks: 0

lp-dataninja/PyData

Code and Data for PyData-Hyderabad-Chapter meetup

Language: HTML - Size: 1.24 MB - Last synced at: about 2 years ago - Pushed at: over 6 years ago - Stars: 0 - Forks: 3

nishantgandhi99/Santander-Product-Recommendation Fork of vaishalilambe/Team7_Santander_Product_Recommendation

Santander Product Recommendation for Santander Customer Dataset / Kaggle Competition

Language: Scala - Size: 8.68 MB - Last synced at: about 2 years ago - Pushed at: about 7 years ago - Stars: 0 - Forks: 1

memojja/recomendation-engine

Language: Java - Size: 35.7 MB - Last synced at: about 1 month ago - Pushed at: about 7 years ago - Stars: 0 - Forks: 0

vinfly/vinSparkMLlib

MLlib samples

Language: Scala - Size: 18.6 KB - Last synced at: almost 2 years ago - Pushed at: about 7 years ago - Stars: 0 - Forks: 0

irfanalidv/Breast_Cancer_Prediction_using_Apache_Spark

Predict whether the cancer is benign or malignant

Language: Jupyter Notebook - Size: 7.81 KB - Last synced at: almost 2 years ago - Pushed at: over 7 years ago - Stars: 0 - Forks: 0

happylittlebunny/Yelp-User-Pattern-And-Recommender-System

Yelp Toronto User Pattern Analysis and Recommender System

Language: Python - Size: 104 MB - Last synced at: about 2 years ago - Pushed at: over 7 years ago - Stars: 0 - Forks: 0

crazyalin92/spark-logistic-regression

Example of applying logistic regression to predict diabet of patients

Language: Scala - Size: 10.7 KB - Last synced at: about 2 years ago - Pushed at: over 7 years ago - Stars: 0 - Forks: 3

Related Keywords
spark-mllib 173 spark 111 spark-sql 55 spark-streaming 50 machine-learning 42 scala 40 pyspark 33 spark-ml 32 apache-spark 27 python 23 big-data 20 python3 16 hadoop 11 kafka 8 data-science 7 java 7 mongodb 7 hadoop-mapreduce 7 recommender-system 7 sparksql 7 sparkjava 7 big-data-analytics 6 bigdata 6 docker 6 sentiment-analysis 6 collaborative-filtering 6 jupyter-notebook 6 ml 5 nlp 5 linear-regression 5 spark-structured-streaming 5 clustering 5 data-analysis 5 kmeans-clustering 5 hdfs 4 feature-engineering 4 random-forest 4 natural-language-processing 4 data-visualization 4 hadoop-hdfs 4 databricks 3 structured-streaming 3 elasticsearch 3 apache-kafka 3 logistic-regression 3 kafka-streams 3 spark-nlp 3 docker-compose 3 delta-lake 3 rdd 3 twitter 3 pyspark-python 3 recommendation-system 3 aws-s3 3 mapreduce 3 apache-hadoop 3 alternating-least-squares 3 spring-boot 3 twitter-sentiment-analysis 3 visualization 3 supervised-learning 3 prediction 3 decision-trees 3 distributed-computing 2 neo4j 2 lda 2 pandas 2 product-recommender-system 2 product-recommendation 2 kafka-producer 2 kafka-consumer 2 cassandra 2 hadoop-framework 2 sbt 2 als 2 spark-rdd 2 spark-mllib-library 2 r 2 stock-price-prediction 2 graphx 2 topic-modeling 2 classification-algorithm 2 spark-streaming-kafka 2 spark-core 2 java-8 2 twitter-api 2 aws 2 naive-bayes 2 naive-bayes-classification 2 recommendation 2 decision-tree-classifier 2 classification 2 text-mining 2 matplotlib 2 pyspark-machine-learning 2 pyspark-api 2 spark-dataframes 2 kaggle 2 clustering-evaluation 2 price-prediction 2