Ecosyste.ms: Repos
An open API service providing repository metadata for many open source software ecosystems.
GitHub topics: spark-mllib
LuisFalva/ophelia
Ophelia a PySpark analytics wrapper.
Language: Python - Size: 860 KB - Last synced: 2 days ago - Pushed: 3 days ago - Stars: 1 - Forks: 5
abulbasar/zeppelin-notebooks
Size: 3.91 KB - Last synced: 14 days ago - Pushed: over 6 years ago - Stars: 1 - Forks: 1
aabdel-kader/Apache-Spark
A repository for my practices and projects using pyspark
Language: Jupyter Notebook - Size: 11.6 MB - Last synced: about 1 month ago - Pushed: over 2 years ago - Stars: 1 - Forks: 0
wengbenjue/spark_recomend
使用Spark的MLlib、Hbase作为模型、Hive作数据清洗的核心推荐引擎,在Spark on Yarn测试通过
Language: C - Size: 41.2 MB - Last synced: about 1 month ago - Pushed: about 7 years ago - Stars: 29 - Forks: 17
yennanliu/NYC_Taxi_Trip_Duration
Develop ML models predict taxi trip duration in NYC. Ranked : Top 6% | RMSLE : 0.377 (Kaggle) | #DS
Language: Jupyter Notebook - Size: 43.2 MB - Last synced: about 1 month ago - Pushed: over 1 year ago - Stars: 15 - Forks: 8
P7h/p7hb-docker-mllib-twitter-sentiment
:ship: Docker image for Twitter Sentiment analysis with Spark MLlib
Language: Shell - Size: 138 KB - Last synced: about 1 month ago - Pushed: almost 7 years ago - Stars: 7 - Forks: 3
IBM/db2-event-store-iot-analytics
IoT sensor temperature analysis and prediction with IBM Db2 Event Store
Language: Jupyter Notebook - Size: 36.6 MB - Last synced: about 1 month ago - Pushed: over 3 years ago - Stars: 4 - Forks: 21
P7h/Spark-MLlib-Twitter-Sentiment-Analysis
:star2: :sparkles: Analyze and visualize Twitter Sentiment on a world map using Spark MLlib
Language: Scala - Size: 19.7 MB - Last synced: about 1 month ago - Pushed: about 3 years ago - Stars: 135 - Forks: 69
qubole/sparklens
Qubole Sparklens tool for performance tuning Apache Spark
Language: Scala - Size: 175 KB - Last synced: 30 days ago - Pushed: 11 months ago - Stars: 547 - Forks: 130
aliabbasi2000/Spark
Solving Big Data Problems using Spark framework in Java. Running the Project on HDFS clusters (BigData@Polito) to get the results.
Language: Java - Size: 143 KB - Last synced: about 2 months ago - Pushed: about 2 months ago - Stars: 1 - Forks: 0
felidsche/movie-recommender
A movie recommendation system built using Apache Spark’s ML library
Language: Python - Size: 829 KB - Last synced: about 2 months ago - Pushed: about 3 years ago - Stars: 0 - Forks: 0
jaceklaskowski/spark-workshop
Apache Spark™ and Scala Workshops
Language: HTML - Size: 57 MB - Last synced: 16 days ago - Pushed: over 1 year ago - Stars: 253 - Forks: 143
josemarialuna/ExternalValidity
This package contains the code for calculating external clustering validity indices in Spark. The package includes Chi Index among others.
Language: Scala - Size: 146 KB - Last synced: 2 months ago - Pushed: 2 months ago - Stars: 9 - Forks: 1
IBM/icp4d-customer-churn-classifier
Infuse AI into your application. Create and deploy a customer churn prediction model with IBM Cloud Private for Data, Db2 Warehouse, Spark MLlib, and Jupyter notebooks.
Language: Jupyter Notebook - Size: 28.1 MB - Last synced: about 1 month ago - Pushed: 12 months ago - Stars: 17 - Forks: 22
LuckyZXL2016/Movie_Recommend
基于Spark的电影推荐系统,包含爬虫项目、web网站、后台管理系统以及spark推荐系统
Language: Java - Size: 55.1 MB - Last synced: 3 months ago - Pushed: about 5 years ago - Stars: 2,636 - Forks: 1,030
bobxwang/predict-stock-in-spark
using spark to predict stock, the data come from sina
Language: Scala - Size: 143 KB - Last synced: 3 months ago - Pushed: over 2 years ago - Stars: 0 - Forks: 0
uosdmlab/spark-nkp
Natural Korean Processor for Apache Spark
Language: Scala - Size: 53.7 KB - Last synced: 3 months ago - Pushed: about 6 years ago - Stars: 53 - Forks: 16
CaioBrainer/Hadoop_Projects
Pequenos projetos utilizando ferramentas do ecossistema Apache Hadoop
Language: Python - Size: 18.6 KB - Last synced: 3 months ago - Pushed: 3 months ago - Stars: 0 - Forks: 0
OrvilleX/MachineLearning
机器学习教程,本教程包含基于numpy、sklearn与tensorflow机器学习,也会包含利用spark、flink加快模型训练等用法。本着能够较全的引导读者入门机器学习。
Language: Python - Size: 10.1 MB - Last synced: 3 months ago - Pushed: 3 months ago - Stars: 47 - Forks: 16
Jayant1234/Malware-classification Fork of dsp-uga/sabayon-p1
Language: Jupyter Notebook - Size: 11.4 MB - Last synced: 4 months ago - Pushed: over 5 years ago - Stars: 0 - Forks: 0
derrickburns/generalized-kmeans-clustering
Spark library for generalized K-Means clustering. Supports general Bregman divergences. Suitable for clustering probabilistic data, time series data, high dimensional data, and very large data.
Language: HTML - Size: 7.42 MB - Last synced: 4 months ago - Pushed: 4 months ago - Stars: 284 - Forks: 51
marcocolangelo/Big-Data-processing-and-Analytics
The current repository contains all the code developed during the Big Data processing and Analytics laboratories. Data are processed and analyzed using Hadoop and Spark
Language: Java - Size: 6.1 MB - Last synced: 4 months ago - Pushed: 4 months ago - Stars: 0 - Forks: 0
mikerly131/serveUpRecos
Build a mock EMR app and integrate an AI/ML prediction into an encounter workflow
Language: CSS - Size: 54.9 MB - Last synced: 5 months ago - Pushed: 5 months ago - Stars: 0 - Forks: 0
polaroidz/sales_prediction
A Production Machine Learning Pipeline for Predicting Future Sales with Spark
Language: Jupyter Notebook - Size: 90.8 KB - Last synced: 5 months ago - Pushed: over 5 years ago - Stars: 3 - Forks: 0
venkateshavula/Evaluate-Spark-MLlib-using-PySpark
A UDF to evaluate Spark-MLlib classification model using PySpark
Language: Python - Size: 4.88 KB - Last synced: 5 months ago - Pushed: over 5 years ago - Stars: 0 - Forks: 0
miguelangel43/Prediction-Flight-Arrivals-Delays-Spark
Application that trains a classifier and predicts flight arrival delays based on past information. Uses the libraries pyspark.ml and pyspark.sql, performs feature engineering, cross-validation and tests various ML algorithms.
Language: Python - Size: 41 KB - Last synced: 7 days ago - Pushed: 5 months ago - Stars: 0 - Forks: 0
amitkumarusc/recommendation-system
A movie recommendation system trained on the MovieLens 20 Million dataset. This system makes use of Collaborative filtering methods to come up with recommendations for a particular user.
Language: Jupyter Notebook - Size: 21.8 MB - Last synced: 2 months ago - Pushed: over 4 years ago - Stars: 13 - Forks: 3
harishpuvvada/BitCoin-Value-Predictor 📦
[NOT MAINTAINED] Predicting Bit coin price using Time series analysis and sentiment analysis of tweets on bitcoin
Language: Jupyter Notebook - Size: 3.07 MB - Last synced: 6 months ago - Pushed: over 4 years ago - Stars: 113 - Forks: 29
ShubhamJagtap2000/Spark-Python
🐍💥Python and Spark for Big Data
Language: Jupyter Notebook - Size: 73.2 KB - Last synced: 7 months ago - Pushed: 7 months ago - Stars: 0 - Forks: 0
wzhe06/SparkCTR
CTR prediction model based on spark(LR, GBDT, DNN)
Language: Scala - Size: 35 MB - Last synced: 7 months ago - Pushed: about 4 years ago - Stars: 880 - Forks: 264
omerbsezer/SparkDeepMlpGADow30 📦
A Deep Neural-Network based (Deep MLP) Stock Trading System based on Evolutionary (Genetic Algorithm) Optimized Technical Analysis Parameters (using Apache Spark MLlib)
Language: Java - Size: 213 MB - Last synced: 7 months ago - Pushed: about 6 years ago - Stars: 58 - Forks: 46
polaternez/Introduction-to-Big-Data
Big Data projects for beginners
Language: Java - Size: 4.59 MB - Last synced: 7 months ago - Pushed: 7 months ago - Stars: 0 - Forks: 0
databricks/LearningSparkV2
This is the github repo for Learning Spark: Lightning-Fast Data Analytics [2nd Edition]
Language: Scala - Size: 75.2 MB - Last synced: 7 months ago - Pushed: over 1 year ago - Stars: 1,009 - Forks: 647
alessandrolulli/reforest
Random Forests in Apache Spark
Language: Scala - Size: 71 MB - Last synced: 7 months ago - Pushed: almost 5 years ago - Stars: 72 - Forks: 11
rtahmasbi/Spark
Size: 18.6 KB - Last synced: 7 months ago - Pushed: over 5 years ago - Stars: 0 - Forks: 1
MNoorFawi/linear-regression-with-spark
Creating an SBT-based Spark Application to predict Online News Popularity using Linear Regression Algorithm ...
Language: Scala - Size: 18.6 KB - Last synced: 7 months ago - Pushed: over 5 years ago - Stars: 0 - Forks: 0
MNoorFawi/kmeans-clustering-with-spark
Creating an sbt Apache Spark application to perform customer segmentation using Spark MLlib KMeans ...
Language: Scala - Size: 15.6 KB - Last synced: 7 months ago - Pushed: over 5 years ago - Stars: 0 - Forks: 1
jingpeicomp/product-category-predict
商品类目预测,使用 Spring Boot 开发框架和 Spark MLlib 机器学习框架,通过 TF-IDF 和 Bayes 算法,训练出一个商品类目预测模型。该模型可以根据商品名称自动预测出商品类目。项目对外提供 RESTFul 接口。
Language: Java - Size: 41.2 MB - Last synced: 7 months ago - Pushed: almost 3 years ago - Stars: 119 - Forks: 60
lookuut/raif-competition
Spark application for prediction home and work coordinates of the customer by payment transactions
Language: Scala - Size: 27.7 MB - Last synced: 9 months ago - Pushed: about 6 years ago - Stars: 1 - Forks: 0
akarsh3007/Recommendation-Systems
Simple Content based and Collaborative Filtering Algorithms implementaion
Language: Python - Size: 1.36 MB - Last synced: 9 months ago - Pushed: over 6 years ago - Stars: 0 - Forks: 0
jr2ngb2/yelp_recommender
Language: Jupyter Notebook - Size: 45.9 KB - Last synced: 9 months ago - Pushed: about 5 years ago - Stars: 1 - Forks: 0
xtutran/spark-tutor
Using spark-sql & spark-mllib to tackle Titanic & Movie Recomendation
Language: Scala - Size: 112 KB - Last synced: 9 months ago - Pushed: over 5 years ago - Stars: 0 - Forks: 0
trendyol-data-eng-summer-intern-2019/recom-engine-streaming
Streaming component of the project, which is written with Spark Streaming.
Language: Scala - Size: 15.6 KB - Last synced: 10 months ago - Pushed: almost 5 years ago - Stars: 0 - Forks: 1
agrimrules/brewery
A spark job that processes data scraped from the web
Language: Scala - Size: 17.6 KB - Last synced: 10 months ago - Pushed: almost 6 years ago - Stars: 0 - Forks: 0
NashTech-Labs/Sparkathon
A library having Java and Scala examples for Spark 2.x
Language: Java - Size: 113 MB - Last synced: 7 months ago - Pushed: over 7 years ago - Stars: 7 - Forks: 9
gavalle94/Songs-Recommender
Recommendation System written in Python, using the pySpark framework and other Data Science libraries
Language: HTML - Size: 5.23 MB - Last synced: 10 months ago - Pushed: about 6 years ago - Stars: 1 - Forks: 1
lucashomuniz/Project-14
Development of an AutoML System to Predict the Compressive Strength of Concrete
Language: Python - Size: 42 KB - Last synced: 10 months ago - Pushed: 10 months ago - Stars: 0 - Forks: 0
hoangviet148/Foody
Language: Python - Size: 17.6 MB - Last synced: 10 months ago - Pushed: over 2 years ago - Stars: 0 - Forks: 1
avcaliani/spark-ml-app
🤖
Language: Jupyter Notebook - Size: 118 KB - Last synced: 10 months ago - Pushed: almost 3 years ago - Stars: 0 - Forks: 0
Lucass97/FlightAnalysis
This project implemented a lambda architecture for analyzing domestic flight data in the US from 2009 to 2020. It used Apache Spark for batch processing, Spark Streaming for real-time analysis, and SVM models to predict flight cancellations and delays, with Docker for cluster management and Grafana for real-time visualization.
Language: Jupyter Notebook - Size: 5.66 MB - Last synced: 10 months ago - Pushed: 10 months ago - Stars: 0 - Forks: 1
shalakasaraogi/apache-spark-pig-hive-work
This repository contains Apache Spark, Apache Hive, Apache Pig work
Language: PigLatin - Size: 813 KB - Last synced: 10 months ago - Pushed: over 3 years ago - Stars: 0 - Forks: 0
jacopocav/spark-ifs
Iterative filter-based feature selection on large datasets with Apache Spark
Language: Scala - Size: 130 KB - Last synced: 10 months ago - Pushed: almost 6 years ago - Stars: 3 - Forks: 0
xghan99/bigdata-assignments
This repository consists of code I wrote for CS4225 - Big Data Systems for Data Science
Language: Jupyter Notebook - Size: 3.64 MB - Last synced: 10 months ago - Pushed: about 1 year ago - Stars: 0 - Forks: 0
shubham-deb/Neural-Circuit-Tracer
This repository contains the source codes & scripts of my project for Master's level course - CS6240 Parallel Data Processing in Map-Reduce course at College of Computer & Information Science, Northeastern University, Boston MA.
Language: Scala - Size: 663 KB - Last synced: about 2 months ago - Pushed: about 6 years ago - Stars: 1 - Forks: 0
shubham-deb/Spark_Scala_Programs
This repository contains all the Spark Scala programs that I have implemented during my Master's level course - CS6240 Parallel Data Processing in Map-Reduce course at College of Computer & Information Science, Northeastern University, Boston MA.
Language: Makefile - Size: 4 MB - Last synced: about 2 months ago - Pushed: about 6 years ago - Stars: 1 - Forks: 0
esap120/spark-twitter-streaming
Streaming Twitter Sentiment Analysis with Apache Spark
Language: Scala - Size: 8.79 KB - Last synced: 10 months ago - Pushed: over 5 years ago - Stars: 5 - Forks: 1
Siddharth1989/ProspectiveTopUpCustomerPrediction
Developed a model/Spark ML pipeline stream to identify potential customers that may purchase top up services in the future.
Language: Jupyter Notebook - Size: 6.17 MB - Last synced: 11 months ago - Pushed: 11 months ago - Stars: 0 - Forks: 0
huzaifakhan04/amazon-product-recommendation-system-web-application-flask-using-mongodb-pyspark-and-apache-kafka
This repository includes a web application connected to a product recommendation system developed with the comprehensive Amazon Review Data (2018) dataset, consisting of nearly 233.1 million records and occupying approximately 128 gigabytes (GB) of data storage, using MongoDB, PySpark, and Apache Kafka.
Language: Jupyter Notebook - Size: 91 MB - Last synced: 11 months ago - Pushed: 11 months ago - Stars: 0 - Forks: 0
ging/fiware-ml-supermarket
Demo: Predicting purchase volume in a supermarket using FIWARE
Language: JavaScript - Size: 290 KB - Last synced: about 1 month ago - Pushed: over 1 year ago - Stars: 4 - Forks: 1
cbozan/graduation-project
Graduation project categorizes popular search phrases using Python and Spark and presents them on a website to inspire creators.
Language: Python - Size: 402 KB - Last synced: 11 months ago - Pushed: over 1 year ago - Stars: 4 - Forks: 0
vaslnk/Spotify-Song-Recommendation-ML
UC Berkeley team's submission for RecSys Challenge 2018
Language: Jupyter Notebook - Size: 34.2 MB - Last synced: 9 months ago - Pushed: about 6 years ago - Stars: 82 - Forks: 21
surajsrivathsa/Supervised_Link_Prediction_Using_Spark_and_Neo4j
A project which involves analysis of Authorship graph data from Microsoft academic graph. In this project we calculate different graph features using temporal parameters of the authors and tried different classifiers. The final aim is to predict the link or coauthorsip possibility between two authors based on topological graph features and also find out the feasibility of performing this task on Neo4j and Spark
Language: Scala - Size: 15.8 MB - Last synced: 4 months ago - Pushed: almost 4 years ago - Stars: 5 - Forks: 4
MHassaanButt/Flight-Delays-Prediction
In this project, I used Decision Tree Learning Model as the main algorithm to build the model. Due to the big amount of flight data, we implement the project using MRJob, PySpark and Spark's MLlib then compare the performance and accuracy of those implementations.
Language: Jupyter Notebook - Size: 17.5 MB - Last synced: about 1 year ago - Pushed: over 2 years ago - Stars: 8 - Forks: 0
avaibh/Twitter-Bot-Detection
Big Data Stack: Spark, Kafka, Elasticsearch and NoSQL
Language: Jupyter Notebook - Size: 304 KB - Last synced: about 1 year ago - Pushed: about 4 years ago - Stars: 4 - Forks: 0
anmolmore/Enzyme-Classifier-Using-ML
Classify enzymes with geomic sequence using spark-ML
Language: Jupyter Notebook - Size: 719 MB - Last synced: about 1 year ago - Pushed: over 3 years ago - Stars: 1 - Forks: 0
svngoku/Pyspark-pour-les-datas-engineers
Introduction à Pyspark pour les Data Engineers par la pratique
Language: Jupyter Notebook - Size: 784 KB - Last synced: 16 days ago - Pushed: over 1 year ago - Stars: 1 - Forks: 0
bassrehab/zerofish-imaging
Using the Thunder Library for Image Processing with Spark ML Lib
Language: Python - Size: 1.83 MB - Last synced: 12 months ago - Pushed: about 7 years ago - Stars: 1 - Forks: 0
grishenkovp/apache_spark
Изучение Apache Spark. Библиотека PySpark
Language: Jupyter Notebook - Size: 135 KB - Last synced: 8 months ago - Pushed: 8 months ago - Stars: 2 - Forks: 0
satyajeetmaharana/floodprediction
The goal of this project is to identify the flood-prone areas with probabilities of flood in counties in a future date, using Spark MLLib.
Language: Scala - Size: 3.46 MB - Last synced: 4 months ago - Pushed: over 4 years ago - Stars: 3 - Forks: 1
forons/BigDataExamples
Code repository for the MSc course "Big Data and Social Networks" of the University of Trento
Language: Jupyter Notebook - Size: 229 MB - Last synced: about 1 year ago - Pushed: over 2 years ago - Stars: 2 - Forks: 13
amanjeetsahu/Apache-Spark-Tutorials
This repo contains my learnings and practice notebooks on Spark using PySpark (Python Language API on Spark). All the notebooks in the repo can be used as template code for most of the ML algorithms and can be built upon it for more complex problems.
Language: Jupyter Notebook - Size: 20.9 MB - Last synced: about 1 year ago - Pushed: over 2 years ago - Stars: 4 - Forks: 10
Rohini2505/Lending-Club-Loan-Analysis
Explanatory Data Analysis and ML model building using Apache Spark and PySpark
Language: HTML - Size: 6.26 MB - Last synced: about 1 year ago - Pushed: about 3 years ago - Stars: 10 - Forks: 12
ABigdataer/MovieRecommendSystem
基于Spark的电影推荐系统
Language: HTML - Size: 64.6 MB - Last synced: about 1 year ago - Pushed: over 1 year ago - Stars: 4 - Forks: 4
alefbt/SparkML-spring-scoring-poc 📦
POC of socring rest service od Spark ML Pipelines
Language: Java - Size: 234 KB - Last synced: about 1 year ago - Pushed: about 5 years ago - Stars: 0 - Forks: 0
corvaglia-alessio/big-data-labs 📦
Labs for the course "Big Data: architectures and data analytics" @ Politecnico di Torino a.y. 2021/22
Language: Java - Size: 53.4 MB - Last synced: about 1 year ago - Pushed: over 2 years ago - Stars: 0 - Forks: 0
chaithrakc/credit_card_default_prediction
Analyzing the likelihood of credit card delinquency without using credit scores or credit history
Language: Jupyter Notebook - Size: 4.16 MB - Last synced: about 1 year ago - Pushed: over 1 year ago - Stars: 0 - Forks: 0
huangyueranbbc/Spark_ALS
基于spark-ml,spark-mllib,spark-streaming的推荐算法实现
Language: Java - Size: 97.7 KB - Last synced: about 1 year ago - Pushed: almost 5 years ago - Stars: 89 - Forks: 46
kocharshaivi19/Stock-Analysis-and-Prediction
Financial Forecasting and its correlation with Human Sentiments using Distributed Computing on Spark Framework
Language: Scala - Size: 811 KB - Last synced: about 1 year ago - Pushed: over 6 years ago - Stars: 4 - Forks: 4
MHassaanButt/Crime-Spark-ML
In this project I stream data and do crime classification using Spark. This dataset contains incidents derived from the SFPD Crime Incident Reporting system. The data ranges from 1/1/2003 to 5/13/2015. I do some data analysis of crime scenes in different areas and with respect to other parameters.
Language: Python - Size: 5.86 KB - Last synced: about 1 year ago - Pushed: over 2 years ago - Stars: 4 - Forks: 0
spoddutur/spark-ml-dashboard
Spark ML Dashboard built to plug-in and tweak the model params to real-time verify classification results on sample test data
Language: Scala - Size: 20.5 MB - Last synced: about 1 year ago - Pushed: over 6 years ago - Stars: 5 - Forks: 2
jingpeicomp/product-relation-mining
商品关联关系挖掘,使用Spring Boot开发框架和Spark MLlib机器学习框架,通过FP-Growth算法,分析用户的购物车商品数据,挖掘商品之间的关联关系。项目对外提供RESTFul接口。
Language: Java - Size: 68.4 KB - Last synced: about 1 year ago - Pushed: almost 3 years ago - Stars: 24 - Forks: 16
wikistat/AI-Frameworks
Science des Données Saison 5: Technologies pour l'apprentissage automatique / statistique de données massives et l'Intelligence Artificielle
Language: Jupyter Notebook - Size: 646 MB - Last synced: 9 months ago - Pushed: 11 months ago - Stars: 40 - Forks: 41
Wadaboa/production-line-performance
Scala/Spark project, for Languages and Algorithms for Artificial Intelligence class at UNIBO
Language: Scala - Size: 31 MB - Last synced: about 1 year ago - Pushed: over 3 years ago - Stars: 1 - Forks: 0
rishiravikumar-tul-scm/IPL-Analysis
IPL Match Simulation using K-means Clustering and Collaborative Filtering.
Language: Python - Size: 2.64 MB - Last synced: about 1 year ago - Pushed: over 4 years ago - Stars: 0 - Forks: 0
FlorentF9/sparkml-som
:sparkles: Spark ML implementation of SOM algorithm (Kohonen self-organizing map)
Language: Scala - Size: 29.3 KB - Last synced: 7 months ago - Pushed: over 2 years ago - Stars: 16 - Forks: 6
ZhipengHong0123/Steam-Game-Analysis
Language: HTML - Size: 3.68 MB - Last synced: about 1 year ago - Pushed: about 2 years ago - Stars: 0 - Forks: 4
samanta-anupam/similar-water-regions
In this project we look at the global surface water explorer and find patches of areas that are similar to each other in the entire world using the European Commision Global Surface satellite water dataset
Language: Jupyter Notebook - Size: 346 KB - Last synced: 4 months ago - Pushed: over 6 years ago - Stars: 1 - Forks: 0
tweichle/Spark-for-Big-Data
Spark: Work with Big Data and Build Machine Learning Models at Scale
Language: Jupyter Notebook - Size: 63.5 KB - Last synced: 12 months ago - Pushed: almost 4 years ago - Stars: 2 - Forks: 1
tkachuksergiy/aws-spark-nlp
Works related to recent project on the use of Apache Spark and AWS cloud for NLP task.
Language: Jupyter Notebook - Size: 2.76 MB - Last synced: about 1 year ago - Pushed: about 4 years ago - Stars: 5 - Forks: 0
grishenkovp/databricks
Коллекция кейсов на базе платформы Databricks
Language: Jupyter Notebook - Size: 504 KB - Last synced: about 1 year ago - Pushed: over 1 year ago - Stars: 0 - Forks: 0
omerbsezer/SparkMlpDow30
A new stock trading and prediction model based on a MLP neural network utilizing technical analysis indicator values as features (using Apache Spark MLlib)
Language: Java - Size: 201 MB - Last synced: about 1 year ago - Pushed: about 6 years ago - Stars: 34 - Forks: 21
MostafaToema/Stroke-Prediction-using-Pyspark
Data preparation, visualization, and feature engineering and classification of people have stroke using pyspark libraries
Language: Jupyter Notebook - Size: 79.1 KB - Last synced: almost 1 year ago - Pushed: over 2 years ago - Stars: 1 - Forks: 0
nbegumc/market-basket-analysis
Finding frequent itemsets using Apriori and FP Growth algorithm on Spark
Language: Jupyter Notebook - Size: 692 KB - Last synced: about 1 year ago - Pushed: over 2 years ago - Stars: 0 - Forks: 0
DavideNardone/TwitterSentimentAnalysis
A Spark Streaming implementation for Online Twitter Sentiment Analysis.
Language: Python - Size: 1.78 MB - Last synced: about 1 year ago - Pushed: about 6 years ago - Stars: 8 - Forks: 3
omerbsezer/SparkMLlibExamples-Scala-
Spark MLlib Examples (Scala)
Language: Java - Size: 371 KB - Last synced: about 1 year ago - Pushed: over 6 years ago - Stars: 1 - Forks: 2
josemarialuna/ClusterIndices
This package contains the code for executing clustering validity indices in Spark. The package includes BD-Silhouette, BD-Dunn, Davies-Bouldin and WSSSE indices.
Language: Scala - Size: 588 KB - Last synced: about 1 year ago - Pushed: over 5 years ago - Stars: 10 - Forks: 3
lp-dataninja/SparkML
Detailed notes and code to learn machine learning with Apache Spark.
Language: Jupyter Notebook - Size: 4.06 MB - Last synced: about 1 year ago - Pushed: over 5 years ago - Stars: 12 - Forks: 17
giuseppegambino/Italian-Sentiment-Analysis-with-Spark
Application of Sentiment Analysis of Italian tweet with Python and Spark
Language: Python - Size: 6.84 KB - Last synced: about 1 year ago - Pushed: about 3 years ago - Stars: 10 - Forks: 0
TrainingByPackt/Big-Data-Processing-with-Apache-Spark-eLearning
Efficiently tackle large datasets and perform big data analysis with Spark and Python
Language: Python - Size: 36.1 KB - Last synced: about 1 year ago - Pushed: over 5 years ago - Stars: 7 - Forks: 6
berksudan/PySpark-Auto-Clustering
Implemented an auto-clustering tool with seed and number of clusters finder. Optimizing algorithms: Silhouette, Elbow. Clustering algorithms: k-Means, Bisecting k-Means, Gaussian Mixture. Module includes micro-macro pivoting, and dashboards displaying radius, centroids, and inertia of clusters. Used: Python, Pyspark, Matplotlib, Spark MLlib.
Language: Python - Size: 73.2 KB - Last synced: about 1 year ago - Pushed: about 2 years ago - Stars: 0 - Forks: 0