Ecosyste.ms: Repos

An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: spark-mllib

LuisFalva/ophelia

Ophelia a PySpark analytics wrapper.

Language: Python - Size: 860 KB - Last synced: 2 days ago - Pushed: 3 days ago - Stars: 1 - Forks: 5

abulbasar/zeppelin-notebooks

Size: 3.91 KB - Last synced: 14 days ago - Pushed: over 6 years ago - Stars: 1 - Forks: 1

aabdel-kader/Apache-Spark

A repository for my practices and projects using pyspark

Language: Jupyter Notebook - Size: 11.6 MB - Last synced: about 1 month ago - Pushed: over 2 years ago - Stars: 1 - Forks: 0

wengbenjue/spark_recomend

使用Spark的MLlib、Hbase作为模型、Hive作数据清洗的核心推荐引擎,在Spark on Yarn测试通过

Language: C - Size: 41.2 MB - Last synced: about 1 month ago - Pushed: about 7 years ago - Stars: 29 - Forks: 17

yennanliu/NYC_Taxi_Trip_Duration

Develop ML models predict taxi trip duration in NYC. Ranked : Top 6% | RMSLE : 0.377 (Kaggle) | #DS

Language: Jupyter Notebook - Size: 43.2 MB - Last synced: about 1 month ago - Pushed: over 1 year ago - Stars: 15 - Forks: 8

P7h/p7hb-docker-mllib-twitter-sentiment

:ship: Docker image for Twitter Sentiment analysis with Spark MLlib

Language: Shell - Size: 138 KB - Last synced: about 1 month ago - Pushed: almost 7 years ago - Stars: 7 - Forks: 3

IBM/db2-event-store-iot-analytics

IoT sensor temperature analysis and prediction with IBM Db2 Event Store

Language: Jupyter Notebook - Size: 36.6 MB - Last synced: about 1 month ago - Pushed: over 3 years ago - Stars: 4 - Forks: 21

P7h/Spark-MLlib-Twitter-Sentiment-Analysis

:star2: :sparkles: Analyze and visualize Twitter Sentiment on a world map using Spark MLlib

Language: Scala - Size: 19.7 MB - Last synced: about 1 month ago - Pushed: about 3 years ago - Stars: 135 - Forks: 69

qubole/sparklens

Qubole Sparklens tool for performance tuning Apache Spark

Language: Scala - Size: 175 KB - Last synced: 30 days ago - Pushed: 11 months ago - Stars: 547 - Forks: 130

aliabbasi2000/Spark

Solving Big Data Problems using Spark framework in Java. Running the Project on HDFS clusters (BigData@Polito) to get the results.

Language: Java - Size: 143 KB - Last synced: about 2 months ago - Pushed: about 2 months ago - Stars: 1 - Forks: 0

felidsche/movie-recommender

A movie recommendation system built using Apache Spark’s ML library

Language: Python - Size: 829 KB - Last synced: about 2 months ago - Pushed: about 3 years ago - Stars: 0 - Forks: 0

jaceklaskowski/spark-workshop

Apache Spark™ and Scala Workshops

Language: HTML - Size: 57 MB - Last synced: 16 days ago - Pushed: over 1 year ago - Stars: 253 - Forks: 143

josemarialuna/ExternalValidity

This package contains the code for calculating external clustering validity indices in Spark. The package includes Chi Index among others.

Language: Scala - Size: 146 KB - Last synced: 2 months ago - Pushed: 2 months ago - Stars: 9 - Forks: 1

IBM/icp4d-customer-churn-classifier

Infuse AI into your application. Create and deploy a customer churn prediction model with IBM Cloud Private for Data, Db2 Warehouse, Spark MLlib, and Jupyter notebooks.

Language: Jupyter Notebook - Size: 28.1 MB - Last synced: about 1 month ago - Pushed: 12 months ago - Stars: 17 - Forks: 22

LuckyZXL2016/Movie_Recommend

基于Spark的电影推荐系统,包含爬虫项目、web网站、后台管理系统以及spark推荐系统

Language: Java - Size: 55.1 MB - Last synced: 3 months ago - Pushed: about 5 years ago - Stars: 2,636 - Forks: 1,030

bobxwang/predict-stock-in-spark

using spark to predict stock, the data come from sina

Language: Scala - Size: 143 KB - Last synced: 3 months ago - Pushed: over 2 years ago - Stars: 0 - Forks: 0

uosdmlab/spark-nkp

Natural Korean Processor for Apache Spark

Language: Scala - Size: 53.7 KB - Last synced: 3 months ago - Pushed: about 6 years ago - Stars: 53 - Forks: 16

CaioBrainer/Hadoop_Projects

Pequenos projetos utilizando ferramentas do ecossistema Apache Hadoop

Language: Python - Size: 18.6 KB - Last synced: 3 months ago - Pushed: 3 months ago - Stars: 0 - Forks: 0

OrvilleX/MachineLearning

机器学习教程,本教程包含基于numpy、sklearn与tensorflow机器学习,也会包含利用spark、flink加快模型训练等用法。本着能够较全的引导读者入门机器学习。

Language: Python - Size: 10.1 MB - Last synced: 3 months ago - Pushed: 3 months ago - Stars: 47 - Forks: 16

Jayant1234/Malware-classification Fork of dsp-uga/sabayon-p1

Language: Jupyter Notebook - Size: 11.4 MB - Last synced: 4 months ago - Pushed: over 5 years ago - Stars: 0 - Forks: 0

derrickburns/generalized-kmeans-clustering

Spark library for generalized K-Means clustering. Supports general Bregman divergences. Suitable for clustering probabilistic data, time series data, high dimensional data, and very large data.

Language: HTML - Size: 7.42 MB - Last synced: 4 months ago - Pushed: 4 months ago - Stars: 284 - Forks: 51

marcocolangelo/Big-Data-processing-and-Analytics

The current repository contains all the code developed during the Big Data processing and Analytics laboratories. Data are processed and analyzed using Hadoop and Spark

Language: Java - Size: 6.1 MB - Last synced: 4 months ago - Pushed: 4 months ago - Stars: 0 - Forks: 0

mikerly131/serveUpRecos

Build a mock EMR app and integrate an AI/ML prediction into an encounter workflow

Language: CSS - Size: 54.9 MB - Last synced: 5 months ago - Pushed: 5 months ago - Stars: 0 - Forks: 0

polaroidz/sales_prediction

A Production Machine Learning Pipeline for Predicting Future Sales with Spark

Language: Jupyter Notebook - Size: 90.8 KB - Last synced: 5 months ago - Pushed: over 5 years ago - Stars: 3 - Forks: 0

venkateshavula/Evaluate-Spark-MLlib-using-PySpark

A UDF to evaluate Spark-MLlib classification model using PySpark

Language: Python - Size: 4.88 KB - Last synced: 5 months ago - Pushed: over 5 years ago - Stars: 0 - Forks: 0

miguelangel43/Prediction-Flight-Arrivals-Delays-Spark

Application that trains a classifier and predicts flight arrival delays based on past information. Uses the libraries pyspark.ml and pyspark.sql, performs feature engineering, cross-validation and tests various ML algorithms.

Language: Python - Size: 41 KB - Last synced: 7 days ago - Pushed: 5 months ago - Stars: 0 - Forks: 0

amitkumarusc/recommendation-system

A movie recommendation system trained on the MovieLens 20 Million dataset. This system makes use of Collaborative filtering methods to come up with recommendations for a particular user.

Language: Jupyter Notebook - Size: 21.8 MB - Last synced: 2 months ago - Pushed: over 4 years ago - Stars: 13 - Forks: 3

harishpuvvada/BitCoin-Value-Predictor 📦

[NOT MAINTAINED] Predicting Bit coin price using Time series analysis and sentiment analysis of tweets on bitcoin

Language: Jupyter Notebook - Size: 3.07 MB - Last synced: 6 months ago - Pushed: over 4 years ago - Stars: 113 - Forks: 29

ShubhamJagtap2000/Spark-Python

🐍💥Python and Spark for Big Data

Language: Jupyter Notebook - Size: 73.2 KB - Last synced: 7 months ago - Pushed: 7 months ago - Stars: 0 - Forks: 0

wzhe06/SparkCTR

CTR prediction model based on spark(LR, GBDT, DNN)

Language: Scala - Size: 35 MB - Last synced: 7 months ago - Pushed: about 4 years ago - Stars: 880 - Forks: 264

omerbsezer/SparkDeepMlpGADow30 📦

A Deep Neural-Network based (Deep MLP) Stock Trading System based on Evolutionary (Genetic Algorithm) Optimized Technical Analysis Parameters (using Apache Spark MLlib)

Language: Java - Size: 213 MB - Last synced: 7 months ago - Pushed: about 6 years ago - Stars: 58 - Forks: 46

polaternez/Introduction-to-Big-Data

Big Data projects for beginners

Language: Java - Size: 4.59 MB - Last synced: 7 months ago - Pushed: 7 months ago - Stars: 0 - Forks: 0

databricks/LearningSparkV2

This is the github repo for Learning Spark: Lightning-Fast Data Analytics [2nd Edition]

Language: Scala - Size: 75.2 MB - Last synced: 7 months ago - Pushed: over 1 year ago - Stars: 1,009 - Forks: 647

alessandrolulli/reforest

Random Forests in Apache Spark

Language: Scala - Size: 71 MB - Last synced: 7 months ago - Pushed: almost 5 years ago - Stars: 72 - Forks: 11

rtahmasbi/Spark

Size: 18.6 KB - Last synced: 7 months ago - Pushed: over 5 years ago - Stars: 0 - Forks: 1

MNoorFawi/linear-regression-with-spark

Creating an SBT-based Spark Application to predict Online News Popularity using Linear Regression Algorithm ...

Language: Scala - Size: 18.6 KB - Last synced: 7 months ago - Pushed: over 5 years ago - Stars: 0 - Forks: 0

MNoorFawi/kmeans-clustering-with-spark

Creating an sbt Apache Spark application to perform customer segmentation using Spark MLlib KMeans ...

Language: Scala - Size: 15.6 KB - Last synced: 7 months ago - Pushed: over 5 years ago - Stars: 0 - Forks: 1

jingpeicomp/product-category-predict

商品类目预测,使用 Spring Boot 开发框架和 Spark MLlib 机器学习框架,通过 TF-IDF 和 Bayes 算法,训练出一个商品类目预测模型。该模型可以根据商品名称自动预测出商品类目。项目对外提供 RESTFul 接口。

Language: Java - Size: 41.2 MB - Last synced: 7 months ago - Pushed: almost 3 years ago - Stars: 119 - Forks: 60

lookuut/raif-competition

Spark application for prediction home and work coordinates of the customer by payment transactions

Language: Scala - Size: 27.7 MB - Last synced: 9 months ago - Pushed: about 6 years ago - Stars: 1 - Forks: 0

akarsh3007/Recommendation-Systems

Simple Content based and Collaborative Filtering Algorithms implementaion

Language: Python - Size: 1.36 MB - Last synced: 9 months ago - Pushed: over 6 years ago - Stars: 0 - Forks: 0

jr2ngb2/yelp_recommender

Language: Jupyter Notebook - Size: 45.9 KB - Last synced: 9 months ago - Pushed: about 5 years ago - Stars: 1 - Forks: 0

xtutran/spark-tutor

Using spark-sql & spark-mllib to tackle Titanic & Movie Recomendation

Language: Scala - Size: 112 KB - Last synced: 9 months ago - Pushed: over 5 years ago - Stars: 0 - Forks: 0

trendyol-data-eng-summer-intern-2019/recom-engine-streaming

Streaming component of the project, which is written with Spark Streaming.

Language: Scala - Size: 15.6 KB - Last synced: 10 months ago - Pushed: almost 5 years ago - Stars: 0 - Forks: 1

agrimrules/brewery

A spark job that processes data scraped from the web

Language: Scala - Size: 17.6 KB - Last synced: 10 months ago - Pushed: almost 6 years ago - Stars: 0 - Forks: 0

NashTech-Labs/Sparkathon

A library having Java and Scala examples for Spark 2.x

Language: Java - Size: 113 MB - Last synced: 7 months ago - Pushed: over 7 years ago - Stars: 7 - Forks: 9

gavalle94/Songs-Recommender

Recommendation System written in Python, using the pySpark framework and other Data Science libraries

Language: HTML - Size: 5.23 MB - Last synced: 10 months ago - Pushed: about 6 years ago - Stars: 1 - Forks: 1

lucashomuniz/Project-14

Development of an AutoML System to Predict the Compressive Strength of Concrete

Language: Python - Size: 42 KB - Last synced: 10 months ago - Pushed: 10 months ago - Stars: 0 - Forks: 0

hoangviet148/Foody

Language: Python - Size: 17.6 MB - Last synced: 10 months ago - Pushed: over 2 years ago - Stars: 0 - Forks: 1

avcaliani/spark-ml-app

🤖

Language: Jupyter Notebook - Size: 118 KB - Last synced: 10 months ago - Pushed: almost 3 years ago - Stars: 0 - Forks: 0

Lucass97/FlightAnalysis

This project implemented a lambda architecture for analyzing domestic flight data in the US from 2009 to 2020. It used Apache Spark for batch processing, Spark Streaming for real-time analysis, and SVM models to predict flight cancellations and delays, with Docker for cluster management and Grafana for real-time visualization.

Language: Jupyter Notebook - Size: 5.66 MB - Last synced: 10 months ago - Pushed: 10 months ago - Stars: 0 - Forks: 1

shalakasaraogi/apache-spark-pig-hive-work

This repository contains Apache Spark, Apache Hive, Apache Pig work

Language: PigLatin - Size: 813 KB - Last synced: 10 months ago - Pushed: over 3 years ago - Stars: 0 - Forks: 0

jacopocav/spark-ifs

Iterative filter-based feature selection on large datasets with Apache Spark

Language: Scala - Size: 130 KB - Last synced: 10 months ago - Pushed: almost 6 years ago - Stars: 3 - Forks: 0

xghan99/bigdata-assignments

This repository consists of code I wrote for CS4225 - Big Data Systems for Data Science

Language: Jupyter Notebook - Size: 3.64 MB - Last synced: 10 months ago - Pushed: about 1 year ago - Stars: 0 - Forks: 0

shubham-deb/Neural-Circuit-Tracer

This repository contains the source codes & scripts of my project for Master's level course - CS6240 Parallel Data Processing in Map-Reduce course at College of Computer & Information Science, Northeastern University, Boston MA.

Language: Scala - Size: 663 KB - Last synced: about 2 months ago - Pushed: about 6 years ago - Stars: 1 - Forks: 0

shubham-deb/Spark_Scala_Programs

This repository contains all the Spark Scala programs that I have implemented during my Master's level course - CS6240 Parallel Data Processing in Map-Reduce course at College of Computer & Information Science, Northeastern University, Boston MA.

Language: Makefile - Size: 4 MB - Last synced: about 2 months ago - Pushed: about 6 years ago - Stars: 1 - Forks: 0

esap120/spark-twitter-streaming

Streaming Twitter Sentiment Analysis with Apache Spark

Language: Scala - Size: 8.79 KB - Last synced: 10 months ago - Pushed: over 5 years ago - Stars: 5 - Forks: 1

Siddharth1989/ProspectiveTopUpCustomerPrediction

Developed a model/Spark ML pipeline stream to identify potential customers that may purchase top up services in the future.

Language: Jupyter Notebook - Size: 6.17 MB - Last synced: 11 months ago - Pushed: 11 months ago - Stars: 0 - Forks: 0

huzaifakhan04/amazon-product-recommendation-system-web-application-flask-using-mongodb-pyspark-and-apache-kafka

This repository includes a web application connected to a product recommendation system developed with the comprehensive Amazon Review Data (2018) dataset, consisting of nearly 233.1 million records and occupying approximately 128 gigabytes (GB) of data storage, using MongoDB, PySpark, and Apache Kafka.

Language: Jupyter Notebook - Size: 91 MB - Last synced: 11 months ago - Pushed: 11 months ago - Stars: 0 - Forks: 0

ging/fiware-ml-supermarket

Demo: Predicting purchase volume in a supermarket using FIWARE

Language: JavaScript - Size: 290 KB - Last synced: about 1 month ago - Pushed: over 1 year ago - Stars: 4 - Forks: 1

cbozan/graduation-project

Graduation project categorizes popular search phrases using Python and Spark and presents them on a website to inspire creators.

Language: Python - Size: 402 KB - Last synced: 11 months ago - Pushed: over 1 year ago - Stars: 4 - Forks: 0

vaslnk/Spotify-Song-Recommendation-ML

UC Berkeley team's submission for RecSys Challenge 2018

Language: Jupyter Notebook - Size: 34.2 MB - Last synced: 9 months ago - Pushed: about 6 years ago - Stars: 82 - Forks: 21

surajsrivathsa/Supervised_Link_Prediction_Using_Spark_and_Neo4j

A project which involves analysis of Authorship graph data from Microsoft academic graph. In this project we calculate different graph features using temporal parameters of the authors and tried different classifiers. The final aim is to predict the link or coauthorsip possibility between two authors based on topological graph features and also find out the feasibility of performing this task on Neo4j and Spark

Language: Scala - Size: 15.8 MB - Last synced: 4 months ago - Pushed: almost 4 years ago - Stars: 5 - Forks: 4

MHassaanButt/Flight-Delays-Prediction

In this project, I used Decision Tree Learning Model as the main algorithm to build the model. Due to the big amount of flight data, we implement the project using MRJob, PySpark and Spark's MLlib then compare the performance and accuracy of those implementations.

Language: Jupyter Notebook - Size: 17.5 MB - Last synced: about 1 year ago - Pushed: over 2 years ago - Stars: 8 - Forks: 0

avaibh/Twitter-Bot-Detection

Big Data Stack: Spark, Kafka, Elasticsearch and NoSQL

Language: Jupyter Notebook - Size: 304 KB - Last synced: about 1 year ago - Pushed: about 4 years ago - Stars: 4 - Forks: 0

anmolmore/Enzyme-Classifier-Using-ML

Classify enzymes with geomic sequence using spark-ML

Language: Jupyter Notebook - Size: 719 MB - Last synced: about 1 year ago - Pushed: over 3 years ago - Stars: 1 - Forks: 0

svngoku/Pyspark-pour-les-datas-engineers

Introduction à Pyspark pour les Data Engineers par la pratique

Language: Jupyter Notebook - Size: 784 KB - Last synced: 16 days ago - Pushed: over 1 year ago - Stars: 1 - Forks: 0

bassrehab/zerofish-imaging

Using the Thunder Library for Image Processing with Spark ML Lib

Language: Python - Size: 1.83 MB - Last synced: 12 months ago - Pushed: about 7 years ago - Stars: 1 - Forks: 0

grishenkovp/apache_spark

Изучение Apache Spark. Библиотека PySpark

Language: Jupyter Notebook - Size: 135 KB - Last synced: 8 months ago - Pushed: 8 months ago - Stars: 2 - Forks: 0

satyajeetmaharana/floodprediction

The goal of this project is to identify the flood-prone areas with probabilities of flood in counties in a future date, using Spark MLLib.

Language: Scala - Size: 3.46 MB - Last synced: 4 months ago - Pushed: over 4 years ago - Stars: 3 - Forks: 1

forons/BigDataExamples

Code repository for the MSc course "Big Data and Social Networks" of the University of Trento

Language: Jupyter Notebook - Size: 229 MB - Last synced: about 1 year ago - Pushed: over 2 years ago - Stars: 2 - Forks: 13

amanjeetsahu/Apache-Spark-Tutorials

This repo contains my learnings and practice notebooks on Spark using PySpark (Python Language API on Spark). All the notebooks in the repo can be used as template code for most of the ML algorithms and can be built upon it for more complex problems.

Language: Jupyter Notebook - Size: 20.9 MB - Last synced: about 1 year ago - Pushed: over 2 years ago - Stars: 4 - Forks: 10

Rohini2505/Lending-Club-Loan-Analysis

Explanatory Data Analysis and ML model building using Apache Spark and PySpark

Language: HTML - Size: 6.26 MB - Last synced: about 1 year ago - Pushed: about 3 years ago - Stars: 10 - Forks: 12

ABigdataer/MovieRecommendSystem

基于Spark的电影推荐系统

Language: HTML - Size: 64.6 MB - Last synced: about 1 year ago - Pushed: over 1 year ago - Stars: 4 - Forks: 4

alefbt/SparkML-spring-scoring-poc 📦

POC of socring rest service od Spark ML Pipelines

Language: Java - Size: 234 KB - Last synced: about 1 year ago - Pushed: about 5 years ago - Stars: 0 - Forks: 0

corvaglia-alessio/big-data-labs 📦

Labs for the course "Big Data: architectures and data analytics" @ Politecnico di Torino a.y. 2021/22

Language: Java - Size: 53.4 MB - Last synced: about 1 year ago - Pushed: over 2 years ago - Stars: 0 - Forks: 0

chaithrakc/credit_card_default_prediction

Analyzing the likelihood of credit card delinquency without using credit scores or credit history

Language: Jupyter Notebook - Size: 4.16 MB - Last synced: about 1 year ago - Pushed: over 1 year ago - Stars: 0 - Forks: 0

huangyueranbbc/Spark_ALS

基于spark-ml,spark-mllib,spark-streaming的推荐算法实现

Language: Java - Size: 97.7 KB - Last synced: about 1 year ago - Pushed: almost 5 years ago - Stars: 89 - Forks: 46

kocharshaivi19/Stock-Analysis-and-Prediction

Financial Forecasting and its correlation with Human Sentiments using Distributed Computing on Spark Framework

Language: Scala - Size: 811 KB - Last synced: about 1 year ago - Pushed: over 6 years ago - Stars: 4 - Forks: 4

MHassaanButt/Crime-Spark-ML

In this project I stream data and do crime classification using Spark. This dataset contains incidents derived from the SFPD Crime Incident Reporting system. The data ranges from 1/1/2003 to 5/13/2015. I do some data analysis of crime scenes in different areas and with respect to other parameters.

Language: Python - Size: 5.86 KB - Last synced: about 1 year ago - Pushed: over 2 years ago - Stars: 4 - Forks: 0

spoddutur/spark-ml-dashboard

Spark ML Dashboard built to plug-in and tweak the model params to real-time verify classification results on sample test data

Language: Scala - Size: 20.5 MB - Last synced: about 1 year ago - Pushed: over 6 years ago - Stars: 5 - Forks: 2

jingpeicomp/product-relation-mining

商品关联关系挖掘,使用Spring Boot开发框架和Spark MLlib机器学习框架,通过FP-Growth算法,分析用户的购物车商品数据,挖掘商品之间的关联关系。项目对外提供RESTFul接口。

Language: Java - Size: 68.4 KB - Last synced: about 1 year ago - Pushed: almost 3 years ago - Stars: 24 - Forks: 16

wikistat/AI-Frameworks

Science des Données Saison 5: Technologies pour l'apprentissage automatique / statistique de données massives et l'Intelligence Artificielle

Language: Jupyter Notebook - Size: 646 MB - Last synced: 9 months ago - Pushed: 11 months ago - Stars: 40 - Forks: 41

Wadaboa/production-line-performance

Scala/Spark project, for Languages and Algorithms for Artificial Intelligence class at UNIBO

Language: Scala - Size: 31 MB - Last synced: about 1 year ago - Pushed: over 3 years ago - Stars: 1 - Forks: 0

rishiravikumar-tul-scm/IPL-Analysis

IPL Match Simulation using K-means Clustering and Collaborative Filtering.

Language: Python - Size: 2.64 MB - Last synced: about 1 year ago - Pushed: over 4 years ago - Stars: 0 - Forks: 0

FlorentF9/sparkml-som

:sparkles: Spark ML implementation of SOM algorithm (Kohonen self-organizing map)

Language: Scala - Size: 29.3 KB - Last synced: 7 months ago - Pushed: over 2 years ago - Stars: 16 - Forks: 6

ZhipengHong0123/Steam-Game-Analysis

Language: HTML - Size: 3.68 MB - Last synced: about 1 year ago - Pushed: about 2 years ago - Stars: 0 - Forks: 4

samanta-anupam/similar-water-regions

In this project we look at the global surface water explorer and find patches of areas that are similar to each other in the entire world using the European Commision Global Surface satellite water dataset

Language: Jupyter Notebook - Size: 346 KB - Last synced: 4 months ago - Pushed: over 6 years ago - Stars: 1 - Forks: 0

tweichle/Spark-for-Big-Data

Spark: Work with Big Data and Build Machine Learning Models at Scale

Language: Jupyter Notebook - Size: 63.5 KB - Last synced: 12 months ago - Pushed: almost 4 years ago - Stars: 2 - Forks: 1

tkachuksergiy/aws-spark-nlp

Works related to recent project on the use of Apache Spark and AWS cloud for NLP task.

Language: Jupyter Notebook - Size: 2.76 MB - Last synced: about 1 year ago - Pushed: about 4 years ago - Stars: 5 - Forks: 0

grishenkovp/databricks

Коллекция кейсов на базе платформы Databricks

Language: Jupyter Notebook - Size: 504 KB - Last synced: about 1 year ago - Pushed: over 1 year ago - Stars: 0 - Forks: 0

omerbsezer/SparkMlpDow30

A new stock trading and prediction model based on a MLP neural network utilizing technical analysis indicator values as features (using Apache Spark MLlib)

Language: Java - Size: 201 MB - Last synced: about 1 year ago - Pushed: about 6 years ago - Stars: 34 - Forks: 21

MostafaToema/Stroke-Prediction-using-Pyspark

Data preparation, visualization, and feature engineering and classification of people have stroke using pyspark libraries

Language: Jupyter Notebook - Size: 79.1 KB - Last synced: almost 1 year ago - Pushed: over 2 years ago - Stars: 1 - Forks: 0

nbegumc/market-basket-analysis

Finding frequent itemsets using Apriori and FP Growth algorithm on Spark

Language: Jupyter Notebook - Size: 692 KB - Last synced: about 1 year ago - Pushed: over 2 years ago - Stars: 0 - Forks: 0

DavideNardone/TwitterSentimentAnalysis

A Spark Streaming implementation for Online Twitter Sentiment Analysis.

Language: Python - Size: 1.78 MB - Last synced: about 1 year ago - Pushed: about 6 years ago - Stars: 8 - Forks: 3

omerbsezer/SparkMLlibExamples-Scala-

Spark MLlib Examples (Scala)

Language: Java - Size: 371 KB - Last synced: about 1 year ago - Pushed: over 6 years ago - Stars: 1 - Forks: 2

josemarialuna/ClusterIndices

This package contains the code for executing clustering validity indices in Spark. The package includes BD-Silhouette, BD-Dunn, Davies-Bouldin and WSSSE indices.

Language: Scala - Size: 588 KB - Last synced: about 1 year ago - Pushed: over 5 years ago - Stars: 10 - Forks: 3

lp-dataninja/SparkML

Detailed notes and code to learn machine learning with Apache Spark.

Language: Jupyter Notebook - Size: 4.06 MB - Last synced: about 1 year ago - Pushed: over 5 years ago - Stars: 12 - Forks: 17

giuseppegambino/Italian-Sentiment-Analysis-with-Spark

Application of Sentiment Analysis of Italian tweet with Python and Spark

Language: Python - Size: 6.84 KB - Last synced: about 1 year ago - Pushed: about 3 years ago - Stars: 10 - Forks: 0

TrainingByPackt/Big-Data-Processing-with-Apache-Spark-eLearning

Efficiently tackle large datasets and perform big data analysis with Spark and Python

Language: Python - Size: 36.1 KB - Last synced: about 1 year ago - Pushed: over 5 years ago - Stars: 7 - Forks: 6

berksudan/PySpark-Auto-Clustering

Implemented an auto-clustering tool with seed and number of clusters finder. Optimizing algorithms: Silhouette, Elbow. Clustering algorithms: k-Means, Bisecting k-Means, Gaussian Mixture. Module includes micro-macro pivoting, and dashboards displaying radius, centroids, and inertia of clusters. Used: Python, Pyspark, Matplotlib, Spark MLlib.

Language: Python - Size: 73.2 KB - Last synced: about 1 year ago - Pushed: about 2 years ago - Stars: 0 - Forks: 0