GitHub / berksudan / PySpark-Auto-Clustering
Implemented an auto-clustering tool with seed and number of clusters finder. Optimizing algorithms: Silhouette, Elbow. Clustering algorithms: k-Means, Bisecting k-Means, Gaussian Mixture. Module includes micro-macro pivoting, and dashboards displaying radius, centroids, and inertia of clusters. Used: Python, Pyspark, Matplotlib, Spark MLlib.
JSON API: http://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/berksudan%2FPySpark-Auto-Clustering
PURL: pkg:github/berksudan/PySpark-Auto-Clustering
Stars: 0
Forks: 0
Open issues: 0
License: None
Language: Python
Size: 64.5 KB
Dependencies parsed at: Pending
Created at: over 3 years ago
Updated at: 6 months ago
Pushed at: 6 months ago
Last synced at: 6 months ago
Topics: bisecting-kmeans, clustering, clustering-analysis, elbow-method, gaussian-mixture, kmeans-clustering, pyspark, silhouette-score, spark, spark-mllib