Topic: "clustering-evaluation"
scikit-learn-contrib/hdbscan
A high performance implementation of HDBSCAN clustering.
Language: Jupyter Notebook - Size: 27.8 MB - Last synced at: 3 days ago - Pushed at: 26 days ago - Stars: 2,936 - Forks: 515

Clustering4Ever/Clustering4Ever
C4E, a JVM friendly library written in Scala for both local and distributed (Spark) Clustering.
Language: Scala - Size: 1.53 MB - Last synced at: 26 days ago - Pushed at: over 4 years ago - Stars: 130 - Forks: 14

sandipanpaul21/Clustering-in-Python
Clustering methods in Machine Learning includes both theory and python code of each algorithm. Algorithms include K Mean, K Mode, Hierarchical, DB Scan and Gaussian Mixture Model GMM. Interview questions on clustering are also added in the end.
Language: Jupyter Notebook - Size: 16.2 MB - Last synced at: over 1 year ago - Pushed at: over 3 years ago - Stars: 81 - Forks: 35

thieu1995/permetrics
Artificial intelligence (AI, ML, DL) performance metrics implemented in Python
Language: Python - Size: 2.36 MB - Last synced at: 7 days ago - Pushed at: 9 months ago - Stars: 76 - Forks: 18

pedrodbs/Aglomera
A hierarchical agglomerative clustering (HAC) library written in C#
Language: C# - Size: 2.62 MB - Last synced at: 10 days ago - Pushed at: over 2 years ago - Stars: 51 - Forks: 16

iralabdisco/pso-clustering
PSO-Clustering algorithm [Matlab code]
Language: MATLAB - Size: 12.7 KB - Last synced at: over 1 year ago - Pushed at: over 3 years ago - Stars: 51 - Forks: 37

BaderLab/scClustViz
Explore and share your scRNAseq clustering results
Language: R - Size: 696 MB - Last synced at: 2 months ago - Pushed at: almost 2 years ago - Stars: 50 - Forks: 9

hhromic/python-bcubed
Simple Extended BCubed implementation in Python for clustering evaluation
Language: Python - Size: 18.6 KB - Last synced at: 23 days ago - Pushed at: over 5 years ago - Stars: 50 - Forks: 9

porterehunley/RACplusplus
A high performance implementation of Reciprocal Agglomerative Clustering in C++
Language: Jupyter Notebook - Size: 191 KB - Last synced at: about 1 year ago - Pushed at: over 1 year ago - Stars: 43 - Forks: 2

gagolews/clustering-benchmarks
A framework for benchmarking clustering algorithms
Language: Python - Size: 194 MB - Last synced at: 18 days ago - Pushed at: 18 days ago - Stars: 39 - Forks: 7

philips-software/latrend
An R package for clustering longitudinal datasets in a standardized way, providing interfaces to various R packages for longitudinal clustering, and facilitating the rapid implementation and evaluation of new methods
Language: R - Size: 62.9 MB - Last synced at: 29 days ago - Pushed at: 29 days ago - Stars: 31 - Forks: 5

eXascaleInfolab/xmeasures
Extremely fast evaluation of the extrinsic clustering measures: various (mean) F1 measures and Omega Index (Fuzzy Adjusted Rand Index) for the multi-resolution clustering with overlaps/covers, standard NMI, clusters labeling
Language: C++ - Size: 1.1 MB - Last synced at: 7 days ago - Pushed at: almost 4 years ago - Stars: 20 - Forks: 7

eXascaleInfolab/clubmark
Clubmark: a Parallel Isolation Framework for Benchmarking and Profiling of Clustering (Community Detection) Algorithms Considering Overlaps (Covers)
Language: Python - Size: 7.66 MB - Last synced at: about 1 year ago - Pushed at: over 4 years ago - Stars: 20 - Forks: 2

waynezhanghk/gacluster
Graph Agglomerative Clustering Library
Language: MATLAB - Size: 20.5 KB - Last synced at: over 1 year ago - Pushed at: almost 6 years ago - Stars: 20 - Forks: 6

eXascaleInfolab/GenConvNMI
Generalized Conventional Mutual Information (GenConvMI) - NMI for overlapping (soft, fuzzy) clusters (communities), compatible with standard NMI, pure C++ version (single executable)
Language: C++ - Size: 124 KB - Last synced at: about 1 year ago - Pushed at: over 4 years ago - Stars: 19 - Forks: 7

eXascaleInfolab/OvpNMI Fork of aaronmcdaid/Overlapping-NMI
Overlapping Normalized Mutual Information and Omega Index evaluation for the overlapping community structure produced by clustering algorithms
Language: C++ - Size: 118 KB - Last synced at: about 1 year ago - Pushed at: over 5 years ago - Stars: 17 - Forks: 5

hulianyu/CVDD
An Internal Validity Index Based on Density-Involved Distance (2019).
Language: MATLAB - Size: 855 KB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 16 - Forks: 3

lettier/interactivekmeans
Interactive HTML canvas based implementation of k-means.
Language: JavaScript - Size: 4.64 MB - Last synced at: 2 months ago - Pushed at: about 7 years ago - Stars: 15 - Forks: 2

cleanzr/clevr
Clustering and Link Prediction Evaluation in R
Language: R - Size: 114 KB - Last synced at: 11 days ago - Pushed at: over 1 year ago - Stars: 12 - Forks: 3

alashkov83/S_Dbw
S_Dbw validity index. Adapted for DBSCAN (and similar)
Language: Jupyter Notebook - Size: 2.34 MB - Last synced at: 22 days ago - Pushed at: about 6 years ago - Stars: 10 - Forks: 5

josemarialuna/ClusterIndices
This package contains the code for executing clustering validity indices in Spark. The package includes BD-Silhouette, BD-Dunn, Davies-Bouldin and WSSSE indices.
Language: Scala - Size: 588 KB - Last synced at: over 2 years ago - Pushed at: over 6 years ago - Stars: 10 - Forks: 3

josemarialuna/ExternalValidity
This package contains the code for calculating external clustering validity indices in Spark. The package includes Chi Index among others.
Language: Scala - Size: 146 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 9 - Forks: 1

eren-ck/finch
A Python implementation of "FINCH Clustering Algorithm (CVPR 2019)"
Language: Python - Size: 460 KB - Last synced at: 10 days ago - Pushed at: 4 months ago - Stars: 8 - Forks: 2

EtzionR/Clustering-by-Silhouette
Optimize clustering labels using Silhouette Score.
Language: Python - Size: 24.4 MB - Last synced at: over 2 years ago - Pushed at: almost 4 years ago - Stars: 8 - Forks: 2

Theldus/KValid
A simple clustering evaluation of KMeans for WEKA
Language: Java - Size: 1.74 MB - Last synced at: over 2 years ago - Pushed at: over 6 years ago - Stars: 8 - Forks: 0

pajaskowiak/clusterConfusion
Clustering validation with ROC Curves
Language: R - Size: 1.2 MB - Last synced at: about 1 month ago - Pushed at: 3 months ago - Stars: 7 - Forks: 1

nejci/Pepelka
Pepelka is a MATLAB toolbox for data clustering and visualization.
Language: MATLAB - Size: 38.9 MB - Last synced at: over 2 years ago - Pushed at: over 3 years ago - Stars: 6 - Forks: 0

adanjoga/cvik-toolbox
CVIK is a Toolbox for the automatic determination of the number of clusters on data clustering problems
Language: MATLAB - Size: 4.87 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 5 - Forks: 2

Cuky88/APICrawler
Language: Python - Size: 78.7 MB - Last synced at: 3 months ago - Pushed at: over 2 years ago - Stars: 5 - Forks: 2

ghar1821/ParetoBench
Benchmarking framework based on Pareto front concept
Language: Python - Size: 680 MB - Last synced at: about 1 year ago - Pushed at: almost 4 years ago - Stars: 5 - Forks: 3

HawxChen/CloudComputing
MapReduce, Spark, Hadoop, PostgreSQL, Cluster Management
Language: Python - Size: 54.7 KB - Last synced at: about 2 years ago - Pushed at: over 7 years ago - Stars: 5 - Forks: 0

mike-liuliu/Min-Max-Jump-distance
Source code of the paper "Min-Max-Jump distance and its applications."
Language: Jupyter Notebook - Size: 104 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 4 - Forks: 0

YihDu/CEMUSA
Language: HTML - Size: 1.47 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 3 - Forks: 0

AYSE-DUMAN/Clustering-by-Business-Income-and-Expenses
load and visualize data and clusters with scatter plots; prepare data for cluster analysis; perform centroid clustering with k-means; interpret clustering results and determine the optimal number of clusters for a given dataset.
Language: Jupyter Notebook - Size: 488 KB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 3 - Forks: 0

zcebeci/fcvalid
Internal Validity Indexes for Fuzzy and Possibilistic Clustering
Language: R - Size: 157 KB - Last synced at: over 2 years ago - Pushed at: over 2 years ago - Stars: 3 - Forks: 1

gulraizchoudhary/Random-Swap-Clustering-Algorithm
The "Random Swap" algorithm with a random dataset, visuals and example notebooks
Language: Python - Size: 539 KB - Last synced at: about 1 year ago - Pushed at: about 3 years ago - Stars: 3 - Forks: 0

wiebket/delarchetypes
A pipeline to construct residential electricity consumer archetypes from the South African Domestic Electrical Load (DEL) database.
Language: Python - Size: 16.9 MB - Last synced at: over 1 year ago - Pushed at: almost 4 years ago - Stars: 3 - Forks: 0

mbari-org/ecoz2-whale-cb
LPC Based K-Means Clustering on Humpback Whale Vocalizations
Language: TeX - Size: 210 MB - Last synced at: 6 days ago - Pushed at: almost 5 years ago - Stars: 3 - Forks: 0

mike-liuliu/gl_index
Source code of the paper "A New Index for Clustering Evaluation Based on Density Estimation."
Language: Jupyter Notebook - Size: 74.4 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 2 - Forks: 0

pajaskowiak/dbcv
Density-Based Clustering Validation
Language: MATLAB - Size: 356 KB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 2 - Forks: 0

caesarmario/Mall-Customers-Clustering-Analysis-using-SAS-Enterprise-Miner
This repository contains mall customers clustering analysis. This repository also uses SAS Enterprise Miner to perform clustering and identify each cluster's characteristics. Full explanations about this repository can be seen on: https://medium.com/@caesarmario/mall-customers-clustering-analysis-da594bd2718b
Size: 4.39 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 2 - Forks: 0

gulraizchoudhary/CentroidIndex
Centroid Index Algorithm for Cluster Level Evaluation
Language: Python - Size: 467 KB - Last synced at: about 1 year ago - Pushed at: about 3 years ago - Stars: 2 - Forks: 0

alefabris/information-retrieval-service-elasticsearch
Creation an Information Retrieval Service with ElasticSearch
Language: Python - Size: 1.52 MB - Last synced at: almost 2 years ago - Pushed at: over 3 years ago - Stars: 2 - Forks: 0

rmerzouki/ml
Best Clustering using silhouette_score
Language: Jupyter Notebook - Size: 658 KB - Last synced at: over 2 years ago - Pushed at: about 4 years ago - Stars: 2 - Forks: 0

nejci/PRAr
Partition relevance analysis with the reduction step
Language: MATLAB - Size: 87.2 MB - Last synced at: over 2 years ago - Pushed at: about 5 years ago - Stars: 2 - Forks: 1

giacomomiolo/clustering-procter-and-gamble
Clustering of consumer data focusing on interpretability and actionable business insights.
Language: Jupyter Notebook - Size: 8.68 MB - Last synced at: about 2 years ago - Pushed at: about 5 years ago - Stars: 2 - Forks: 1

eXascaleInfolab/resmerge
Resolution levels clustering merger with filtering and clusters deduplication. Flattens a hierarchy/list of multiple resolutions levels (clusterings) into the single flat clustering (collection), synchronizing the node base and deduplicating.
Language: C++ - Size: 87.9 KB - Last synced at: about 1 year ago - Pushed at: over 5 years ago - Stars: 2 - Forks: 0

duttashi/clustering 📦
everything related to unsupervised algorithms in data mining
Language: R - Size: 382 KB - Last synced at: over 2 years ago - Pushed at: about 6 years ago - Stars: 2 - Forks: 0

semoglou/composite_silhouette
A clustering evaluation framework that combines micro- and macro-averaged silhouette scores into a composite metric using statistical weighting.
Language: Jupyter Notebook - Size: 1.71 MB - Last synced at: about 1 month ago - Pushed at: about 2 months ago - Stars: 1 - Forks: 0

QuantLet/USC Fork of 120BPM/CSC
Understanding Smart Contracts: Hype or Hope?
Language: Jupyter Notebook - Size: 14.8 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 1 - Forks: 2

nezumiCodes/bcubed-metrics
This repository contains the source code of the bcubed-metrics library for calculating B-Cubed precision, recall, f1 score and macro f1 score
Language: Python - Size: 3.91 KB - Last synced at: 23 days ago - Pushed at: 9 months ago - Stars: 1 - Forks: 0

kreindata/simple-seurat
Simplifying Seurat data processing, clustering, and analysis
Language: R - Size: 27.3 KB - Last synced at: over 2 years ago - Pushed at: over 2 years ago - Stars: 1 - Forks: 0

ndgigliotti/cluster-optimizer
A GridSearchCV-like hyperparameter optimizer for clustering (no cross-validation).
Language: Jupyter Notebook - Size: 74.2 KB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 1 - Forks: 0

hj-n/btw-dataset-internal-measures
The implementation of between dataset internal measures
Language: Python - Size: 15.6 KB - Last synced at: 4 days ago - Pushed at: almost 3 years ago - Stars: 1 - Forks: 0

sankarshan-bhat/graph-clustering
This repository provides classic clustering algorithms and various internal cluster quality validation metrics and also visualization capabilities to analyse the clustering results
Language: Python - Size: 2.69 MB - Last synced at: over 2 years ago - Pushed at: over 5 years ago - Stars: 1 - Forks: 0

visibilia/SIION
This is the repo containing code and other resources for the paper entitled "Exploiting Geographical Data to improve Recommender Systems for Business Opportunities in Urban Areas" and published at BRACIS 2019.
Language: Python - Size: 10.1 MB - Last synced at: about 2 years ago - Pushed at: over 5 years ago - Stars: 1 - Forks: 1

eXascaleInfolab/TInfES
Type Inference Evaluation Scripts & Accessory Apps (used for the StaTIX benchmarking)
Language: Python - Size: 26.1 MB - Last synced at: about 1 year ago - Pushed at: about 7 years ago - Stars: 1 - Forks: 2

samnemo/WClust
FentonLab's WClust software used for spike sorting
Language: C++ - Size: 165 MB - Last synced at: over 1 year ago - Pushed at: about 7 years ago - Stars: 1 - Forks: 0

n-serrette/Cluster_Index
Implementation of some intern and extern clustering indexes
Language: Python - Size: 1.01 MB - Last synced at: over 1 year ago - Pushed at: over 7 years ago - Stars: 1 - Forks: 0

alexandreday/clustering_distance
Distance between clustering assignments. Non-trivial measure weighting L0 and L1 Jaccard norms.
Language: Python - Size: 3.91 KB - Last synced at: 3 months ago - Pushed at: over 7 years ago - Stars: 1 - Forks: 2

semoglou/statistical_composite_silhouette
A clustering evaluation framework that combines micro- and macro-averaged silhouette scores into a composite metric using statistical weighting.
Size: 11.7 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

neobernad/evaluomeR
The evaluomeR package is an R package which permits to evaluate the reliability of bioinformatic metrics.
Language: Jupyter Notebook - Size: 32.9 MB - Last synced at: 14 days ago - Pushed at: 3 months ago - Stars: 0 - Forks: 1

xGabrielR/cluster-ss
Python cluster-ss Package
Language: Python - Size: 3.05 MB - Last synced at: 6 days ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

Kleo-Karap/KPA_thesis
Thesis project for the MSc "Language Technology" of the National and Kapodistrian University of Athens (NKUA)
Language: HTML - Size: 9.75 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

DimFragk/Centroid-clustering-app
Selection of the best centroid based clustering version with k-medoids and k-means
Language: Python - Size: 260 KB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 0 - Forks: 0

Rakhi-TS22/Online-Retail-Store-Analysis
An online retail store is trying to understand the various customer purchase patterns for their firm.
Language: Jupyter Notebook - Size: 5.22 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

Gauss-PWr/clustereval
Python package with clustering validation measures.
Language: Python - Size: 6.84 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

thanasiskr/K-means-2d-clustering
The implementation of the K-means algorithm for clustering randomly generated 2d-points.
Language: C - Size: 12.7 KB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

idastani7/RFM-Clustering
RFM analysis is a type of customer segmentation and behavioral targeting used to help businesses rank and segment customers based on the recency, frequency, and monetary value of a transaction
Language: Jupyter Notebook - Size: 7.15 MB - Last synced at: over 1 year ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

KyriakosPsa/Hyper-Spectral-Image-Clustering
Qualitative and quantitative evaluation of the performance of clustering algorithms in HSI clustering
Language: MATLAB - Size: 2.47 MB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

KyriakosPsa/Countries-cluster-analysis
The objective of this study is to cluster the countries using socio-economic and health factors that determine the overall development of the country and to characterize each resulting cluster (and, consequently, the countries it comprises) based on the relevant values of the above factors
Language: MATLAB - Size: 2.09 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

moharamfatema/img-segmentation
We intend to perform image segmentation. Image segmentation means that we can group similar pixels together and give these grouped pixels the same label. The grouping problem is a clustering problem. We want to study the use of K-means on the Berkeley Segmentation Benchmark.
Language: Jupyter Notebook - Size: 65.1 MB - Last synced at: about 1 year ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 1

micheleandreucci/Distributed-Data-Analysis-and-Mining-Project
Language: Jupyter Notebook - Size: 10.2 MB - Last synced at: almost 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

duli-eng/KMeans_sklearn
Language: Python - Size: 2.93 KB - Last synced at: about 2 years ago - Pushed at: almost 3 years ago - Stars: 0 - Forks: 0

VZoche-Golob/ClusterTools
R package with convenience tools for clusteranalyses
Language: R - Size: 24.4 KB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0

american-perspectives-capstone/american-attitudes
Our team acquired survey data from the Pew Research Panel, and we explored the drivers of pessimism in American Prospective Attitudes. Understanding what most likely drives pessimistic or optimistic thinking about the future will help business leaders clarify strategies for moving forward and guide expectations of future success in the customers they serve, products offered, investments made, in Marketing and Sales, and throughout their business organization.
Language: Jupyter Notebook - Size: 38.7 MB - Last synced at: over 2 years ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0

EtzionR/Kmeans-Simulator
Allows a 2D view of the calculation process of kmeans clustering.
Language: Python - Size: 3.53 MB - Last synced at: over 2 years ago - Pushed at: almost 4 years ago - Stars: 0 - Forks: 0

jonathan-pap/ML_Foundations
Course offered via Coursera.
Language: Jupyter Notebook - Size: 31.4 MB - Last synced at: about 1 year ago - Pushed at: almost 4 years ago - Stars: 0 - Forks: 0

datasci-iopsy/sec-year-proj
Graduate project implementing a pattern-based approach on organizational data
Language: R - Size: 11.7 MB - Last synced at: almost 2 years ago - Pushed at: about 4 years ago - Stars: 0 - Forks: 0

Gulnaz-18/K-means-clustering
The task is to cluster NBA players based on the players' per-game average performance in the 2018-2019 season. The goal is to achieve the best performance by exploring several different clustering methods, feature engineering, distance metrics, and evaluation measures. The NBA players belong to 5 positions on the basketball court: SG (shooting guard), PG (point guard), SF (small forward), PF (power forward), and C (center). The attribute position in the data file thus constitutes proper ground truth for evlauating clustering performance. Therefore, when you cluster the data, the attribute position shouldn't be included.
Language: Jupyter Notebook - Size: 598 KB - Last synced at: about 2 years ago - Pushed at: over 4 years ago - Stars: 0 - Forks: 0

zcebeci/VatAna
Visual Assessment of Clustering Tendency for Finding the Number of Clusters in Datasets
Language: R - Size: 198 KB - Last synced at: over 2 years ago - Pushed at: over 5 years ago - Stars: 0 - Forks: 0

samnemo/isoitools
C++ tools for calculating cluster quality using information-theoretic measures
Language: C++ - Size: 900 KB - Last synced at: over 1 year ago - Pushed at: about 7 years ago - Stars: 0 - Forks: 0

GustavoRuedaEnriquez/Clustering_JavaAPI
Java API focused on the managment of data clusters, specifically, in Agglomerative and Divisive Hierarchical clustering.
Language: Java - Size: 14.6 KB - Last synced at: over 2 years ago - Pushed at: over 7 years ago - Stars: 0 - Forks: 0
