clustering-evaluation | Topic | Ecosyste.ms: Repos

Topic: "clustering-evaluation"

scikit-learn-contrib/hdbscan

A high performance implementation of HDBSCAN clustering.

Language: Jupyter Notebook - Size: 27.8 MB - Last synced at: 3 days ago - Pushed at: 26 days ago - Stars: 2,936 - Forks: 515

Clustering4Ever/Clustering4Ever

C4E, a JVM friendly library written in Scala for both local and distributed (Spark) Clustering.

Language: Scala - Size: 1.53 MB - Last synced at: 26 days ago - Pushed at: over 4 years ago - Stars: 130 - Forks: 14

sandipanpaul21/Clustering-in-Python

Clustering methods in Machine Learning includes both theory and python code of each algorithm. Algorithms include K Mean, K Mode, Hierarchical, DB Scan and Gaussian Mixture Model GMM. Interview questions on clustering are also added in the end.

Language: Jupyter Notebook - Size: 16.2 MB - Last synced at: over 1 year ago - Pushed at: over 3 years ago - Stars: 81 - Forks: 35

thieu1995/permetrics

Artificial intelligence (AI, ML, DL) performance metrics implemented in Python

Language: Python - Size: 2.36 MB - Last synced at: 7 days ago - Pushed at: 9 months ago - Stars: 76 - Forks: 18

pedrodbs/Aglomera

A hierarchical agglomerative clustering (HAC) library written in C#

Language: C# - Size: 2.62 MB - Last synced at: 10 days ago - Pushed at: over 2 years ago - Stars: 51 - Forks: 16

iralabdisco/pso-clustering

PSO-Clustering algorithm [Matlab code]

Language: MATLAB - Size: 12.7 KB - Last synced at: over 1 year ago - Pushed at: over 3 years ago - Stars: 51 - Forks: 37

BaderLab/scClustViz

Explore and share your scRNAseq clustering results

Language: R - Size: 696 MB - Last synced at: 2 months ago - Pushed at: almost 2 years ago - Stars: 50 - Forks: 9

hhromic/python-bcubed

Simple Extended BCubed implementation in Python for clustering evaluation

Language: Python - Size: 18.6 KB - Last synced at: 23 days ago - Pushed at: over 5 years ago - Stars: 50 - Forks: 9

porterehunley/RACplusplus

A high performance implementation of Reciprocal Agglomerative Clustering in C++

Language: Jupyter Notebook - Size: 191 KB - Last synced at: about 1 year ago - Pushed at: over 1 year ago - Stars: 43 - Forks: 2

gagolews/clustering-benchmarks

A framework for benchmarking clustering algorithms

Language: Python - Size: 194 MB - Last synced at: 18 days ago - Pushed at: 18 days ago - Stars: 39 - Forks: 7

philips-software/latrend

An R package for clustering longitudinal datasets in a standardized way, providing interfaces to various R packages for longitudinal clustering, and facilitating the rapid implementation and evaluation of new methods

Language: R - Size: 62.9 MB - Last synced at: 29 days ago - Pushed at: 29 days ago - Stars: 31 - Forks: 5

eXascaleInfolab/xmeasures

Extremely fast evaluation of the extrinsic clustering measures: various (mean) F1 measures and Omega Index (Fuzzy Adjusted Rand Index) for the multi-resolution clustering with overlaps/covers, standard NMI, clusters labeling

Language: C++ - Size: 1.1 MB - Last synced at: 7 days ago - Pushed at: almost 4 years ago - Stars: 20 - Forks: 7

eXascaleInfolab/clubmark

Clubmark: a Parallel Isolation Framework for Benchmarking and Profiling of Clustering (Community Detection) Algorithms Considering Overlaps (Covers)

Language: Python - Size: 7.66 MB - Last synced at: about 1 year ago - Pushed at: over 4 years ago - Stars: 20 - Forks: 2

waynezhanghk/gacluster

Graph Agglomerative Clustering Library

Language: MATLAB - Size: 20.5 KB - Last synced at: over 1 year ago - Pushed at: almost 6 years ago - Stars: 20 - Forks: 6

eXascaleInfolab/GenConvNMI

Generalized Conventional Mutual Information (GenConvMI) - NMI for overlapping (soft, fuzzy) clusters (communities), compatible with standard NMI, pure C++ version (single executable)

Language: C++ - Size: 124 KB - Last synced at: about 1 year ago - Pushed at: over 4 years ago - Stars: 19 - Forks: 7

eXascaleInfolab/OvpNMI Fork of aaronmcdaid/Overlapping-NMI

Overlapping Normalized Mutual Information and Omega Index evaluation for the overlapping community structure produced by clustering algorithms

Language: C++ - Size: 118 KB - Last synced at: about 1 year ago - Pushed at: over 5 years ago - Stars: 17 - Forks: 5

hulianyu/CVDD

An Internal Validity Index Based on Density-Involved Distance (2019).

Language: MATLAB - Size: 855 KB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 16 - Forks: 3

lettier/interactivekmeans

Interactive HTML canvas based implementation of k-means.

Language: JavaScript - Size: 4.64 MB - Last synced at: 2 months ago - Pushed at: about 7 years ago - Stars: 15 - Forks: 2

cleanzr/clevr

Clustering and Link Prediction Evaluation in R

Language: R - Size: 114 KB - Last synced at: 11 days ago - Pushed at: over 1 year ago - Stars: 12 - Forks: 3

alashkov83/S_Dbw

S_Dbw validity index. Adapted for DBSCAN (and similar)

Language: Jupyter Notebook - Size: 2.34 MB - Last synced at: 22 days ago - Pushed at: about 6 years ago - Stars: 10 - Forks: 5

josemarialuna/ClusterIndices

This package contains the code for executing clustering validity indices in Spark. The package includes BD-Silhouette, BD-Dunn, Davies-Bouldin and WSSSE indices.

Language: Scala - Size: 588 KB - Last synced at: over 2 years ago - Pushed at: over 6 years ago - Stars: 10 - Forks: 3

josemarialuna/ExternalValidity

This package contains the code for calculating external clustering validity indices in Spark. The package includes Chi Index among others.

Language: Scala - Size: 146 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 9 - Forks: 1

eren-ck/finch

A Python implementation of "FINCH Clustering Algorithm (CVPR 2019)"

Language: Python - Size: 460 KB - Last synced at: 10 days ago - Pushed at: 4 months ago - Stars: 8 - Forks: 2

EtzionR/Clustering-by-Silhouette

Optimize clustering labels using Silhouette Score.

Language: Python - Size: 24.4 MB - Last synced at: over 2 years ago - Pushed at: almost 4 years ago - Stars: 8 - Forks: 2

Theldus/KValid

A simple clustering evaluation of KMeans for WEKA

Language: Java - Size: 1.74 MB - Last synced at: over 2 years ago - Pushed at: over 6 years ago - Stars: 8 - Forks: 0

pajaskowiak/clusterConfusion

Clustering validation with ROC Curves

Language: R - Size: 1.2 MB - Last synced at: about 1 month ago - Pushed at: 3 months ago - Stars: 7 - Forks: 1

nejci/Pepelka

Pepelka is a MATLAB toolbox for data clustering and visualization.

Language: MATLAB - Size: 38.9 MB - Last synced at: over 2 years ago - Pushed at: over 3 years ago - Stars: 6 - Forks: 0

adanjoga/cvik-toolbox

CVIK is a Toolbox for the automatic determination of the number of clusters on data clustering problems

Language: MATLAB - Size: 4.87 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 5 - Forks: 2

Cuky88/APICrawler

Language: Python - Size: 78.7 MB - Last synced at: 3 months ago - Pushed at: over 2 years ago - Stars: 5 - Forks: 2

ghar1821/ParetoBench

Benchmarking framework based on Pareto front concept

Language: Python - Size: 680 MB - Last synced at: about 1 year ago - Pushed at: almost 4 years ago - Stars: 5 - Forks: 3

HawxChen/CloudComputing

MapReduce, Spark, Hadoop, PostgreSQL, Cluster Management

Language: Python - Size: 54.7 KB - Last synced at: about 2 years ago - Pushed at: over 7 years ago - Stars: 5 - Forks: 0

mike-liuliu/Min-Max-Jump-distance

Source code of the paper "Min-Max-Jump distance and its applications."

Language: Jupyter Notebook - Size: 104 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 4 - Forks: 0

YihDu/CEMUSA

Language: HTML - Size: 1.47 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 3 - Forks: 0

AYSE-DUMAN/Clustering-by-Business-Income-and-Expenses

load and visualize data and clusters with scatter plots; prepare data for cluster analysis; perform centroid clustering with k-means; interpret clustering results and determine the optimal number of clusters for a given dataset.

Language: Jupyter Notebook - Size: 488 KB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 3 - Forks: 0

zcebeci/fcvalid

Internal Validity Indexes for Fuzzy and Possibilistic Clustering

Language: R - Size: 157 KB - Last synced at: over 2 years ago - Pushed at: over 2 years ago - Stars: 3 - Forks: 1

gulraizchoudhary/Random-Swap-Clustering-Algorithm

The "Random Swap" algorithm with a random dataset, visuals and example notebooks

Language: Python - Size: 539 KB - Last synced at: about 1 year ago - Pushed at: about 3 years ago - Stars: 3 - Forks: 0

wiebket/delarchetypes

A pipeline to construct residential electricity consumer archetypes from the South African Domestic Electrical Load (DEL) database.

Language: Python - Size: 16.9 MB - Last synced at: over 1 year ago - Pushed at: almost 4 years ago - Stars: 3 - Forks: 0

mbari-org/ecoz2-whale-cb

LPC Based K-Means Clustering on Humpback Whale Vocalizations

Language: TeX - Size: 210 MB - Last synced at: 6 days ago - Pushed at: almost 5 years ago - Stars: 3 - Forks: 0

mike-liuliu/gl_index

Source code of the paper "A New Index for Clustering Evaluation Based on Density Estimation."

Language: Jupyter Notebook - Size: 74.4 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 2 - Forks: 0

pajaskowiak/dbcv

Density-Based Clustering Validation

Language: MATLAB - Size: 356 KB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 2 - Forks: 0

caesarmario/Mall-Customers-Clustering-Analysis-using-SAS-Enterprise-Miner

This repository contains mall customers clustering analysis. This repository also uses SAS Enterprise Miner to perform clustering and identify each cluster's characteristics. Full explanations about this repository can be seen on: https://medium.com/@caesarmario/mall-customers-clustering-analysis-da594bd2718b

Size: 4.39 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 2 - Forks: 0

gulraizchoudhary/CentroidIndex

Centroid Index Algorithm for Cluster Level Evaluation

Language: Python - Size: 467 KB - Last synced at: about 1 year ago - Pushed at: about 3 years ago - Stars: 2 - Forks: 0

alefabris/information-retrieval-service-elasticsearch

Creation an Information Retrieval Service with ElasticSearch

Language: Python - Size: 1.52 MB - Last synced at: almost 2 years ago - Pushed at: over 3 years ago - Stars: 2 - Forks: 0

rmerzouki/ml

Best Clustering using silhouette_score

Language: Jupyter Notebook - Size: 658 KB - Last synced at: over 2 years ago - Pushed at: about 4 years ago - Stars: 2 - Forks: 0

nejci/PRAr

Partition relevance analysis with the reduction step

Language: MATLAB - Size: 87.2 MB - Last synced at: over 2 years ago - Pushed at: about 5 years ago - Stars: 2 - Forks: 1

giacomomiolo/clustering-procter-and-gamble

Clustering of consumer data focusing on interpretability and actionable business insights.

Language: Jupyter Notebook - Size: 8.68 MB - Last synced at: about 2 years ago - Pushed at: about 5 years ago - Stars: 2 - Forks: 1

eXascaleInfolab/resmerge

Resolution levels clustering merger with filtering and clusters deduplication. Flattens a hierarchy/list of multiple resolutions levels (clusterings) into the single flat clustering (collection), synchronizing the node base and deduplicating.

Language: C++ - Size: 87.9 KB - Last synced at: about 1 year ago - Pushed at: over 5 years ago - Stars: 2 - Forks: 0

duttashi/clustering 📦

everything related to unsupervised algorithms in data mining

Language: R - Size: 382 KB - Last synced at: over 2 years ago - Pushed at: about 6 years ago - Stars: 2 - Forks: 0

semoglou/composite_silhouette

A clustering evaluation framework that combines micro- and macro-averaged silhouette scores into a composite metric using statistical weighting.

Language: Jupyter Notebook - Size: 1.71 MB - Last synced at: about 1 month ago - Pushed at: about 2 months ago - Stars: 1 - Forks: 0

QuantLet/USC Fork of 120BPM/CSC

Understanding Smart Contracts: Hype or Hope?

Language: Jupyter Notebook - Size: 14.8 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 1 - Forks: 2

nezumiCodes/bcubed-metrics

This repository contains the source code of the bcubed-metrics library for calculating B-Cubed precision, recall, f1 score and macro f1 score

Language: Python - Size: 3.91 KB - Last synced at: 23 days ago - Pushed at: 9 months ago - Stars: 1 - Forks: 0

kreindata/simple-seurat

Simplifying Seurat data processing, clustering, and analysis

Language: R - Size: 27.3 KB - Last synced at: over 2 years ago - Pushed at: over 2 years ago - Stars: 1 - Forks: 0

ndgigliotti/cluster-optimizer

A GridSearchCV-like hyperparameter optimizer for clustering (no cross-validation).

Language: Jupyter Notebook - Size: 74.2 KB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 1 - Forks: 0

hj-n/btw-dataset-internal-measures

The implementation of between dataset internal measures

Language: Python - Size: 15.6 KB - Last synced at: 4 days ago - Pushed at: almost 3 years ago - Stars: 1 - Forks: 0

sankarshan-bhat/graph-clustering

This repository provides classic clustering algorithms and various internal cluster quality validation metrics and also visualization capabilities to analyse the clustering results

Language: Python - Size: 2.69 MB - Last synced at: over 2 years ago - Pushed at: over 5 years ago - Stars: 1 - Forks: 0

visibilia/SIION

This is the repo containing code and other resources for the paper entitled "Exploiting Geographical Data to improve Recommender Systems for Business Opportunities in Urban Areas" and published at BRACIS 2019.

Language: Python - Size: 10.1 MB - Last synced at: about 2 years ago - Pushed at: over 5 years ago - Stars: 1 - Forks: 1

eXascaleInfolab/TInfES

Type Inference Evaluation Scripts & Accessory Apps (used for the StaTIX benchmarking)

Language: Python - Size: 26.1 MB - Last synced at: about 1 year ago - Pushed at: about 7 years ago - Stars: 1 - Forks: 2

samnemo/WClust

FentonLab's WClust software used for spike sorting

Language: C++ - Size: 165 MB - Last synced at: over 1 year ago - Pushed at: about 7 years ago - Stars: 1 - Forks: 0

n-serrette/Cluster_Index

Implementation of some intern and extern clustering indexes

Language: Python - Size: 1.01 MB - Last synced at: over 1 year ago - Pushed at: over 7 years ago - Stars: 1 - Forks: 0

alexandreday/clustering_distance

Distance between clustering assignments. Non-trivial measure weighting L0 and L1 Jaccard norms.

Language: Python - Size: 3.91 KB - Last synced at: 3 months ago - Pushed at: over 7 years ago - Stars: 1 - Forks: 2

semoglou/statistical_composite_silhouette

A clustering evaluation framework that combines micro- and macro-averaged silhouette scores into a composite metric using statistical weighting.

Size: 11.7 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

neobernad/evaluomeR

The evaluomeR package is an R package which permits to evaluate the reliability of bioinformatic metrics.

Language: Jupyter Notebook - Size: 32.9 MB - Last synced at: 14 days ago - Pushed at: 3 months ago - Stars: 0 - Forks: 1

xGabrielR/cluster-ss

Python cluster-ss Package

Language: Python - Size: 3.05 MB - Last synced at: 6 days ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

Kleo-Karap/KPA_thesis

Thesis project for the MSc "Language Technology" of the National and Kapodistrian University of Athens (NKUA)

Language: HTML - Size: 9.75 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

DimFragk/Centroid-clustering-app

Selection of the best centroid based clustering version with k-medoids and k-means

Language: Python - Size: 260 KB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 0 - Forks: 0

Rakhi-TS22/Online-Retail-Store-Analysis

An online retail store is trying to understand the various customer purchase patterns for their firm.

Language: Jupyter Notebook - Size: 5.22 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

Gauss-PWr/clustereval

Python package with clustering validation measures.

Language: Python - Size: 6.84 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

thanasiskr/K-means-2d-clustering

The implementation of the K-means algorithm for clustering randomly generated 2d-points.

Language: C - Size: 12.7 KB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

idastani7/RFM-Clustering

RFM analysis is a type of customer segmentation and behavioral targeting used to help businesses rank and segment customers based on the recency, frequency, and monetary value of a transaction

Language: Jupyter Notebook - Size: 7.15 MB - Last synced at: over 1 year ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

KyriakosPsa/Hyper-Spectral-Image-Clustering

Qualitative and quantitative evaluation of the performance of clustering algorithms in HSI clustering

Language: MATLAB - Size: 2.47 MB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

KyriakosPsa/Countries-cluster-analysis

The objective of this study is to cluster the countries using socio-economic and health factors that determine the overall development of the country and to characterize each resulting cluster (and, consequently, the countries it comprises) based on the relevant values of the above factors

Language: MATLAB - Size: 2.09 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

moharamfatema/img-segmentation

We intend to perform image segmentation. Image segmentation means that we can group similar pixels together and give these grouped pixels the same label. The grouping problem is a clustering problem. We want to study the use of K-means on the Berkeley Segmentation Benchmark.

Language: Jupyter Notebook - Size: 65.1 MB - Last synced at: about 1 year ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 1

micheleandreucci/Distributed-Data-Analysis-and-Mining-Project

Language: Jupyter Notebook - Size: 10.2 MB - Last synced at: almost 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

duli-eng/KMeans_sklearn

Language: Python - Size: 2.93 KB - Last synced at: about 2 years ago - Pushed at: almost 3 years ago - Stars: 0 - Forks: 0

VZoche-Golob/ClusterTools

R package with convenience tools for clusteranalyses

Language: R - Size: 24.4 KB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0

american-perspectives-capstone/american-attitudes

Our team acquired survey data from the Pew Research Panel, and we explored the drivers of pessimism in American Prospective Attitudes. Understanding what most likely drives pessimistic or optimistic thinking about the future will help business leaders clarify strategies for moving forward and guide expectations of future success in the customers they serve, products offered, investments made, in Marketing and Sales, and throughout their business organization.

Language: Jupyter Notebook - Size: 38.7 MB - Last synced at: over 2 years ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0

EtzionR/Kmeans-Simulator

Allows a 2D view of the calculation process of kmeans clustering.

Language: Python - Size: 3.53 MB - Last synced at: over 2 years ago - Pushed at: almost 4 years ago - Stars: 0 - Forks: 0

jonathan-pap/ML_Foundations

Course offered via Coursera.

Language: Jupyter Notebook - Size: 31.4 MB - Last synced at: about 1 year ago - Pushed at: almost 4 years ago - Stars: 0 - Forks: 0

datasci-iopsy/sec-year-proj

Graduate project implementing a pattern-based approach on organizational data

Language: R - Size: 11.7 MB - Last synced at: almost 2 years ago - Pushed at: about 4 years ago - Stars: 0 - Forks: 0

Gulnaz-18/K-means-clustering

The task is to cluster NBA players based on the players' per-game average performance in the 2018-2019 season. The goal is to achieve the best performance by exploring several different clustering methods, feature engineering, distance metrics, and evaluation measures. The NBA players belong to 5 positions on the basketball court: SG (shooting guard), PG (point guard), SF (small forward), PF (power forward), and C (center). The attribute position in the data file thus constitutes proper ground truth for evlauating clustering performance. Therefore, when you cluster the data, the attribute position shouldn't be included.

Language: Jupyter Notebook - Size: 598 KB - Last synced at: about 2 years ago - Pushed at: over 4 years ago - Stars: 0 - Forks: 0