An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: cluster-analysis

semoglou/composite_silhouette

A clustering evaluation framework that combines micro- and macro-averaged silhouette scores into a composite metric using statistical weighting.

Language: Jupyter Notebook - Size: 1.73 MB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 1 - Forks: 0

SergeyFilipov/SHARE-Wave7-Depression-Study

Comprehensive analysis of depressive symptoms among older adults using SHARE Wave 7 data. Includes data preprocessing, factor analysis, clustering of countries, profiling by demographics, and multiple regression modeling.

Language: R - Size: 8.7 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 0 - Forks: 0

CaitHRobinson/indoor-air-quality

Building a classification to assess neighbourhood indoor air quality vulnerabilities

Size: 0 Bytes - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 0 - Forks: 0

gagolews/genieclust

Genie: Fast and Robust Hierarchical Clustering with Noise Point Detection - in Python and R

Language: C++ - Size: 79.6 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 62 - Forks: 12

Liang-Team/Sequenzo

A fast, scalable, and intuitive Python package in social sequence analysis.

Language: Jupyter Notebook - Size: 73.4 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 6 - Forks: 2

scikit-learn-contrib/hdbscan

A high performance implementation of HDBSCAN clustering.

Language: Jupyter Notebook - Size: 27.8 MB - Last synced at: 7 days ago - Pushed at: about 1 month ago - Stars: 2,947 - Forks: 514

kubesphere/kubeeye

KubeEye aims to find various problems on Kubernetes, such as application misconfiguration, unhealthy cluster components and node problems.

Language: Go - Size: 221 MB - Last synced at: 7 days ago - Pushed at: 10 days ago - Stars: 832 - Forks: 133

Hazim-HF/Business-Analytics

This course introduces techniques to transform raw data into actionable insights for business analysis, covering customer, operation, and people analytics. Customer analytics examines and predicts customer behavior; operation analytics aligns supply with demand and optimizes decisions; people analytics uses data to manage the workforce effectively.

Language: R - Size: 51.4 MB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 0 - Forks: 0

zeynepcindemir/California-Housing-Data-Mining-Project

This project is developed as part of the Data Mining course. It covers detailed EDA and comparative analysis of various data clustering algorithms on the California Housing 1990 Census dataset to evaluate performance and efficiency.

Language: Jupyter Notebook - Size: 24.5 MB - Last synced at: 10 days ago - Pushed at: 10 days ago - Stars: 0 - Forks: 0

heitornolla/Analysis-of-Voter-Abstention-through-Clustering

Analysis of voter abstention on brazilian elections

Language: Jupyter Notebook - Size: 289 KB - Last synced at: 13 days ago - Pushed at: 13 days ago - Stars: 0 - Forks: 0

daniau23/motor_vehicle_thefts_analysis

An analysis of vehicle thefts

Language: Jupyter Notebook - Size: 11 MB - Last synced at: 13 days ago - Pushed at: 13 days ago - Stars: 0 - Forks: 0

mlr-org/mlr3cluster

Cluster analysis for mlr3

Language: R - Size: 7.93 MB - Last synced at: 17 days ago - Pushed at: 17 days ago - Stars: 24 - Forks: 6

hugo-strang/silhouette-upper-bound

An upper bound of the Average Silhouette Width.

Language: Python - Size: 20.5 KB - Last synced at: about 15 hours ago - Pushed at: about 16 hours ago - Stars: 3 - Forks: 0

shubhro2002/Comparing-Clustering-Algorithms-by-Customer-Segmentation

This project applies unsupervised machine learning techniques to perform customer segmentation on a marketing dataset. It uses the BIRCH, K-Means and DBSCAN clustering algorithms to group customers based on their demographic and behavioral features, with a focus on interpretability, performance, and scalability.

Language: Jupyter Notebook - Size: 2.23 MB - Last synced at: 22 days ago - Pushed at: 23 days ago - Stars: 0 - Forks: 0

kvesta/vesta

A static analysis of vulnerabilities, Docker and Kubernetes cluster configuration detect toolkit based on the real penetration of cloud computing

Language: Go - Size: 3.93 MB - Last synced at: 26 days ago - Pushed at: 26 days ago - Stars: 199 - Forks: 29

gagolews/genie

Genie: A Fast and Robust Hierarchical Clustering Algorithm (this R package has now been superseded by genieclust)

Language: C++ - Size: 410 KB - Last synced at: 26 days ago - Pushed at: 26 days ago - Stars: 22 - Forks: 3

GermanoGallicchio/PhysioExplorer

toolbox for statistical analysis of EEG data in 3d space (temporal, spectral, channels) or any subspace

Language: MATLAB - Size: 109 KB - Last synced at: 29 days ago - Pushed at: 29 days ago - Stars: 0 - Forks: 0

TheJJ/ceph-balancer

Efficient Ceph placement optimization, aiming for maximum storage capacity through equal OSD utilization.

Language: Python - Size: 370 KB - Last synced at: 29 days ago - Pushed at: about 2 months ago - Stars: 122 - Forks: 34

instamatic-dev/edtools

Collection of tools for automated processing and clustering of electron diffraction data

Language: Python - Size: 461 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 11 - Forks: 9

gagolews/clustering-benchmarks

A framework for benchmarking clustering algorithms

Language: Python - Size: 194 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 39 - Forks: 7

philips-labs/comparison-clustering-longitudinal-data

Supplementary materials for the manuscript "A comparison of methods for clustering longitudinal data with slowly changing trends" by N. G. P. Den Teuling, S.C. Pauws, and E.R. van den Heuvel, published in Communications in Statistics - Simulation and Computation (2021).

Language: R - Size: 144 KB - Last synced at: 14 days ago - Pushed at: almost 3 years ago - Stars: 4 - Forks: 1

pnavaro/GeometricClusterAnalysis.jl

Geometric methods for Cluster Analysis

Language: Julia - Size: 84.3 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 4 - Forks: 0

MicheleCucinella/Statistics-for-High-Dimentional-Data

Assignment for Statistics and High Dimentional Data Exam

Language: R - Size: 752 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

Devinterview-io/cluster-analysis-interview-questions

🟣 Cluster Analysis interview questions and answers to help you prepare for your next machine learning and data science interview in 2025.

Size: 31.3 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 12 - Forks: 1

MBAigner/PDFSegmenter

This library builds a graph-representation of the content of PDFs. The graph is then clustered, resulting page segments are classified and returned. Tables are retrieved formatted as a CSV.

Language: Python - Size: 399 KB - Last synced at: 26 days ago - Pushed at: almost 5 years ago - Stars: 21 - Forks: 3

Cyberoctane29/Optimizing-K-in-K-means-A-Visual-and-Quantitative-Exploration

Exploring K-means clustering through image color compression and high-dimensional data analysis. Learn how pixel grouping in RGB space builds intuition, while inertia/silhouette scores optimize clusters. Demonstrates K-means' power to reveal patterns in both visual and abstract data by optimizing groupings and selecting ideal k-values.

Language: Jupyter Notebook - Size: 17.1 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

edelweiss611428/dissimilarities

An R Package for Creating, Manipulating, and Subsetting "dist" Objects

Language: R - Size: 560 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 1 - Forks: 0

elki-project/elki

ELKI Data Mining Toolkit

Language: Java - Size: 55 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 813 - Forks: 325

sellyrk/Analisis-Segmentasi-Karyawan

Ini adalah proyek klasterisasi karyawan yang saya buat untuk memenuhi syarat kelulusan kelas Penerapan Data Science dari Dicoding

Language: Jupyter Notebook - Size: 4.71 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

Chaganti-Reddy/AI-Prototype-Customer-Segmentation

Artificial Intelligence Prototype product based model for Customer Segmentation in E-Commerce Industry.

Language: Jupyter Notebook - Size: 13.8 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

lachhebo/pyclustertend

A python package to assess cluster tendency

Language: Python - Size: 6.2 MB - Last synced at: 20 days ago - Pushed at: 6 months ago - Stars: 48 - Forks: 11

Shanmukhi1920/Heart-Failure-Prediction

Explored Heart Failure Prediction Dataset and performed Classification and Clustering on the data using R.

Size: 37.1 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

philips-software/latrend

An R package for clustering longitudinal datasets in a standardized way, providing interfaces to various R packages for longitudinal clustering, and facilitating the rapid implementation and evaluation of new methods

Language: R - Size: 64.4 MB - Last synced at: 10 days ago - Pushed at: about 1 month ago - Stars: 31 - Forks: 5

0quaaD/Museum-Clustering-Model

This is the Museum Clustering Model with the museums at Canada

Language: Python - Size: 7.08 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

billmouzakis/Disseratation-code

In this reposity we can see a data analysis with Covid19 data for many countries in the time period 2020-2022

Language: R - Size: 54.7 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

hetuvpatel/ML-Diabetes-Risk-Progression-Stage

Machine learning project analyzing diabetes risk progression using K-Means and Hierarchical clustering techniques on the Pima Indian Diabetes dataset. 🧠📊

Language: Python - Size: 2.44 MB - Last synced at: 1 day ago - Pushed at: about 2 months ago - Stars: 1 - Forks: 0

leassis91/allmart

All Mart project using Kedro Framework.

Language: Python - Size: 27.2 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0

clusterking/clusterking

Cluster sets of histograms/curves, in particular kinematic distributions in high energy physics.

Language: Python - Size: 2.95 MB - Last synced at: 7 days ago - Pushed at: 5 months ago - Stars: 12 - Forks: 2

KennyBanwo23/customer-segmentation

The project’s goal is to conduct customer segmentation analysis to identify key customer groups, predict future purchasing behaviours, and devise personalized customer experiences to improve customer engagement and retention rates.

Language: Jupyter Notebook - Size: 25.5 MB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 0 - Forks: 0

erda-project/kubeprober

Large-scale Kubernetes cluster diagnostic tool.

Language: Go - Size: 268 MB - Last synced at: 29 days ago - Pushed at: over 1 year ago - Stars: 143 - Forks: 39

PhoenixDD/Cheapest-Flights-bot

A bot created on python and selenium, that mines data on cheapest flights using google flights API

Language: Python - Size: 5.86 KB - Last synced at: 15 days ago - Pushed at: about 6 years ago - Stars: 78 - Forks: 27

epigen/unsupervised_analysis

A general purpose Snakemake workflow and MrBiomics module to perform unsupervised analyses (dimensionality reduction & cluster analysis) and visualizations of high-dimensional data.

Language: Python - Size: 56 MB - Last synced at: 2 months ago - Pushed at: 4 months ago - Stars: 26 - Forks: 4

fatimagulomova/iu-projects

IU Projects

Language: Jupyter Notebook - Size: 127 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

AlexandrovLab/SigProfilerClusters

Tool for analyzing the inter-mutational distances between SNV-SNV and INDEL-INDEL mutations. Tool separates mutations into clustered and non-clustered groups on a sample-dependent basis.

Language: Python - Size: 1.7 MB - Last synced at: 18 days ago - Pushed at: 18 days ago - Stars: 12 - Forks: 1

bajor/categorical-cluster

My algo and library for clustering categorical data

Language: Python - Size: 863 KB - Last synced at: 2 days ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

arsilva87/biotools

biotools: Tools for Biometry and Applied Statistics in Agricultural Science (R package)

Language: R - Size: 3.02 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 2 - Forks: 0

volfpeter/localclustering

Python 3 implementation and documentation of the Hermina-Janos local graph clustering algorithm.

Language: Python - Size: 2.48 MB - Last synced at: 7 days ago - Pushed at: over 2 years ago - Stars: 22 - Forks: 1

davidj-brewster/agentic-e2e-dicom-medical-pipeline

Multi-Agentic adaptive FreeSurfer/FSL Registration, ROI-detection, Segmentation, Clustering, Visualisation and Anomaly detection System for DiCOM and NiFTi images and sequences

Language: Python - Size: 273 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 1 - Forks: 0

Cyberoctane29/Penguins-Data-Analysis-and-Modeling

This project applies statistical modeling, including single and multiple linear regression, using Python. It covers exploratory data analysis, data cleaning, and modeling with pandas, NumPy, statsmodels, and scikit-learn. Regression analyzes relationships, while clustering identifies patterns. Seaborn visualizations enhance interpretability.

Language: Jupyter Notebook - Size: 5.15 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

semoglou/statistical_composite_silhouette

A clustering evaluation framework that combines micro- and macro-averaged silhouette scores into a composite metric using statistical weighting.

Size: 11.7 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

Mahdi-s/Twitter_Embedding_Clustering_NL2SQL

Embedding, Clustering, and NL 2 SQL tool for USC's HUMANS Lab's twitter dataset on 2024 Election posts.

Language: Jupyter Notebook - Size: 411 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 1 - Forks: 0

LuisScoccola/persistable

density-based clustering for exploratory data analysis based on multi-parameter persistence

Language: Python - Size: 11.3 MB - Last synced at: 7 days ago - Pushed at: 3 months ago - Stars: 38 - Forks: 2

abh2050/Customer_support_intelligence

An interactive Streamlit app leveraging NLP embeddings and Gemini AI to analyze, classify, and provide insights on customer support issues. It enables trend tracking, root cause analysis, and AI-powered solution recommendations, utilizing NLP-based cluster analysis for semantic grouping.

Language: Jupyter Notebook - Size: 6.25 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

microsoft/dstoolkit-forecasting

Template for forecasting data science project and identify consumption profiles in time series

Language: Jupyter Notebook - Size: 5.79 MB - Last synced at: 3 days ago - Pushed at: over 1 year ago - Stars: 42 - Forks: 4

Beliavsky/Burkardt-Fortran-90-codes

John Burkardt's Fortran 90 codes and documentation

Language: Fortran - Size: 35.3 MB - Last synced at: 3 months ago - Pushed at: 4 months ago - Stars: 24 - Forks: 1

RamiKrispin/ts-cluster-analysis-r

Materials for the the Analyzing Time Series at Scale with Cluster Analysis in R Workshop

Language: HTML - Size: 89 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 19 - Forks: 4

ProntoSbinalla/Python

Language: Jupyter Notebook - Size: 2.03 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

ear-team/bambird

Unsupervised classification to improve the quality of a bird song recording dataset. https://doi.org/10.1016/j.ecoinf.2022.101952

Language: Python - Size: 207 MB - Last synced at: about 1 month ago - Pushed at: 2 months ago - Stars: 26 - Forks: 6

milaan9/Clustering_Algorithms_from_Scratch

Implementing Clustering Algorithms from scratch in MATLAB and Python

Language: Jupyter Notebook - Size: 6.5 MB - Last synced at: about 1 month ago - Pushed at: over 2 years ago - Stars: 201 - Forks: 179

Beliavsky/Burkardt-Fortran-90

Classification of John Burkardt's many Fortran 90 codes

Size: 29.9 MB - Last synced at: 3 months ago - Pushed at: 4 months ago - Stars: 46 - Forks: 10

xGabrielR/cluster-ss

Python cluster-ss Package

Language: Python - Size: 3.05 MB - Last synced at: 20 days ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

Robertoarce/Clustering_tools

Clustering tools

Language: Jupyter Notebook - Size: 2.64 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

Evintkoo/Functional-Group-Analysis

Analysis of bond characteristic in high drug-likeness score compound

Language: Jupyter Notebook - Size: 172 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

uef-machine-learning/tspgclu

Fast but accurate approximation of Ward's agglomerative clustering using a fully connected TSP graph

Language: C - Size: 8.18 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 3 - Forks: 0

philips-labs/demo-clustering-longitudinal-data

Supplementary materials for the manuscript "Clustering of longitudinal data: A tutorial on a variety of approaches" by N. G. P. Den Teuling, S.C. Pauws, and E.R. van den Heuvel (2021)

Language: R - Size: 18.6 KB - Last synced at: 14 days ago - Pushed at: over 3 years ago - Stars: 11 - Forks: 9

aliciagilmatute/Machine-Learning--unsupervised-

Proyectos de Aprendizaje Automático No Supervisado

Language: Jupyter Notebook - Size: 2.37 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

HarikrishnanK9/Health_Profile_Analysis

Health Profile Analysis:Revealing Disorder Paterns,Medication Guidance and Risk Classification-ML Project

Language: Jupyter Notebook - Size: 3.42 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

DRehan003/Cluster_Analysis_of_Smart_Contract_Risks

I performed cluster analysis on a dataset of smart contracts in Python to identify similar risk profiles.

Language: Python - Size: 1.31 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

c3duan/Time-Series-Classifier

Anomaly Classification in Time Series Data

Language: Jupyter Notebook - Size: 11.8 MB - Last synced at: 3 months ago - Pushed at: almost 5 years ago - Stars: 5 - Forks: 0

MohdRasmil7/Customer-Insights-and-Segmentation-with-Machine-Learning

Analyze customer data to segment and understand your ideal customers. This app helps businesses tailor products and marketing strategies for different customer segments using detailed analysis and clustering. 🚀

Language: Jupyter Notebook - Size: 13.4 MB - Last synced at: 3 months ago - Pushed at: 11 months ago - Stars: 1 - Forks: 0

zcebeci/fcvalid

Internal Validity Indexes for Fuzzy and Possibilistic Clustering

Language: R - Size: 157 KB - Last synced at: 11 days ago - Pushed at: over 2 years ago - Stars: 7 - Forks: 3

marianamartiyns/RFM-Cluster-Analysis

Customer behavior and sales analysis, including data cleaning, RFM calculation, churn analysis and customer clustering.

Language: Jupyter Notebook - Size: 1.73 MB - Last synced at: 18 days ago - Pushed at: 5 months ago - Stars: 0 - Forks: 0

tnleite/real-estate-opportunities-analysis

Este repositório apresenta uma análise de oportunidades no mercado imobiliário, combinando séries temporais, clusterização e previsões para identificar estados com maior potencial de crescimento e orientar estratégias de expansão eficientes.

Language: Jupyter Notebook - Size: 15.4 MB - Last synced at: 2 months ago - Pushed at: 5 months ago - Stars: 0 - Forks: 0

mike-liuliu/Min-Max-Jump-distance

Source code of the paper "Min-Max-Jump distance and its applications."

Language: Jupyter Notebook - Size: 104 MB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 4 - Forks: 0

anasantosdev/MQAM-ACH2036

Repositório contendo os códigos da disciplina de Métodos Quantitativos para Análise Multivariada, da Escola de Artes, Ciências e Humanidades (EACH) na Universidade de São Paulo (USP).

Language: R - Size: 993 KB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 0 - Forks: 0

QMUL/poLCAParallel Fork of dlinzer/poLCA

C++ Implementation of poLCA (R package)

Language: C++ - Size: 1 MB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 3 - Forks: 0

PavanSugreev04/Document-Classification

automatically label documents based on the textual content present near key areas of interest

Language: Python - Size: 6.06 MB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 0 - Forks: 0

eZWALT/ADSDB-DS-EtE-Project

MDS-FIB Algorithms, Data Structures and Databases (ADSDB) Subject 2024-25 Q1, Data-Science End-to-End project path

Language: Jupyter Notebook - Size: 14.8 MB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 0 - Forks: 0

NghiaNT3110/customer_segment_9

This is a DA Project about Clustering based on the Customer Mall datasets from Kaggle

Language: Jupyter Notebook - Size: 2.92 MB - Last synced at: 5 days ago - Pushed at: 6 months ago - Stars: 0 - Forks: 0

AiCorsair/Dataquest-Data-Science-Analysis-Projects

A repository dedicated to storing guided projects completed while learning data science concepts with Dataquest.

Language: Jupyter Notebook - Size: 74 MB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 11 - Forks: 3

shafira-khoirunnisah/Cluster-analysis

This is my course project about clustering analysis of districts/cities in West Java based on socio-economic conditions using the K-Means method with R studio.

Size: 12.7 KB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 0 - Forks: 0

hermesespinola/FOA-Kmeans-Color-Image-Segmentation

Clustering analysis using an evolutionary optimization algorithm based on nature, Forest Optimization Algorithm

Language: MATLAB - Size: 627 KB - Last synced at: 3 days ago - Pushed at: almost 6 years ago - Stars: 21 - Forks: 9

bkrai/Top-10-Machine-Learning-Methods-With-R

Includes top ten must know machine learning methods with R.

Size: 82 KB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 77 - Forks: 66

egy1st/denmune-clustering-algorithm Fork of scikit-learn-contrib/denmune-clustering-algorithm

DenMune is a clustering algorithm that can find clusters of arbitrary size, shapes and densities in two-dimensions. Higher dimensions are first reduced to 2-D using the t-sne. The algorithm relies on a single parameter K (the number of nearest neighbors). The results show the superiority of DenMune. Enjoy the simplicty but the power of DenMune.

Language: Jupyter Notebook - Size: 73.9 MB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 4 - Forks: 0

V-MalM/Stock-Clustering-and-Prediction Fork of dschoen24/Stock-Prediction

To build, train and test LSTM model to forecast next day 'Close' price and to create diverse stock portfolios using k-means clustering to detect patterns in stocks that move similarly with an underlying trend i.e., for a given period, how stocks trend together.To deploy our findings to an app along with an interactive dashboard to predict the next day ‘Close’ for any given stock.

Language: Jupyter Notebook - Size: 58.1 MB - Last synced at: 3 months ago - Pushed at: over 3 years ago - Stars: 6 - Forks: 1

clusterfreak/ClusterCore

Core classes for cluster analysis - Java - Fuzzy-C-Means and Possibilistic-C-Means Algorithms in an only marginally modified version from 2005.

Language: Java - Size: 4.96 MB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 3 - Forks: 0

ITRoselloSignoris/Fraud-Detection-and-Prevention-Model

Final Project for Edvai´s Data Science & MLOps Bootcamp

Language: Jupyter Notebook - Size: 1.49 MB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 0 - Forks: 1

noorulhudaajmal/Customer-Segmentation-Analysis

Customer segmentation and analysis of purchasing behaviour

Language: Jupyter Notebook - Size: 1.16 MB - Last synced at: 3 months ago - Pushed at: 8 months ago - Stars: 0 - Forks: 0

thenomaniqbal/K-MeansClustering-AirlineCustomerValueAnalysis

K-means Clustering for Airline Customer Value Analysis is a data-driven project focused on segmenting airline customers based on their behavior and value using K-means clustering. It includes an introduction to customer segmentation, dataset preprocessing, clustering methodology, results analysis, and actionable business insights.

Language: Jupyter Notebook - Size: 0 Bytes - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 1 - Forks: 0

NicolasH2/ggdendroplot

dendrograms in ggplot2.

Language: R - Size: 626 KB - Last synced at: 3 months ago - Pushed at: almost 4 years ago - Stars: 10 - Forks: 0

FatimaUriarte/R

R markdown files employed in my research

Size: 893 KB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 0 - Forks: 0

george-gca/asreview-top2vec Fork of asreview/semantic-clusters

Semantic Clustering for ASReview Datasets using Top2Vec

Language: Python - Size: 18.7 MB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 0 - Forks: 0

nafisalawalidris/911-Call-Analysis

The 911 Call Analysis project explores and visualises emergency call data to uncover patterns and trends. It includes data preparation, exploratory analysis, visualizing call volume and reasons and generating heatmaps. Users can customize the code for their dataset. The project relies on libraries like Pandas, NumPy, Matplotlib, Seaborn, and SciPy

Language: Jupyter Notebook - Size: 24.1 MB - Last synced at: 2 months ago - Pushed at: about 2 years ago - Stars: 5 - Forks: 0

Forest-Lover/LogStatistic

日志分类和频率统计,日志过滤、归类、统计工具。支持多种输入格式,输出到文件(excel,txt)和标准输出。config目录、output目录,已经给出了一些实际的配置和输出(不含日志源文件),可以参考。 项目配置灵活性比较大、主要工作在于根据实际日志格式编写合适的过滤和解析规则

Language: Python - Size: 3.81 MB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 1 - Forks: 0

kardevroop/CSCI723GraphDB

Implementation of the Label Propagation algorithm with a slight variation in the stopping criteria.

Language: Python - Size: 7.43 MB - Last synced at: 18 days ago - Pushed at: 7 months ago - Stars: 0 - Forks: 0

SalmanFarizN/pybdynamics

Python package for simulation and data analysis of interacting colloidal particle systems.

Language: Python - Size: 74.2 KB - Last synced at: about 2 months ago - Pushed at: 7 months ago - Stars: 0 - Forks: 0

dmattek/ARCOS

An R package to detect collective spatio-temporal phenomena

Language: R - Size: 50.8 MB - Last synced at: 7 months ago - Pushed at: 7 months ago - Stars: 11 - Forks: 3

PaulRegnier/PICAFlow

PICAFlow: a complete R workflow dedicated to flow/mass cytometry data, from data pre-processing to deep and comprehensive analysis.

Language: R - Size: 200 KB - Last synced at: 7 months ago - Pushed at: 7 months ago - Stars: 2 - Forks: 2

s-yazhini/PySpark-and-SparkSQL

In Azure DataBricks

Language: Jupyter Notebook - Size: 13.7 KB - Last synced at: 3 months ago - Pushed at: 7 months ago - Stars: 0 - Forks: 0

Paulj1989/player-similarities

Using FB Ref player data to measure player similarity within positions, using clustering methods

Language: Jupyter Notebook - Size: 6.16 MB - Last synced at: 2 months ago - Pushed at: 7 months ago - Stars: 0 - Forks: 0

Related Keywords
cluster-analysis 492 clustering 139 clustering-algorithm 90 machine-learning 90 python 84 data-science 60 r 57 data-visualization 46 data-analysis 38 k-means-clustering 32 kmeans-clustering 32 unsupervised-learning 28 machine-learning-algorithms 28 data-mining 25 clustering-evaluation 24 cluster 24 pca 21 unsupervised-machine-learning 20 visualization 19 clustering-methods 19 exploratory-data-analysis 19 k-means 18 logistic-regression 15 statistics 15 segmentation 14 pandas 14 hierarchical-clustering 14 time-series 13 customer-segmentation 13 time-series-analysis 13 kmeans 13 random-forest 12 pca-analysis 12 jupyter-notebook 12 numpy 12 principal-component-analysis 12 dimensionality-reduction 12 scikit-learn 11 python3 11 nlp 11 deep-learning 10 factor-analysis 10 dbscan 10 ggplot2 10 regression-analysis 9 text-mining 9 classification 9 rstudio 8 covid-19 8 seaborn 8 supervised-learning 8 matplotlib 8 regression-models 8 decision-trees 7 dbscan-clustering 7 sklearn 7 eda 7 nlp-machine-learning 6 python-3 6 forecasting 6 feature-selection 6 outlier-detection 6 linear-regression 6 knn 6 knn-classification 6 association-rules 6 clusters 6 data-cleaning 6 clustering-algorithms 5 spatial-analysis 5 random-forest-classifier 5 shiny 5 multivariate-analysis 5 outliers 5 silhouette-score 5 ensemble-learning 5 umap 5 feature-engineering 5 dendrogram 5 spark 5 ml 5 anomaly-detection 5 tableau 5 marketing 5 webscraping 5 clustering-analysis 5 analysis 5 genomics 5 gaussian-mixture-models 5 xgboost 4 recommendation-system 4 r-programming 4 artificial-intelligence 4 network-analysis 4 java 4 shinydashboard 4 decision-tree 4 unsupervised-clustering 4 marketing-analytics 4 hadoop 4