Package: genieclust 1.1.6.9003

genieclust: Fast and Robust Hierarchical Clustering with Noise Points Detection

A retake on the Genie algorithm (Gagolewski, 2021 <doi:10.1016/j.softx.2021.100722>), which is a robust hierarchical clustering method (Gagolewski, Bartoszuk, Cena, 2016 <doi:10.1016/j.ins.2016.05.003>). It is now faster and more memory efficient; determining the whole cluster hierarchy for datasets of 10M points in low dimensional Euclidean spaces or 100K points in high-dimensional ones takes only a minute or so. Allows clustering with respect to mutual reachability distances so that it can act as a noise point detector or a robustified version of 'HDBSCAN*' (that is able to detect a predefined number of clusters and hence it does not dependent on the somewhat fragile 'eps' parameter). The package also features an implementation of inequality indices (e.g., Gini and Bonferroni), external cluster validity measures (e.g., the normalised clustering accuracy, the adjusted Rand index, the Fowlkes-Mallows index, and normalised mutual information), and internal cluster validity indices (e.g., the Calinski-Harabasz, Davies-Bouldin, Ball-Hall, Silhouette, and generalised Dunn indices). See also the 'Python' version of 'genieclust' available on 'PyPI', which supports sparse data, more metrics, and even larger datasets.

Authors:Marek Gagolewski [aut, cre, cph], Maciej Bartoszuk [ctb], Anna Cena [ctb], Peter M. Larsen [ctb]

genieclust_1.1.6.9003.tar.gz
genieclust_1.1.6.9003.zip(r-4.5)genieclust_1.1.6.9003.zip(r-4.4)genieclust_1.1.6.9003.zip(r-4.3)
genieclust_1.1.6.9003.tgz(r-4.5-x86_64)genieclust_1.1.6.9003.tgz(r-4.5-arm64)genieclust_1.1.6.9003.tgz(r-4.4-x86_64)genieclust_1.1.6.9003.tgz(r-4.4-arm64)genieclust_1.1.6.9003.tgz(r-4.3-x86_64)genieclust_1.1.6.9003.tgz(r-4.3-arm64)
genieclust_1.1.6.9003.tar.gz(r-4.5-noble)genieclust_1.1.6.9003.tar.gz(r-4.4-noble)
genieclust.pdf |genieclust.html
genieclust/json (API)
NEWS

# Install 'genieclust' in R:
install.packages('genieclust', repos = c('https://gagolews.r-universe.dev', 'https://cloud.r-project.org'))

Bug tracker:https://github.com/gagolews/genieclust/issues

Uses libs:
  • c++– GNU Standard C++ Library v3
  • openmp– GCC OpenMP (GOMP) support library

On CRAN:

Conda:

cluster-analysisclusteringclustering-algorithmdata-analysisdata-miningdata-sciencegeniehdbscanhierarchical-clusteringhierarchical-clustering-algorithmmachine-learningmachine-learning-algorithmsmlpacknmslibpythonpython3sparsecppopenmp

7.29 score 61 stars 5 packages 13 scripts 820 downloads 28 exports 4 dependencies

Last updated 9 hours agofrom:dce6c20820. Checks:12 OK. Indexed: yes.

TargetResultLatest binary
Doc / VignettesOKMar 12 2025
R-4.5-win-x86_64OKMar 12 2025
R-4.5-mac-x86_64OKMar 12 2025
R-4.5-mac-aarch64OKMar 12 2025
R-4.5-linux-x86_64OKMar 12 2025
R-4.4-win-x86_64OKMar 12 2025
R-4.4-mac-x86_64OKMar 12 2025
R-4.4-mac-aarch64OKMar 12 2025
R-4.4-linux-x86_64OKMar 12 2025
R-4.3-win-x86_64OKMar 12 2025
R-4.3-mac-x86_64OKMar 12 2025
R-4.3-mac-aarch64OKMar 12 2025

Exports:adjusted_fm_scoreadjusted_mi_scoreadjusted_rand_scorebonferroni_indexcalinski_harabasz_indexdevergottini_indexdunnowa_indexemst_mlpackfm_scoregclustgeneralised_dunn_indexgeniegini_indexmi_scoremstnegated_ball_hall_indexnegated_davies_bouldin_indexnegated_wcss_indexnormalized_clustering_accuracynormalized_confusion_matrixnormalized_mi_scorenormalized_pivoted_accuracynormalizing_permutationpair_sets_indexrand_scoresilhouette_indexsilhouette_w_indexwcnn_index

Dependencies:mlpackRcppRcppArmadilloRcppEnsmallen

Readme and manuals

Help Manual

Help pageTopics
Internal Cluster Validity Measurescalinski_harabasz_index cluster_validity dunnowa_index generalised_dunn_index negated_ball_hall_index negated_davies_bouldin_index negated_wcss_index silhouette_index silhouette_w_index wcnn_index
External Cluster Validity Measures and Pairwise Partition Similarity Scoresadjusted_fm_score adjusted_mi_score adjusted_rand_score compare_partitions fm_score mi_score normalized_clustering_accuracy normalized_confusion_matrix normalized_mi_score normalized_pivoted_accuracy normalizing_permutation pair_sets_index rand_score
Euclidean Minimum Spanning Tree [DEPRECATED]emst_mlpack
Hierarchical Clustering Algorithm Geniegclust gclust.default gclust.dist gclust.mst genie genie.default genie.dist genie.mst
Inequality Measuresbonferroni_index devergottini_index gini_index inequality
Minimum Spanning Tree of the Pairwise Distance Graphmst mst.default mst.dist