Monday, May 2, 2022

Supervised classification-oriented measures for Cluster Analysis


Here we are discussing the classification-oriented measures for cluster analysis, assuming that our clustering is correct and clustering results are supersivising are reports.

Our Clusters

Purity

# What is the Purity of cluster '1' here? # How pure this cluster is? # What percentage of articles in this cluster belong to 'Metro'? Note: Precision and Purity are the same thing. = 506 / 677 Similarly, we can calculate Recall and F-score for these clusters. Recall: 506 / 943 F-Score: harmonic mean of Precision and Recall. F-score: 2 * precisioin * recall / (precision + recall) Formula for Entropy for a N clusters is: summation( - probability(i) * log2( probability(i) ) ) Here i goes from 1 to N. The lesser the entropy, the better. The lesser the entropy, the better the clustering.

No comments:

Post a Comment