survival8: Supervised classification-oriented measures for Cluster Analysis

Monday, May 2, 2022

Supervised classification-oriented measures for Cluster Analysis

Here we are discussing the classification-oriented measures for cluster analysis, assuming that our clustering is correct and clustering results are supersivising are reports.

Our Clusters




Purity

# What is the Purity of cluster '1' here?

# How pure this cluster is?

# What percentage of articles in this cluster belong to 'Metro'?

Note: Precision and Purity are the same thing.
= 506 / 677

Similarly, we can calculate Recall and F-score for these clusters.

Recall: 506 / 943

F-Score: harmonic mean of Precision and Recall.
F-score: 2 * precisioin * recall / (precision + recall)

Formula for Entropy for a N clusters is: summation( - probability(i) * log2( probability(i) ) )
Here i goes from 1 to N.

The lesser the entropy, the better.
The lesser the entropy, the better the clustering.

survival8

Pages

Monday, May 2, 2022

Supervised classification-oriented measures for Cluster Analysis

Our Clusters

Purity

No comments:

Post a Comment