survival8: Interpretation of Decision Tree J48 output in Weka

Sunday, March 13, 2022
Interpretation of Decision Tree J48 output in Weka

Data Set Glimpse
@RELATION iris

@ATTRIBUTE sepallength  NUMERIC
@ATTRIBUTE sepalwidth   NUMERIC
@ATTRIBUTE petallength  NUMERIC
@ATTRIBUTE petalwidth   NUMERIC
@ATTRIBUTE class        {Iris-setosa,Iris-versicolor,Iris-virginica}

The Data of the ARFF file looks like the following:

@DATA
5.1,3.5,1.4,0.2,Iris-setosa
4.9,3.0,1.4,0.2,Iris-setosa
4.7,3.2,1.3,0.2,Iris-setosa
4.6,3.1,1.5,0.2,Iris-setosa
5.0,3.6,1.4,0.2,Iris-setosa
... 

=== Run information ===

Scheme:       weka.classifiers.trees.J48 -C 0.25 -M 2
Relation:     iris
Instances:    150
Attributes:   5
                sepallength
                sepalwidth
                petallength
                petalwidth
                class
Test mode:    10-fold cross-validation

=== Classifier model (full training set) ===

J48 pruned tree
------------------

petalwidth <= 0.6: Iris-setosa (50.0)
petalwidth > 0.6
|   petalwidth <= 1.7
|   |   petallength <= 4.9: Iris-versicolor (48.0/1.0)
|   |   petallength > 4.9
|   |   |   petalwidth <= 1.5: Iris-virginica (3.0)
|   |   |   petalwidth > 1.5: Iris-versicolor (3.0/1.0)
|   petalwidth > 1.7: Iris-virginica (46.0/1.0)

Number of Leaves  : 	5

Size of the tree : 	9


Time taken to build model: 0.36 seconds

=== Stratified cross-validation ===
=== Summary ===

Correctly Classified Instances         144               96      %
Incorrectly Classified Instances         6                4      %
Kappa statistic                          0.94  
Mean absolute error                      0.035 
Root mean squared error                  0.1586
Relative absolute error                  7.8705 %
Root relative squared error             33.6353 %
Total Number of Instances              150     

=== Detailed Accuracy By Class ===

                    TP Rate  FP Rate  Precision  Recall   F-Measure  MCC      ROC Area  PRC Area  Class
                    0.980    0.000    1.000      0.980    0.990      0.985    0.990     0.987     Iris-setosa
                    0.940    0.030    0.940      0.940    0.940      0.910    0.952     0.880     Iris-versicolor
                    0.960    0.030    0.941      0.960    0.950      0.925    0.961     0.905     Iris-virginica
Weighted Avg.    0.960    0.020    0.960      0.960    0.960      0.940    0.968     0.924     

=== Confusion Matrix ===

    a   b  c   <-- classified as
    49  1  0 |  a = Iris-setosa
    0  47  3 |  b = Iris-versicolor
    0   2 48 |  c = Iris-virginica

Interpretation of Model Output From Weka

=== Confusion Matrix ===

    a   b  c   <-- classified as
    49  1  0 |  a = Iris-setosa
    0  47  3 |  b = Iris-versicolor
    0   2 48 |  c = Iris-virginica
    
    
TRUE LABEL and CLASSIFIER LABEL:
Data points classified as Setosa and are actually Setosa: 49 (True Positives)
False Positives (predicted Setosa but are not Setosa): 0
False Negative: 1
True Negative: 47 + 3 + 2 + 48 = 100
When TRUE LABEL == CLASSIFIER LABEL => TRUE POSITIVES 

For Versicolor:
True Positives: 47
False Positives: (1 + 2) = 3
False Negative: 3
True Negative: 49 + 48 

For Virginica:
True Positives: 48
False Positives: 3
False Negative: 2 (predicted as "not virginica" but were actually "virginica")
True Negative: 49 + 1 + 47

Recall: How many of the setosa class were predicted as setosa?
How many of data points belonging to class X were also predicted as X?


Recall = (TP) / (TP + FN)
For Setosa = 49 / 50 = 0.98
For Versicolor: 47 / 50 = 0.94
For Virginica: 48 / 50 = 0.96

Precision: How many of the total predictions of X were actually X?  

Precision = (TP) / (TP + FP)
For Setosa = 49 / (49 + 0) = 1
For Versicolor = 47 / (47 + 3) = 0.940
For Virginica = 48 / (48 + 3) = 0.941
survival8

Pages

Sunday, March 13, 2022

Interpretation of Decision Tree J48 output in Weka

Data Set Glimpse

=== Run information ===

Interpretation of Model Output From Weka

No comments:

Post a Comment