Monday, March 7, 2022

Running Weka Apriori on 9_TXN_5_ITEMS Dataset

A CSV file not following Weka format of questions marks failed to load.

Error Message:
Full Screen:

The File Erroneous For Weka is opening without any issues in LibreOffice:

So, we create our custom file in similar manner to Weka's Supermarket dataset:
tid,item1,item2,item3,item4,item5
T100,t,t,?,?,t
T200,?,t,?,t,?
T300,?,t,t,?,?
T400,t,t,?,t,?
T500,t,?,t,?,?
T600,?,t,t,?,?
T700,t,?,t,?,?
T800,t,t,t,?,t
T900,t,t,t,?,?

Weka's Apriori Run Information For Small Dataset As Above With TID

=== Run information ===

Scheme:       weka.associations.Apriori -N 10 -T 0 -C 0.9 -D 0.05 -U 1.0 -M 0.1 -S -1.0 -c -1
Relation:     9_txn_5_items
Instances:    9
Attributes:   6
                tid
                item1
                item2
                item3
                item4
                item5
=== Associator model (full training set) ===


Apriori
=======

Minimum support: 0.16 (1 instances)
Minimum metric <confidence>: 0.9
Number of cycles performed: 17

Generated sets of large itemsets:

Size of set of large itemsets L(1): 14

Size of set of large itemsets L(2): 31

Size of set of large itemsets L(3): 25

Size of set of large itemsets L(4): 8

Size of set of large itemsets L(5): 1

Best rules found:

1. item5=t 2 ==> item1=t 2    <conf:(1)> lift:(1.5) lev:(0.07) [0] conv:(0.67)
2. item4=t 2 ==> item2=t 2    <conf:(1)> lift:(1.29) lev:(0.05) [0] conv:(0.44)
3. item5=t 2 ==> item2=t 2    <conf:(1)> lift:(1.29) lev:(0.05) [0] conv:(0.44)
4. item2=t item5=t 2 ==> item1=t 2    <conf:(1)> lift:(1.5) lev:(0.07) [0] conv:(0.67)
5. item1=t item5=t 2 ==> item2=t 2    <conf:(1)> lift:(1.29) lev:(0.05) [0] conv:(0.44)
6. item5=t 2 ==> item1=t item2=t 2    <conf:(1)> lift:(2.25) lev:(0.12) [1] conv:(1.11)
7. tid=T100 1 ==> item1=t 1    <conf:(1)> lift:(1.5) lev:(0.04) [0] conv:(0.33)
8. tid=T100 1 ==> item2=t 1    <conf:(1)> lift:(1.29) lev:(0.02) [0] conv:(0.22)
9. tid=T100 1 ==> item5=t 1    <conf:(1)> lift:(4.5) lev:(0.09) [0] conv:(0.78)
10. tid=T200 1 ==> item2=t 1    <conf:(1)> lift:(1.29) lev:(0.02) [0] conv:(0.22) 

We run the Apriori again without TID column this time:

Logs from Weka:

=== Run information ===

Scheme:       weka.associations.Apriori -N 10 -T 0 -C 0.9 -D 0.05 -U 1.0 -M 0.1 -S -1.0 -c -1
Relation:     9_txn_5_items_without_tid
Instances:    9
Attributes:   5
                item1
                item2
                item3
                item4
                item5
=== Associator model (full training set) ===


Apriori
=======

Minimum support: 0.16 (1 instances)
Minimum metric <confidence>: 0.9
Number of cycles performed: 17

Generated sets of large itemsets:

Size of set of large itemsets L(1): 5

Size of set of large itemsets L(2): 8

Size of set of large itemsets L(3): 5

Size of set of large itemsets L(4): 1

Best rules found:

1. item5=t 2 ==> item1=t 2    <conf:(1)> lift:(1.5) lev:(0.07) [0] conv:(0.67)
2. item4=t 2 ==> item2=t 2    <conf:(1)> lift:(1.29) lev:(0.05) [0] conv:(0.44)
3. item5=t 2 ==> item2=t 2    <conf:(1)> lift:(1.29) lev:(0.05) [0] conv:(0.44)
4. item2=t item5=t 2 ==> item1=t 2    <conf:(1)> lift:(1.5) lev:(0.07) [0] conv:(0.67)
5. item1=t item5=t 2 ==> item2=t 2    <conf:(1)> lift:(1.29) lev:(0.05) [0] conv:(0.44)
6. item5=t 2 ==> item1=t item2=t 2    <conf:(1)> lift:(2.25) lev:(0.12) [1] conv:(1.11)
7. item1=t item4=t 1 ==> item2=t 1    <conf:(1)> lift:(1.29) lev:(0.02) [0] conv:(0.22)
8. item3=t item5=t 1 ==> item1=t 1    <conf:(1)> lift:(1.5) lev:(0.04) [0] conv:(0.33)
9. item3=t item5=t 1 ==> item2=t 1    <conf:(1)> lift:(1.29) lev:(0.02) [0] conv:(0.22)
10. item2=t item3=t item5=t 1 ==> item1=t 1    <conf:(1)> lift:(1.5) lev:(0.04) [0] conv:(0.33)
Tags: Technology,Machine Learning,Weka,

No comments:

Post a Comment