Fp-Tree Growth Algorithm
Fp-Tree Growth Algorithm
Algorithm
Bishal Sharma (13000116117)
INTRODUCTION
➢ Fastest and most popular algorithm for itemset mining.
➢ Based on a prefix tree representation of the given database of transactions,
which can save considerable amounts of memory for storing.
➢ Basic idea can be described as recursive elimination scheme: in a in a
preprocessing step delete all items from the transactions that are not
frequent individually.
➢ First it compresses the input database creating an FP-tree instance to
represent frequent items. After this first step it divides the compressed
database into a set of conditional databases, each one associated with one
frequent pattern. Finally, each such database is mined separately.
FP-TREE STRUCTURE
➢ One root labeled as “null” with a set of item-prefix subtrees as children, and
a frequent-item-header table
➢ Each node in the item-prefix subtree consists of three fields:
○ Item-name: registers which item is represented by the node;
○ Count: the number of transactions represented by the portion of the
path reaching the node;
○ Node-link: links to the next node in the FP-tree carrying the same
item-name, or null if there is none.
FP-TREE STRUCTURE
➢ Each entry in the frequent-item-header table consists of two fields:
○ Item-name: as the same to the node;
○ Head of node-link: a pointer to the first node in the FP-tree carrying
the item-name.
FP-TREE CONSTRUCTION ALGORITHM
Input: A transaction database DB and a minimum support threshold .
Output: FP-tree, the frequent-pattern tree of DB.
Method: The FP-tree is constructed as follows.
➢ Scan the transaction database DB once. Collect F, the set of frequent items,
and the support of each frequent item. Sort F in support-descending order
as FList, the list of frequent items.
FP-TREE CONSTRUCTION ALGORITHM
➢ Create the root of an FP-tree, T, and label it as “null”. For each transaction
Trans in DB do the following:
○ Select the frequent items in Trans and sort them according to the order
of FList. Let the sorted frequent-item list in Trans be [ p | P], where p is
the first element and P is the remaining list. Call insert tree([ p | P], T ).
○ The function insert tree([ p | P], T ) is performed as follows. If T has a
child N such that N.item-name = p.item-name, then increment N ’s
count by 1; else create a new node N , with its count initialized to 1, its
parent link linked to T , and its node-link linked to the nodes with the
same item-name via the node-link structure. If P is nonempty, call
insert tree(P, N ) recursively.
FP-GROWTH ALGORITHM
Input: A database DB, represented by FP-tree constructed according to
Algorithm 1, and a minimum support threshold ?.
Output: The complete set of frequent patterns.
Method: call FP-growth(FP-tree, null).
Procedure FP-growth(Tree, a) {
(01) if Tree contains a single prefix path then { // Mining single prefix-path
FP-tree
FP-GROWTH ALGORITHM
(02) let P be the single prefix-path part of Tree;
(03) let Q be the multipath part with the top branching node replaced by a
null root;
}
FP-GROWTH ALGORITHM
(07) else let Q be Tree;
(10) construct ß’s conditional pattern-base and then ß’s conditional FP-
tree Tree ß;
FP-GROWTH ALGORITHM
(11) if Tree ß ≠ Ø then
(14) return(freq pattern set(P) ∪ freq pattern set(Q) ∪ (freq pattern set(P) ×
freq pattern set(Q)))
}
REFERNCES
1. J. Han, H. Pei, and Y. Yin. Mining Frequent Patterns without Candidate Generation. In: Proc.
Conf. on the Management of Data (SIGMOD’00, Dallas, TX). ACM Press, New York, NY, USA
2000.
2. Agrawal, R. and Srikant, R. 1994. Fast algorithms for mining association rules. In Proc. 1994
Int. Conf. Very Large Data Bases (VLDB’94), Santiago, Chile, pp. 487–499.
3. Agarwal, R., Aggarwal, C., and Prasad, V.V.V. 2001. A tree projection algorithm for
generation of frequent itemsets. Journal of Parallel and Distributed Computing, 61:350–371.
4. B.Santhosh Kumar and K.V.Rukmani. Implementation of Web Usage Mining Using APRIORI
and FP Growth Algorithms. Int. J. of Advanced Networking and Applications, Volume: 01,
Issue:06, Pages: 400-404 (2010).
5. Cornelia Gyorödi and Robert Gyorödi. A Comparative Study of Association Rules Mining
Algorithms.