An efficient algorithm to discover colossal closed itemsets in high dimensional data /

The current trend of data collection involves a small number of observations with a very large number of variables, known as high dimensional data. Mining these data produces an explosive number of smaller item sets which are less important than colossal (large) ones. As the trend in Frequent Itemse...

Full description

Saved in:
Bibliographic Details
Main Author: Fatimah Audah Md. Zaki (Author)
Format: Thesis Book
Language:English
Published: Kuala Lumpur : Kulliyyah of Engineering, International Islamic University Malaysia, 2020
Subjects:
Online Access:http://studentrepo.iium.edu.my/handle/123456789/11050
Tags: Add Tag
No Tags, Be the first to tag this record!
LEADER 03555aam a2200397 i 4500
005 20221005162013.0
008 220418s2020 my a f m 000 0 eng d
040 |a UIAM  |b eng  |e rda 
041 |a eng 
043 |a a-my--- 
100 0 |a Fatimah Audah Md. Zaki  |9 9905  |e author 
245 1 3 |a An efficient algorithm to discover colossal closed itemsets in high dimensional data /  |c by Fatimah Audah Md. Zaki 
264 1 |a Kuala Lumpur :   |b Kulliyyah of Engineering, International Islamic University Malaysia,   |c 2020 
300 |a xvi, 175 leaves :  |b illustrations ;  |c 30 cm. 
336 |a text  |2 rdacontent 
337 |a unmediated  |2 rdamedia 
337 |a computer  |2 rdamedia 
338 |a volume  |2 rdacarrier 
338 |a online resource  |2 rdacarrier 
347 |a text file  |b PDF  |2 rda 
500 |a Abstracts in English and Arabic. 
500 |a "A thesis submitted in fulfilment of the requirement for the degree of Doctor of Philosophy (Engineering)." --On title page. 
502 |a Thesis (Ph.D)--International Islamic University Malaysia, 2020. 
504 |a Includes bibliographical references (leaves 167-174). 
520 |a The current trend of data collection involves a small number of observations with a very large number of variables, known as high dimensional data. Mining these data produces an explosive number of smaller item sets which are less important than colossal (large) ones. As the trend in Frequent Itemset Mining is moving towards mining colossal item sets, it is important to understand the challenges in order to formulate a better method that is faster in running time, more scalable and able to produce useful and interesting knowledge. For this reason, this thesis has proposed two new algorithms; RARE and RARE II, which mine colossal closed item sets. Both algorithms apply a minimum cardinality threshold to limit the search space and a closure computation method that does not require storage of previously discovered item sets for duplicates checking. These approaches improved both memory and time requirement of the algorithms to finish mining tasks. Algorithm RARE searches the row set lattice in breadth-first manner which resulted to a reduced itemset intersections compare to other state-of-the-art algorithms, CARPENTER and IsTa. Although the different threshold used in CARPENTER and IsTa make direct comparison with RARE difficult, RARE proved to be better. In terms of memory usage, RARE need only one-third of CARPENTER’s and one-tenth of IsTa’s, while require the least running time to discover 100% of closed item sets in the dataset. Meanwhile, RARE II further reduced itemset intersections by evaluating only the closed row sets in order to mine the next closed item sets, which resulted to an improved run time by more than 50% compare to RARE. 
655 |a Theses, IIUM local 
690 |a Dissertations, Academic  |x Kulliyyah of Engineering  |z IIUM  |9 4824 
700 0 |a Nurul Fariza Zulkurnain  |e degree supervisor  |9 9906 
700 1 |a Teddy Surya Gunawan  |e degree supervisor  |9 4447 
710 2 |a International Islamic University Malaysia.  |b Kulliyyah of Engineering  |9 4827 
856 4 |u http://studentrepo.iium.edu.my/handle/123456789/11050 
900 |a sz to asbh-mrr 
942 |2 lcc  |c THESIS 
999 |c 500922  |d 533363 
952 |0 0  |1 0  |2 lcc  |4 0  |7 5  |9 968484  |a IIUM  |b IIUM  |c MULTIMEDIA  |d 2022-08-02  |g 0.00  |o XX(572934.1)  |p 11100429139  |r 1900-01-02  |t 1  |v 0.00  |y THESIS 
952 |0 0  |6 XX(572934.000001)CD  |7 5  |8 THESES  |9 984321  |a IIUM  |b IIUM  |c MULTIMEDIA  |g 0.00  |o XX(572934.1)CD  |p 11100429340  |r 1900-01-02  |t 1  |v 0.00  |y THESIS