Improved collaborative filtering using clustering and association rule mining on implicit data

The recommender systems are recently becoming more significant due to their ability in making decisions on appropriate choices. Collaborative Filtering (CF) is the most successful and most applied technique in the design of a recommender system where items to an active user will be recommended based...

Full description

Saved in:
Bibliographic Details
Main Author: Najafabadi, Maryam Khanian
Format: Thesis
Language:English
Published: 2016
Subjects:
Online Access:http://eprints.utm.my/id/eprint/98093/1/MaryamKhanianNajafabadiPAIS2016.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
id my-utm-ep.98093
record_format uketd_dc
spelling my-utm-ep.980932022-11-14T09:52:54Z Improved collaborative filtering using clustering and association rule mining on implicit data 2016 Najafabadi, Maryam Khanian QA75 Electronic computers. Computer science T58.5-58.64 Information technology The recommender systems are recently becoming more significant due to their ability in making decisions on appropriate choices. Collaborative Filtering (CF) is the most successful and most applied technique in the design of a recommender system where items to an active user will be recommended based on the past rating records from like-minded users. Unfortunately, CF may lead to poor recommendation when user ratings on items are very sparse (insufficient number of ratings) in comparison with the huge number of users and items in user-item matrix. In the case of a lack of user rating on items, implicit feedback is used to profile a user’s item preferences. Implicit feedback can indicate users’ preferences by providing more evidences and information through observations made on users’ behaviors. Data mining technique, which is the focus of this research, can predict a user’s future behavior without item evaluation and can too, analyze his preferences. In order to investigate the states of research in CF and implicit feedback, a systematic literature review has been conducted on the published studies related to topic areas in CF and implicit feedback. To investigate users’ activities that influence the recommender system developed based on the CF technique, a critical observation on the public recommendation datasets has been carried out. To overcome data sparsity problem, this research applies users’ implicit interaction records with items to efficiently process massive data by employing association rules mining (Apriori algorithm). It uses item repetition within a transaction as an input for association rules mining, in which can achieve high recommendation accuracy. To do this, a modified preprocessing has been employed to discover similar interest patterns among users. In addition, the clustering technique (Hierarchical clustering) has been used to reduce the size of data and dimensionality of the item space as the performance of association rules mining. Then, similarities between items based on their features have been computed to make recommendations. Experiments have been conducted and the results have been compared with basic CF and other extended version of CF techniques including K-Means Clustering, Hybrid Representation, and Probabilistic Learning by using public dataset, namely, Million Song dataset. The experimental results demonstrate that the proposed technique exhibits improvements of an average of 20% in terms of Precision, Recall and Fmeasure metrics when compared to the basic CF technique. Our technique achieves even better performance (an average of 15% improvement in terms of Precision and Recall metrics) when compared to the other extended version of CF techniques, even when the data is very sparse. 2016 Thesis http://eprints.utm.my/id/eprint/98093/ http://eprints.utm.my/id/eprint/98093/1/MaryamKhanianNajafabadiPAIS2016.pdf application/pdf en public http://dms.library.utm.my:8080/vital/access/manager/Repository/vital:144356 phd doctoral Universiti Teknologi Malaysia, Advanced Informatics School Advanced Informatics School
institution Universiti Teknologi Malaysia
collection UTM Institutional Repository
language English
topic QA75 Electronic computers
Computer science
T58.5-58.64 Information technology
spellingShingle QA75 Electronic computers
Computer science
T58.5-58.64 Information technology
Najafabadi, Maryam Khanian
Improved collaborative filtering using clustering and association rule mining on implicit data
description The recommender systems are recently becoming more significant due to their ability in making decisions on appropriate choices. Collaborative Filtering (CF) is the most successful and most applied technique in the design of a recommender system where items to an active user will be recommended based on the past rating records from like-minded users. Unfortunately, CF may lead to poor recommendation when user ratings on items are very sparse (insufficient number of ratings) in comparison with the huge number of users and items in user-item matrix. In the case of a lack of user rating on items, implicit feedback is used to profile a user’s item preferences. Implicit feedback can indicate users’ preferences by providing more evidences and information through observations made on users’ behaviors. Data mining technique, which is the focus of this research, can predict a user’s future behavior without item evaluation and can too, analyze his preferences. In order to investigate the states of research in CF and implicit feedback, a systematic literature review has been conducted on the published studies related to topic areas in CF and implicit feedback. To investigate users’ activities that influence the recommender system developed based on the CF technique, a critical observation on the public recommendation datasets has been carried out. To overcome data sparsity problem, this research applies users’ implicit interaction records with items to efficiently process massive data by employing association rules mining (Apriori algorithm). It uses item repetition within a transaction as an input for association rules mining, in which can achieve high recommendation accuracy. To do this, a modified preprocessing has been employed to discover similar interest patterns among users. In addition, the clustering technique (Hierarchical clustering) has been used to reduce the size of data and dimensionality of the item space as the performance of association rules mining. Then, similarities between items based on their features have been computed to make recommendations. Experiments have been conducted and the results have been compared with basic CF and other extended version of CF techniques including K-Means Clustering, Hybrid Representation, and Probabilistic Learning by using public dataset, namely, Million Song dataset. The experimental results demonstrate that the proposed technique exhibits improvements of an average of 20% in terms of Precision, Recall and Fmeasure metrics when compared to the basic CF technique. Our technique achieves even better performance (an average of 15% improvement in terms of Precision and Recall metrics) when compared to the other extended version of CF techniques, even when the data is very sparse.
format Thesis
qualification_name Doctor of Philosophy (PhD.)
qualification_level Doctorate
author Najafabadi, Maryam Khanian
author_facet Najafabadi, Maryam Khanian
author_sort Najafabadi, Maryam Khanian
title Improved collaborative filtering using clustering and association rule mining on implicit data
title_short Improved collaborative filtering using clustering and association rule mining on implicit data
title_full Improved collaborative filtering using clustering and association rule mining on implicit data
title_fullStr Improved collaborative filtering using clustering and association rule mining on implicit data
title_full_unstemmed Improved collaborative filtering using clustering and association rule mining on implicit data
title_sort improved collaborative filtering using clustering and association rule mining on implicit data
granting_institution Universiti Teknologi Malaysia, Advanced Informatics School
granting_department Advanced Informatics School
publishDate 2016
url http://eprints.utm.my/id/eprint/98093/1/MaryamKhanianNajafabadiPAIS2016.pdf
_version_ 1776100548016078848