Intelligent web proxy caching based on supervised machine learning

Web proxy caching is one of the most successful solutions for improving the performance of web-based systems. In web proxy caching, the popular web objects that are likely to be revisited in the near future are stored on the proxy server, which plays the key roles between users and web sites by redu...

Full description

Saved in:
Bibliographic Details
Main Author: Ali Ahmed, Waleed
Format: Thesis
Language:English
Published: 2012
Subjects:
Online Access:http://eprints.utm.my/id/eprint/31355/1/WaleedAliAhmedPFSKSM2012.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
id my-utm-ep.31355
record_format uketd_dc
spelling my-utm-ep.313552018-04-17T08:37:21Z Intelligent web proxy caching based on supervised machine learning 2012-08 Ali Ahmed, Waleed QA75 Electronic computers. Computer science Web proxy caching is one of the most successful solutions for improving the performance of web-based systems. In web proxy caching, the popular web objects that are likely to be revisited in the near future are stored on the proxy server, which plays the key roles between users and web sites by reducing the response time of user requests and saving the network bandwidth. However, the difficulty in determining the significant web objects that would be re-visited in the future is still a problem faced by the existing conventional web proxy caching techniques. In this study, three popular supervised machine learning techniques were used to enhance the performances of conventional web proxy caching policies: Least-Recently-Used (LRU), Greedy-Dual-Size (GDS), Greedy-Dual-Size-Frequency (GDSF) and Least- Frequently-Used-Dynamic-Aging (LFU-DA). A support vector machine (SVM), a naïve Bayes classifier (NB) and a decision tree (C4.5) were trained from web proxy logs files to predict the class of objects that would be re-visited. More significantly, the trained SVM, NB and C4.5 classifiers were intelligently incorporated with the conventional web proxy caching techniques to form novel intelligent caching approaches known as intelligent LRU, GDS, GDSF and DA approaches. For testing and evaluating the proposed proxy caching methods, the proxy logs files were obtained from several proxy servers located around the United States of the IRCache network, which are the most common proxy datasets used in the research of web proxy caching. The experimental results showed that SVM, NB and C4.5 achieved a better accuracy and a much faster than back-propagation neural network (BPNN) and adaptive neuro-fuzzy inference system (ANFIS). Furthermore, the proposed intelligent caching approaches were evaluated by trace-driven simulation and compared with the most relevant web proxy caching policies. The simulation results revealed that the proposed intelligent web proxy caching approaches substantially improved the performance in terms of hit ratio and byte hit ratio of the conventional techniques on a range of datasets. The average improvement ratios of hit ratio achieved by intelligent LRU, GDS, and DA approaches over LRU, GDS and LFUDA increased by 32.60 %, 22.45 % and 35.458 %, respectively. In terms of byte hit ratio, the average improvement ratios achieved by intelligent LRU, GDS, GDSF, and DA approaches over LRU, GDS, GDSF and LFU-DA increased by 69.56 %, 229.14 %, 407.49 % and 69.074 %, respectively. 2012-08 Thesis http://eprints.utm.my/id/eprint/31355/ http://eprints.utm.my/id/eprint/31355/1/WaleedAliAhmedPFSKSM2012.pdf application/pdf en public phd doctoral Universiti Teknologi Malaysia, Faculty of Computer Science and Information Systems Faculty of Computer Science and Information Systems
institution Universiti Teknologi Malaysia
collection UTM Institutional Repository
language English
topic QA75 Electronic computers
Computer science
spellingShingle QA75 Electronic computers
Computer science
Ali Ahmed, Waleed
Intelligent web proxy caching based on supervised machine learning
description Web proxy caching is one of the most successful solutions for improving the performance of web-based systems. In web proxy caching, the popular web objects that are likely to be revisited in the near future are stored on the proxy server, which plays the key roles between users and web sites by reducing the response time of user requests and saving the network bandwidth. However, the difficulty in determining the significant web objects that would be re-visited in the future is still a problem faced by the existing conventional web proxy caching techniques. In this study, three popular supervised machine learning techniques were used to enhance the performances of conventional web proxy caching policies: Least-Recently-Used (LRU), Greedy-Dual-Size (GDS), Greedy-Dual-Size-Frequency (GDSF) and Least- Frequently-Used-Dynamic-Aging (LFU-DA). A support vector machine (SVM), a naïve Bayes classifier (NB) and a decision tree (C4.5) were trained from web proxy logs files to predict the class of objects that would be re-visited. More significantly, the trained SVM, NB and C4.5 classifiers were intelligently incorporated with the conventional web proxy caching techniques to form novel intelligent caching approaches known as intelligent LRU, GDS, GDSF and DA approaches. For testing and evaluating the proposed proxy caching methods, the proxy logs files were obtained from several proxy servers located around the United States of the IRCache network, which are the most common proxy datasets used in the research of web proxy caching. The experimental results showed that SVM, NB and C4.5 achieved a better accuracy and a much faster than back-propagation neural network (BPNN) and adaptive neuro-fuzzy inference system (ANFIS). Furthermore, the proposed intelligent caching approaches were evaluated by trace-driven simulation and compared with the most relevant web proxy caching policies. The simulation results revealed that the proposed intelligent web proxy caching approaches substantially improved the performance in terms of hit ratio and byte hit ratio of the conventional techniques on a range of datasets. The average improvement ratios of hit ratio achieved by intelligent LRU, GDS, and DA approaches over LRU, GDS and LFUDA increased by 32.60 %, 22.45 % and 35.458 %, respectively. In terms of byte hit ratio, the average improvement ratios achieved by intelligent LRU, GDS, GDSF, and DA approaches over LRU, GDS, GDSF and LFU-DA increased by 69.56 %, 229.14 %, 407.49 % and 69.074 %, respectively.
format Thesis
qualification_name Doctor of Philosophy (PhD.)
qualification_level Doctorate
author Ali Ahmed, Waleed
author_facet Ali Ahmed, Waleed
author_sort Ali Ahmed, Waleed
title Intelligent web proxy caching based on supervised machine learning
title_short Intelligent web proxy caching based on supervised machine learning
title_full Intelligent web proxy caching based on supervised machine learning
title_fullStr Intelligent web proxy caching based on supervised machine learning
title_full_unstemmed Intelligent web proxy caching based on supervised machine learning
title_sort intelligent web proxy caching based on supervised machine learning
granting_institution Universiti Teknologi Malaysia, Faculty of Computer Science and Information Systems
granting_department Faculty of Computer Science and Information Systems
publishDate 2012
url http://eprints.utm.my/id/eprint/31355/1/WaleedAliAhmedPFSKSM2012.pdf
_version_ 1747815784006025216