Intelligent web proxy caching based on supervised machine learning
Web proxy caching is one of the most successful solutions for improving the performance of web-based systems. In web proxy caching, the popular web objects that are likely to be revisited in the near future are stored on the proxy server, which plays the key roles between users and web sites by redu...
Saved in:
Main Author: | |
---|---|
Format: | Thesis |
Language: | English |
Published: |
2012
|
Subjects: | |
Online Access: | http://eprints.utm.my/id/eprint/31355/1/WaleedAliAhmedPFSKSM2012.pdf |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
id |
my-utm-ep.31355 |
---|---|
record_format |
uketd_dc |
spelling |
my-utm-ep.313552018-04-17T08:37:21Z Intelligent web proxy caching based on supervised machine learning 2012-08 Ali Ahmed, Waleed QA75 Electronic computers. Computer science Web proxy caching is one of the most successful solutions for improving the performance of web-based systems. In web proxy caching, the popular web objects that are likely to be revisited in the near future are stored on the proxy server, which plays the key roles between users and web sites by reducing the response time of user requests and saving the network bandwidth. However, the difficulty in determining the significant web objects that would be re-visited in the future is still a problem faced by the existing conventional web proxy caching techniques. In this study, three popular supervised machine learning techniques were used to enhance the performances of conventional web proxy caching policies: Least-Recently-Used (LRU), Greedy-Dual-Size (GDS), Greedy-Dual-Size-Frequency (GDSF) and Least- Frequently-Used-Dynamic-Aging (LFU-DA). A support vector machine (SVM), a naïve Bayes classifier (NB) and a decision tree (C4.5) were trained from web proxy logs files to predict the class of objects that would be re-visited. More significantly, the trained SVM, NB and C4.5 classifiers were intelligently incorporated with the conventional web proxy caching techniques to form novel intelligent caching approaches known as intelligent LRU, GDS, GDSF and DA approaches. For testing and evaluating the proposed proxy caching methods, the proxy logs files were obtained from several proxy servers located around the United States of the IRCache network, which are the most common proxy datasets used in the research of web proxy caching. The experimental results showed that SVM, NB and C4.5 achieved a better accuracy and a much faster than back-propagation neural network (BPNN) and adaptive neuro-fuzzy inference system (ANFIS). Furthermore, the proposed intelligent caching approaches were evaluated by trace-driven simulation and compared with the most relevant web proxy caching policies. The simulation results revealed that the proposed intelligent web proxy caching approaches substantially improved the performance in terms of hit ratio and byte hit ratio of the conventional techniques on a range of datasets. The average improvement ratios of hit ratio achieved by intelligent LRU, GDS, and DA approaches over LRU, GDS and LFUDA increased by 32.60 %, 22.45 % and 35.458 %, respectively. In terms of byte hit ratio, the average improvement ratios achieved by intelligent LRU, GDS, GDSF, and DA approaches over LRU, GDS, GDSF and LFU-DA increased by 69.56 %, 229.14 %, 407.49 % and 69.074 %, respectively. 2012-08 Thesis http://eprints.utm.my/id/eprint/31355/ http://eprints.utm.my/id/eprint/31355/1/WaleedAliAhmedPFSKSM2012.pdf application/pdf en public phd doctoral Universiti Teknologi Malaysia, Faculty of Computer Science and Information Systems Faculty of Computer Science and Information Systems |
institution |
Universiti Teknologi Malaysia |
collection |
UTM Institutional Repository |
language |
English |
topic |
QA75 Electronic computers Computer science |
spellingShingle |
QA75 Electronic computers Computer science Ali Ahmed, Waleed Intelligent web proxy caching based on supervised machine learning |
description |
Web proxy caching is one of the most successful solutions for improving the performance of web-based systems. In web proxy caching, the popular web objects that are likely to be revisited in the near future are stored on the proxy server, which plays the key roles between users and web sites by reducing the response time of user requests and saving the network bandwidth. However, the difficulty in determining the significant web objects that would be re-visited in the future is still a problem faced by the existing conventional web proxy caching techniques. In this study, three popular supervised machine learning techniques were used to enhance the performances of conventional web proxy caching policies: Least-Recently-Used (LRU), Greedy-Dual-Size (GDS), Greedy-Dual-Size-Frequency (GDSF) and Least- Frequently-Used-Dynamic-Aging (LFU-DA). A support vector machine (SVM), a naïve Bayes classifier (NB) and a decision tree (C4.5) were trained from web proxy logs files to predict the class of objects that would be re-visited. More significantly, the trained SVM, NB and C4.5 classifiers were intelligently incorporated with the conventional web proxy caching techniques to form novel intelligent caching approaches known as intelligent LRU, GDS, GDSF and DA approaches. For testing and evaluating the proposed proxy caching methods, the proxy logs files were obtained from several proxy servers located around the United States of the IRCache network, which are the most common proxy datasets used in the research of web proxy caching. The experimental results showed that SVM, NB and C4.5 achieved a better accuracy and a much faster than back-propagation neural network (BPNN) and adaptive neuro-fuzzy inference system (ANFIS). Furthermore, the proposed intelligent caching approaches were evaluated by trace-driven simulation and compared with the most relevant web proxy caching policies. The simulation results revealed that the proposed intelligent web proxy caching approaches substantially improved the performance in terms of hit ratio and byte hit ratio of the conventional techniques on a range of datasets. The average improvement ratios of hit ratio achieved by intelligent LRU, GDS, and DA approaches over LRU, GDS and LFUDA increased by 32.60 %, 22.45 % and 35.458 %, respectively. In terms of byte hit ratio, the average improvement ratios achieved by intelligent LRU, GDS, GDSF, and DA approaches over LRU, GDS, GDSF and LFU-DA increased by 69.56 %, 229.14 %, 407.49 % and 69.074 %, respectively. |
format |
Thesis |
qualification_name |
Doctor of Philosophy (PhD.) |
qualification_level |
Doctorate |
author |
Ali Ahmed, Waleed |
author_facet |
Ali Ahmed, Waleed |
author_sort |
Ali Ahmed, Waleed |
title |
Intelligent web proxy caching based on supervised machine learning |
title_short |
Intelligent web proxy caching based on supervised machine learning |
title_full |
Intelligent web proxy caching based on supervised machine learning |
title_fullStr |
Intelligent web proxy caching based on supervised machine learning |
title_full_unstemmed |
Intelligent web proxy caching based on supervised machine learning |
title_sort |
intelligent web proxy caching based on supervised machine learning |
granting_institution |
Universiti Teknologi Malaysia, Faculty of Computer Science and Information Systems |
granting_department |
Faculty of Computer Science and Information Systems |
publishDate |
2012 |
url |
http://eprints.utm.my/id/eprint/31355/1/WaleedAliAhmedPFSKSM2012.pdf |
_version_ |
1747815784006025216 |