The Analysis Of Metadata Based Classification For Classifying Educational Websites

Initially websites can be easily categorized based on its domain extensions. But due to the explosion of the internet, the domain name restrictions are no longer being adhered. Web classification can help to categorize websites, especially educational websites that being the focus of this research....

Full description

Saved in:
Bibliographic Details
Main Author: Zaraini, Mohd Nazrien
Format: Thesis
Language:English
English
Published: 2016
Subjects:
Online Access:http://eprints.utem.edu.my/id/eprint/18198/1/The%20Analysis%20Of%20Metadata%20Based%20Classification%20For%20Classifying%20Educational%20Websites%2024%20Pages.pdf
http://eprints.utem.edu.my/id/eprint/18198/2/The%20Analysis%20Of%20Metadata%20Based%20Classification%20For%20Classifying%20Educational%20Websites.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
id my-utem-ep.18198
record_format uketd_dc
institution Universiti Teknikal Malaysia Melaka
collection UTeM Repository
language English
English
advisor Hussin, Burairah

topic Z665 Library Science
Information Science
spellingShingle Z665 Library Science
Information Science
Zaraini, Mohd Nazrien
The Analysis Of Metadata Based Classification For Classifying Educational Websites
description Initially websites can be easily categorized based on its domain extensions. But due to the explosion of the internet, the domain name restrictions are no longer being adhered. Web classification can help to categorize websites, especially educational websites that being the focus of this research. Classification will be done based on content and metadata in order to get the impact of metadata implementation in terms of classification accuracy. Three sets of 200 pre-determined educational websites taken from DMOZ directory utilized as training data. This is the total number of educational websites with metadata information available in that directory. For content based classification, keywords extracted from the contents and TF-IDF ranking used to get the top educational keywords. These keywords used as a training dataset attribute for educational web classification. The same method goes for metadata based classification, but the difference is that the keywords were taken from its meta description. One class support vector machine method was used because this research is focusing on single class classification only. Cross validation technique and two sets of test data; all educational websites and various categories of website will be used to validate this research. The results shows that content based classification gives more accuracy compare to metadata. Top ranking educational keywords and the analysis of metadata implementation known from this research based on the information retrieval and web classification process.
format Thesis
qualification_name Doctor of Philosophy (PhD.)
qualification_level Master's degree
author Zaraini, Mohd Nazrien
author_facet Zaraini, Mohd Nazrien
author_sort Zaraini, Mohd Nazrien
title The Analysis Of Metadata Based Classification For Classifying Educational Websites
title_short The Analysis Of Metadata Based Classification For Classifying Educational Websites
title_full The Analysis Of Metadata Based Classification For Classifying Educational Websites
title_fullStr The Analysis Of Metadata Based Classification For Classifying Educational Websites
title_full_unstemmed The Analysis Of Metadata Based Classification For Classifying Educational Websites
title_sort analysis of metadata based classification for classifying educational websites
granting_institution Universiti Teknikal Malaysia Melaka
granting_department Faculty of Information and Communication Technology
publishDate 2016
url http://eprints.utem.edu.my/id/eprint/18198/1/The%20Analysis%20Of%20Metadata%20Based%20Classification%20For%20Classifying%20Educational%20Websites%2024%20Pages.pdf
http://eprints.utem.edu.my/id/eprint/18198/2/The%20Analysis%20Of%20Metadata%20Based%20Classification%20For%20Classifying%20Educational%20Websites.pdf
_version_ 1747833916674277376
spelling my-utem-ep.181982021-10-08T07:46:31Z The Analysis Of Metadata Based Classification For Classifying Educational Websites 2016 Zaraini, Mohd Nazrien Z665 Library Science. Information Science Initially websites can be easily categorized based on its domain extensions. But due to the explosion of the internet, the domain name restrictions are no longer being adhered. Web classification can help to categorize websites, especially educational websites that being the focus of this research. Classification will be done based on content and metadata in order to get the impact of metadata implementation in terms of classification accuracy. Three sets of 200 pre-determined educational websites taken from DMOZ directory utilized as training data. This is the total number of educational websites with metadata information available in that directory. For content based classification, keywords extracted from the contents and TF-IDF ranking used to get the top educational keywords. These keywords used as a training dataset attribute for educational web classification. The same method goes for metadata based classification, but the difference is that the keywords were taken from its meta description. One class support vector machine method was used because this research is focusing on single class classification only. Cross validation technique and two sets of test data; all educational websites and various categories of website will be used to validate this research. The results shows that content based classification gives more accuracy compare to metadata. Top ranking educational keywords and the analysis of metadata implementation known from this research based on the information retrieval and web classification process. 2016 Thesis http://eprints.utem.edu.my/id/eprint/18198/ http://eprints.utem.edu.my/id/eprint/18198/1/The%20Analysis%20Of%20Metadata%20Based%20Classification%20For%20Classifying%20Educational%20Websites%2024%20Pages.pdf text en public http://eprints.utem.edu.my/id/eprint/18198/2/The%20Analysis%20Of%20Metadata%20Based%20Classification%20For%20Classifying%20Educational%20Websites.pdf text en validuser https://plh.utem.edu.my/cgi-bin/koha/opac-detail.pl?biblionumber=100103 dphil masters Universiti Teknikal Malaysia Melaka Faculty of Information and Communication Technology Hussin, Burairah 1. Abidin, H.Z., Abdul Rahman, F.Y., Yassin, I.M. and Mohd Sayuti, E.E., 2009. Development of a local web server linked to Malaysian Research and Education Network (MyREN). NGMAST 2009 - 3rd International Conference on Next Generation Mobile Applications, Services and Technologies, pp.515–519. 2. Ahmadi, A., Fotouhi, M. and Khaleghi, M., 2011. Intelligent classification of web pages using contextual and visual features. Applied Soft Computing, 11(2), pp.1638–1647. 3. Altman, M. and Mcdonald, M., n.d. for the Social Scientist. 4. Anjali, Jivani, G. and Anjali, M., 2007. A Comparative Study of Stemming Algorithms. October, 2(6), pp.1930–1938. 5. Anon, 2013. CLASSIFICATION OF WEB PAGES IN YIOOP. 6. Anon, 2015. About DMOZ. [online] Available at: http://www.dmoz.org/docs/en/about.html [Accessed 28 Jun. 2015]. 7. Anon, 2015a. Oxford Advanced Learner’s Dictionary. [online] Available at: http://www.oxforddictionaries.com/definition/learner/website [Accessed 26 Jun. 2015]. 8. Anon, 2015b. Oxford Advanced Learner’s Dictionary. [online] Available at: http://www.oxforddictionaries.com/ms/definisi/bahasa-inggeris/education?searchDictCode=all [Accessed 21 Sep. 2015]. 9. Antoniol, G., Canfora, G., Casazza, G. and De Lucia, A., 2000. Information retrieval models for recovering traceability links between code and documentation. Software Maintenance, 2000. Proceedings. International Conference on, pp.40–49. 10. Baeza-Yates, R. and Ribeiro-Neto, B., 2011. Modern Information Retrieval: The Concepts and Technology behind Search. Information Retrieval, . 11. Baharudin, B., Lee, L.H. and Khan, K., 2010. A Review of Machine Learning Algorithms for Text-Documents Classification. Journal of Advances in Information Technology, 1(1), pp.4–20. 12. Barve, A. and Joshi, B.K., 2015. Improved Parallel Lexical Analysis Using OpenMP on Multi-core Machines. Procedia Computer Science, 49, pp.211–219. 13. Baştanlar, Y. and Ozuysal, M., 2014. Introduction to Machine Learning Second Edition. Methods in molecular biology (Clifton, N.J.), . 14. Bishop, C.M., 1994. Novelty detection and neural network validation. IEE Proceedings - Vision, Image, and Signal Processing, . 15. Brown, T.J., 2013. Punctuation @ global.britannica.com. [online] Available at: http://global.britannica.com/EBchecked/topic/483473/punctuation [Accessed 9 Feb. 2015]. 16. Byna, S., Meng, J., Raghunathan, A., Chakradhar, S. and Cadambi, S., 2010. Best-effort semantic document search on GPUs. Proceedings of the 3rd Workshop on General-Purpose Computation on Graphics Processing Units - GPGPU ’10, p.86. 17. Chen, J., Huang, H., Tian, S. and Qu, Y., 2009. Feature selection for text classification with Naiive Bayes. Expert Systems with Applications, 36(3 PART 1), pp.5432–5435. 18. Devi, M.I., Rajaram, R. and Selvakuberan, K., 2008. Generating best features for web classification. Webology, 5. 19. Digital Library Federation, 2010. METS Schema & Documentation @ www.loc.gov. [online] Available at: http://www.loc.gov/standards/mets/mets-schemadocs.html [Accessed 8 Feb. 2015]. 20. Dreiseitl, S. and Ohno-Machado, L., 2002. Logistic regression and artificial neural network classification models: A methodology review. Journal of Biomedical Informatics, 35(5-6), pp.352–359. 21. Engineering, S., Committee, S. and Computer, I., 2011. IEEE Guide — Adoption of the Project Management Institute ( PMI ® ) Standard A Guide to the Project Management Body of Knowledge ( PMBOK ® Guide ) — Fourth Edition IEEE Computer Society. 22. Fathi, M., Adly, N. and Nagi, M., 2000. Web Documents Classification Using Text , Anchor , Title and Metadata Information. In: International Computer Science Conference. pp.1–8. 23. Fu, Z., Chen, C., Gong, Y. and Bie, R., 2008. A Comparison Study: Web Pages Categorization with Bayesian Classifiers. 2008 10th IEEE International Conference on High Performance Computing and Communications, pp.789–794. 24. Fuhr, N. and Buckley, C., 1991. A probabilistic learning approach for document indexing. ACM Transactions on Information Systems, 9(3), pp.223–248. 25. Gao, M., Tian, J. and Zhou, S., 2009. Research of web classification mining based on classify support vector machine. 2009 ISECS International Colloquium on Computing, Communication, Control, and Management, pp.21–24. 26. Grefenstette, G. and Tapanainen, P., 1994. What is a word, what is a sentence? Problems of tokenization. In: COMPLEX ’94: 3rd conference on computational lexicography and text research, Budapest, Hungary, 7–10 July. pp.79–87. 27. Han, E.S., Karypis, G. and Kumar, V., 2001. Text Categorization Using Weight Adjusted k-Nearest Neighbor Classification. Data Mining and Knowledge Discovery, 3918(Dm), pp.53–65. 28. Han, J. and Kamber, M., 2006. Data Mining: Concepts and Techniques, 2nd ed. 2nd ed. The Morgan Kaufmann Series in Data Management Systems, Jim Gray, Series Editor Morgan Kaufmann Publishers,. 29. Hiemstra, D., 2009. Information Retrieval. In: Information Retrieval: Searching in the 21st Century. Wiley, pp.1–20. 30. Higgins, S., 2007a. What are Metadata Standards @ www.dcc.ac.uk. [online] Available at: http://www.dcc.ac.uk/resources/briefing-papers/standards-watch-papers/what-are-metadata-standards [Accessed 8 Feb. 2015]. 31. Higgins, S., 2007b. What are Metadata Standards @ www.dcc.ac.uk. 32. Htwe, T., 2010. Noise removing from Web pages using neural network. In: 2010 The 2nd International Conference on Computer and Automation Engineering (ICCAE). IEEE, pp.281–285. 33. International Organization for Standardization, 2014. ISO 19115-1:2014 — Geographic Information: Metadata @ Www.Iso.Org. [online] Available at: http://www.iso.org/iso/catalogue_detail.htm?csnumber=21197 [Accessed 8 Feb. 2015]. 34. Isa, D., Kallimani, V.P. and Lee, L.H., 2009. Using the self organizing map for clustering of text documents. Expert Systems with Applications, 36(5), pp.9584–9591. 35. Isa, D., Lee, L.H., Kallimani, V.P. and Rajkumar, R., 2008. Text Document Preprocessing with the Bayes Formula for Classification Using the Support Vector Machine. 20(9), pp.1264–1272. 36. Islam, M.J., Wu, Q.M.J., Ahmadi, M. and Sid-Ahmed, M. a., 2007. Investigating the Performance of Naive- Bayes Classifiers and K- Nearest Neighbor Classifiers. 2007 International Conference on Convergence Information Technology (ICCIT 2007), pp.1541–1546. 37. Japkowicz, N., 1999. Concept-Learning in the Absence of Counter-Examples : an Autoassociation-Based Approach To Classification. 38. JingHua, B., Xiao Xian, Z., ZhiXin, L. and XiaoPing, L., 2012. Mixture Models for Web Page Classification. Physics Procedia, 25, pp.499–505. 39. Juszczak, P. and Duin, R.P.W., 2004. Combining one-class classifiers to classify missing data. Multiple Classifier Systems, pp.92–101. 40. Khan, S.S. and Madden, M.G., 2014. One-class classification: taxonomy of study and review. The Knowledge Engineering Review, 29:3(January), pp.345–374. 41. Kinsella, S., Passant, A. and Breslin, J.G., 2011. Topic Classification in Social Media Using Metadata from Hyperlinked Objects. In: Advances in Information Retrieval Proceedings ECIR 2011. pp.201–206. 42. Kohavi, R., 1996. Scaling Up the Accuracy of Naive-Bayes Classi ers : a Decision-Tree Hybrid Accuracy Scale-Up : the Learning. Data Mining and Visualization, 96(Utgo 1988), pp.202–207. 43. Kriet, J.D. and Wang, T.D., 1999. The Internet and the World Wide Web. Facial plastic surgery : FPS, 15(2), pp.145–148. 44. Lee, L.H., Isa, D., Choo, W.O. and Chue, W.Y., 2012. High Relevance Keyword Extraction facility for Bayesian text classification on different domains of varying characteristic. Expert Systems with Applications, 39(1), pp.1147–1155. 45. Lee, Z.S., Maarof, M.A., Selamat, A. and Shamsuddin, S.M., 2008. Enhance termweighting algorithm as feature selection technique for illicit web content classification. Proceedings - 8th International Conference on Intelligent Systems Design and Applications, ISDA 2008, 2, pp.145–150. 46. Leiba, B., 2009. The Good and the Bad of Top-Level Domains. IEEE Internet Computing, 13(1), p.66. 47. Leiner, B.M., Cerf, V.G., Clark, D.D., Kahn, R.E., Kleinrock, L., Lynch, D.C., Postel, J., Roberts, L.G. and Wolff, S.S., 1997. The past and future history of the Internet. Communications of the ACM, 40(2), pp.102–108. 48. Lenard, T.M. and White, L.J., 2011. Improving ICANN’s governance and accountability: A policy proposal. Information Economics and Policy, 23(2), pp.189–199. 49. Li, S., Yang, Z. and Liu, Q., 2008. Research of Metadata Based Digital Educational Resource Sharing. 2008 International Conference on Computer Science and Software Engineering, pp.828–831. 50. Lin, Y., 2002. Support vector machines and the Bayes rule in classification. Data Mining and Knowledge Discovery, 6(3), pp.259–275. 51. Manning, C.D., Raghavan, P. and Schütze, H., 2009. An Introduction to Information Retrieval. Cambridge University Press Cambridge, England, . 52. Manning, C.D. and Schütze, H., 1999. Foundations of statistical natural language processing. MIT Press, . 53. Mika, P. and Potter, T., 2012. Metadata statistics for a large web corpus. CEUR Workshop Proceedings, 937. 54. Mills, D.L. and Braun, H., 1988. The NSFNET backbone network. In: Proceedings of the ACM workshop on Frontiers in computer communications technology - SIGCOMM ’87. New York, New York, USA: ACM Press, pp.191–196. 55. Minter, T.C., 1975. Single-Class Classification. Symposium on Machine Processing of Remotely Sensed Data. 56. Modeling, S., 2000. Data mining. Nature biotechnology, 18 Suppl, pp.IT35–T36. 57. Munroe, D.T. and Madden, M.G., 2005. Multi-class and single-class classification approaches to vehicle model recognition from images. Proceedings of AICS-05: Irish Conference on Artificial Intelligence and Cognitive Science. 58. MYREN, 2013a. MyREN - Live Traffic Monitoring. [online] Available at: http://www.myren.net.my/services/myren-services [Accessed 7 Jan. 2013]. 59. MYREN, 2015b. International Linkages. [online] Available at: http://myren.net.my/index.php/myren-network/international-linkages [Accessed 21 Sep. 2015]. 60. MYREN, 2015c. What is MYREN? [online] Available at: http://www.myren.net.my/index.php/about-myren/what-is-myren [Accessed 29 Jun. 2015]. 61. MySQL, 2015. MySQL :: MySQL 5.1 Reference Manual :: 12.9.4 Full-Text Stopwords. [online] Available at: https://dev.mysql.com/doc/refman/5.1/en/fulltext-stopwords.html [Accessed 24 Aug. 2015]. 62. National Information Standards Organization, 2004. Understanding Metadata. National Information Standards. 63. Netcraft, 2015. May 2015 Web Server Survey | Netcraft. [online] Available at: http://news.netcraft.com/archives/2015/05/19/may-2015-web-server-survey.html [Accessed 18 Jun. 2015]. 64. Othman, M.S., Yusuf, L.M. and Salim, J., 2010. Features Discovery for Web Classification Using Support Vector Machine. 2010 International Conference on Intelligent Computing and Cognitive Informatics, pp.36–40. 65. Oxford Dictionaries, 2012. Oxford Dictionaries Online. [online] Available at: http://oxforddictionaries.com/definition/english/classify?q=classify. 66. Özel, S.A., 2011. A Web page classification system based on a genetic algorithm using tagged-terms as features. Expert Systems with Applications, 38(4), pp.3407–3415. 67. Paice, C.D., 1994. An Evaluation Method for Stemming Algorithms. SIGIR ’94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval, pp.42–50. 68. Patel, K.J. and Sarvakar, K.J., 2013. Web Page Classification Using Data Mining. 2(7), pp.2513–2520. 69. Potter, M.F., 1980. An algorithm for suffix stripping. Program, 14(3), pp.130–137. 70. Potter, M.F., 2001. Snowball: A language for stemming algorithms. [online] Available at: http://snowball.tartarus.org/texts/introduction.html [Accessed 21 May 2015]. 71. Qi, X. and Davison, B.D., 2009. Web page classification. ACM Computing Surveys, 41(2), pp.1–31. 72. Ragunath, R. and Sivaranjani, N., 2015. Ontology Based Text Document Summarization System Using Concept Terms. 10(6), pp.2638–2642. 73. Rajaraman, A. and Ullman, J.D., 2011. Data Mining. Mining of Massive Datasets, 18 Suppl, pp.114–142. 74. Ramos, J., Eden, J. and Edu, R., 2003. Using TF-IDF to Determine Word Relevance in Document Queries. Proceedings of the first instructional conference on machine learning. 75. Rennie, J.D.M., Shih, L., Teevan, J. and Karger, D.R., 2003. Tackling the Poor Assumptions of Naive Bayes Text Classifiers. Proceedings of the Twentieth International Conference on Machine Learning (ICML)-2003), 20(1973), pp.616–623. 76. Ritter, G. and Gallegos, M.T., 1997. Outliers in statistical pattern recognition and an application to automatic chromosome classification. Pattern Recognition Letters, 18, pp.525–539. 77. Robertson, S.E. and Spärck Jones, K., 1976. Relevance Weighting of Search Terms. Journal of the American Society for Information Science, pp.129–146. 78. Rodgers, M.L., Snell, W.E. and Starrett, D. a, 2003. Internet Homepages. 2. 79. Rohani, V.A. and Ow, S.H., 2012. A framework for e-content generation, management and integration in MYREN network. Lecture Notes in Electrical Engineering, 157 LNEE(VOL. 2), pp.293–298. 80. Ruiz, M.E. and Srinivasan, P., 1998. Automatic text categorization using neural networks. Advances in Classification Research, Vol 8, (August), pp.59–72. 81. Salton, G. and Buckley, C., 1988. Term-weighting approaches in automatic text retrieval. Information Processing & Management, . 82. Salton, G., Wong, a. and Yang, C.S., 1975. A vector space model for automatic indexing. Communications of the ACM, 18(11), pp.613–620. 83. Schölkopf, B., Williamson, R.C., Smola, A.J., Shawe-Taylor, J. and Platt, J.C., 2000. Support Vector Method for Novelty Detection. Advances in Neural Information Processing Systems 12, pp.582–588. 84. Schwartz, B., 2011. Google Kills Google Directory, Says Web Search Is Faster. [online] Available at: https://www.seroundtable.com/google-directory-gone-13731.html [Accessed 29 Jun. 2015]. 85. Singh, A.., Kumar, N., Gera, S.. and Mittal, A.., 2010. Achieving magnitude order improvement in porter stemmer algorithm over multi-core architecture. In: INFOS2010 - 2010 7th International Conference on Informatics and Systems. 86. Sinka, M.P. and Corne, D.W., 2002. A large benchmark dataset for web document clustering. Soft Computing Systems: Design, Management and Applications, 87, pp.881–890. 87. Sokvitne, L., 2003. An Evaluation of the Effectiveness of Current Dublin Core Metadata for Retrieval. Library. 88. Soucy, P. and Mineau, G.W., 2005. Beyond TFIDF Weighting for Text Categorization in the Vector Space Model. In: IJCAI’05 Proceedings of the 19th international joint conference on Artificial intelligence. San Francisco, CA, USA: Morgan Kaufmann Publishers Inc., pp.1130–1135. 89. Stanković, R., Kitanović, O., Obradović, I., Linzalone, R., Schiuma, G. and Carlucci, D., 2014. USING METADATA FOR CONTENT INDEXING WITHIN AN OER NETWORK. The Fifth International Conference on e-Learning (eLearning-2014), (September), pp.22–23. 90. Stumme, G., Hotho, a and Berendt, B., 2006. Semantic Web MiningState of the art and future directions. Web Semantics: Science, Services and Agents on the World Wide Web, 4(2), pp.124–143. 91. Suanmali, L., Salim, N. and Binwahlan, M.S., 2011. Fuzzy genetic semantic based text summarization. Proceedings - IEEE 9th International Conference on Dependable, Autonomic and Secure Computing, DASC 2011, pp.1184–1191. 92. Subasi, A. and Erçelebi, E., 2005. Classification of EEG signals using neural network and logistic regression. Computer Methods and Programs in Biomedicine, 78(2), pp.87–99. 93. Sugiyama, K., 2004. Doctor ’ s Thesis Studies on Improving Retrieval Accuracy in Web Information Retrieval KAZUNARI SUGIYAMA. Nara Institute of Science and Technology. 94. Tax, D.M.J., 2001. One-class classification. 95. Tax, D.M.J. and Duin, R.P.W., 2004. Support Vector Data Description. Mach. Learn., 54, pp.45–66. 96. Tejada, S., Knoblock, C. a. and Minton, S., 2002. Learning Domain-Independent String Transformation Weights for High Accuracy Object Identification. Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining, pp.350–359. 97. Ting, S.L., Ip, W.H. and Tsang, A.H.C., 2011. Is Naïve bayes a good classifier for document classification? International Journal of Software Engineering and its Applications, 5(3), pp.37–46. 98. Trim, C., 2013. The Art of Tokenization (Language Processing). 99. Tsukada, M., Washio, T. and Motoda, H., 2001. Automatic web-page classification by using machine learning methods. Web Intelligence: Research and Development, 2198, pp.303–313. 100. Universiti Sains Malaysia, 2015. USM - Universiti Sains Malaysia Official Website. [online] Available at: http://www.usm.my/index.php/en/ [Accessed 29 Jun. 2015]. 101. Universiti Teknologi Malaysia, 2015. UTM Official Website. [online] Available at: http://www.utm.my/ [Accessed 29 Jun. 2015]. 102. Verma, T., 2014. Tokenization and Filtering Process in RapidMiner. International Journal of Applied Information Systems (IJAIS) – ISSN : 2249-0868 Foundation of Computer Science FCS, New York, USA, 7(2), pp.16–18. 103. Vishwanathan, S., 2014. Sentiment Analysis of Movie Reviews. In: 3rd IRF International Conference. pp.80–82. 104. Wu, X. and Bolivar, A., 2008. Keyword extraction for contextual advertisement. Proceeding of the 17th international conference on World Wide Web - WWW ’08, p.1195. 105. Xia, T., Chai, Y. and Wang, T., 2012. Improving SVM on web content classification by document formulation. The 7th International Conference on Computer Science & Education (ICCSE 2012), (Iccse), pp.110–113. 106. Xu, G., Xiang, C., Gao, X., Zhao, X. and Yang, G., 2012. Automatic Classification of Tibetan Web Pages. 2012 International Conference on Computer Science and Electronics Engineering, pp.423–426. 107. Yaghini, M., Khoshraftar, M.M. and Fallahi, M., 2013. A hybrid algorithm for artificial neural network training. Engineering Applications of Artificial Intelligence, . 108. Yu, H., Han, J. and Chen-chuan Chang, K., 2002. PEBL : Positive Example Based Learning for Web Page Classification Using SVM. Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining, pp.239–248. 109. Yuan-jie, L. and Jian, C., 2012. Web Service Classification Based on Automatic Semantic Annotation and Ensemble Learning. 2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum, pp.2274–2279. 110. Yusuf, L.M., Othman, M.S. and Salim, J., 2010. Web classification using extraction and machine learning techniques. 2010 International Symposium on Information Technology, pp.765–770. 111. Zhang, J. and Dimitroff, A., 2005. The impact of metadata implementation on webpage visibility in search engine results (part II). Information Processing and Management, 41(3), pp.691–715.