METHODOLOGY FOR CONSTRUCTING AND EVALUATING AN ONTOLOGICAL PROFILE FOR CONTENT PERSONALIZATION SYSTEMS: STAGES AND EVALUATION CRITERIA

Abstract

This article presents the development and testing of a methodology for building an ontological profile designed for content personalization systems. It details the modular architecture of a web-based personalization system, illustrating the text processing and analysis methods and algorithms employed at each stage, and provides a step-by-step procedure for ontology creation. The methodology encompasses primary data processing, including the extraction of keywords and phrases, followed by their hierarchical clustering to reveal the semantic structure of the domain. Subsequent stages involve defining thresholds to filter out insignificant connections, and extracting and formalizing relationships between concepts using natural language processing techniques such as word-sense disambiguation and semantic similarity-based relationship extraction. An integrated pipeline was developed to implement this process, combining improved algorithms proposed by the author in previous studies, namely, an algorithm for extracting key phrases from individual text based on semantic similarity and a modified algorithm for word sense disambiguation. This pipeline also optimally integrated all necessary natural language processing tools, ensuring the efficient operation of these methods in the process of automatically constructing an ontology from text. The study places particular emphasis on a comprehensive evaluation of the resulting ontology using a specialized set of criteria designed to objectively assess the profile's quality, completeness, and consistency. A important component of the work is a computational experiment that clearly demonstrates the impact of each data processing stage on the final quality and efficacy of the ontology. The results show that the proposed method enables the construction of a practical, scalable, and relevant ontology, suitable for industrial deployment and integration into personalization systems to enhance their accuracy and adaptability

Authors

References

1. Garrigós I., Gomez J., Houben G.-J. Specification of personalization in web application design, Infor-mation Software Technology, 2010, Vol. 52, No. 9, pp. 991-1010.

2. Mertekhin A.A. Internet-zavisimoe povedenie i peregruzka informatsiey [Internet-dependent behavior and information overload], Severo-Kavkazskiy psikhologicheskiy vestnik [North Caucasian Psychological Bulletin], 2012, Vol. 10, No. 3, pp. 24-27.

3. Meister F., Shin D., Andrews L. “Getting to know you”: What’s new in personalization technologies, E-Doc, 2002, Vol. 16, No. 2, pp. 8-8.

4. Pressman R.S., Lowe D. Web engineering, Software Engineering: A Practitioner’s Perspective, 2000, pp. 769-798.

5. Ginige A., Murugesan S. Web Engineering: A Holistic, Disciplined Approach to Web-Based System Development, 12 th International World Wide Web Conference, 2003, Vol. 3. Web Engineering.

6. Tao X., Li Y., Zhong N. A personalized ontology model for web information gathering, IEEE transac-tions on knowledge data engineering, 2010, Vol. 23, No. 4, pp. 496-511.

7. Guo Q., Chen W., Wan H. AOL4PS: A large-scale data set for personalized search, Data Intelligence, 2021, Vol. 3, AOL4PS, No. 4, pp. 548-567.

8. Farid M., Elgohary R., Moawad I., Roushdy M. User profiling approaches, modeling, and personaliza-tion, Proceedings of the 11th international conference on informatics & systems (INFOS 2018), 2018.

9. Mobasher B. Data mining for web personalization, The adaptive web. Springer, 2007, pp. 90-135.

10. Gauch S., Speretta M., Chandramouli A., Micarelli A. User profiles for personalized information access, The adaptive web, 2007, pp. 54-89.

11. Cantador I., Bellogín A., Castells P. Ontology-based personalised and context-aware recommendations of news items, 2008 IEEE/WIC/ACM international conference on web intelligence and intelligent agent technology. IEEE, 2008, Vol. 1, pp. 562-565.

12. Leung K.W.-T., Lee D.L. Deriving concept-based user profiles from search engine logs, IEEE Transac-tions on knowledge and data engineering, 2009, Vol. 22, No. 7, pp. 969-982.

13. Liu F., Yu C., Meng W. Personalized web search by mapping user queries to categories, Proceedings of the eleventh international conference on Information and knowledge management CIKM02: Eleventh ACM International Conference on Information and Knowledge Management. McLean Virginia USA: ACM, 2002, pp. 558-565.

14. Penas P., Del Hoyo R., Vea-Murguía J., González C., Mayo S. Collective knowledge ontology user profiling for Twitter–automatic user profiling, 2013 IEEE/WIC/ACM International Joint Conferences on Web Intelligence (WI) and Intelligent Agent Technologies (IAT). IEEE, 2013, Vol. 1, pp. 439-444.

15. Gauch S., Chaffee J., Pretschner A. Ontology-based personalized search and browsing, Web Intelligence and Agent Systems: An international Journal, 2003, Vol. 1, No. 3-4, pp. 219-234.

16. Xu Y., Wang K., Zhang B., Chen Z. Privacy-enhancing personalized web search, Proceedings of the 16th international conference on World Wide Web WWW’07: 16th International World Wide Web Confer-ence. Banff Alberta Canada: ACM, 2007, pp. 591-600.

17. Abián D., Guerra F., Martínez-Romanos J., Trillo-Lado R. Wikidata and DBpedia: A Comparative Study, Semantic Keyword-Based Search on Structured Data Sources: Lecture Notes in Computer Sci-ence, eds. J. Szymański, Y. Velegrakis. Cham: Springer International Publishing, 2018, Vol. 10546. Wikidata and DBpedia, pp. 142-154. ISBN 978-3-319-74496-4.

18. Lehmann J., Isele R., Jakob M., Jentzsch A., Kontokostas D., Mendes P.N., Hellmann S.,. Morsey M, Van Kleef P., Auer S. Dbpedia–a large-scale, multilingual knowledge base extracted from Wikipedia, Semantic web, 2015, Vol. 6, No. 2, pp. 167-195.

19. Eke C.I., Norman A.A., Shuib L., Nweke H.F. A survey of user profiling: State-of-the-art, challenges, and solutions, IEEE Access, 2019, Vol. 7, pp. 144907-144924.

20. Purificato E., Boratto L., De Luca User E.W. Modeling and User Profiling: A Comprehensive Survey, arXiv preprint arXiv:2402.09660, 2024.

21. Bird S. NLTK: the natural language toolkit, Proceedings of the COLING/ACL 2006 Interactive Presen-tation Sessions, 2006, pp. 69-72.

22. Miller G.A. WordNet: a lexical database for English, C. A., 1995, Vol. 38, pp. 39-41.

23. Vasiliev Y. Natural language processing with Python and spaCy: A practical introduction. No Starch Press, 2020. ISBN 1-71850-052-1.

24. Lops P., De Gemmis M., Semeraro G. Content-based recommender systems: State of the art and trends, Recommender systems handbook, 2011, pp. 73-105.

25. Poelmans J., Ignatov D.I., Kuznetsov S.O., Dedene G. Formal concept analysis in knowledge pro-cessing: A survey on applications, Expert Systems with Applications, 2013, Vol. 40, No. 16, pp. 6538-6560.

26. Poelmans J., Ignatov D.I., Viaene S., Dedene G., Kuznetsov S.O. Text mining scientific papers: a survey on FCA-based information retrieval research, Advances in Data Mining. Applications and Theoretical Aspects: 12th Industrial Conference, ICDM 2012, Berlin, Germany, July 13-20, 2012. Proceedings 12. Springer, 2012, pp. 273-287.

27. Manning K.D., Ragkhavan P., Shyuttse Kh. Vvedenie v informatsionnyy poisk [Introduction to infor-mation retrieva], 2011.

28. Gruber T. What is an Ontology, 1993.

29. Mokhammad Zh.Kh. Mansur A.M., Kravchenko Yu.A., Bova V.V. Metod izvlecheniya klyuchevykh fraz na osnove novoy funktsii ranzhirovaniya [Method for extracting key phrases based on a new ranking function], Informatsionnye tekhnologii [Information Technologies], 2022, Vol. 9, No. 28, pp. 465-474.

30. Kravchenko Yu.A., Mansur A.M., Khussayn M.Zh. Modifitsirovannyy metod ustraneniya neodnoz-nachnosti smysla slov, osnovannyy na metodakh raspredelennogo predstavleniya [Modified method for disambiguating the meaning of words based on distributed representation methods], Izvestiya YuFU. Tekhnicheskie nauki [Izvestiya SFedU. Engineering Sciences], 2021, No. 3.

31. Mokhammad Zh.Kh., Mansur A.M., Kravchenko Yu.A., Kravchenko D.Yu. Metod avtomaticheskogo izvlecheniya klyuchevykh slov [Method of automatic extraction of keywords], Mezhdunarodnyy nauch-no-tekhnicheskiy kongress «Intellektual'nye sistemy i informatsionnye tekhnologii – 2022» [International Scientific and Technical Congress "Intelligent Systems and Information Technologies - 2022"], 2022, pp. 91-97.

32. Wang X., Xu Y. An improved index for clustering validation based on Silhouette index and Calinski-Harabasz index, IOP Conference Series: Materials Science and Engineering. IOP Publishing, 2019, Vol. 569, pp. 052024.

33. Mokhammad Zh.Kh., Mansur A.M., Kravchenko Yu.A. Perspektivy primeneniya metoda izvlecheniya klyuchevykh fraz FBKE v zadachakh personalizatsii veb-kontenta [Prospects for applying the FBKE keyword extraction method in web content personalization tasks], ХX Vserossiyskaya nauchnaya konfer-entsiya molodykh uchenykh, aspirantov i studentov «Informatsionnye tekhnologii, sistemnyy analiz i up-ravlenie (ITSAU-2022) [XX All-Russian scientific conference of young scientists, graduate students and students "Information technology, systems analysis and management (ITSAU-2022)], Taganrog, 2022, pp. 206.

34. Mokhammad Zh.Kh., Mansur A.M., Kravchenko Yu.A. Modifitsirovannyy metod ustraneniya neodnoz-nachnosti smysla slov, osnovannyy na metodakh raspredelennogo predstavleniya [A modified method for disambiguating word meanings based on distributed representation methods], Izvestiya YuFU. Tekhnicheskie nauki [Izvestiya SFedU. Engineering Sciences], 2021, No. 3 (220), pp. 92-101.

35. Mokhammad Zh.Kh., Mansur A. Modifitsirovannyy metod ustraneniya semanticheskoy neodnoz-nachnosti slov [A modified method for disambiguating semantic words]. Taganrog, 2022.

Скачивания

Published:

2025-12-30

Issue:

Section:

SECTION IV. MACHINE LEARNING AND NEURAL NETWORKS

Keywords:

Ontology profile, content personalization, keyword extraction, hierarchical clustering, ontology evaluation, semantic model

For citation:

Z.H. Mohammad METHODOLOGY FOR CONSTRUCTING AND EVALUATING AN ONTOLOGICAL PROFILE FOR CONTENT PERSONALIZATION SYSTEMS: STAGES AND EVALUATION CRITERIA. IZVESTIYA SFedU. ENGINEERING SCIENCES – 2025. - № 6. – P. 248-262.