Description
In the last two decades, the use of Online Social Networks (OSN) platforms has been widely spreading across the world population. They all possess different characteristics, target users and content. On the other hand, they all usually have one standard functionality: a search bar which makes use of an Information Retrieval System (IRS). Although Information Retrieval has deep roots in Computer Science with more than fifty years of history, this is not true for IRS applied to OSNs, also called Social Information Retrieval Systems (SIR). Assume the term similarity between two objects it is a measure that determines how ‘close’ they are. If we call searcher us, the user with an information need which is submitting the query q (like writing in the search bar), the goal of a SIR is to find all the entities (like OSN users) that are relevant for q with respect us to fulfill such need. To do so, the SIR balance the document/topic similarity (similarity between the available entities and the query) and the user similarity (similarity between the available entities and searcher us). The topic/document similarity can be measured, for example, by calculating how semantically close are the query and the available entities. For example, if q =‘Vittorio’, we can retrieve all the users called ‘Vittorio’. The user similarity, instead, can be determined by assigning a ‘social measure’ of closeness to each entity for the us. In our example, we could give a higher user similarity to the us’s users friends. By balancing both measures, the SIR will return back a set of results which should be relevant for the user information need. In the mentioned example, they should be the users called ‘Vittorio’, which are searcher’s friends. Studies and analysis for this context are just partially disclosed. Indeed, companies in the field keep their knowledge private for obvious reasons which regard competition on user engagement and overall improvement of user experience. Moreover, the few free and available research does not take into account the possibility of searching other type entities but users. This gives us an excellent chance to contribute in such a sense. The use-case study of this thesis is FullBrain: a social e-learning platform where students collaborate, get help and find personalized learning material. Students can follow university courses, explore concepts and share with their fellows learning resources, simply called sources. The aim of this thesis is designing and developing a SIR for FullBrain. Concretely, we focused on Query Suggestions (no query input), Query Autocompletion (QAC) and Search within the context of Social Networks. Indeed, we propose and implement a model able to handle not only users; but also other kinds of entities and use the social connections between them to rank the final results. Particularly for FullBrain’s SIR makes it possible to explore users, concepts, courses, posts and sources (similarly to what we find in Twitter, where search results can be both users and posts). Precisely, for this thesis, we use the distance in the social graph between the available entities and the searcher. The distances are calculated using a modified version of Landmark Embedding for the specific case of social networks. Moreover, we design the logical and physical layer of the application (using PostgreSQL and neo4j), taking into account: speed of retrieval, collection’s updates and scalability. Finally we discuss the performance of the system in the day-to-day FullBrain users’ usage, validating its functioning from a speed of retrieval standpoint.Period | 15 May 2020 → 30 Nov 2020 |
---|---|
Examinee | Vittorio Carmignani |
Degree of Recognition | International |
Keywords
- Educational social network
- information retreival