Abstract:
Most research oriented search queries consist of multiple topics belonging to one or more domains of knowledge. The objective of such search queries is to find the relationship or impact that each topic has on the other(s). Though web search engines provide easy means to retrieve information off the web, search engines are mainly key word oriented and do not consider the different topics and relationships between such topics present in the results. This project presents a software solution, `SearchDroid' that acts as an intermediate user between a search engine and the end user, refining search results based on different topics identified in the search query and their presence and relationships depicted in search results. A discourse parsing approach has been used to build discourse structures that represent the rhetorical relations of text in search results; this is used to re-rank results based on topics identified in the search query. The project combines linguistics research under discourse parsing with web information retrieval techniques. The lack of literature combining discourse parsing techniques with web information retrieval has been compensated for by introducing a fresh algorithmic approach. An abstract information retrieval mechanism has been created with the use of discourse parsing techniques, and can be integrated into any web information retrieval approach. The system was evaluated by Linguistic Experts, Technical Experts and End users. All experts agreed that a discourse parsing approach was suitable for addressing the problem at hand and that the project had high research value in the aspect of linking linguistics research with web information retrieval research. Hands-on testing of the system by End Users produced high user acceptance of the proposed system.