Digital Repository

ZENSEARCH: Revolutionizing E-Commerce Search through Advanced Multimodal Integration and Retrieval Techniques

Show simple item record

dc.contributor.author Ratnalingam, Gajeendran
dc.date.accessioned 2026-04-21T05:25:37Z
dc.date.available 2026-04-21T05:25:37Z
dc.date.issued 2025
dc.identifier.citation Ratnalingam, Gajeendran (2025) ZENSEARCH: Revolutionizing E-Commerce Search through Advanced Multimodal Integration and Retrieval Techniques. BSc. Dissertation, Informatics Institute of Technology en_US
dc.identifier.issn 20210328
dc.identifier.uri http://dlib.iit.ac.lk/xmlui/handle/123456789/3165
dc.description.abstract Problem: The dawn of the multimodal search systems in the e-commerce domain has been promising with an expansion of the variety of online product catalogs. These multimodal systems present forward a challenge for efficient retrieval recommendation systems in a unified space. The existing traditional e-commerce systems struggle to focus the user query to give a relevant product recommendation at the end of the retrieval stage where they mostly rely on unimodal approaches. This project explores this gap through developing an efficient multimodal retrieval system utilizing the ColPali architecture where the product images and captions are mapped effectively into a unified space to ultimately produce accurate and context-aware product recommendations. Methodology: This research proposes a vision language model to generate contextualized vector embeddings using a product catalog as an input. These embeddings are projected into a 128- dimensional vector space and stored in a vector database for the retrieval where all the above processes occur in an offline stage. When a user is passing a query into the system, it undergoes a similar embedding generation process to maintain the nature of the embeddings. And similarity search systems will be used to fetch the efficient and relevant product suggestion with the integration of the late-interaction mechanism. Testing: The proposed approach with the above mentioned methodology and problem has not been attempted and the system reaches an accuracy of 89% during the testing and evaluation phase which outperforms the base model trained on. Along with that the proposed approach exhibits better metrics in terms of NDCG, Precision, Recall, F1-Score etc. as well. The model trained shows a better performance in the vidore benchmarking which is a best-fitting benchmark for the colpali model which is quite evident in producing most relevant top-k retrievals by making the proposed approach achieve a greater milestone in the field of e-commerce multimodal search domain. en_US
dc.language.iso en en_US
dc.subject Product Search en_US
dc.subject Vision Language Model en_US
dc.subject Contextual Embeddings en_US
dc.subject Information Retrieval en_US
dc.title ZENSEARCH: Revolutionizing E-Commerce Search through Advanced Multimodal Integration and Retrieval Techniques en_US
dc.type Thesis en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search


Advanced Search

Browse

My Account