“Crawl the News” (Video and Article News Recommendation using Web Crawling)

Diaz, C. L. Melissa

dc.contributor.author	Diaz, C. L. Melissa
dc.date.accessioned	2020-06-01T14:24:38Z
dc.date.available	2020-06-01T14:24:38Z
dc.date.issued	2019
dc.identifier.citation	Diaz, Melisa C. L. (2019) “Crawl the News” (Video and Article News Recommendation using Web Crawling). BSc. Dissertation Informatics Institute of Technology	en_US
dc.identifier.other	2013232
dc.identifier.uri	http://dlib.iit.ac.lk/xmlui/handle/123456789/454
dc.description.abstract	The Web is an ocean of information covering a vast area of topics and is constantly updated with more and more information daily. With the advancements of the WWW and the expansion of the programmable web, more and more applications and services have begun to be increasingly data-driven. At present, the two main methods used to gather required data involve using APIs from publishers or Web Scraping. Even though they are being used widely, APIs have some limitations when gathering data. Many people must visit multiple websites and YouTube news channels, spending their precious time, to get their daily dose of news. This research will focus on introducing a data retrieval mechanism using web crawling in the News Domain to recommend news articles with videos and visuals and help minimize time wasted by users having to visit multiple sources. In this research a web crawling mechanism was introduced that can crawl multiple websites with different DOM structures parallelly. The crawled news is categorized based on their headlines using text classification models and then recommended to the users by mapping the news categories with the news category preferences set by the users.	en_US
dc.subject	Information Retrieval	en_US
dc.subject	Web Crawling	en_US
dc.subject	Logistic Regression	en_US
dc.subject	Machine Learning	en_US
dc.title	“Crawl the News” (Video and Article News Recommendation using Web Crawling)	en_US
dc.type	Thesis	en_US