| dc.contributor.advisor | ||
| dc.contributor.author | Ranasinghe, Dulika Lavanya | |
| dc.date.accessioned | 2019-03-04T10:24:11Z | |
| dc.date.available | 2019-03-04T10:24:11Z | |
| dc.date.issued | 2018 | |
| dc.identifier.citation | Ranasinghe, D. L. (2018) IceScraper – Optimized Data sets provider. BSc. Dissertation. Informatics Institute of Technology | en_US |
| dc.identifier.other | 2014219 | |
| dc.identifier.uri | http://dlib.iit.ac.lk/xmlui/handle/123456789/157 | |
| dc.description.abstract | Data is said to be the new oil for the coming decade. Every organization is after the data these days spending millions of dollars to be ahead in the game. Web scraping and bots are among one of the novel tools that businesses are focusing on. Commercial web scraping tools are every where. But there are lot of limitations while it can be used for a given scenario. Scraping from a single page & creating a web scraping agent are completely two different tasks. Scraping a single page is just an attribute of web scraping agent. Web scraping agent consist of techniques such as by pass IP blocks, deal with bandwidth & Ram , css changes etc. Web scraping agent should be able to scrape multiple pages by handling java script without failures , do that job in speed of single page scrapers’ speed, keep the efficiency & consistency. The proposed solution is a framework to scrape multiple web pages with an increased accuracy while maintaing the web elements like Javascript and CSS. The prototype presented would be gathering data sets for a given set of URLs by an user. | en_US |
| dc.title | IceScraper – Optimized Data sets provider | en_US |
| dc.type | Thesis | en_US |