Instance segmentation based objects detection in digital documents

Balasubramaniam, Bravin

Home
→
Dissertations & Thesis
→
MSc Bigdata Analytics
→
2019
→
View Item

dc.contributor.author	Balasubramaniam, Bravin
dc.date.accessioned	2020-07-24T18:52:55Z
dc.date.available	2020-07-24T18:52:55Z
dc.date.issued	2019
dc.identifier.citation	Balasubramaniam, Bravin (2019) Instance segmentation based objects detection in digital documents. MSc. Dissertation Informatics Institute of Technology	en_US
dc.identifier.other	2017385
dc.identifier.uri	http://dlib.iit.ac.lk/xmlui/handle/123456789/484
dc.description.abstract	Digital documents have increased in numbers exponentially within the last twenty years. Because of this information captured in digital documents also lost vastly. There are multiple researches done on using Natural Language Processing to mechanically extracting, understanding and, eventually, summarizing key data from digital documents. However, while text is without argument, a basic way to convey data, there are contexts where graphical components are far more powerful. For example, in scientific research papers, several experiments, variables and numbers must be reported in a concise manner that fits better with tables/figures than text. Graphical components possess in conveying information that may be otherwise cumbersome to explain in words, each for the author to express and also the reader to grasp. We developed an application which can identify/detect any graphical components in given digital document and extract them separately. Application not only have the capability to extract these graphical components but it also can classify them into three different categories/classes. 1. Tables 2. Charts 3. Other (Any other graphical components other than tables and charts) The results shows that graphical components are extracted from digital documents and classified correctly with an 81% of accuracy	en_US
dc.title	Instance segmentation based objects detection in digital documents	en_US
dc.type	Thesis	en_US