Digital Repository

AuthDoc: Content based fingerprinting of physical documents for tamper detection

Show simple item record

dc.contributor.author Atapattu, Hashani
dc.date.accessioned 2023-01-12T07:30:05Z
dc.date.available 2023-01-12T07:30:05Z
dc.date.issued 2022
dc.identifier.citation Atapattu, Hashani (2022) AuthDoc: Content based fingerprinting of physical documents for tamper detection. MSc. Dissertation, Informatics Institute of Technology en_US
dc.identifier.issn 2017400
dc.identifier.uri http://dlib.iit.ac.lk/xmlui/handle/123456789/1384
dc.description.abstract Most of the important documents on a life of a person such as birth certificates, passports, license, school or academic certificates, cheques, bills, marriage certificates and many kinds of other certificates and assertions are paper based physical documents. Those are typically kept in both digital and physical forms in order to promote the availability regardless of the availability of a digital infrastructure. Unlike the digital documents validating the authenticity of a physical document is not straight forward and cost effective. The study mentioned in this thesis proposes several possible methods to validate the authenticity of a physical document through a digitally obtained content-based and robust fingerprints. In particular, the study first proposes an autoencoder to digest an image of a document to a fixed length code which can be used as a fingerprint. After evaluating and identifying the limitations of this method the second method is proposed. The second method utilizes the features learned by the previous autoencoder followed by an image processing pipeline. The two methods have been evaluated in terms of robustness and reliability using recall, precision, and F1-score by comparing with a state-of-the-art technique using a self-made dataset and an existing benchmark dataset. The benchmark technique obtained 70.10% F1-score while autoencoder-based method and hybrid method gained 72.14% and 75.33% respectively. Thus, the experimental results prove that the two proposed methods outperformed the existing method. Furthermore, out of the two proposed methods the image processing and machine learning based hybrid method performs better than autoencoder-based method. en_US
dc.language.iso en en_US
dc.subject Autoencoder en_US
dc.subject Connected Component Labeling (CCL) en_US
dc.subject Digital Watermarking en_US
dc.subject Skew correction en_US
dc.title AuthDoc: Content based fingerprinting of physical documents for tamper detection en_US
dc.type Thesis en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search


Advanced Search

Browse

My Account