Abstract:
"The preservation and dissemination of knowledge within ancient manuscripts present significant
challenges, especially due to the limited number of experts proficient in the respective languages.
In the case of Pali manuscripts, this challenge is particularly acute given the scarcity of proficient
readers. Interviews with subject specialists reveal a declining interest among new entrants to this
field, posing a risk of losing valuable knowledge. In response, this paper proposes the development
of an Optical Character Recognition (OCR) system tailored for Pali manuscripts. This system aims
to digitize manuscript content, facilitating access through digital means and serving as an initial
step towards preserving and disseminating the knowledge contained within Pali manuscripts.
The proposed OCR system emphasizes character segmentation and recognition. Upon inputting a
palm leaf manuscript, the system accurately segments and recognizes characters. It then converts
these recognized characters into editable text, enabling straightforward comprehension and
analysis of the manuscript content. This integration provides an efficient solution for reading and
interpreting Pali manuscripts"