Abstract:
"This study integrates Natural Language Processing (NLP) and machine learning methods to
compute and produce ledger accounts from provided questions. Furthermore, it seeks to
construct balance statements utilizing the generated ledger accounts. The study emphasizes identifying transaction types, discerning affected accounts, and comprehending the impact on each account. The process includes classifying transactions and creating a model to identify affected accounts. In the context of the Sri Lankan O/L and Advanced Level accounting subjects, the primary emphasis is on balance statements, cash accounts, and ledger accounts. The commerce path in tertiary education in Sri Lanka experiences one of the highest selection rates, and success in this domain necessitates extensive practice to attain proficiency, practicing with past papers is crucial, although the number of available past papers is limited. Given that the main exam is time-constrained and demands the creation of numerous financial statements in a brief time, ample practice is necessary to complete the exam successfully. Consequently, having access to a greater number of papers enhances the validation of answers and increases the likelihood of passing the subject. Training on TF-IDF vectorized text features, the logistic regression model performed well, achieving testing accuracy of 86.32% and training accuracy of 95.02%. The confusion matrix illustrates how different transaction types can be classified effectively. A balanced model is indicated by precision, recall, and F1 score, which average 88.47%, 79.67%, and 79.69%, respectively. Interestingly, some class accuracy, like 100% for 'in' and 88.89% for 'eq',
demonstrate the model's strength in particular domains."