Few-Shot Learning in Writer Identification: Achieving Cross- Lingual Generalization with Limited Data

Ranawaka, Imesh

Few-Shot Learning in Writer Identification: Achieving Cross- Lingual Generalization with Limited Data

Ranawaka, Imesh

URI: http://dlib.iit.ac.lk/xmlui/handle/123456789/2909

Date: 2025

Abstract:

Writer identification, a crucial task in forensic analysis and historical document authentication. faces significant challenges with data scarcity and cross-lingual application Existing systems struggle when confronted with limited handwriting samples per writer or diverse languages. limiting their real-world applicability. Traditional methods exhibit reduced accuracy with minimal sample sizes, while practical applications inherently deal with limited data due to resource constraınts. Researchers have proposed various machine learning approaches to address these challenges, but current systems mostly focus on supervised learning requiring extensive datasets. which are frequently unavailable in real-world scenarios. The adaptation of few-shot learning techniques specifically for writer identification across multiple languages remaıns largely unexplored. In this project, the author proposes a novel few-shot learning system combining Prototypical Networks with CNN backbones, specifically adapted for writer identification tasks The system employs a three-tiered architecture comprising a Python-based ML core implementing episodic training, a.NET Core backend providing REST APIs and a React frontend for end users., Key innovations include adaptation of pre-trained GoogLeNet models for grayscale handwriting analysis through weight averaging techniques and scalable microservices architecture. Date collection utilizes publicly available datasets (IAM, CVL, CERUG, Firemaker) across English, German, Chinese and Dutch languages The final model achieved 87% accuracy with high precision, recall and F1-scores using only 5 samples per writer during training. This study advances few-shot learning applications for robust writer identification in data-constrained multilingual environments, demonstrating practica] viability through scalable architecture that bridges research and forensic applications

Show full item record