ESG Rating Prediction Using Deep Learning Methods

Rathnayake, Krishalika

dc.contributor.author	Rathnayake, Krishalika
dc.date.accessioned	2026-03-11T08:08:21Z
dc.date.available	2026-03-11T08:08:21Z
dc.date.issued	2025
dc.identifier.citation	Rathnayake, Krishalika (2025) ESG Rating Prediction Using Deep Learning Methods. Msc. Dissertation, Informatics Institute of Technology	en_US
dc.identifier.issn	20232180
dc.identifier.uri	http://dlib.iit.ac.lk/xmlui/handle/123456789/2939
dc.description.abstract	Environmental, Social and Governance (ESG) aspects are non-financial elements attracting investor interest as they increasingly incorporate them into their analyses to uncover significant risks and development prospects. However, existing ESG grading systems confront issues like inconsistency and restricted data sources, which jeopardise their accuracy and dependability. To address these difficulties, this project proposes creating a transparent multimodal deep learning pipeline that aims to improve accuracy by combining structured and unstructured data sources. The system utilises both text and numeric data. A multistage pipeline was developed to collect and preprocess company reports, filter ESG-related text using domain-specific keywords and extract numeric finance data through Yahoo Finance. The text branch encodes disclosures with a section-aware ESG BERT and learns a compact signal via ridge regression. The numeric branch models fundamentals with XGBoost. The evaluation uses GroupKFold by company ticker and out-of-fold (OOF) predictions to avoid firm-level leakage. Final predictions are produced by a ridge stacking meta-learner that fuses the text and numeric branches, while an early fusion MLP serves as a deep baseline. Performance was evaluated using standard metrics such as R² and Mean Absolute Error (MAE) and Mean Squared Error (MSE) across ESG dimensions. Results show that numeric data provides a strong backbone, while text adds a consistent and modest list, particularly for social factors. Across OOF folds, the stacked model attains approximately R² = 0.20 (environmental), 0.16 (social), and 0.07 (governance), outperforming text-only models and slightly improving upon numeric-only baselines. Meta weights from the stacker quantify modality reliance. Explainability is integral in this project. SHAP identifies global and local numeric drivers; meta-weights expose cross-modal trust, and anchor-free sentence ranking surfaces disclosure snippets linked to predictions. These results validate the feasibility of using deep learning for automated ESG scoring and establish a performance baseline for further improvement while contributing to explainable AI.	en_US
dc.language.iso	en	en_US
dc.subject	ESG Ratings	en_US
dc.subject	Deep Learning	en_US
dc.subject	Sustainability	en_US
dc.title	ESG Rating Prediction Using Deep Learning Methods	en_US
dc.type	Thesis	en_US