GREEN ROOTEDIN : Optimizing Satellite Image-Based  Deforestation Detection with CNN and Vision Transformer  Integration

Vidanapathirana, Chanuka Subodh Laknath

dc.contributor.author	Vidanapathirana, Chanuka Subodh Laknath
dc.date.accessioned	2026-04-10T07:37:42Z
dc.date.available	2026-04-10T07:37:42Z
dc.date.issued	2025
dc.identifier.citation	Vidanapathirana, Chanuka subodh laknath (2025) GREEN ROOTEDIN : Optimizing Satellite Image-Based Deforestation Detection with CNN and Vision Transformer Integration. BSc. Dissertation, Informatics Institute of Technology	en_US
dc.identifier.issn	20210284
dc.identifier.uri	http://dlib.iit.ac.lk/xmlui/handle/123456789/3158
dc.description.abstract	Deforestation is a significant threat to biodiversity, climate stability, and global heat regulation. There are various reasons that lead to deforestation. Some are natural reasons, and some are human activities. Human activities are the most contributing factor for the deforestation. Various actions have been taken by the government, NGOs, and international organisations to prevent deforestation. Strict regulations, reforestation programs, and educating people on preserving land resources are some of the main actions taken by the mentioned group of people. However, still deforestation is happening all around the world. Deep learning based ensemble model was developed by combining two powerful architectures to detect deforestation. Convolutional Neural Network (CNN) and Vision Transformer (ViT) are two powerful architectures that used in this research project. The CNN was designed with four sequential blocks (from 64 to 512 filters) that progressively learned more detailed features from satellite images. Techniques such as batch normalization, dilated convolutions, and residual connections were applied to improve learning stability. Also that expand the model’s view of image patterns, and support deeper layers.To prevent overfitting, dropout was gradually increased in deeper layers. The ViT model was optimized using patch-based input (16×16), a 256-dimensional embedding, and five transformer encoder layers with 8 attention heads, allowing it to learn global patterns across the image. Both models were fine-tuned using systematic hyperparameter tuning via Optuna, which guided the selection of learning rates, dropout rates, and layer sizes. Their outputs were combined using a two-layer neural stacker, creating a robust ensemble capable of predicting multiple deforestation-related labels for each image. Because of this Research based on multi label image classification, researcher used suitable metrics to evaluate the model. Below you can find the evaluation results for the whole model. Micro-Averaged Precision: 0.9641, Recall: 0.7458, F1-score: 0.8410 Macro-Averaged Precision: 0.6030, Recall: 0.3444, F1-score: 0.4038 Hamming Loss: 0.0480	en_US
dc.language.iso	en	en_US
dc.subject	Deforestation	en_US
dc.subject	Satellite Imagery	en_US
dc.subject	Deep Learning	en_US
dc.title	GREEN ROOTEDIN : Optimizing Satellite Image-Based Deforestation Detection with CNN and Vision Transformer Integration	en_US
dc.type	Thesis	en_US