3D Image Reconstruction from 2D Images

Bodhinayake, Uditha

dc.contributor.author	Bodhinayake, Uditha
dc.date.accessioned	2026-04-10T08:24:31Z
dc.date.available	2026-04-10T08:24:31Z
dc.date.issued	2025
dc.identifier.citation	Bodhinayake, Uditha (2025) 3D Image Reconstruction from 2D Images. BSc. Dissertation, Informatics Institute of Technology	en_US
dc.identifier.issn	20210288
dc.identifier.uri	http://dlib.iit.ac.lk/xmlui/handle/123456789/3160
dc.description.abstract	3D object reconstruction from 2D images is a long-standing challenge in computer vision due to the loss of depth information, occlusions, and the geometric ambiguity present in single-view inputs. Traditional reconstruction techniques rely on multi-camera setups, controlled lighting, or depth sensors, making them impractical for everyday users or lightweight applications. Modern deep learning approaches have improved reconstruction accuracy but often require high computational resources, lack intelligent multi-view reasoning, or produce low-resolution voxel outputs with visible artifacts. To address these limitations, this research presents an enhanced multi-view 3D reconstruction system based on an improved Pix2Vox architecture capable of producing accurate voxel models using only three orthographic 2D views: front, side, and top. The proposed system integrates several key innovations to improve reconstruction quality and accessibility. Shared MobileNetV2 encoders are used to extract consistent features across views while maintaining computational efficiency. A novel attention-based fusion mechanism adaptively weights each view based on its geometric contribution, allowing the network to focus on the most informative perspectives. A progressive decoding pipeline then transforms fused features into a 32³ voxel representation through hierarchical upsampling. A specialized 3D refinement network further enhances surface continuity and reduces voxel-level artifacts. The model is trained on the ModelNet10 dataset using a multi-objective loss function that combines voxel accuracy, surface consistency, and volume preservation. The complete system is deployed through a user-friendly platform built with Flask and Next.js, enabling real-time inference and interactive 3D visualization via Three.js. Experimental results demonstrate that the enhanced architecture provides improved reconstruction accuracy, faster inference, and higher-quality voxel outputs compared to baseline approaches, making 3D reconstruction more accessible for educational, creative, and research applications.	en_US
dc.language.iso	en	en_US
dc.subject	3D Reconstruction	en_US
dc.subject	Multi View Learning	en_US
dc.subject	Voxel Representation	en_US
dc.title	3D Image Reconstruction from 2D Images	en_US
dc.type	Thesis	en_US