| dc.description.abstract |
Fraudulent attendance manipulation threatens academic integrity, compliance, and resource use in
schools and businesses. Traditional systems like sign-ins, RFID, and basic biometrics are
vulnerable to proxy sign-ins, cloned IDs, and video spoofing (e.g., replay attacks, deepfakes),
leading to inflated grades, false certifications, financial loss, and reduced trust.
This dissertation introduces a robust, end-to-end solution that harnesses spatiotemporal video
analytics via a three-dimensional convolutional neural network (3D CNN). The system ingests
short video clips captured at the point of attendance, uniformly samples 16 key frames, and applies
a preprocessing pipeline comprising frame resizing (112 × 112 px), pixel-value normalization, and
on-the-fly data augmentation (random cropping, horizontal flips, brightness/contrast jitter, and
minor rotations). A bespoke 3D CNN architecture, featuring stacked 3D convolutional blocks with
ReLU activations, temporal‐preserving max-pooling layers, a flatten-and-dense head with dropout
regularization, and a sigmoid‐activated output neuron, learns to discriminate genuine live
submissions from fraudulent forgeries by capturing both spatial facial features and temporal
motion cues.
The model is trained using the Adam optimizer, binary cross-entropy loss, and early stopping on
155 annotated videos. It achieves 82.53% training accuracy, 67.74% validation accuracy, and an
F₁-score of 0.68, detecting subtle liveness cues like blinks and head movement while rejecting
spoofed footage. Case studies highlight its sensitivity to deepfake artifacts. The results show the
potential of 3D CNNs for real-time, secure attendance verification. Future work includes dataset
expansion, transfer learning, edge optimization, and multi-modal fusion. |
en_US |