AI-Powered Optical Character Recognition for Automated Timesheet Data Extraction: A Multimodal Approach for Handling Document Degradation

Florence Jean B. Talirongan; Mellanie S. Gambe

doi:10.47772/IJRISS.2026.100300188

AI-Powered Optical Character Recognition for Automated Timesheet Data Extraction: A Multimodal Approach for Handling Document Degradation

by Florence Jean B. Talirongan, Mellanie S. Gambe

Published: March 31, 2026 • DOI: 10.47772/IJRISS.2026.100300188

Abstract

This study explores the utilization of Optical Character Recognition in extracting employee information from physical datasheets. The development process includes the integration of Google Gemini 2.5 Flash Framework with the implementation of React frontend development. The different states of degradation, namely original, folded, crumpled, and wet, were 20 samples per category for 80 samples. The system achieved high accuracy: 100% accuracy for original documents, 90% for folded documents, 70% for crumpled documents, and 91.66% for wet timesheets, with a final accuracy of 87.92%. This means that context-aware multimodal reasoning is a powerful framework that can substantially reduce the reliance on standard binarization and template-matching in real-life document digitization, achieving 12–47 percentage point higher accuracy than the baseline OCR. This work serves as a baseline in determining document degradation in terms of manual to digital utilization and extraction.

Download PDF