2nd GCC International Conference on Industrial Engineering and Operations Management

UDiNet: A dilated U-net for improving OCR performance

Mahyar Fardinfar, Pouya Rashidikia, Mohammadreza Rezaie & Mina Zolfy Lighvan
Publisher: IEOM Society International
0 Paper Citations
1 Views
1 Downloads
Abstract

Image denoising is a critical task in the field of computer vision. This paper introduces the UDiNet architecture, a dilated variant of the U-Net, specifically designed to address image denoising challenges. We present a novel dataset comprising book sheet images to rigorously evaluate the performance of the proposed method. Experimental results demonstrate that UDiNet significantly enhances the performance of established Optical Character Recognition (OCR) systems, such as Tesseract and Genome. The model effectively mitigates severe noise while preserving essential structural details of English characters. This capability positions UDiNet as a valuable preprocessing technique for various applications, including classification, detection, and OCR tasks. To promote further research in this domain, we have made the code and trained models publicly accessible.

Published in: 2nd GCC International Conference on Industrial Engineering and Operations Management, Muscat, Oman

Publisher: IEOM Society International
Date of Conference: December 1-3, 2024

ISBN: 979-8-3507-4442-2
ISSN/E-ISSN: 2169-8767