Image Text to Speech Conversion Using Optical Character Recognition

Authors

DOI:

https://doi.org/10.61841/fbhhgx23

Keywords:

Recurrent Neural Network (RNN, Optical Character Recognition (OCR), Long Short-Term Memory (LSTM), OTSU’s Method

Abstract

Nowadays, digital storage is preferred to paper storage. The data are scanned and stored in the form of image files. To retrieve an image from large data, text recognition is done. The data in that image can be in any language and also handwritten. Image processing is done to extract text, and those texts are converted to audio format in order to avoid ambiguity in handwritten data files, as the handwriting of a person is difficult to understand. There are few automated methods in machine learning algorithms that failed to provide accurate results. In this preprocessing, the input image using Long Short-Term Memory in Recurrent Neural Network (RNN), a deep learning algorithm, is done with addiction to that. Optical Character Recognition (OCR) uses OTSU’s method for image binarization and segmentation and then converts texts into audio format with better accuracy and clarity.

Downloads

Download data is not yet available.

Published

31.07.2020

How to Cite

Image Text to Speech Conversion Using Optical Character Recognition. (2020). International Journal of Psychosocial Rehabilitation, 24(5), 4199-4205. https://doi.org/10.61841/fbhhgx23