Deep Learning-Based Cursive Text Detection and Recognition in Natural Scene Images

Vijay Gugulothu; Dr.Ramu Vankudoth

doi:10.32628/IJSRST523103161

Authors

Vijay Gugulothu Department of Computer Science, Chaitanya Deemed to be University, Warangal, Telangana, India
Dr.Ramu Vankudoth Department of Computer Science and Engineering, Data Science, Secunderabad, Telangana, India

DOI:

https://doi.org/10.32628/IJSRST523103161

Keywords:

Cursive text detection, Cursive text recognition, Deep learning, Natural scene images, Convolutional neural networks, Recurrent neural networks, Attention mechanisms.

Abstract

Cursive text detection and recognition in natural scene images are complex tasks due to the variability and intricacy of handwriting styles. This research focuses on developing a deep learning-based approach to address these challenges. The proposed solution leverages advancements in deep learning techniques to improve the accuracy and robustness of cursive text detection and recognition. The research involves collecting and annotating a diverse dataset of natural scene images containing cursive text. A deep learning model for cursive text detection is trained on the annotated dataset, and techniques for text line segmentation are investigated to enhance accuracy. Furthermore, a deep learning-based recognition system is designed and implemented to transcribe cursive text into machine-readable text. The proposed approach is evaluated and compared with existing methods using appropriate metrics and benchmark datasets. The research aims to provide insights into the challenges and opportunities of cursive text analysis in real-world scenarios and contribute to advancements in document digitization, handwriting analysis, and information retrieval.

References

Girshick, R., Donahue, J., Darrell, T., & Malik, J. (2014). Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR) (pp. 580-587).
Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C. Y., & Berg, A. C. (2016). SSD: Single shot multibox detector. In European conference on computer vision (ECCV) (pp. 21-37).
Redmon, J., Divvala, S., Girshick, R., & Farhadi, A. (2016). You only look once: Unified, real-time object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR) (pp. 779-788).
Graves, A., Mohamed, A. R., & Hinton, G. (2013). Speech recognition with deep recurrent neural networks. In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 6645-6649).
Bahdanau, D., Cho, K., & Bengio, Y. (2014). Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473.
Girshick, R., Donahue, J., Darrell, T., & Malik, J. (2014). Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR) (pp. 580-587).
Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C. Y., & Berg, A. C. (2016). SSD: Single shot multibox detector. In European conference on computer vision (ECCV) (pp. 21-37).
Redmon, J., Divvala, S., Girshick, R., & Farhadi, A. (2016). You only look once: Unified, real-time object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR) (pp. 779-788).
Graves, A., Mohamed, A. R., & Hinton, G. (2013). Speech recognition with deep recurrent neural networks. In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 6645-6649).
Bahdanau, D., Cho, K., & Bengio, Y. (2014). Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473.
Marti, U. V., & Bunke, H. (2002). The IAM-database: An English sentence database for offline handwriting recognition. International Journal on Document Analysis and Recognition (IJDAR), 5(1), 39-46.
Gatos, B., Pratikakis, I., Ntirogiannis, K., & Halatsis, C. (2009). A holistic approach to handwritten text line recognition. Pattern Recognition, 42(12), 2998-3003.
ICDAR competitions datasets: https://rrc.cvc.uab.es/

Deep Learning-Based Cursive Text Detection and Recognition in Natural Scene Images

Authors

DOI:

Keywords:

Abstract

References

Downloads

Published

Issue

Section

License

How to Cite