Deep Learning-Based Cursive Text Detection and Recognition in Natural Scene Images

Authors

  • Vijay Gugulothu  Department of Computer Science, Chaitanya Deemed to be University, Warangal, Telangana, India
  • Dr.Ramu Vankudoth  Department of Computer Science and Engineering, Data Science, Secunderabad, Telangana, India

DOI:

https://doi.org/10.32628/IJSRST523103161

Keywords:

Cursive text detection, Cursive text recognition, Deep learning, Natural scene images, Convolutional neural networks, Recurrent neural networks, Attention mechanisms.

Abstract

Cursive text detection and recognition in natural scene images are complex tasks due to the variability and intricacy of handwriting styles. This research focuses on developing a deep learning-based approach to address these challenges. The proposed solution leverages advancements in deep learning techniques to improve the accuracy and robustness of cursive text detection and recognition. The research involves collecting and annotating a diverse dataset of natural scene images containing cursive text. A deep learning model for cursive text detection is trained on the annotated dataset, and techniques for text line segmentation are investigated to enhance accuracy. Furthermore, a deep learning-based recognition system is designed and implemented to transcribe cursive text into machine-readable text. The proposed approach is evaluated and compared with existing methods using appropriate metrics and benchmark datasets. The research aims to provide insights into the challenges and opportunities of cursive text analysis in real-world scenarios and contribute to advancements in document digitization, handwriting analysis, and information retrieval.

References

  1. Girshick, R., Donahue, J., Darrell, T., & Malik, J. (2014). Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR) (pp. 580-587).
  2. Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C. Y., & Berg, A. C. (2016). SSD: Single shot multibox detector. In European conference on computer vision (ECCV) (pp. 21-37).
  3. Redmon, J., Divvala, S., Girshick, R., & Farhadi, A. (2016). You only look once: Unified, real-time object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR) (pp. 779-788).
  4. Graves, A., Mohamed, A. R., & Hinton, G. (2013). Speech recognition with deep recurrent neural networks. In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 6645-6649).
  5. Bahdanau, D., Cho, K., & Bengio, Y. (2014). Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473.
  6. Girshick, R., Donahue, J., Darrell, T., & Malik, J. (2014). Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR) (pp. 580-587).
  7. Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C. Y., & Berg, A. C. (2016). SSD: Single shot multibox detector. In European conference on computer vision (ECCV) (pp. 21-37).
  8. Redmon, J., Divvala, S., Girshick, R., & Farhadi, A. (2016). You only look once: Unified, real-time object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR) (pp. 779-788).
  9. Graves, A., Mohamed, A. R., & Hinton, G. (2013). Speech recognition with deep recurrent neural networks. In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 6645-6649).
  10. Bahdanau, D., Cho, K., & Bengio, Y. (2014). Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473.
  11. Marti, U. V., & Bunke, H. (2002). The IAM-database: An English sentence database for offline handwriting recognition. International Journal on Document Analysis and Recognition (IJDAR), 5(1), 39-46.
  12. Gatos, B., Pratikakis, I., Ntirogiannis, K., & Halatsis, C. (2009). A holistic approach to handwritten text line recognition. Pattern Recognition, 42(12), 2998-3003.
  13. ICDAR competitions datasets: https://rrc.cvc.uab.es/

Downloads

Published

2023-06-30

Issue

Section

Research Articles

How to Cite

[1]
Vijay Gugulothu, Dr.Ramu Vankudoth "Deep Learning-Based Cursive Text Detection and Recognition in Natural Scene Images" International Journal of Scientific Research in Science and Technology(IJSRST), Online ISSN : 2395-602X, Print ISSN : 2395-6011,Volume 10, Issue 3, pp.969-974, May-June-2023. Available at doi : https://doi.org/10.32628/IJSRST523103161