Skip to content

frankfqchen/Deep-Learning-for-OCR

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

71 Commits
 
 

Repository files navigation

Deep Learning for OCR

This is a reading list for deep learning for OCR.

OCR paper and codes

Papers

2018

  • Deng, D., Liu, H., Li, X., & Cai, D. (2018). PixelLink: Detecting Scene Text via Instance Segmentation. arXiv preprint arXiv:1801.01315. [pdf] [code]

2017

  • Detecting Oriented Text in Natural Images by Linking Segments, Shi, B., Bai, X., & Belongie, S. (2017, July). CVPR (pp. 3482-3490). IEEE. [pdf] [code]

  • Gated Recurrent Convolution Neural Network for OCR [pdf] [code] Wang, J., & Hu, X. (2017). NIPS (pp. 335-344).

  • TextBoxes: A Fast Text Detector with a Single Deep Neural Network, Liao, M., Shi, B., Bai, X., Wang, X., & Liu, W. (2017, February). AAAI (pp. 4161-4167).[pdf] [code]

  • Deep Direct Regression for Multi-Oriented Scene Text Detection, He, W., Zhang, X. Y., Yin, F., & Liu, C. L. (2017). ICCV (pp. 745-753). [pdf]

  • An end-to-end trainable neural network for image-based sequence recognition and its application to scene text recognition [pdf] [code] Shi, B., Bai, X., & Yao, C. (2017). TPAMI, 39(11), 2298-2304.

  • EAST: an efficient and accurate scene text detector, Zhou, X., Yao, C., Wen, H., Wang, Y., Zhou, S., He, W., & Liang, J. (2017, July). CVPR (pp. 2642-2651).[pdf] [code]

  • Scan, Attend and Read: End-to-End Handwritten Paragraph Recognition with MDLSTM Attention, Theodore Bluche Jerome Louradour, Ronaldo Messina, ICDAR, 2017. pdf

  • Combining Convolutional Neural Networks and LSTMs for Segmentation-Free OCR, Rawls, S., Cao, H., Kumar, S., & Natarajan, P. (2017, November). ICDAR, (Vol. 1, pp. 155-160). IEEE. [pdf]

  • Combining deep learning and language modeling for segmentation-free OCR from raw pixels, Rawls, S., Cao, H., Sabir, E., & Natarajan, P. (2017, April) ASAR, (pp. 119-123). IEEE. [pdf]

  • Implicit Language Model in LSTM for OCR, Sabir, E., Rawls, S., & Natarajan, P. (2017, November). ICDAR, (Vol. 7, pp. 27-31). IEEE.[pdf]

  • Towards end-to-end text spotting with convolutional recurrent neural networks, Li, H., Wang, P., & Shen, C. (2017, October). ICCV (pp. 5238-5246). [pdf]

  • Deep matching prior network: Toward tighter multi-oriented text detection, Liu, Y., & Jin, L. (2017, March). CVPR (pp. 3454-3461). [pdf]

2016

  • Multi-oriented text detection with fully convolutional networks, Zhang, Z., Zhang, C., Shen, W., Yao, C., Liu, W., & Bai, X. (2016). CVPR (pp. 4159-4167). [pdf]

  • Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C. Y., & Berg, A. C. (2016, October). Ssd: Single shot multibox detector. In European conference on computer vision (pp. 21-37). Springer, Cham. [pdf] [code]

  • Deeptext: A unified framework for text proposal generation and text detection in natural images, Zhong, Z., Jin, L., Zhang, S., & Feng, Z. (2016). arXiv preprint arXiv:1605.07314. [pdf]

  • Synthetic data for text localisation in natural images, Gupta, A., Vedaldi, A., & Zisserman, A. (2016). CVPR (pp. 2315-2324). [pdf]

  • Reading text in the wild with convolutional neural networks (2016), M. Jaderberg et al. IJCV, (DeepMind) [pdf] [code]

  • Detecting text in natural image with connectionist text proposal network, Tian, Z., Huang, W., He, T., He, P., & Qiao, Y. (2016, October). ECCV (pp. 56-72). Springer, Cham. [pdf]

  • Recursive Recurrent Nets with Attention Modeling for OCR in the Wild, Chen-Yu Lee, Simon Osindero, CVPR, 2016, pdf

  • Reading Scene Text in Deep Convolutional Sequences, Pan He, Weilin Huang, Yu Qiao, Chen Change Loy, and Xiaoou Tang, AAAI, 2016, pdf

  • You only look once: Unified, real-time object detection, Redmon, J., Divvala, S., Girshick, R., & Farhadi, A. (2016). CVPR (pp. 779-788). [pdf] [code]

  • Scene text detection via holistic, multi-channel prediction, Yao, C., Bai, X., Sang, N., Zhou, X., Zhou, S., & Cao, Z. (2016). arXiv preprint arXiv:1606.09002. [pdf]

2015

  • Unsupervised feature learning for optical character recognition, Sahu, D. K., & Jawahar, C. V. (2015, August). ICDAR (pp. 1041-1045). IEEE. [pdf]

  • Sequence to sequence learning for optical character recognition, Devendra Kumar Sahu & Mohak Sukhwani, 2015, pdf

  • ReNet: A Recurrent Neural Network Based Alternative to Convolutional Networks, Francesco Visin, Kyle Kastner,Kyunghyun Cho, Matteo Matteucci,Aaron Courville, Yoshua Bengio. pdf

  • Faster r-cnn: Towards real-time object detection with region proposal networks, Ren, S., He, K., Girshick, R., & Sun, J. (2015). NIPS (pp. 91-99). [pdf] [code]

  • Symmetry-based text line detection in natural scenes, Zhang, Z., Shen, W., Yao, C., & Bai, X. (2015). CVPR (pp. 2558-2567). [pdf]

2014

  • Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation, Girshick, R., Donahue, J., Darrell, T., & Malik, J. (2014). CVPR (pp. 580-587). [pdf] [code]

  • A Comparison of Sequence-Trained Deep Neural Networks and Recurrent Neural Networks Optical Modeling for Handwriting Recognition, Theodore Bluche, Hermann Ney, and Christopher Kermorvant, SLSP, 2014. pdf

  • Orientation robust text line detection in natural images, Kang, L., Li, Y., & Doermann, D. (2014). CVPR (pp. 4034-4041). [pdf]

  • Towards End-to-End Speech Recognition with Recurrent Neural Networks. Alex Graves, Navdeep Jaitly. ICML, 2014. pdf

  • Multi-digit Number Recognition from Street View, Imagery using Deep Convolutional Neural Networks, Ian J. Goodfellow, Yaroslav Bulatov, Julian Ibarz, Sacha Arnoud, Vinay Shet, ICLR, 2014. pdf

  • Deep Features for Text Spotting, ECCV, M. Jaderberg, A. Vedaldi, A. Zisserman, 2014. pdf, code

2013

  • PhotoOCR: Reading Text in Uncontrolled Conditions, Alessandro Bissacco, Mark Cummins, Yuval Netzer, Hartmut Neven, ICCV, 2013. pdf

  • High Performance OCR for Printed English and Fraktur using LSTM Networks. ICDAR, 2013. pdf

  • Image binarization for end-to-end text understanding in natural images, Sergey Milyaev, Olga Barinova, Tatiana Novikova, Pushmeet Kohli, Victor Lempitsky. ICDAR, 2013, pdf

  • Multi-digit number recognition from street view imagery using deep convolutional neural networks, Goodfellow, I. J., Bulatov, Y., Ibarz, J., Arnoud, S., & Shet, V. (2013). arXiv preprint arXiv:1312.6082. [pdf]

2012

  • Text Recognition in Videos using a Recurrent Connectionist Approach, Khaoula Elagouni, Christophe Garcia, Franck Mamalet1 , and Pascale Sebillot, ICANN, 2012. pdf

  • A Novel Word Spotting Method Based on Recurrent Neural Networks, Volkmar Frinken, Andreas Fischer, R. Manmatha, and Horst Bunke, TPAMI, 2012.pdf

  • End-to-End Text Recognition with Convolutional Neural Networks, Tao Wang, David J. Wu, Adam Coates, Andrew Y. Ng, ICPR, 2012. pdf

  • Detecting texts of arbitrary orientations in natural images, Yao, C., Bai, X., Liu, W., Ma, Y., & Tu, Z. (2012, June). CVPR (pp. 1083-1090). IEEE. [pdf][dataset]

2011

  • Robust text detection in natural images with edge-enhanced maximally stable extremal regions, Chen, H., Tsai, S. S., Schroth, G., Chen, D. M., Grzeszczuk, R., & Girod, B. (2011, September). ICIP (pp. 2609-2612). IEEE. [pdf] [code]

2010

  • Object detection with discriminatively trained part-based models, Felzenszwalb, P. F., Girshick, R. B., McAllester, D., & Ramanan, D. (2010). TPAMI, 32(9), 1627-1645. [pdf] [code]

Other

  • Optical Character Recognition (OCR), Marina Samuel, [blog]

  • The Unreasonable Effectiveness of Recurrent Neural Networks, Andrej Karpathy, 2015, [blog]

Keynote

Lecture Slides

Thesis

  • Deep Learning for Text Spotting, Robotics Research Group, Department of Engineering Science, University of Oxford [pdf]

About

This is a reading list for deep learning for OCR (Based on https://github.com/hs105/Deep-Learning-for-OCR )

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published