Deep Learning for OCR

This is a reading list for deep learning for OCR.

OCR paper and codes

Papers

2018

Deng, D., Liu, H., Li, X., & Cai, D. (2018). PixelLink: Detecting Scene Text via Instance Segmentation. arXiv preprint arXiv:1801.01315. [pdf] [code]

2017

Detecting Oriented Text in Natural Images by Linking Segments, Shi, B., Bai, X., & Belongie, S. (2017, July). CVPR (pp. 3482-3490). IEEE. [pdf] [code]
Gated Recurrent Convolution Neural Network for OCR [pdf] [code] Wang, J., & Hu, X. (2017). NIPS (pp. 335-344).
TextBoxes: A Fast Text Detector with a Single Deep Neural Network, Liao, M., Shi, B., Bai, X., Wang, X., & Liu, W. (2017, February). AAAI (pp. 4161-4167).[pdf] [code]
Deep Direct Regression for Multi-Oriented Scene Text Detection, He, W., Zhang, X. Y., Yin, F., & Liu, C. L. (2017). ICCV (pp. 745-753). [pdf]
An end-to-end trainable neural network for image-based sequence recognition and its application to scene text recognition [pdf] [code] Shi, B., Bai, X., & Yao, C. (2017). TPAMI, 39(11), 2298-2304.
EAST: an efficient and accurate scene text detector, Zhou, X., Yao, C., Wen, H., Wang, Y., Zhou, S., He, W., & Liang, J. (2017, July). CVPR (pp. 2642-2651).[pdf] [code]
Scan, Attend and Read: End-to-End Handwritten Paragraph Recognition with MDLSTM Attention, Theodore Bluche Jerome Louradour, Ronaldo Messina, ICDAR, 2017. pdf
Combining Convolutional Neural Networks and LSTMs for Segmentation-Free OCR, Rawls, S., Cao, H., Kumar, S., & Natarajan, P. (2017, November). ICDAR, (Vol. 1, pp. 155-160). IEEE. [pdf]
Combining deep learning and language modeling for segmentation-free OCR from raw pixels, Rawls, S., Cao, H., Sabir, E., & Natarajan, P. (2017, April) ASAR, (pp. 119-123). IEEE. [pdf]
Implicit Language Model in LSTM for OCR, Sabir, E., Rawls, S., & Natarajan, P. (2017, November). ICDAR, (Vol. 7, pp. 27-31). IEEE.[pdf]
Towards end-to-end text spotting with convolutional recurrent neural networks, Li, H., Wang, P., & Shen, C. (2017, October). ICCV (pp. 5238-5246). [pdf]
Deep matching prior network: Toward tighter multi-oriented text detection, Liu, Y., & Jin, L. (2017, March). CVPR (pp. 3454-3461). [pdf]

2016

Multi-oriented text detection with fully convolutional networks, Zhang, Z., Zhang, C., Shen, W., Yao, C., Liu, W., & Bai, X. (2016). CVPR (pp. 4159-4167). [pdf]
Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C. Y., & Berg, A. C. (2016, October). Ssd: Single shot multibox detector. In European conference on computer vision (pp. 21-37). Springer, Cham. [pdf] [code]
Deeptext: A unified framework for text proposal generation and text detection in natural images, Zhong, Z., Jin, L., Zhang, S., & Feng, Z. (2016). arXiv preprint arXiv:1605.07314. [pdf]
Synthetic data for text localisation in natural images, Gupta, A., Vedaldi, A., & Zisserman, A. (2016). CVPR (pp. 2315-2324). [pdf]
Reading text in the wild with convolutional neural networks (2016), M. Jaderberg et al. IJCV, (DeepMind) [pdf] [code]
Detecting text in natural image with connectionist text proposal network, Tian, Z., Huang, W., He, T., He, P., & Qiao, Y. (2016, October). ECCV (pp. 56-72). Springer, Cham. [pdf]
Recursive Recurrent Nets with Attention Modeling for OCR in the Wild, Chen-Yu Lee, Simon Osindero, CVPR, 2016, pdf
Reading Scene Text in Deep Convolutional Sequences, Pan He, Weilin Huang, Yu Qiao, Chen Change Loy, and Xiaoou Tang, AAAI, 2016, pdf
You only look once: Unified, real-time object detection, Redmon, J., Divvala, S., Girshick, R., & Farhadi, A. (2016). CVPR (pp. 779-788). [pdf] [code]
Scene text detection via holistic, multi-channel prediction, Yao, C., Bai, X., Sang, N., Zhou, X., Zhou, S., & Cao, Z. (2016). arXiv preprint arXiv:1606.09002. [pdf]

2015

Unsupervised feature learning for optical character recognition, Sahu, D. K., & Jawahar, C. V. (2015, August). ICDAR (pp. 1041-1045). IEEE. [pdf]
Sequence to sequence learning for optical character recognition, Devendra Kumar Sahu & Mohak Sukhwani, 2015, pdf
ReNet: A Recurrent Neural Network Based Alternative to Convolutional Networks, Francesco Visin, Kyle Kastner,Kyunghyun Cho, Matteo Matteucci,Aaron Courville, Yoshua Bengio. pdf
Faster r-cnn: Towards real-time object detection with region proposal networks, Ren, S., He, K., Girshick, R., & Sun, J. (2015). NIPS (pp. 91-99). [pdf] [code]
Symmetry-based text line detection in natural scenes, Zhang, Z., Shen, W., Yao, C., & Bai, X. (2015). CVPR (pp. 2558-2567). [pdf]

2014

Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation, Girshick, R., Donahue, J., Darrell, T., & Malik, J. (2014). CVPR (pp. 580-587). [pdf] [code]
A Comparison of Sequence-Trained Deep Neural Networks and Recurrent Neural Networks Optical Modeling for Handwriting Recognition, Theodore Bluche, Hermann Ney, and Christopher Kermorvant, SLSP, 2014. pdf
Orientation robust text line detection in natural images, Kang, L., Li, Y., & Doermann, D. (2014). CVPR (pp. 4034-4041). [pdf]
Towards End-to-End Speech Recognition with Recurrent Neural Networks. Alex Graves, Navdeep Jaitly. ICML, 2014. pdf
Multi-digit Number Recognition from Street View, Imagery using Deep Convolutional Neural Networks, Ian J. Goodfellow, Yaroslav Bulatov, Julian Ibarz, Sacha Arnoud, Vinay Shet, ICLR, 2014. pdf
Deep Features for Text Spotting, ECCV, M. Jaderberg, A. Vedaldi, A. Zisserman, 2014. pdf, code

2013

PhotoOCR: Reading Text in Uncontrolled Conditions, Alessandro Bissacco, Mark Cummins, Yuval Netzer, Hartmut Neven, ICCV, 2013. pdf
High Performance OCR for Printed English and Fraktur using LSTM Networks. ICDAR, 2013. pdf
Image binarization for end-to-end text understanding in natural images, Sergey Milyaev, Olga Barinova, Tatiana Novikova, Pushmeet Kohli, Victor Lempitsky. ICDAR, 2013, pdf
Multi-digit number recognition from street view imagery using deep convolutional neural networks, Goodfellow, I. J., Bulatov, Y., Ibarz, J., Arnoud, S., & Shet, V. (2013). arXiv preprint arXiv:1312.6082. [pdf]

2012

Text Recognition in Videos using a Recurrent Connectionist Approach, Khaoula Elagouni, Christophe Garcia, Franck Mamalet1 , and Pascale Sebillot, ICANN, 2012. pdf
A Novel Word Spotting Method Based on Recurrent Neural Networks, Volkmar Frinken, Andreas Fischer, R. Manmatha, and Horst Bunke, TPAMI, 2012.pdf
End-to-End Text Recognition with Convolutional Neural Networks, Tao Wang, David J. Wu, Adam Coates, Andrew Y. Ng, ICPR, 2012. pdf
Detecting texts of arbitrary orientations in natural images, Yao, C., Bai, X., Liu, W., Ma, Y., & Tu, Z. (2012, June). CVPR (pp. 1083-1090). IEEE. [pdf][dataset]

2011

Robust text detection in natural images with edge-enhanced maximally stable extremal regions, Chen, H., Tsai, S. S., Schroth, G., Chen, D. M., Grzeszczuk, R., & Girod, B. (2011, September). ICIP (pp. 2609-2612). IEEE. [pdf] [code]

2010

Object detection with discriminatively trained part-based models, Felzenszwalb, P. F., Girshick, R. B., McAllester, D., & Ramanan, D. (2010). TPAMI, 32(9), 1627-1645. [pdf] [code]

Other

Optical Character Recognition (OCR), Marina Samuel, [blog]
The Unreasonable Effectiveness of Recurrent Neural Networks, Andrej Karpathy, 2015, [blog]

Keynote

Deep Neural Networks for Scene Text Reading Deep Neural Networks for Scene Text Reading, Xiang Bai, Huazhong University of Science and Technology

Lecture Slides

Scene Text Detection and Recognition [pdf] Dr. Cong Yao, Megvii (Face++) Researcher, [email protected]

Thesis

Deep Learning for Text Spotting, Robotics Research Group, Department of Engineering Science, University of Oxford [pdf]

Name		Name	Last commit message	Last commit date
Latest commit History 71 Commits
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Deep Learning for OCR

Papers

2018

2017

2016

2015

2014

2013

2012

2011

2010

Other

Keynote

Lecture Slides

Thesis

About

Uh oh!

Releases

Packages

frankfqchen/Deep-Learning-for-OCR

Folders and files

Latest commit

History

Repository files navigation

Deep Learning for OCR

Papers

2018

2017

2016

2015

2014

2013

2012

2011

2010

Other

Keynote

Lecture Slides

Thesis

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Packages