Skip to content

Commit 28359ae

Browse files
authored
Include and update the restructured README
For better integration with PyPI
1 parent 17ba5c2 commit 28359ae

File tree

1 file changed

+65
-0
lines changed

1 file changed

+65
-0
lines changed

README.rst

Lines changed: 65 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,65 @@
1+
Python Tesseract
2+
================
3+
4+
Python-tesseract is an optical character recognition (OCR) tool for python.
5+
That is, it will recognize and "read" the text embedded in images.
6+
7+
Python-tesseract is a wrapper for `Google's Tesseract-OCR Engine`_. It is also useful as a
8+
stand-alone invocation script to tesseract, as it can read all image types
9+
supported by the Python Imaging Library, including jpeg, png, gif, bmp, tiff,
10+
and others, whereas tesseract-ocr by default only supports tiff and bmp.
11+
Additionally, if used as a script, Python-tesseract will print the recognized
12+
text in stead of writing it to a file. Support for confidence estimates and
13+
bounding box data is planned for future releases.
14+
15+
.. _Google's Tesseract-OCR Engine: https://github.com/tesseract-ocr/tesseract
16+
17+
USAGE
18+
-----
19+
::
20+
21+
try:
22+
import Image
23+
except ImportError:
24+
from PIL import Image
25+
import pytesseract
26+
print(pytesseract.image_to_string(Image.open('test.png')))
27+
print(pytesseract.image_to_string(Image.open('test-european.jpg'), lang='fra'))
28+
29+
INSTALLATION
30+
------------
31+
32+
Prerequisites:
33+
34+
- Python-tesseract requires python 2.5+ or python 3.x
35+
- You will need the Python Imaging Library (PIL) (or the Pillow fork).
36+
Under Debian/Ubuntu, this is the package **python-imaging** or **python3-imaging**.
37+
- Install `Google Tesseract OCR <https://github.com/tesseract-ocr/tesseract>`_
38+
(additional info how to install the engine on Linux, Mac OSX and Windows).
39+
You must be able to invoke the tesseract command as *tesseract*. If this
40+
isn't the case, for example because tesseract isn't in your PATH, you will
41+
have to change the "tesseract_cmd" variable at the top of *tesseract.py*.
42+
Under Debian/Ubuntu you can use the package **tesseract-ocr**.
43+
44+
Installing via pip:
45+
See the `pytesseract package page <https://pypi.python.org/pypi/pytesseract>`_.
46+
::
47+
48+
$ (env)> pip install pytesseract
49+
50+
Installing from source:
51+
::
52+
53+
$> git clone [email protected]:madmaze/pytesseract.git
54+
$ (env)> python setup.py install
55+
56+
LICENSE
57+
-------
58+
Python-tesseract is released under the GPL v3.
59+
60+
CONTRIBUTERS
61+
------------
62+
- Originally written by `Samuel Hoffstaetter <https://github.com/hoffstaetter>`_
63+
- `Juarez Bochi <https://github.com/jbochi>`_
64+
- `Matthias Lee <https://github.com/madmaze>`_
65+
- `Lars Kistner <https://github.com/Sr4l>`_

0 commit comments

Comments
 (0)