Host and run OCR as a service within your organisation or community.

OCR service is dependent on following:

  1. Java
  2. Maven
  3. Olena
  4. Tesseract
  5. Tessdata (for Indic scripts support)
  6. Varnam Project (libvarnam) Install instructions are here

Checkout the code

git clone 

To compile and start the server use following command

mvn package  && java -jar target/IndicOCR-jar-with-dependencies.jar <path_to_olena>/scribo/src/content_in_doc

On my local system it looks like this

mvn package  && java -jar target/IndicOCR-jar-with-dependencies.jar ~/ocr/olena/olena/scribo/src/content_in_doc

The server start on port 8081 and exposes 3 webservice APIs

An experimental server is available on All images are removed from the server at least once a day and they are not stored

Usage Examples


curl   -F "dpi=300"   -F "lang=eng"   -F "myfile=@<path_to_image_file>"


curl   -F "tolang=eng"   -F "sourcelang=pan"   -F "myfile=@<path_to_binarized_image>"


curl -H "Content-Type: application/json" -X POST -d '{"filePath":"<http url or data url >", "sourcelang":"pan","tolang":"eng","operation":"invert","engine":"tesseract"}'

Please join the project and help by code contributions or by reporting bugs.