iText - OCR Tool

iText is an OCR tool which could recognize text from any image.

You can use iText to extract text from PDF, document in paper, page in a book and any other images.

1. Easily Select Image

iText supports a variety of ways to select images, the operation is very convenient.

1.1 Capture Screen

iText has built-in screen capture tool. Just press the shortcut ⇧⌘1, capture any area on the screen, you can extract the text in it.

Tips: The recognized text has been copied to the system clipboard. You can paste directly.

1.2 Drag the Image to Menubar Icon

For example, when you see an image in Twitter and want to extract the text or number inside, just drag the image to iText’s menubar icon, you will get what you want.

1.3 Choose Image File

Of course, you can also select a picture file to recognize. However, dragging mentioned above is preferred in this case.

2 Accurately Recognize Text

Do you have this experience: You want to extract the text from a picture and found that there are some errors in the recognized text. As a result, the time to manually modify these errors is longer than the time to type them in a computer.

Obviously, accuracy of recognition is very important, that’s why I work hard on it.

2.1. Powered by Google

First of all, I excluded offline recognition libraries, as the offline libraries are dead and can’t improve itself. Next, in many online OCR services, I compared the products of Microsoft, Google, and others.

Finally, I chose Google’s service as it’s so powerful, which could recognize 50+ languages.

  • For normal natural language, such as a page of a book, press release, recognition result is amazingly accurate, even up to 100%.
  • For complex typesetting, especially with special characters (e.g., program source code), the recognition result isn’t that good, You may need to manually modify the results after recognition.
    • E.g,, for just a vertical line, the machine can not distinguish between the lowercase l, or uppercase I (by the way, can you identify them?); In contrast, machine needs to understand the context to optimize the result. But now it’s too hard for machine to understand non-natural language like program source code.

Welcome to have a try and feel how accurate the recognition result is.

2.2. Optimize the Recognition Results

OCR services could accurately recognize the text in image, but not that good for further recognition, e.g., paragraph recognition, etc.

So, iText includes its own algorithm to optimize the result, eg.,

  • Automatically identify paragraphs.
  • Remove extra spaces between English words and punctuation characters.
  • Capitalize the first letter for English.

If you find that the optimization is not good, welcome to send the image to me. I will optimize the algorithm corresponding to the image. Thanks in advance.

2.2. Preview the Original Image for Proofing

As current OCR technology cannot always 100% recognize the text, it’s necessary to review the original image to modify the result. In iText, you could:

  • Drag the result window nearby the image.
  • Show image in left of the result window.

And then, you will feel easy to update the result.

Download

Mac App Store

You can recognize text from images 20 times for free each month, or subscribe iText Pro to unlimitedly recognize text from images.

If you also feel iText is helpful, welcome to rate iText on Mac App Store and leave a small review.

If you had any problem using iText or have any suggestions for improvements, please feel free to contact me.

I’m looking forward to hearing from you.