11. OCR Menu
add chapter

 

11.1 OCR Attributes
add section

The OCR attributes determine the settings with which the OCR engine should process an image. Depending on the OCR engine used (ABBYY or Tesseract), the setting options vary.

The most important settings are explained in the following section.

11.1.1 Selecting and Setting Attributes

Set global OCR attributes: These settings are used by the OCR Engine whenever no job or page-specific settings are made or the OCR settings are controlled by the Job Index.

 

Set OCR attributes for current job: The set values are used for the OCR processing of the current job and are only valid for this job.

 

Set OCR attributes for current page: The set values are used for the OCR processing of the current page and are only valid for this page.

 

11.1.2 Reset attributes

reset global OCR attributes… The OCR attributes are reset to the original default values set at the time of installation.

Reset OCR attributes of the job… The OCR attributes are reset to the global settings

Reset OCR attributes of the image… The OCR attributes are reset to the job or global OCR attribute settings

 

11.2 Set OCR Attributes
add section

Since both OCR engines offer a variety of settings that are unfortunately not always documented or comprehensible, a drop-down list below the OCR Attributes dialog allows you to limit the number of settings based on an experience level (Beginner, Advanced, Expert).

The attributes for the OCR engines are preconfigured to the extent that the user usually only has to select the language and, in the case of ABBYY, the font.

Die wichtigsten Einstellungen bei beiden OCR Engines sind Sprache und Schrift. Werden diese nicht korrekt ausgewählt wird das Ergebnis entsprechend schlecht ausfallen.

11.2.1 Tesseract

 

 

 

 

 

 

 

11.2.2 ABBYY

11.3 Run OCR
add section

Perform OCR on the selected area: Text recognition is performed only for the selected area on the image.

 

Perform OCR on the current page: Text recognition is performed for the entire page

 

Perform OCR for current page and display segments..: Text recognition is performed for the entire page, then the segments (areas) recognized by the OCR engine are displayed.

 

Edit OCR text of the current page (CTRL+O): The text editor is called, here the OCR full text can be corrected or copied.

Only the continuous text is corrected, any errors remain in the other output formats (e.g. PDF, Alto, IWCOCR-eL).

11.4 OCR Special Feautures
add section

The additional OCR functions are only available after a successful OCR run.

Highlight OCR blocks OCR recognized blocks are displayed on the image. As soon as the mouse pointer is dragged onto the marked area, the OCR information (recognized text and font type) appears in a mouseover.

Highlight OCR text lines Text lines recognized by OCR are displayed. As soon as the mouse pointer is moved over the marked area, the OCR information (recognized text and font) appears in a mouseover

Highlight OCR text lines Text lines recognized by OCR are displayed. As soon as the mouse pointer is moved over the marked area, the OCR information (recognized text and font) appears in a mouseover

 

Highlight OCR words Words recognized by OCR are displayed. By mousing over a word, all the features of the word recognized by the OCR engine will be displayed.

Highlight OCR symbols All symbols and letters recognized by OCR are highlighted.