7. OCR Texts
add chapter

Special feature: You specify the file name and target folder in the script, not in the “Basic” tab of the workflow.

7..1 OCR Texts per Page

Output one TXT file per page.

// -----------------------------------------------------------------
// Date: 01-09-2018
// Purpose: Generate a text file with the OCR text per page
// -----------------------------------------------------------------


var filename;
var folder = "C:/exports/ocr/";
var page;

for (i = 0; i < job.numPages; i++) {


  page = job.pages[i];

// ut.notifyUser('seite ' + i+1, '"'+page.ocrText+'"');
  filename = folder + page.pageNumber8 + '.txt';
// ut.notifyUser('filename' , filename);
  ut.writeStringToFile (filename, page.ocrText);

} // for i

BCS-2 creates the storage path automatically if the KeyMaps are filled. The file name corresponds to the index value.

Name: FOLDER_FOLDER Value: job.index1 Name: LOCAL_MASTER_FOLDER Valuet: C:/

// -----------------------------------------------------------------
// Date: 01-22-2021
// Purpose: Generate a text file with the OCR text per page
// -----------------------------------------------------------------

var folder    = job.getWfKeyValue('MASTER_FOLDER') + '/' + eval(job.getWfKeyValue('FOLDER_NAME'));
var file_name;
var page;

ut.createCompletePath(folder);

for (i = 0; i < job.numPages; i++) {


  page = job.pages[i];

  ut.notifyUser('seite ' + i+1, '"'+page.ocrText+'"');
  file_name = folder + '/' + page.pageNumber8 + '.txt';
  ut.notifyUser('file_name' , file_name);
 

  ut.writeStringToFile(file_name, page.ocrText);

} // for i

7..2 OCR Texts for the Entire Job

Alternative 1: You define the file name and path as usual under “Basic”.

// -----------------------------------------------------------------
// Date: 08-13-2019
// Purpose: Generate a text file with the OCR texts of all pages
// -----------------------------------------------------------------

var ocr_texts = "";

for (i = 0; i < job.numPages; i++) {
  ocr_texts += job.pages[i].ocrText + '\n';
} // for i...

ocr_texts;

Alternative 2:

// -----------------------------------------------------------------
// Datum: 09.01.2018
// Purpose: Generate a text file with the OCR texts of all pages
// -----------------------------------------------------------------

var filename = "C:/exports/ocr/ocr_job.txt";
var page;
var ocr_texts = "";

for (i = 0; i < job.numPages; i++) {

  page = job.pages[i];

  ocr_texts += page.ocrText + '\n';

} // for i

ut.writeStringToFile (filename, ocr_texts);

Alternative 3 – BCS-2 creates the storage path automatically if the KeyMaps are filled. The file name corresponds to the index value.
  • Name: FOLDER_FOLDER
  • Value: job.index1
  • Name: LOCAL_MASTER_FOLDER
  • Value: C:/
// -----------------------------------------------------------------
// Date: 08-13-2019
// Purpose: Generate a text file with the OCR texts of all pages
// -----------------------------------------------------------------

var ocr_texts = "";
var folder    = job.getWfKeyValue('MASTER_FOLDER') + '/' + eval(job.getWfKeyValue('FOLDER_NAME'));
var file_name = folder +'/' + job.index1 + '.txt';

for (i = 0; i < job.numPages; i++) {
  ocr_texts += job.pages[i].ocrText + '\n';
} // for i...

ut.createCompletePath(folder);
ut.writeStringToFile(file_name, ocr_texts);