3. Modulsadd chapter
3.1 OCRadd section
Integration of Tesseract 4.1 into the OCR suite with the output of searchable PDFs and METS or ALTO formats.
Integration of ABBYY 12 32/64 bit into the OCR Suite.
3.2 Zoned OCRadd section
In the context menu of the viewer, the following mouse mode can now be selected: “Zoned OCR and store the OCR text in the ClipBoard”. So it is very easy to create OCR data and assign it to corresponding objects:
- In the structure tree: Right-click on the respective node and select ‘Insert text from ClipBoard’
- Anywhere – also in other applications: Select element and insert the text content with CONTROL-V
3.3 Text Search in Jobsadd section
In the event that an OCR has been performed for all pages in the job, and the workflow is set so that the internal IwOcrDoc objects are created and made available, it is now possible to search for text in the job. The text passages found are highlighted in yellow.
3.4 Indexingadd section
Capture of chapters, structure nodes, moving pages, especially for METS applications such as DFG-Viewer, Goobi, Kitodo or the MyBib eL
3.5 Scripting Extensionadd section
The scripting functionality has been extended by the following elements:
- ut.inOpenJob(); Returns true if a job is currently open.
- ut.leaveJob(); The open job is exited and the job list is displayed.
- ut.leaveJobAndCreateNewJob(); The opened job is left and a new job is created.
- ut.deleteImage(page_number); the page named by page_number is removed from the job
- ut.deleteCurrentImage(); The current page is removed from the job
- ut.detectBarcodeCurrentImage(); an attempt is made to recognize a barcode on the current page. If successful, the barcode is returned
3.6 Connection to MyBib eDoc Mode: Order listadd section
If the mode “Workflow uses requested order list from the assigned MyBib eDoc system” is selected in a workflow, the workflow can now set the maximum number of suitable orders to be offered to the user. This prevents you from displaying too many jobs.
3.7 C3xmlconvadd section
RM’2057 c3xmlconv is a program for generating various target formats from the input format C-3 Plus XML.
The results of a C-3 Plus operation are typically converted into the C-3 Plus XML format by means of a script. This is intended as an exchange format for article data.
However, a different target format is usually expected for the catalogs. These target formats can now be generated by the c3xmlconv program. A so-called lua script is required for each target format. This approach ensures that new formats can be supported without changing the c3xmlconv program.
With the installation of BCS-2, the program and the current lua scripts are automatically installed. The scripts are located in the application directory in the scripts subdirectory.
Furthermore, the program can also be used implicitly via scripting in BCS-2.
3.8 C-3 Plus Duplicate Listadd section
As part of the order entry in MyBib eDoc, existing article data for an anthology may have been found by a catalog query. MyBib eDoc passes these potential duplicates as a C3XML string to BCS-2 during
BCS-2 accepts this and extracts the TOC items as a list and saves them at the job. These duplicates can then be displayed via the menu and within the C3+ result dialog.
3.9 C-3 Plus Special Character Dialogadd section
The special character dialog has been extended. The context was the content indexing of ancient Greek and Hebrew texts within the framework of C-3 Plus.
3.10 C-3 Plus New Authoring Functionalityadd section
For author names that are listed in the table of contents with first names, a parameter can now be set so that the comma between name and first name is automatically set.
3.11 C-3 Plus Correction of Language Codesadd section
The output of the articles or review language codes for Romanian, Serbian and Albanian have been corrected so that they are now output in code 639-2.