170 MarkView Solution Components
Recognition Server
Download PDF

Automated Document Recognition and Routing

170 MarkView® Recognition Server provides production quality optical character recognition (OCR) and optical mark recognition (OMR) to reduce the cost of entering information into the ERP system.

Benefits of the 170 MarkView Recognition Server

  • Using OCR to automatically extract text from document images
  • Automating a step in the document preparation process by automatically extracting document indexing information
  • Reducing manual data entry efforts
  • Using OMR to automatically process standard forms
  • Automatically adjusting and enhancing image quality
  • Any document image, regardless of its source (scanned in, faxed in, etc.), is automatically processed

Primary services provided by the 170 MarkView Recognition Server

OCR – Optical character recognition reads machine-printed text and intelligently extracts indexing field information.

Form Recognition – the process of comparing a document with a series of templates to identify a match is further refined to compare the individual pages of a multi-page document against multi-page templates.

OMR – Optical mark recognition reads checkboxes, ScanTron forms, and more.

Image Enhancement – enhances the image quality of a document to improve the accuracy of form matching, OCR, and OMR. Enhancement options include registration, deskew, auto-rotation, border removal, line removal, nverse text correction, deshading, despeckling, and character smoothing.

Recognition Services

*

There are many types of recognition services the 170 MarkView Recognition Server can be asked to provide when processing a given document. These include:

Full Text OCR – extracts all the text from the pages of a document. The resulting text can be automatically made available for a variety of purposes, including full text searches of documents.

Structured or Zoned OCR – OCR and OMR work together or separately on one or more zones on a page, rather than on the entire page. This is used to capture distinct pieces of data such as account or serial numbers. The results can then be used by the associated application. For example, a serial number could trigger a workflow that routes the document to a specific service team.

Form Recognition – This can be requested independently of OCR, or it can occur as part of an OCR request. The 170 MarkView Recognition Server can identify the type of document, and the associated application can use this information for automatic categorization and/or trigger a workflow process.

Features of 170 MarkView Recognition Server

OCR

  • Smart Zones - allow you to define zones on a page and then extract relevant data from the zone based on filter and masks
  • Two built-in OCR algorithms provide automatic polling, to improve recognition results
  • Alphabets for 16 languages are available to define the OCR character set
  • Add any Unicode character to OCR character set
  • Spell checking can be used to improve and verify OCR results - Multiple language dictionaries are available.
  • Add custom user dictionaries, which are used by the spell checking subsystem. Custom dictionaries can contain words or UNIX-style regular expressions. Custom dictionaries can also be generated dynamically from SQL statements based on any table in the database. For example, a dictionary can be established to query account numbers.
  • Reliability is predicted based on the quality of the image and other embedded rules. When the prediction of the extracted data is not accurate (based on configurable thresholds), the imaged document is routed to a QA process where the operator, instead of entering in the data, is verifying and correcting the data.

Form Recognition

  • Automatically identifies the type of form
  • Automatically detects and corrects documents that are incorrectly orientated
  • Automatically scales documents to match templates, regardless of paper size and image resolution
  • Allows a single input document to be split into multiple documents based on form recognition results
  • Provides a two-step form recognition process that allows the user to identify overall document type, as well as individual pages within a document

OMR

  • Zones can be logically grouped to ensure the integrity of results. Only one zone in each logical group is evaluated as marked.

Image Enhancement

  • Performs registration and de-skew to ensure that similar documents will always be aligned in a standard way
  • Provides settings to improve OCR and OMR results - line removal, inverse text correction, de-shading, de-speckling, cropping, and character smoothing
  • Allows the enhanced image to be saved as a new document

Usability features

  • 170 MarkView Recognition Template Designer can be used to quickly and easily define new templates for zoned OCR and OMR requests
  • OCR and OMR zones are defined using Form Markups™ with the 170 MarkView Viewer
  • Results of OCR and OMR can be displayed on the processed 170 MarkView document as Markups