In working in the capture space a similar use case often comes up – capture all instances of [insert the data element name here] from the document. For example, I want all phone numbers, regardless of location in the document, and regardless of the format of the document. ABBYY FlexiCapture is very good at creating solutions for well formatted, i.e. predictable, documents and capturing data in these documents. But sometimes the format of the document is not known in advance of it entering the system. In this case, we are still interested in capturing the data element (or elements) from the OCRed text and assigning them as properties. The solution can then use this captured data in a content management system as searchable metadata as well as use it as a lookup parameter in internal systems to capture additional data.
ABBYY FlexiLayout is used to create a document definition that supports semi-structured and unstructured documents. Capturing all the data elements that occur in the document can be done in a few relatively easy steps
The first step is to create a FlexiLayout and provide a sample document to match against. For this exercise, we’ll be using a simple email that contains a phone number in the text as well as multiple phone numbers in the footer.
Now lets see the following elements in FlexiLayout
Overall the FlexiLayout structure should look as follows
Export the FlexiLayout so we can create a document definition in FlexiCapture based on this layout
Create a new document definition based on the FlexiLayout created in the previous step
When created, you can see the data element, PhoneNumber in our case, is created as a multi-entry field
The final step is to test the document definition. This can be done directly in the document definition editor by selecting Testing à Run Test. You can see from the test below that FlexiCapture found three instances of phone number in the test image. Note, there is one phone number that was not captured as the phone number contains letters (TEAM). This could be remedied by altering the regular expression in FlexiLayout.
FlexiLayout can be used to extract any number of data elements from a document. Namely where a field may be represented by a regular expression. It can capture all the values from the text and assign these to a repeating group block. Finally with a document definition in FlexiCapture incorporates the FlexiLayout, it can be used to produce not only an OCRed version of the document, but all the instances of the data element captured.
TEAM IM is a global solution company that advises, develops, implements, supports, and manages enterprise grade information management and content management systems. For more than twenty years, TEAM IM has helped our clients through our offices in Australia, New Zealand, Europe and the United States get the most out of their investment in technology. Whether our clients are large government agencies or corporations, construction firms, accounting firms, heavy industry, or smaller organizations, we strive to deliver demonstrable business benefits and generate real ROI and efficiencies for our clients.
Our products and services offer solutions to digitize, automate and modernize your operations. TEAM IM strives to create multi-year, multiple outcome, outstanding return based relationships with our customers. As we plan to support any solution we deliver, we take care to design for long term, future proof solutions. We work best-in-class technology partners that we have carefully selected to ensure we can deliver on our multi-year, multiple outcome promises.
Our products and solutions encompass Advisory Services, Business Process Automation, Optimization, Content Platforms and Content Services. We are also a leader in Mobile App/Field Services software development, focusing on building Digital Workplaces with industry-specific solutions for the Construction and Accounting Services sectors—with more sector-specific solutions on the way.
The most important thing to know about TEAM IM is that, after more than twenty years, we are still passionate about achieving outstanding outcomes for you, our clients.