One of the commonly desired features for a document management system is OCR or intelligent data extraction from documents. These features enable us to speed up the data creation and routing processes within our organization. This type of intelligence can also affect the security and permissions of our data, by catching key pieces of information that need to be appropriately managed - pieces of data that could be missed due to human error.
Of the various intelligence services that M-Files provides, a popular service that provides intelligent data extraction is the M-Files Smart Extractor; the M-Files Smart Extractor consists of two services – M-Files Matcher, and M-Files Text Analytics. Combined, these services scan the contents of your document in order to provide metadata suggestions that include properties linking to other objects in the vault, as well as properties found within a document that match regular expressions (for example, dates or codes).
The M-Files Matcher scans each newly added document for text or key words that matches the names of existing objects within the vault. For example, if I have a project object in my vault, and then I add a meeting minutes document that also has the project name in the contents of the file, M-Files will suggest adding the project to the document on the metadata card, creating those intelligent relationships within the system.
M-Files Text analytics scans each newly added document for text or key words that match regular expressions set in the configurations. Out of the box, Text Analytics can extract information such as Dates, Emails, and Phone Numbers; however, you can build your own regular expressions to capture Project IDs, Job Site Codes, etc.
M-Files Smart Extractor must be broken into its two parts to consider its viability, as both services - M-Files Matcher and M-Files Text Analytics – offer different functionality within the vault.
Matcher is not viable if your vault only contains documents. As Matcher is a service that links different object types to each other (for example, linking a document object to a project object), having a vault with only documents means that the M-Files Matcher will not be of much use to you.
The viability of Text Analytics is entirely up to the client and how far they want to push customized regular expressions. However, if you don’t have consistent data patterns within your documents, then Text Analytics cannot use any regular expression to capture data.
It's important to note that you don't have to use both features of the M-Files Smart Extractor to still consider it a worthwhile investment; if you know that Matcher will be a great fit for your document management system, but Text Analytics would be less useful, you can choose to enable only Matcher in your vault to create those intelligent data relationships. Or, perhaps the reverse is true - maybe Text Analytics is a better fit, and Matcher is less so. Either way, both features of the M-Files Smart Extractor are powerful tools for data extraction.
Interested in learning more? You can read more about data intelligence tools in our other Insights, or you can reach out to TEAM IM at www.teamim.com.