DataXstream OMS+

Document Analysis Automation powered by SAP Data Intelligence

sprint

Document Analysis Automation powered by SAP Data Intelligence

As part of the 2021 SAP Data Intelligence Content Sprint-at-Home initiative, DataXstream delivers business content that minimizes processing time to a few minutes per purchase order, no matter the size of the order. 

From Hours to Minutes 

Manual processing of business documents such as purchase orders and requests for quotes remains a critical element of many businesses in the wholesale distribution industry today. Several characteristics of this manual process make it an ideal candidate for advanced automation using machine learning: 

  • Documents are often lengthy with upwards of hundreds of line items per file 
  • Documents vary in format, layout, and resolution 
  • Documents contain typed and handwritten text 
  • Data from documents range from structured, semi-structured, to unstructured 

These characteristics make the manual processing of business documents a time-consuming task. A brief survey of customer pain points suggests that the time it takes to process one purchase order ranges anywhere from 30 minutes to 2 hours, depending on the size and complexity of the order. A CAPS Research report from 2014 shows that the cost associated with purchase order processing ranges from one hundred to over one thousand dollars per order, depending on the company and industry. The average processing cost comes in at around four hundred dollars per order. 

As part of the 2021 SAP Data Intelligence Content Sprint-at-Home initiative, DataXstream delivers business content that minimizes processing time to a few minutes per purchase order, no matter the size of the order. DataXstream’s document analysis solution for purchase orders can be integrated with a pipeline of intelligent search engines to match customer and line-item information from the document with customers and materials in SAP to deliver a seamless experience for purchase order creation. View the demo video here 

A Modular and Dynamic Solution 

A deployed instance of DataXstream’s document analysis API on SAP Data Intelligence can be consumed from a variety of applications including but not limited to SAP and DataXstream’s OMS+. It allows users to: 

  • Input purchase order documents in doc, pdf and image formats 
  • Get text elements for document  
  • Get predictions for text elements 
  • Prepare for quote or order creation in SAP 

Furthermore, the solution can be adapted for use cases outside of purchase orders, such as requests for quotes, invoices, and receipts. The flexibility of the solution comes from the machine learning algorithms’ ability to generalize rules from different datasets. Text specific attributes extracted using Natural Language Processing and geospatial features of the document’s elements serve as the foundation for training the predictive models. 

Text elements are parsed from documents with optical character recognition technology (OCR). Predictive models trained with machine learning algorithms help classify the relevance and business content of text elements containing customer and line-item information. The prediction results are displayed in a JavaScript based user interface that allows users to review and modify prior to creating an order. As the user adjusts the predictions, user activity is collected as feedback for improving future model accuracy. 

Benchmarks show that predictions are returned in seconds from a deployed instance on SAP Data Intelligence. The OCR engine, hosted by Amazon Textract API, typically takes one to five minutes to parse the text elements of a document, depending on document size. 

Prerequisites and Information 

Prerequisites for using this business content are: 

  • Access to AWS account with S3 and Textract API 
  • Collect and preprocess data from users’ own purchase orders 
  • Train and export the models before deployment  

Please contact the team at DataXstream to learn more about this solution. 

Leave a reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.