As part of the 2021 SAP Data Intelligence Content Sprint-at-Home initiative, DataXstream delivers business content that minimizes processing time to a few minutes per purchase order, no matter the size of the order.
From Hours to Minutes
Manual processing of business documents such as purchase orders and requests for quotes remains a critical element of many businesses in the wholesale distribution industry today. Several characteristics of this manual process make it an ideal candidate for advanced automation using machine learning:
- Documents are often lengthy with upwards of hundreds of line items per file
- Documents vary in format, layout, and resolution
- Documents contain typed and handwritten text
- Data from documents range from structured, semi-structured, to unstructured
These characteristics make the manual processing of business documents a time-consuming task. A brief survey of customer pain points suggests that the time it takes to process one purchase order ranges anywhere from 30 minutes to 2 hours, depending on the size and complexity of the order. A CAPS Research report from 2014 shows that the cost associated with purchase order processing ranges from one hundred to over one thousand dollars per order, depending on the company and industry. The average processing cost comes in at around four hundred dollars per order.
As part of the 2021 SAP Data Intelligence Content Sprint-at-Home initiative, DataXstream delivers business content that minimizes processing time to a few minutes per purchase order, no matter the size of the order. DataXstream’s document analysis solution for purchase orders can be integrated with a pipeline of intelligent search engines to match customer and line-item information from the document with customers and materials in SAP to deliver a seamless experience for purchase order creation. View the demo video here
A Modular and Dynamic Solution
A deployed instance of DataXstream’s document analysis API on SAP Data Intelligence can be consumed from a variety of applications including but not limited to SAP and DataXstream’s OMS+. It allows users to:
- Input purchase order documents in doc, pdf and image formats
- Get text elements for document
- Get predictions for text elements
- Prepare for quote or order creation in SAP
Furthermore, the solution can be adapted for use cases outside of purchase orders, such as requests for quotes, invoices, and receipts. The flexibility of the solution comes from the machine learning algorithms’ ability to generalize rules from different datasets. Text specific attributes extracted using Natural Language Processing and geospatial features of the document’s elements serve as the foundation for training the predictive models.
Text elements are parsed from documents with optical character recognition technology (OCR). Predictive models trained with machine learning algorithms help classify the relevance and business content of text elements containing customer and line-item information. The prediction results are displayed in a JavaScript based user interface that allows users to review and modify prior to creating an order. As the user adjusts the predictions, user activity is collected as feedback for improving future model accuracy.
Benchmarks show that predictions are returned in seconds from a deployed instance on SAP Data Intelligence. The OCR engine, hosted by Amazon Textract API, typically takes one to five minutes to parse the text elements of a document, depending on document size.
Prerequisites and Information
Prerequisites for using this business content are:
- Access to AWS account with S3 and Textract API
- Collect and preprocess data from users’ own purchase orders
- Train and export the models before deployment
Please contact the team at DataXstream to learn more about this solution.