Prevent Data Feed Ingestion Errors with IFIF
Intelligent Feed Ingestion Framework uses machine learning to avoid long delays and reduce the time to get data into critical business systems.
Intelligent Feed Integrator
The Intelligent Feed Integrator is configured to be aware of existing data feeds, their type and structure. Feed types and structures are stored in a feed dictionary maintained in the Metadata repository. Feeds are defined by their source and name, format (CSV, JSON, XML, Etc.) and their structure (schema, column layout, etc.).The IFI is responsible for reading feed content as it arrives and requesting evaluation by the Intelligent Feed Evaluator.
Intelligent Feed Machine Learning Module
The Intelligent Feed Machine Learning Module evaluates feeds to establish thresholds, ranges, and content features for each of the data elements that comprise a feed. The IFMLM is trained with representative good data for each feed to establish the baseline criteria against which future instances of the same feed will be evaluated. The IFMLM can be trained with as little as a single feed instance and supports an iterative workflow that allows subsequent valid instances to be introduced as new training data to the IFMLM. This allows for a rapid deployment and usage of the system, while simultaneously ensuring that the system leverages all available information to improve over time
Intelligent Feed Evaluator
The Intelligent Feed Evaluator is configured to accept feed evaluation requests from the IFI. In response to a request, the IFE inspects the data for the requested feed applying thresholds, ranges and models to each property contained in the feed. Upon completion of feed inspection, results are written to the Evaluation Results repository and can be reviewed and remediated by data managers and other interested users.
Intelligent Feed User Interface
The Intelligent Feed User Interface provides a smooth and intuitive user experience for accomplishing the following workflows:
- Identifying and Reviewing feeds that contain content that violates a learned threshold, exists outside a learned range, or fails to conform to a discovered model.
- Promoting passing feeds or remediated feeds to training data status for reincorporating into the learning algorithm that drives the dynamic thresholds, ranges, and content features.
- Curating learned models, thresholds and ranges where domain expertise needs to override discovered knowledge
Evaluation Result and Metadata Repository
This repository maintains several key data sets that are required for platform operation:
- Feed dictionary
- Feed metadata (types and structures)
- Learned thresholds
- Learned ranges
- Learned content features (NLP)
- Evaluation results and status for remediation, etc.