The ability to make decisions and predict opportunities for action is being done with higher volume and faster changing sources of data and the data itself. With all this change, it is important to understand that the technology underlying what is supporting analytics needs to keep up with the speed of interaction and transaction.
Traditional data architectures maintain the transactional application and analytics applications in different silos and have data integration pipes to bring data between the two. This architecture cannot support the data load and processing speed required by mission critical engagement systems using transactional data for real-time analytics.1
I disagree with the construct of real-time; the belief that the infrastructure and solutions for analytics does not need to be as performant as those that power the transaction itself isn’t reality anymore. Analytics is a part of the transaction. Analytics is the input to starts transactions, to keep transactions alive.
A data streaming architecture gives the ability to make just-in-time decisions and embracing knowledge being power. Eran Levy has written a free ebook on streaming data called The Architect’s Guide to Streaming Data and Data Lakes.
The biggest aspect of this architecture is the ability to conform data to standard structures, formats, and values that allow for the analytics to work quickly. As your sources of data grow, the ability to codify these rules becomes exponentially harder. It is important that consumer of the data don’t just understand the data models and what the current data represents, but they also understand how users of the source system use it and what it is capable of. Without this understanding, data workers and everyone else that are consumers of the data will consistently be chasing their tail as more data is introduced.
The owners and users of the transactional systems that power the consumer experience need to collaborate with their downstream partners to make the ecosystem better: from infrastructure to integration and the data that is flowing through all of it, ensuring consistency is all of it allows for prediction to keep up with everything else.