Is Your Data Ready for Machine Learning (ML) or AIOps?
Is Your Data Ready for Machine Learning (ML) or AIOps?

Is Your Data Ready for Machine Learning (ML) or AIOps?

How do you gain insight and visibility into your information technology services? You have so many options, like: Telemetry, NetFlow collection, syslog(s) collection, network taps (lawful data interception), SNMP polling, ICMP or TCP polling, application APIs, orchestration software APIs, cloud APIs… the list goes on.

My bet is that any organisation with a considerable IT service stack subscribes to several of these collection methods, whether you like to or not. Vendors still back certain standards –

and autonomous departments subscribe to whatever gives them the necessary insights.

If you look at information structures and storage methods, you quickly notice the disparity between all this information.

Our previous CTO, Ludwig Myburgh, always used to say that monitoring challenges are database challenges. What did he mean by that?

Historically, tools are just tools and dashboards are just dashboards – but how you store, collate and summarise your data is where the real differentiator lies.

This has become even more challenging now. The sad reality is that infrastructure technology engineers are not concerned about the back-end or the database or the APIs –

when they should be.

What we currently see in the market is a flood of products that expect that (outside of their own data) all data must be fed to the “AIOps system”. It’s the concept of “your data” that the vendor needs to consume, so it is important to know your data, collection types, storage DBs and access requirements.

Also, have a look at the APIs. Automation is only possible if you can reach into another system and make a meaningful change (once your ML model has migrated off its training wheels, that is).

If, due to a business systems’ high latency (as experienced by its users), you would like to apply more bandwidth or change QoS, then make sure that you have the API access to enforce such a change. (Good-bye, change control.)

Be on the lookout for collaboration tools: systems that ingest data and spew out events that require multiple people to collaborate online in order to find the root cause.

To me, that sounds like the traditional critical situation (crit sit) session with a knowledge-based database. Although, if there are gigs of data that need to be reviewed to establish patterns and anomalies to finally review as one problem, applying ML is still very useful in this instance.

Some reasons why you should care about your visibility data standards include the following:

  • Data used in the context of its own system is useful, whereas data used by a third-party systems can be very powerful.
  • Data enrichment is key to understanding the data in the context of your business. Data enrichment can only be applied if you are in control of your data.
  • Seamless integration is required to allow for the bidirectional automation of systems.
  • Most third-party systems require data to be ‘normalised’ before being ingested. Can you normalise your data?
  • We need history data to train ML. How much do you have?
  • We need data from multiple data sources for meaningful AIOps implementations. Open standards access rings a bell here.

The diagram above is a great example of a central data repository (API) that can be integrated to control systems to inflect change and automation. The only question is, where does the ML sit? Is it on the data set – or as an API integration ingesting the data?

Either way, if the data is normalised and the control APIs are mature, then this makes for a great foundation for end-to-end, ML-based automation – true AIOps.

Emile Biagio



(0 votes)

Leave a comment