Top 4 ways AI can help companies manage IT infrastructure

By Dr Usama Fayyad, Chairman & Founder of Open Insights

December 03, 2022

undefined mins

AI insights from Dr Usama Fayyad, Chairman & Founder of Open Insights and Executive Director of the Institute for Experiential AI, Northeastern University

Over the last decade, as more and more IT resources became available through the cloud, organisations began seeking solutions that were hardware-independent and programmatically flexible. Software-defined infrastructure (SDI) emerged as one of those solutions: an end-to-end means of unifying computing, networking, and storage domains under a single stack. As virtual technologies, SDI provided organisations with diverse, semi-autonomous environments that could scale with growth.

But now IT infrastructures are facing a new challenge. The sheer volume of data they’re tasked with processing has put pressure on the flexibility, security, and adaptability of systems. Contingency planning has to account for a variety of data flows - from apps to sensors to dashboards to edge networks - and static, hard-coded programming controls are increasingly not up to the task.

AI-defined infrastructures (ADIs) inject traditional SDIs with self-learning, self-correcting methodologies. They can allocate resources according to system demands, structure components based on past behaviour, and autonomously respond to data events before an error has even occurred. These capabilities are necessary for settings where the sheer amount of data being collected and events being monitored is so overwhelming that human interpretation becomes a challenge. Algorithms can help reduce overload, focus attention, and automatically triage the majority of metrics that indicate no need for intervention.

What separates AI-defined infrastructures from alternatives is the capacity for self-improvement. Rather than behaving according to narrowly defined inputs and human controls, machine learning (ML) infrastructures can identify patterns in the data and respond dynamically and automatically to new environments. This offers an attractive alternative to traditional “thresholding” approaches, which are prone to missing configurations and false alarms.

AI-based infrastructures could not have arrived at a more critical time for enterprises, which face rising cloud computing costs as well as heightened privacy expectations from the market. In response, IT leaders are looking for systems that can optimise and accurately predict resource expenditures at minimum cost. AI-driven IT infrastructures offer one way to optimise resources. Let’s take a look at four specific use cases:

1 Intelligent resource provisioning

The cloud offers access to vital computing resources for enterprises with limited facilities. However, providers typically offer only narrow solutions that, when combined with others, deliver a patchwork infrastructure with variable bandwidths, geographic deviations, and stochastic service requests. The natural remedy would be to load up on cloud services, but such an expensive, wasteful strategy is unlikely to last.

Machine learning models can be used to predict and provision resources in real time. This includes everything from trend analysis to emergency firefighting. Workloads can be managed and allocated according to the availability of resources in a given system. Some IT departments may see fit to fully automate their resource provisioning, but AI experts recommend a more balanced approach. With a “human in the loop” AI-driven infrastructures achieve the best of both worlds: automation of routine processes with manual control of more context-sensitive engagements. Furthermore, AI algorithms can learn to identify situations where bringing down the infrastructure and re-instantiating it at a later time results in major computing savings.

2 Intelligent scalability

IT departments have a mandate to scale their networks, databases, and applications to the needs of the organisation - whatever those needs may be. Critically, infrastructural capacities need to be able to scale with the demand placed on them over time. The “time” aspect here is crucial. After all, a system can only be said to be adaptable if it’s responsive to conditions over time.

Instead of using manually coded rules and configurations, AI-assisted infrastructures can elastically scale themselves according to demand forecasts. Machine learning models monitor and collect historical data for resource demand on various applications, helping to inform future behaviour as system capacities scale up or down. But it’s not all technical; people play a critical part in the scalability of IT infrastructures, too. Experiential AI - that is, AI with a “human in the loop” - offers the best opportunity for IT departments to account for the human variable.

3 Intelligent storage management

AI emerged as a tool to make sense of an environment where data is generated in constant abundance. Before the era of big data, storage systems were static and hard-programmed - not ideal for the dynamic nature of modern data flows. As a consequence, a lot of important data got missed or left behind.

AI allows IT departments to dynamically monitor and calibrate storage demands. More specifically, AI-driven infrastructures can create tiered storage systems that reflect data lifecycles and input/output (I/O) performance in real time. Such an intelligent system can improve efficiency and reduce storage costs. With predictive modelling, storages can be automatically adjusted based on usage patterns, with stale data off-loaded onto low-cost storage. This could grow over time to enhance traditional multi-layer caching techniques.

4 Anomaly detection

The flipside of intelligent resource management is the ability to identify, alert, and explicate unexpected resource surges. Anomaly detection - whether automated through machine learning or hard programmed - refers to the ability to identify critical events, ideally leading to faster remedies and shorter outages.

With the help of explainable AI, ML models can create models that reveal why an anomaly has occurred. This can help in real-time root cause analysis, preventing service disruptions as well as false alarms. Models can be trained on enterprise-specific data, resulting in pinpoint-precise detection algorithms.

Don’t neglect the human element

Data collection, content management, network security, server optimization, resource planning, even customer relationship management - all these actions are improved with an AI-driven IT infrastructure. But organisations are encouraged to approach AI with some caution. AI systems may be biased or, when left to their own devices, poor at contextualising aberrant data upon deployment. The right approach is for algorithms to look for confirmation or feedback from human operators. Every intervention by an operator is a chance to collect data on the right actions and the context around them, enabling the AI to learn and improve over time.

For the foreseeable future, human beings will be better at understanding the ecosystem in which an AI is deployed. For that reason, the experiential AI model offers the best of both worlds: human oversight with algorithmic insight.

ai IT infrastructure Digital infrastructure