![]() |
VOOZH | about |
We’re so glad you’re here. You can expect all the best TNS content to arrive Monday through Friday to keep you on top of the news and at the top of your game.
Check your inbox for a confirmation email where you can adjust your preferences and even join additional groups.
Follow TNS on your favorite social media networks.
Become a TNS follower on LinkedIn.
Check out the latest featured and trending stories while you wait for your first TNS newsletter.
Of all the hoopla around AI, the most misguided part is the insistence on fine-tuned large language models (LLMs). Too many believe specializing a model based on a massive collection of domain-specific data is the only way to build useful AI assistants.
This demand for fine-tuning is even more common in highly specialized or technical fields, such as software development and cloud services. A prime example is the ongoing maintenance and troubleshooting of Kubernetes clusters designed to deliver apps. This situation underscores the critical challenge faced by DevOps and app development leaders: There needs to be a better solution for managing the complexity of cloud native infrastructure. These environments often present intractable challenges that defy experience, wisdom or intuition around troubleshooting.
In response, startups and open source projects claim to have fine-tuned existing models to include specialized knowledge about Kubernetes that generic models, even GPT-4 Turbo, wouldn’t normally ingest or have access to.
However, the challenge is not a problem with the fine-tuning itself, but its inability to mimic the human approach to troubleshooting. No matter how intelligent the model, you’ll get no real value unless it can replicate how you perform troubleshooting: gathering disparate resources, jugging all the critical details you’ve found in logs and kubectl output in your head, leaning on your experience, and distilling it all into a logical next step.
There are only two key areas that make an AI assistant useful in the Kubernetes world. Assistants must be:
kubectl get/logs/describe.Fine-tuning just prioritizes hype over what matters to you most: acting on what’s actually happening with your pods, nodes and apps.
The AI and cloud native spaces are growing simultaneously, so new tools are overlapping in these two domains.
New open source command-line interface (CLI) tools like K8sgpt and KoPylot wrap their operations around kubectl to gain access to your cluster’s state. By running that command on your behalf using the context available in your .kube/config file, these tools can read and process the output directly, rather than forcing you to switch context. They then proxy data to OpenAI’s API to deliver AI-generated responses in your terminal.
It’s a clever workaround, but these CLI tools still require a high level of Kubernetes knowledge, or another CLI tool. You need to know the right commands, not just a question about your cluster’s status, to initiate interaction.
Another open source tool, mico, advances this concept by converting your natural language queries into kubectl commands. You can ask mico to, for example, print the number of times each pod in xyz namespace has restarted, and it will use the jsonpath argument in kubectl to filter output down to just the relevant line.
We love to see how the open source community is leveraging AI, but these tools are limited: They either understand the cluster’s state but can’t handle natural language queries, or they help you write queries but only return kubectl output without the next troubleshooting steps. You could replace the default OpenAI models powering these tools with a specialized alternative, but that won’t help you reduce your troubleshooting time or help your less-experienced peers monitor their apps.
The answer is an AI assistant that excels in understanding cluster state and interpreting natural language — fine-tuning be damned.
Without access to the cluster state, the only way to get help from your AI assistant is to play a game of telephone on your path to resolving issues. Even with a fine-tuned AI, you can expect the conversation to go a little like this:
kubectl get pods when your deployment doesn’t come up immediately.CrashLoopBackOff error.CrashLoopBackOff error include insufficient memory, missing dependencies and container failure due to port conflict. Perhaps it’s smart enough to ask you to run kubectl describe pod POD_NAME for clues about its resource usage and limits … perhaps.Terminated state and last emitted event: Back-off restarting failed container.kubectl get events --field-selector involvedObject.name=POD_NAME to search for other possible causes.kubectl logs POD_NAME --all-containers to search for specific errors with your containerized app or its dependencies from your manifest, like a database or messaging queue.docker-entrypoint.sh saying it couldn’t execute because of a not found argument.You certainly received assistance from your AI tool, but the assistance was not particularly efficient. It might have saved you from Googling each error or running kubectl ... help commands to find the right syntax. But because you were responsible for accurately sharing information about your cluster’s state and understanding each step from your AI assistant, you still carried almost all the cognitive load and did not save very much time.
Access to a cluster’s state is essential. A valuable AI assistant must automatically respond to your original question about CrashLoopBackOff by running kubectl commands itself, parsing the output for clues, bringing in context from the collective Kubernetes troubleshooting knowledge available online and delivering a precise path to remediation — no runbooks or deep dives into documentation required.
A Kubernetes AI assistant that can read outputs or logs and deliver an executive summary of what to think about next is great, but it assumes you have enough Kubernetes knowledge to know what question or specific kubectl command to run. The real added value, especially for application developers with limited knowledge of Kubernetes operations, comes from the ability to ask questions in your natural language:
xyz namespace?”NotReady?”When the AI can translate a question into the relevant command to gather state context (kubectl get pods -n xyz), it can effectively reduce the cognitive load on your team. DevOps engineers can reduce their mean time to resolution (MTTR) by using the AI assistant as a resource to reflect their specialized knowledge, and developers can troubleshoot their apps in a self-service fashion.
When the AI assistant runs on platforms where your team operates, like Slack or Microsoft Teams, this knowledge is more accessible and collaborative. When the next significant incident strikes your app, DevOps engineers and developers can engage your AI assistant in the same channel for more targeted root cause analysis and a remediation plan that goes beyond a temporary fix.
To address these issues, Botkube recently launched AI Assistant, which is designed to operate in both areas of Kubernetes troubleshooting and directly in collaboration platforms.
The assistant works by listening to your natural language questions about your Kubernetes cluster and its apps, converting your queries into the appropriate kubectl get/logs/describe commands and interfacing with an LLM to explore root causes and opportunities. From this, the assistant can deliver insights and recommend next steps on your troubleshooting journey.
This assistant enhances Botkube’s notification, investigation and troubleshooting tools by operating on the most valuable bounds of both areas. Using AI Assistant helps you research why an issue is happening, learn kubectl to perform basic operations, or tap into Kubernetes expertise to seek out root causes and find a workable solution.
Under the hood, Botkube’s AI Assistant uses ChatGPT-4.
We’re not ashamed to admit that we’re using the same model as every open source tool and most new paid platforms. We can’t fine-tune what ChatGPT knows, but we can add nuance to queries and tweak the nature of its responses to provide a better troubleshooting experience.
For example, we layer additional instructions on top of common natural language queries and data about a cluster’s state to “force” ChatGPT to provide more comprehensive answers. We also enrich ChatGPT’s default output with better formatting and organizational structure to help you focus on troubleshooting, not deciphering instructions.
Adding value just before and after interfacing with an LLM can do much more than fine-tuning. We designed AI Assistant to be context-aware and compatible with the questions you genuinely want to ask of your cluster — not the complex kubectl commands you may be used to.
The opportunities are generally bounded only by how much detail kubectl emits and the Kubernetes knowledge built into OpenAI’s latest models … which is quite a lot. You can ask:
xyz namespace are healthy.Botkube’s executor features then let you turn the AI Assistant’s insights into immediate remediation by helping you craft the right kubectl through a drop-down interface (rather than a dozen runs of kubectl ... help).
DevOps engineers can speed up workflows by spending less time in the terminal and more time where collaboration happens. And app developers can fix Kubernetes issues on their own instead of filling out a ticket and waiting for someone to help.
No matter your title or role, you can start using Botkube’s embedded AI assistant today with a new or existing Botkube account. Sign up now for free to enable our Kubernetes AI assistant with a single click, with no configuration required.
Toss it a few questions and you’ll quickly see why cluster awareness and natural language — not fine-tuned LLMs — are the best path forward to manage the complexity of your cloud native infrastructure.