![]() |
VOOZH | about |
Large Language Models (LLMs) are now widely used in a variety of applications, like machine translation, chat bots, text summarization , sentiment analysis , making advancements in the field of natural language processing (NLP). However, it is difficult to deploy and manage these LLMs in actual use, which is where LLMOps comes in. LLMOps refers to the set of practices, tools, and processes used to develop, deploy, and manage LLMs in production environments.
MLflow is an opensource platform that provides set of tools for tracking experiments, packaging code, and deploying models in production. Centralized model registry of MLflow simplifies the management of model versions and allows for easy sharing and collaborative access with the team members making it a popular choice for data scientists and Machine Learning engineers to streamline their workflow and improve productivity.
This article was published as a part of the Data Science Blogathon.
The following factors make managing and deploying LLMs in a production setting difficult:
MLflow is an open-source platform for managing the machine learning lifecycle. It provides a set of tools and APIs for managing experiments, packaging code, and deploying models. MLflow can be used to deploy and manage LLMs in production environments by following the steps:
It is a popular open-source library for building natural language processing models. These models are simple to deploy and manage in a production setting due to MLflowβs built-in support for them.To use the Hugging Face transformers with MLflow, follow these steps:
!pip install transformers
!pip install mlflow
import transformers
import mlflow
chat_pipeline = transformers.pipeline(model="microsoft/DialoGPT-medium")
with mlflow.start_run():
model_info = mlflow.transformers.log_model(
transformers_model=chat_pipeline,
artifact_path="chatbot",
input_example="Hi there!"
)
# Load as interactive pyfunc
chatbot = mlflow.pyfunc.load_model(model_info.model_uri)
#make predictions
chatbot.predict("What is the best way to get to Antarctica?")
>>> 'I think you can get there by boat'
chatbot.predict("What kind of boat should I use?")
>>> 'A boat that can go to Antarctica.'
Open AI is another popular platform for building LLMs. MLflow provides support for Open AI models, making it easy to deploy and manage Open AI models in a production environment. Following are the steps to use Open AI models with MLflow:
!pip install openai
!pip install mlflow
from typing import List
import openai
import mlflow
# Define a functional model with type annotations
def chat_completion(inputs: List[str]) -> List[str]:
# Model signature is automatically constructed from
# type annotations. The signature for this model
# would look like this:
# ----------
# signature:
# inputs: [{"type": "string"}]
# outputs: [{"type": "string"}]
# ----------
outputs = []
for input in inputs:
completion = openai.ChatCompletion.create(
model="gpt-3.5-turbo",
messages=[{"role": "user", "content": "<prompt>"}]
)
outputs.append(completion.choices[0].message.content)
return outputs
# Log the model
mlflow.pyfunc.log_model(
artifact_path="model",
python_model=chat_completion,
pip_requirements=["openai"],
)
Lang Chain is a platform for building LLMs using a modular approach. MLflow provides support for Lang Chain models, making it easy to deploy and manage Lang Chain models in a production environment. To use Lang Chain models with MLflow, you can follow these steps:
!pip install langchain
!pip install mlflow
from langchain import PromptTemplate, HuggingFaceHub, LLMChain
template = """Translate everything you see after this into French:
{input}"""
prompt = PromptTemplate(template=template, input_variables=["input"])
llm_chain = LLMChain(
prompt=prompt,
llm=HuggingFaceHub(
repo_id="google/flan-t5-small",
model_kwargs={"temperature":0, "max_length":64}
),
)
mlflow.langchain.log_model(
lc_model=llm_chain,
artifact_path="model",
registered_model_name="english-to-french-chain-gpt-3.5-turbo-1"
)
#Load the LangChain model
import mlflow.pyfunc
english_to_french_udf = mlflow.pyfunc.spark_udf(
spark=spark,
model_uri="models:/english-to-french-chain-gpt-3.5-turbo-1/1",
result_type="string"
)
english_df = spark.createDataFrame([("What is MLflow?",)], ["english_text"])
french_translated_df = english_df.withColumn(
"french_text",
english_to_french_udf("english_text")
)
Deploying and managing LLMs in a production environment can be challenging due to resource management, model performance, model versioning, and infrastructure issues. LLMs are simple to deploy and administer in a production setting using MLflowβs tools and APIs for managing the model lifecycle. In this blog, we discussed how to use MLflow to deploy and manage LLMs in a production environment, along with support for Hugging Face transformers, Open AI, and Lang Chain models. The collaboration between data scientists, engineers, and other stakeholders in the machine learning lifecycle can be improved by using MLflow.
Some of the Key Takeaways are as follow:
Read more: Build NLP Applications with Hugging Face
The media shown in this article is not owned by Analytics Vidhya and is used at the Authorβs discretion.
Gayathri is an aspiring AI leader and a highly skilled data scientist with over 11 years of experience in leveraging data to drive business outcomes. She has deep expertise in NLP, Computer vision, Machine learning and AI and a proven track record of delivering insights and recommendations that have helped organizations make informed decisions and deliver real business value. With a strong background in both technical and business domains, she is adept at communicating complex data-driven findings in a clear and concise manner.
As a data scientist manager, innovator, and researcher, she has led cross-functional teams of data scientists and engineers to deliver high-quality data-driven insights and solutions to our clients an excellent communicator and team player, a mentor, and has the ability to translate complex technical concepts into plain language for business stakeholders.
As a technical architect, she has designed and implemented, deployed, and maintained AI solutions to enable organizations to leverage their data effectively.
Her experience has taught her that the most important aspect of data science is not just technical expertise, but the ability to work closely with business stakeholders to understand their needs and deliver solutions that meet their business objectives. She always strives to stay at the forefront of the latest data science and technology advancements, and is always eager to learn and grow as a professional.
During the free time, she enjoys reading about the latest advancements in data science and technology.
GPT-4 vs. Llama 3.1 β Which Model is Better?
Llama-3.1-Storm-8B: The 8B LLM Powerhouse Surpa...
A Comprehensive Guide to Building Agentic RAG S...
Top 10 Machine Learning Algorithms in 2026
45 Questions to Test a Data Scientist on Basics...
90+ Python Interview Questions and Answers (202...
8 Easy Ways to Access ChatGPT for Free
Prompt Engineering: Definition, Examples, Tips ...
What is LangChain?
What is Retrieval-Augmented Generation (RAG)?
Edit
Resend OTP
Resend OTP in 45s