MLX
This notebook shows how to get started using MLX LLM's as chat models.
In particular, we will:
- Utilize the MLXPipeline,
- Utilize the
ChatMLXclass to enable any of these LLMs to interface with LangChain's Chat Messages abstraction. - Demonstrate how to use an open-source LLM to power an
ChatAgentpipeline
%pip install --upgrade --quiet mlx-lm transformers huggingface_hub
1. Instantiate an LLM
There are three LLM options to choose from.
from langchain_community.llms.mlx_pipeline import MLXPipeline
llm = MLXPipeline.from_model_id(
"mlx-community/quantized-gemma-2b-it",
pipeline_kwargs={"max_tokens": 10, "temp": 0.1},
)
2. Instantiate the ChatMLX to apply chat templates
Instantiate the chat model and some messages to pass.
from langchain_community.chat_models.mlx import ChatMLX
from langchain_core.messages import HumanMessage
messages = [
HumanMessage(
content="What happens when an unstoppable force meets an immovable object?"
),
]
chat_model = ChatMLX(llm=llm)
API Reference:HumanMessage
Inspect how the chat messages are formatted for the LLM call.
chat_model._to_chat_prompt(messages)
Call the model.
res = chat_model.invoke(messages)
print(res.content)