Instructions to use baa-ai/MiniMax-M2.7-RAM-100GB-MLX with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- MLX
How to use baa-ai/MiniMax-M2.7-RAM-100GB-MLX with MLX:
# Make sure mlx-lm is installed # pip install --upgrade mlx-lm # Generate text with mlx-lm from mlx_lm import load, generate model, tokenizer = load("baa-ai/MiniMax-M2.7-RAM-100GB-MLX") prompt = "Write a story about Einstein" messages = [{"role": "user", "content": prompt}] prompt = tokenizer.apply_chat_template( messages, add_generation_prompt=True ) text = generate(model, tokenizer, prompt=prompt, verbose=True) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- LM Studio
- Pi new
How to use baa-ai/MiniMax-M2.7-RAM-100GB-MLX with Pi:
Start the MLX server
# Install MLX LM: uv tool install mlx-lm # Start a local OpenAI-compatible server: mlx_lm.server --model "baa-ai/MiniMax-M2.7-RAM-100GB-MLX"
Configure the model in Pi
# Install Pi: npm install -g @mariozechner/pi-coding-agent # Add to ~/.pi/agent/models.json: { "providers": { "mlx-lm": { "baseUrl": "http://localhost:8080/v1", "api": "openai-completions", "apiKey": "none", "models": [ { "id": "baa-ai/MiniMax-M2.7-RAM-100GB-MLX" } ] } } }Run Pi
# Start Pi in your project directory: pi
- Hermes Agent new
How to use baa-ai/MiniMax-M2.7-RAM-100GB-MLX with Hermes Agent:
Start the MLX server
# Install MLX LM: uv tool install mlx-lm # Start a local OpenAI-compatible server: mlx_lm.server --model "baa-ai/MiniMax-M2.7-RAM-100GB-MLX"
Configure Hermes
# Install Hermes: curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash hermes setup # Point Hermes at the local server: hermes config set model.provider custom hermes config set model.base_url http://127.0.0.1:8080/v1 hermes config set model.default baa-ai/MiniMax-M2.7-RAM-100GB-MLX
Run Hermes
hermes
- MLX LM
How to use baa-ai/MiniMax-M2.7-RAM-100GB-MLX with MLX LM:
Generate or start a chat session
# Install MLX LM uv tool install mlx-lm # Interactive chat REPL mlx_lm.chat --model "baa-ai/MiniMax-M2.7-RAM-100GB-MLX"
Run an OpenAI-compatible server
# Install MLX LM uv tool install mlx-lm # Start the server mlx_lm.server --model "baa-ai/MiniMax-M2.7-RAM-100GB-MLX" # Calling the OpenAI-compatible server with curl curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "baa-ai/MiniMax-M2.7-RAM-100GB-MLX", "messages": [ {"role": "user", "content": "Hello"} ] }'
Observed issues when using it with coding agents (applies to both 90GB and 100GB versions)
@BAA .AI team, while the 90GB version had this issue significantly more manifested (, 100GB too is showing this and would be a deal breaker for coding actually. The issue is with the model recognizing and generating correct file names, directory paths, see just few of the example below when used with Claude Code (and observed in other coding agents as well)
⏺ Bash(mkdir -p /Users/ s olosouls/claude-test/.claude/task_memory) --- space in the directory name
Write(/Users/olosouls/claude-test/.claude/task_memory/db_connection_verification.md) -- mis-spelt directory name
these errors are fatal because it is the most basic task and will mess up overall coding workflow big time.
Other than this issue, I was amazed at the speed I got on my M5 MAX 128 GB. In chat interface, I asked it to generate a complex html with planetary view and it spewed 2400 lines of code in a single session. Though the file did not load correctly for different reasons but none of the frontier models gave me such a comprehensive output. It is a pity that it works so well but is not really usable for meaningful coding without these issues being addressed.
Hey Solosouls.
This could be just a model thing. But first make sure you have configured the settings as described in the 116GB version, which is one of our stronger versions.
https://huggingface.co/baa-ai/MiniMax-M2.7-RAM-116GB-MLX
For these models, getting the right configuration makes a big difference to output.