Instructions to use miqudev/miqu-1-70b with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- llama-cpp-python
How to use miqudev/miqu-1-70b with llama-cpp-python:
# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="miqudev/miqu-1-70b", filename="miqu-1-70b.q2_K.gguf", )
llm.create_chat_completion( messages = "No input example has been defined for this model task." )
- Notebooks
- Google Colab
- Kaggle
- Local Apps
- llama.cpp
How to use miqudev/miqu-1-70b with llama.cpp:
Install from brew
brew install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf miqudev/miqu-1-70b:Q4_K_M # Run inference directly in the terminal: llama-cli -hf miqudev/miqu-1-70b:Q4_K_M
Install from WinGet (Windows)
winget install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf miqudev/miqu-1-70b:Q4_K_M # Run inference directly in the terminal: llama-cli -hf miqudev/miqu-1-70b:Q4_K_M
Use pre-built binary
# Download pre-built binary from: # https://github.com/ggerganov/llama.cpp/releases # Start a local OpenAI-compatible server with a web UI: ./llama-server -hf miqudev/miqu-1-70b:Q4_K_M # Run inference directly in the terminal: ./llama-cli -hf miqudev/miqu-1-70b:Q4_K_M
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git cd llama.cpp cmake -B build cmake --build build -j --target llama-server llama-cli # Start a local OpenAI-compatible server with a web UI: ./build/bin/llama-server -hf miqudev/miqu-1-70b:Q4_K_M # Run inference directly in the terminal: ./build/bin/llama-cli -hf miqudev/miqu-1-70b:Q4_K_M
Use Docker
docker model run hf.co/miqudev/miqu-1-70b:Q4_K_M
- LM Studio
- Jan
- Ollama
How to use miqudev/miqu-1-70b with Ollama:
ollama run hf.co/miqudev/miqu-1-70b:Q4_K_M
- Unsloth Studio new
How to use miqudev/miqu-1-70b with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for miqudev/miqu-1-70b to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for miqudev/miqu-1-70b to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for miqudev/miqu-1-70b to start chatting
- Docker Model Runner
How to use miqudev/miqu-1-70b with Docker Model Runner:
docker model run hf.co/miqudev/miqu-1-70b:Q4_K_M
- Lemonade
How to use miqudev/miqu-1-70b with Lemonade:
Pull the model
# Download Lemonade from https://lemonade-server.ai/ lemonade pull miqudev/miqu-1-70b:Q4_K_M
Run and chat with the model
lemonade run user.miqu-1-70b-Q4_K_M
List all available models
lemonade list
Model
Hi,
Thank you for releasing this model! Would you mind sharing some details on how it was trained, and what the training data is?
Thanks!
Is this a frankenmerge?
mrfakename, this model is most likely a leak of mistral medium.
mrfakename, this model is most likely a leak of mistral medium.
Interesting! According to nisten it's a frankenmerge, do you know if that's accurate?
I had not seen this, thanks for the info
Interesting! According to nisten it's a frankenmerge, do you know if that's accurate?
He initially claimed it was a MoE, so I'd take this with a grain of salt. It outperforms mistral 7B by a mile from my testing though.
Interesting! According to nisten it's a frankenmerge, do you know if that's accurate?
He initially claimed it was a MoE, so I'd take this with a grain of salt. It outperforms mistral 7B by a mile from my testing though.
Frankenmerges can be MoEs, right?
Frankenmerges can be MoEs, right?
Correct
Nisten made all kinds of claims, some rather insane ones in the beginning.. yet I tested the model and it's relatively good. If it's a merge then of what? Who else uses mistral's format that put a model out recently? I suggest people just try it if they have the memory.. at least at Q4.
It chats well and it's not dumb, that's all that matters. I downloaded tons of disappointment of the leader board going by benchmarks.
I downloaded tons of disappointment of the leader board going by benchmarks.
Yes! So many models are disappointing when evaluated with real world usage.
this looks like an MoE 7x11 fine-tuned on mistral-medium synthetic data. it does mimic mistral's style very closely.
https://twitter.com/teortaxesTex/status/1752459593416847570
Interesting. Not sure if true but seems possible.
Excellent model. Reminds me of Claude. Itβs willing to consider alternative solutions. It takes advice and will mold its answers to new promoted insights. Tested it with the difficult Aunt Agatha riddle and it handled it well.
Apparently someone succeeded at dequantizing it to fp16, 70+ MMLU scores
https://huggingface.co/152334H/miqu-1-70b-sf
Apparently someone succeeded at dequantizing it to fp16, 70+ MMLU scores
https://huggingface.co/152334H/miqu-1-70b-sf
Hmm. That makes no sense. How can you "add" precision to it? That would be like taking a blurry picture and making it clear again with all the detail.
Apparently someone succeeded at dequantizing it to fp16, 70+ MMLU scores
https://huggingface.co/152334H/miqu-1-70b-sf
the model appears to be legit, resembling "mistral-medium" as mentioned onhttps://twitter.com/teortaxesTex/status/1752459593416847570.
it (mistral-70b-instruct-alpha01) was likely trained on the Llama architecture, possibly for a quick presentation to investors.
this model is fine-tuned and adept at following instructions. based on my experiments, i can confirm that it is also aligned for safety.
The 5bit EXL2 performs OK. It gets 11 perplexity on PTB_NEW. Have to check it vs the q4km I have. So the re-compression wasn't the end of the world.
That makes no sense. How can you "add" precision to it? That would be like taking a blurry picture and making it clear again with all the detail.
It doesn't add any precision, but fp16 pytorch file format is much more universal and it's easier to work with if you want to do finetuning. It's the same blurry image, but now you have it in digital form and can do stuff to it in Photoshop and you're not limited in what you can do to the physical photo using scissors, markers and other physical tools.
Wow. Crazy.
Well, I guess it is a leak.
Its pretty obvious it was some sort of leak considering the lack of information about its creation process!
