When Qwen released the Medium Series of Qwen 3.5 models last week, I was immediately hooked.
I replaced the GLM-4.7 Flash model with the Qwen 3.5 Flash model and was very happy.
I was only a little disappointed with the coding. I often had to make manual adjustments.
Then I found the recommendation from Unsloth AI. They recommend reducing the temperature from 1.0 to 0.6 for coding tasks. This changed everything. After that, the results from Qwen-3.5-Flash were shockingly good.
I can only recommend this model. I use it with opencode and achieve fantastic results on my MacBook.
./llama-server \
--model Qwen3.5-35B-A3B-UD-Q4_K_XL.gguf \
--alias "Qwen3.5" \
--jinja \
--temp 0.6 \
--top-p 0.95 \
--top-k 20 \
--min-p 0.00 \
--repeat-penalty 1.0 \
--ctx-size 131072 \
--port 8001