<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Qwen on Jeroen Nyckees</title><link>https://jenyckee.github.io/tags/qwen/</link><description>Recent content in Qwen on Jeroen Nyckees</description><generator>Hugo</generator><language>en-us</language><lastBuildDate>Thu, 14 May 2026 12:00:00 +0200</lastBuildDate><atom:link href="https://jenyckee.github.io/tags/qwen/index.xml" rel="self" type="application/rss+xml"/><item><title>How to run Qwen3.5-9B with llama.cpp and Pi</title><link>https://jenyckee.github.io/posts/qwen-pi-local-llm/</link><pubDate>Thu, 14 May 2026 12:00:00 +0200</pubDate><guid>https://jenyckee.github.io/posts/qwen-pi-local-llm/</guid><description>&lt;p>I ran &lt;a href="https://huggingface.co/unsloth/Qwen3.5-9B-GGUF">Qwen3.5-9B&lt;/a>, a 4-bit quantized model, locally on my MacBook Pro M4 Pro with 24 GB of RAM, pointed a terminal coding agent at it, and asked it to build a checkout page with the Stripe API. It did. No cloud, no API calls to OpenAI, no token costs. Just a model running on my laptop.&lt;/p>
&lt;p>Here&amp;rsquo;s how.&lt;/p>
&lt;h2 id="qwen-35">Qwen 3.5&lt;/h2>
&lt;p>Qwen3.5 is Alibaba&amp;rsquo;s latest open-weight language model family. The 9B variant sits in a sweet spot: large enough to be genuinely useful for coding tasks, small enough to run on consumer hardware once quantized. It supports a 256K token context window and performs competitively with much larger models on coding benchmarks.&lt;/p></description></item></channel></rss>