The era of the $20-a-month AI tax is officially over for my workflow. For the last few months, Claude Pro was my co-pilot (not the one from Microsoft) for debugging Python scripts, planning a vacation, and giving me ideas about improving my home lab.

I decided to pull the plug on my subscription and migrate my entire development and writing stack to local LLMs through LM Studio. And suddenly the capacity errors vanished, the privacy fears disappeared, and my productivity stayed exactly where it should be.

Qwen3.6-35B-A3B

Cloud power, local speed

The real turning point in my transition to a local-first workflow was the release of Qwen3.6-35B-A3B. When you are used to the ‘it just works’ nature of Claude Pro, you're naturally skeptical of open-source alternatives, but this model changed the math for me. It’s a Mixture-of-Experts model with 35 billion total parameters but activates only 3 billion at any given time. For my setup, that means I get elite-level reasoning without my MacBook Pro fans spinning up all the time.

The standout feature for me is its Agentic Coding capability. Most small models can write a single function, but Qwen3.6 can actually think through a repository. I started with a classic productivity hurdle: automating my messy Downloads folder. I asked it to build a robust organization script using pathlib that could handle file collisions without overwriting my data.

👁 XDA
Quiz
8 Questions · Test Your Knowledge

How much do you know about Claude?
Trivia challenge

Think you know Anthropic's AI assistant? Put your knowledge of Claude to the test.

OriginsCapabilitiesSafetyFeaturesDesign
01 / 8Origins

Which company created Claude?

Correct! Claude was created by Anthropic, an AI safety company founded in 2021. Anthropic was co-founded by Dario Amodei and Daniela Amodei, among others who previously worked at OpenAI.
Not quite. Claude is made by Anthropic, not to be confused with OpenAI, which makes ChatGPT. Anthropic was founded in 2021 with a strong focus on AI safety research.
02 / 8Safety

What is the name of the safety and values framework Anthropic developed to guide Claude's behavior?

Correct! Anthropic developed Constitutional AI (CAI), a technique that trains Claude using a set of principles — a 'constitution' — to guide its responses toward being helpful, harmless, and honest.
Not quite. The framework is called Constitutional AI (CAI). It is a novel training approach pioneered by Anthropic that uses a written set of principles to help the model self-critique and improve its own outputs.
03 / 8Origins

What is the name most commonly associated with inspiring Claude's name?

Correct! Claude Shannon is widely cited as the inspiration behind the name. Shannon founded information theory, which is foundational to all modern computing and digital communication — a fitting namesake for an AI.
Not quite. The name Claude is most commonly associated with Claude Shannon, the mathematician and electrical engineer who founded information theory. His pioneering work laid the groundwork for the digital age.
04 / 8Capabilities

Which of the following best describes Claude's context window capability in its more advanced versions?

Correct! Advanced versions of Claude support context windows of 100,000 tokens or more, allowing it to process entire books, lengthy codebases, or large documents in a single conversation — a standout feature at the time of its release.
Not quite. Claude's advanced versions support context windows of 100,000 tokens or more. This was a significant leap beyond many contemporaries and allows Claude to reason over very large amounts of text in one session.
05 / 8Design

Which of the following principles is NOT part of Anthropic's core goal for Claude?

Correct! Anthropic's guiding principles for Claude are to be Helpful, Harmless, and Honest — often called the 'three H's.' Hierarchical is not part of this framework. The goal is to make AI that is safe and beneficial for everyone.
Not quite. Anthropic's three guiding principles for Claude are Helpful, Harmless, and Honest. 'Hierarchical' is not one of them. These three H's shape how Claude is trained to interact with users responsibly.
06 / 8Features

What was a key distinguishing feature of Claude 2 when it launched compared to many rival models at the time?

Correct! Claude 2 launched with a 100,000-token context window, which was remarkable at the time. This allowed users to feed in entire books or massive codebases for analysis, setting Claude apart from many competing models.
Not quite. The standout feature of Claude 2 was its 100,000-token context window. Claude does not natively generate images, and real-time browsing and built-in voice were not launch features of Claude 2.
07 / 8Safety

Anthropic describes itself primarily as which type of company?

Correct! Anthropic describes itself as an AI safety and research company. Unlike some competitors who lead with products or platforms, Anthropic's founding mission centers on building AI systems that are safe, interpretable, and steerable.
Not quite. Anthropic is primarily an AI safety and research company. Its founding mission is rooted in making AI that is safe and understandable, which is why safety-focused training methods like Constitutional AI are central to its work.
08 / 8Features

Which of the following tasks is Claude specifically designed to handle well?

Correct! Claude excels at long-form writing, summarization, coding assistance, and complex reasoning tasks. Its large context window and nuanced language understanding make it particularly well suited for handling detailed, multi-step text-based work.
Not quite. Claude is designed for text-based tasks like writing, summarization, analysis, and reasoning. It does not render graphics, autonomously execute system commands, or perform live video analysis — it is a large language model at its core.
Challenge Complete

Your Score

/ 8

Thanks for playing!

I was surprised that the code worked on the first try. It even added error handling for edge cases, such as what to do with files that don’t have an extension. It proved that for daily automation tasks, the $20 cloud subscription is now an unnecessary add-on.

Whether I’m debugging Python scripts or drafting a 2000-word deep dive into the latest Android firmware, the latency is virtually zero.

It’s lean enough to run on consumer hardware but has enough brainpower to rival Claude Sonnet 4.5 in intelligence and document understanding. It’s become my go-to for vibe coding sessions.

Gemma 4 E4B

The ‘everyday’ champion

If Qwen3.6 is my heavy-duty engineer, Gemma 4 E4B is my nimble, everyday champion. When it comes to local LLMs, we often think that bigger is smarter, but Google’s latest 4-billion-parameter powerhouse proves that efficiency is the new benchmark for 2026. Because it’s so lightweight, I leave it running in the background of my workstation 24/7; it’s the model I turn to for instant brainstorming, email drafting, and those complex logic puzzles that usually trip up smaller models.

To see if Gemma 4 could actually replicate Claude's reasoning feel, I gave it a complex prompt that would make models struggle. I asked it to describe a square room with specific items on each wall and put conditions on which words to avoid. It didn’t just pass; it excelled. Despite some conditions, like it couldn’t even say ‘bookshelf,’ ‘blue,’ or ‘behind’ words, it managed to describe the room with precision.

It handled the Sound wall reflection perfectly, calculated the room temperature using the given formula, and did all of this while hitting a tight 112-word count. Most local models lose the plot when you stack negative constraints like that, but Gemma’s instruction following felt sharp.

And because it’s the E4B (Edge-4-Billion) variant, I run it directly on my laptop without a dedicated GPU. I can draft sensitive client emails or jewelry business strategies for Asha Jewels without worrying about my data training a future cloud model. Gemma 4 is one of the few models of this size that doesn’t break when you ask it to format a table or calculate a quick conversion. It feels polished in a way that many open-source models don’t. Due to its lightweight nature, I’m already running it on my Pixel 8 via the Google AI Edge Gallery app.

GPT-OSS 20B

Honorable mention

No productivity piece would be complete without an honorable mention for the heavyweight in the room: GPT-OSS 20B. If you are running a workstation with a beefy GPU or a top-tier Mac Studio, this is the model that brings GPT-level polish to your local workflow. Since it only activates 3.6B parameters at a time (despite its 20B size), it remains snappy and responsive.

When I need a model to not just write code, but to actually simulate the output and catch logic errors before I even hit save, GPT-OSS is the one I load up. While the Qwen and Gemma models I mentioned earlier are perfect for my MacBook Pro on the go, I save GPT-OSS 20B exclusively for my Windows workstation.

I fired Claude Pro

The dream of Saas-free professional productivity used to be a compromise, but thanks to the arrival of Gemma 4 and other capable local LLMs, you can actually transition away from Claude Pro without affecting your work output.

Whether you are a developer tired of capacity errors or a writer looking to secure your property, the local LLM ecosystem is finally ready for prime time.

Of course, these are just my preferred LLM models for my workflow. You shouldn’t hold yourself back from exploring other LLMs based on your hardware setup and workflow.