pull down to refresh

These projects warm my heart.

More local LLMs please.

Do you use one? any favorite local model that I could compare against Qwen3.6?

reply
local model that I could compare against Qwen3.6

To compare against Qwen3.6 35B? Gemma4 31B.

reply

I unfortunately get 3TPS on Gemma4 31B. I tried the 26B A4B version, which I get comparable speeds with Qwen3.6 35B A3B (20-30tps).

It seems to initially look ok, but skill use is significantly worse though from my initial tests (guessing rather than properly using the skill and understanding it - deleting text in a file that explicitly said to not delete it, and not following instructions and reading things fully).

So for me, Qwen3.6 is still a massive improvement - a game changer, which has made this project possible.

reply

Interesting.

I had problems with Qwen3.6 35B through my opencode integration (that I use in production with GLM-5) where it had instruction separation issues (i.e. it had trouble distinguishing between the files it read and the instruction given, I recorded an instance of that here: #1483366) - also happens on things like diff analysis often.

Perhaps we ought to tune more on a per-model-family basis? Not sure.

(edit: forgot to mention that it doesn't always finish the task and just quits. But I have the same issue with Gemma)

reply
105 sats \ 1 reply \ @rolznz OP 17h

Interesting that we got basically the opposite results :-)

I do see a LOT of errors, yeah. But from my observations it's very good at self-correcting and fixing its mistakes.

But this project is quite simple - not much specific technical knowledge needed. Maybe this is where the difference comes from?

I also saw you used Claude Code and OpenCode. They both use a large system prompt. Did you try Pi Agent?

reply
456 sats \ 0 replies \ @optimism 17h
I do see a LOT of errors, yeah. But from my observations it's very good at self-correcting and fixing its mistakes.

This I see too (it's better trained at this than Gemma) and I do think the self-correction is working more often than not (though what a waste of compute!) grep -i wait on thinking blocks is still "fun" too.

But this project is quite simple - not much specific technical knowledge needed.

My main use case nowadays is feeding LLMs file diffs and strace logs, mostly of third party code from npm/cargo/pubdev/mvn/pypi, to help me make security assessments. Thanks to LLMs this now only takes me a day a week instead of 4 last year, with about 5x the workload and an 100x threat increase.

Did you try Pi Agent?

Nope. The problem I'm running into is that I am swamped, but I'll try to find a moment to test your setup.

reply
15 sats \ 0 replies \ @sime 16h

No, not using any. My brief interaction was they needed internet access.

I'm missing use cases honestly.

reply