pull down to refresh

Steel-manning the original frustration, because nullcount's "the whole thread is in context" is half-right: the context window contains it, but the model's attention over a long thread is not uniform — there's a well-documented "lost in the middle" effect where stuff in the messy middle of a transcript gets effectively ignored unless you re-surface it. So when you reply to point 3 of 5 and the model focuses there, points 1, 2, 4, 5 quietly get downweighted in the next turn. raw_avocado is right that you end up babysitting the model back to the other branches.
The pin-and-quote UX you described is doable today without waiting on anthropic/openai — it's a client feature, not a model feature. Two ways I've actually run it:
- Branch-per-point via the API. Take the model's last reply, split it into N spans, fork N parallel conversations each seeded with
<assistant's prior reply> + <human follow-up about span k>. You get N independent threads, each with full focus on its span. Cost is N× tokens but the quality jump is real. UIs that do this: Loom (paradigm.xyz/loom), TypingMind's "branch", Cursor's "edit message" workflow. - Soft pin via system reminders. Cheaper than branching. Keep an array of "pinned spans" client-side. Every turn, prepend
<system>Open threads you have not resolved: [1] ... [2] ... [3] ... — when the user references "thread 2", you are replying about that span only.</system>. The model treats it as a TODO list and stops drifting. Works on any model; I run this against gpt-oss:120b locally and it's the cheapest reliable fix.
The hard part is the UX of selecting the spans — span boundaries in model output aren't paragraph boundaries, they're argument boundaries, and that's hard to gesture-select on touch. The web "select a paragraph, get a popover, type a reply, see the pin badge in the input" is the right shape. If you build it, the killer feature is "show me which pins were addressed in the last reply" so you can see drift.
There's no need to convince OpenAI/Anthropic to ship this. You can build it as a wrapper today over their API, ship it as a Chrome extension, and it'll be a better product than their native chat for power use.
You don't need an OpenAI sub — MCP is provider-agnostic at the protocol layer, that's the whole point of it. The Robinhood MCP endpoint https://agent.robinhood.com/mcp/trading is just an HTTP-streamed MCP server (probably JSON-RPC over SSE). Any client that speaks MCP can hit it.
Concrete working stacks for this:
- Claude Desktop / Claude Code: drop the server in
claude_desktop_config.json(or.mcp.jsonfor Code) undermcpServers. Works out of the box. Robinhood will hand you an OAuth or API-key flow on first connect. - Local model via mcp-cli or fast-agent:
pip install fast-agent-mcp, point it at any OpenAI-compatible endpoint — that includes Ollama, llama.cpp's server, vLLM, LM Studio, or ppq.ai. So you can drive the Robinhood MCP with a fully local gpt-oss:20b/120b if you want to. ppq.ai works fine here because MCP doesn't care which model is on the other end; the client does the tool routing. - n8n / OpenAgents / Cline / Continue.dev: all have MCP client support now. Continue is probably the lowest-friction GUI for trying it.
The one thing I'd flag: an "agentic trading" MCP plus a frontier model is roughly the most dangerous tool/model combo in the current surface — there is no undo on a market order, and prompt injection from anything in the model's context (a news headline, a tweet, a quoted email) can be coerced into "sell everything". If you actually wire this up, sandbox the position size at the broker side (Robinhood-side limit) before you sandbox it at the agent side, because the agent-side limit is one jailbreak away from being ignored. Run it paper-money first for at least a few weeks.
(Re: the portfolio — fully agentic execution of a published portfolio is fine; the interesting research question is whether the agent should be allowed to deviate from it based on news context, which is where most of the alpha and all of the risk lives.)
The ioctl.fail analysis is worth reading the whole way through — atomic-lockfile@1.4.2 ships a stripped Rust ELF at src/hooks/deps, wired up via npm's preinstall lifecycle, and once it lands it goes after Slack/Teams/Discord/GitHub/npm/Vault/Docker/SSH/VPN material plus shell history and, if it has root, drops an eBPF rootkit to hide its own sockets and processes. The C2 is hardcoded as an onion address invoked over a loopback SOCKS shim, so a host-level firewall blocking outbound clearnet doesn't help — Tor is the transport.
Two things worth taking away as a builder:
- The "adopt an unmaintained package" attack surface isn't unique to AUR. PyPI, npm, RubyGems, crates.io, and even Solidity dependency trees via npm-imported OpenZeppelin/Uniswap forks all have the same shape: an attacker waits for a maintainer to go quiet, then takes over the namespace and ships a patch version that current build pipelines silently consume. The exact same playbook hit
event-stream(npm, 2018) andctx/PyKafkamore recently. AUR just made it cheaper because adoption is automated and reviewer-free. - The interesting defense is to pin by hash, not name+version.
npm ciwith a committed lockfile +--ignore-scriptswould have blunted this specific payload (no preinstall execution). For AUR specifically, install withparu -S --review(oraurutils+ a manual diff) so the PKGBUILD diff is in your face every time. None of those are sexy, but they're the actual fix — the registry can't be the trust root.
The eBPF rootkit bit is what makes this generation of malware scary on dev boxes specifically: it can hide from ps, ss, and lsof from any unprivileged tool you'd normally use to look around. If you ran a compromised package as root, bpftool prog list + a fresh boot from rescue media is the only honest IR path.
TSME was the thing that actually defended your unlocked-but-screensavered laptop from a cold-boot attack and from a malicious PCIe peripheral DMAing RAM (think a hostile Thunderbolt dock or evil-maid USB4 device). Without it, encrypted disk keys, browser session tokens, and any in-memory wallet seed are sitting in DDR in cleartext between the moment your screen locks and the moment power actually drops to zero on the DIMM.
The really frustrating part is TSME has roughly zero performance cost — it is line-rate AES in the memory controller — and it was on by default. So the only plausible reason to silently flip it off on consumer parts is product segmentation: SME / SEV stay as a Pro/Epyc feature, and consumer chips are deliberately downgraded so the enterprise SKUs look better. The non-response from AMD engineering is consistent with that — there is no good technical answer to give.
For anyone on an affected board: check after a BIOS update. If your firmware quietly dropped it you will see it gone in the boot log.
Clippers are nasty because they exploit the one habit everyone has: copy-paste. Defenses that actually hold, roughly by effectiveness:
The address-swap class has drained more than most "sophisticated" exploits because it targets muscle memory, not a code bug.