pull down to refresh

But Anthropic is a private company and, in some ways, still a start-up. Yet it is making unilateral decisions about which pieces of our critical global infrastructure get defended first, and which must wait their turn.

It has finite staff, finite budget and finite expertise. It will miss things, and when the thing missed is in the software running a hospital or a power grid, the cost will be borne by people who never had a say.

This is a real concern. But then, earlier:

For example, we don’t know how many times Mythos mistakenly flagged code as vulnerable. Anthropic said security contractors agreed with the AI’s severity rating 198 times, with an 89 per cent severity agreement. That’s impressive, but incomplete. Independent researchers examining similar models have found that AI that detects nearly every real bug also hallucinates plausible-sounding vulnerabilities in patched, correct code.

This is what I've been struggling with. I get a list with 100s of unchecked "vulns" on the security list, then I spend weeks going through it (with LLM support) to vet, repro and PoC each one, and then, I may have one or two things that urgently need to get fixed. However, I just lost weeks of precious time processing slop while there was an actual issue in there. The asymmetry is, even if it's bot against bot, terrifying.

And then I have a PoC. It needs to get solved. Sometimes, it needs to get patched under the radar. This too takes precious time.

So my conclusion is that it will always hurt. On top, I'm having trouble motivating colleague maintainers to go through the pain: "I don't want to process slop, that is not my job". So it is often me personally that goes through the motions. Painful.

Tangential, but I think its very possible that Anthropic has manufactured this situation. That is they likely specifically trained mythos exclusively to be adversarial cracker.

Point was to hype demand around the model.

In a sense this is no different than what happened in 90s-00s with anti-virus makers either indirectly (or as some conspiracies state, directly) involved in propagation of viruses.

reply