• brucethemoose@lemmy.world
    link
    fedilink
    arrow-up
    19
    ·
    edit-2
    7 days ago

    It does seem advantageous to the defender.

    Another factor Mozilla didn’t mention (and that Anthropic wouldn’t like to emphasize) is that major LLMs are pretty similar. And their development is way more conservative than you’d think. They use similar architectures and formats, train from the same data, distill each other, further pollute the internet with the same output and so on. So if (for example) Mozilla red teams with Mythos, I’d posit it’s likely that attacker LLMs would find the same already-patched bugs, instead of something new.

    …So yeah. I’d wager Mozilla’s sentiment is correct.