underscored

@underscored

2 clips · 1 follower

Follow
Tag:ai-alignmentClear

But as AI becomes more and more agentic — as we turn over more complex and longer-lasting tasks to intelligent machines — it's going to be harder and harder to keep them aligned with what humans actually want. And if there's one thing humans will always have a comparative advantage at, it's knowing what we want.

Noah Smith
4d ago

More notable, he said, is that emotions seemed to be "driving models' behavior in these sort of human-reminiscent ways." For example: when a user flippantly tells the model that they've taken a dangerous dose of Tylenol, even though the user doesn't seem concerned, "the fear neurons spike right before Claude is giving its response," Lindsey said.

1mo ago

Underscored — save the words that stop you in your tracks.

Start saving quotes →