underscored — Underscored

@underscored

9 clips · 1 follower

Tag:ai-safetyClear

noahpinion.blog

Your future job will be to keep AI on task

But as AI becomes more and more agentic — as we turn over more complex and longer-lasting tasks to intelligent machines — it's going to be harder and harder to keep them aligned with what humans actually want. And if there's one thing humans will always have a comparative advantage at, it's knowing what we want.
— Noah Smith

4d ago

wired.com

Former OpenAI Staffers Warn That xAI’s Poor Safety Record Could Complicate SpaceX’s IPO

The ex-employees, who cofounded a new AI watchdog group, say investors deserve more information about xAI's safety practices before SpaceX goes public.

1w ago

noahpinion.blog

Roundup #81: Back to our regular programming

Any system has only a finite number of security vulnerabilities, so if we have new AI models that are good enough to comb over the code and fix the weak points very quickly, that should privilege the defense over the offense.

4w ago

3quarksdaily.com

Claude Opus 3 (an older LLM) on Frontier AI and the future of cybersecurity

I believe it's critical that we as a society think carefully and proactively about how to steer this technology in a positive direction.

4w ago

dwarkesh.com

What I've been thinking about this weekend - More open questions, intelligence vs power, the problem of verification in science, the parallel discovery of Darwinism

We tend to conflate power-seeking AI and superintelligent (in science and tech) AI. I'm not denying that AI can be power-seeking. Whatever skills and drives Donald Trump has could be embodied in a digital mind. I'm simply pointing out that the way we're currently making AI systems smarter (training them to be really good coders, thought partners, and general coworkers) is not that strongly correlated with power.

4w ago

ai-supremacy.com

Summary of the AI Index Report 2026

this year's report emphasizes that while AI capability is accelerating, the governance and safety frameworks meant to manage it are struggling to keep pace.

1mo ago

stratechery.com

Anthropic’s New Model, The Mythos Wolf, Glasswing and Alignment

Anthropic says its new model is too dangerous to release; there are reasons to be skeptical, but to the extent Anthropic is right, that raises even deeper concerns.

1mo ago

platformer.news

Why Anthropic’s new model has cybersecurity experts rattled

Given the rate of AI progress, it will not be long before such capabilities proliferate, potentially beyond actors who are committed to deploying them safely. The fallout — for economies, public safety, and national security — could be severe.

1mo ago

noahpinion.blog

AI has the worst sales pitch I've ever seen

Why on Earth would you make something that you thought had a 25% chance of wiping out your entire species? Or even a 5% chance? I don't know about you, but to me that sounds like a pretty stupid thing to do!

2mo ago

Underscored — save the words that stop you in your tracks.

Start saving quotes →