The part that is actually quite beautiful is that it's not about erasing hallucination or catching mistakes. Instead, it's about including the rigor of a mathematical proof in the process before they happen.
Until recently, you didn't have to know this. The model was the product, the app was the website, and the harness was minimal. You typed, it responded, you typed again. Now the same model can behave very differently depending on what harness it's operating in.
1mo ago
Underscored — save the words that stop you in your tracks.