The other observation is that the precision will almost always be higher in the accumulation step than in the multiplication step. This is specific to AI chips. You're multiplying low-precision numbers, and then when you accumulate, errors accumulate quickly, so you need more precision there.
The part that is actually quite beautiful is that it's not about erasing hallucination or catching mistakes. Instead, it's about including the rigor of a mathematical proof in the process before they happen.
Until recently, you didn't have to know this. The model was the product, the app was the website, and the harness was minimal. You typed, it responded, you typed again. Now the same model can behave very differently depending on what harness it's operating in.
3mo ago
Underscored — save the words that stop you in your tracks.