underscored

@underscored

2 clips · 1 follower

Follow
Tag:latencyClear

traditional AI models function sequentially—listening and then responding—whereas its new model aims for simultaneous input processing and response generation, akin to a phone call.

Michael Spencer
1w ago
stratechery.com
The Inference Shift

What does matter is how long humans are waiting for an answer, and as products like AI wearables become more of a thing, the speed of interaction, particularly for voice — which will be a function of token generation speed — will have a tangible effect on the user experience.

1w ago

Underscored — save the words that stop you in your tracks.

Start saving quotes →