the failure mode
Keyword search has one failure mode and it's the same one every time. The exact term you typed isn't in the document. The document is about the thing you asked for. The match is zero. You scroll for ten minutes. You give up.
This is worse than it sounds. Your bookmarks are written by hundreds of different people, in hundreds of different ways. Two threads can be about the same topic and share no nouns. Keyword search rewards lexical match. Real recall rewards conceptual match.
what embeddings do
An embedding turns a piece of text into a list of numbers (1,536 of them, in xmark's case). The list of numbers is a coordinate. Two pieces of text that mean similar things land at similar coordinates. The math is dot products and cosines; the experience is “the right thing surfaced even though I didn't use its exact words.”
In xmark we use OpenAI's text-embedding-3-small. It's cheap, fast, and good enough that you don't notice it. Storage is pgvector inside Supabase Postgres, so queries are SQL with a distance operator (`<->`).
where keyword still wins
Keyword wins when you know the exact string. URLs, handles, very specific product names. xmark combines both: if your query looks like a URL or an @-handle, we don't bother embedding it; we just match. For everything else, embeddings.
the practical upshot
At 50 bookmarks, keyword is fine. At 500, you're scrolling. At 5,000, you have a research library that you can't read. Semantic is the difference between a saved-things pile and a queryable corpus. It's the only thing that scales.