Remix.run Logo
voiper1 2 days ago

I thought embedding large chunks would "dilute" the ideas, since large chunks tend to have multiple disparate ideas?

Does it somehow capture _all_ of the ideas, and querying for a single one would somehow match?

Isn't that the point of breaking down into sentences?

Someone mentioned adding context -- but doesn't it calculate embedding on the whole thing? The API Docs list `input` but no separate `context`. https://docs.voyageai.com/reference/embeddings-api