Thoughts (and potential confusion) on embeddings & meilisearch #828

nnethercott · 2025-04-29T16:46:44Z

nnethercott
Apr 29, 2025

Less of a bug and more just some general feedback/ideas ! (sorry in advance if this is annoying)

on the topic of composite embedders

I initially understood composite embedders in your product as a rerank tool; one model is used to generate a query embedding to quickly fetch semantically similar vectors in the DB, after which the second model reorganizes the retrieved entries in terms of relevance. I think the docs could try to distinguish more how the "composite" embedding approach is distinct from the concept of reranking, especially since both techniques involve specifying 2 embedding models and that rerank is quite well known.

In a notion page I found on the feature it's also written that (though I haven't checked the code to be sure if its enforced)

Meilisearch requires a maximum distance of 0.01.

How was this tolerance decided/and or where is it documented? If the local model is meant to optimize for latency then using quantized or half precision models could be a nice-to-have. But due to the nature of quantization we might not respect that max distance of 0.01 tolerance, even though the local model is derived from the remote.

limited support for hugging face models

Okay the title is a bit of a misnomer since all the top models on the MTEB are BertModels, but if someone passes the hugging face ID of a model which is not a Bert architecture (e.g. a DebertaV2 or XLM Roberta) then the embedding job may fail (depending or not on whether you could coerce the weights into a BertModel here in the code).

This is kind of addressed here in the settings doc, but I think the restriction to embedding models with BertModel architectures for embeddings should maybe be mentioned more explicitly or in the known limitations page.

an aside/thought experiment (or a gedankenexperiment as they call it in germany)

This model: deepvk/deberta-v1-base for results in a 500 when i try configuring my embedder settings for an index;

2025-04-29T15:18:00.375558Z ERROR index_scheduler::scheduler: Batch failed Index `movies`: Error while generating embeddings: runtime error: loading model failed:
  - cannot find tensor embeddings.token_type_embeddings.weight

Now imagine I'm using composite embedders with my XLM Roberta remote endpoint which handles generating embeddings no issue, whereas the local model crashes on load -- how is this failure mode handled ?

looking forwards

Some features related to embeddings i think would be fun (but probably out of scope for right now)

support for quantized embedding models (fp16 at a minimum, or more exotic ones like bnb, hqq, etc -> nice summary of some quantization schemes here) for ultra low latency applications
Support for non BertModels; plenty of other architectures are already supported in candle.rs --> why not use them ?
validate model type from hugging face id pre-emptively before downloading weights (get it from the model config before downloading model weights) & return explicit error message
multimodal support with models like CLIP or SigLIP --> a usecase we're exploring at my company with Meilisearch for image-as-queries to build computer vision datasets from our raw production data

Anyways I covered a lot of ground here, thanks and feel free to let me know if any of this is planned and/or relevant !

dureuill · 2025-05-12T08:05:28Z

dureuill
May 12, 2025
Collaborator

hello, not sure why the discussion is closed, but I wanted to make sure I would answer you 😊

How was this tolerance decided/and or where is it documented?

What are your expectations in terms of documentation here? The exact value was chosen to leave some margin for float approximation while making reasonably sure that the same model is used.

As a first step and as a means to prevent user error, we decided to enforce the same model with the same characteristics is used.

If the local model is meant to optimize for latency then using quantized or half precision models could be a nice-to-have.

This would be nice, but only works if the geometric transformation does not alter the relative position of the vectors, otherwise the similarity check will give poor results.

I think the docs could try to distinguish more how the "composite" embedding approach is distinct from the concept of reranking, especially since both techniques involve specifying 2 embedding models and that rerank is quite well known.

This is kind of addressed here in the settings doc, but I think the restriction to embedding models with BertModel architectures for embeddings should maybe be mentioned more explicitly or in the known limitations page.

@meilisearch/docs-team

Now imagine I'm using composite embedders with my XLM Roberta remote endpoint which handles generating embeddings no issue, whereas the local model crashes on load -- how is this failure mode handled ?

When declaring a composite embedder, Meilisearch attempts to embed the same texts with both embedders. If that fails, the settings task will fail.

plenty of other architectures are already supported in candle.rs --> why not use them ?

AFAIK from my experience adding support for the BERT models, supporting a model architecture using candle involves doing some tensor operations, pooling, detection of the architecture, etc in code. So adding new architectures hasn't been done yet because it is engineering time. We might add some in the future.

multimodal support with models like CLIP or SigLIP --> a usecase we're exploring at my company with Meilisearch for image-as-queries to build computer vision datasets from our raw production data

We definitely have something in the oven regarding this :-)

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Meilisearch

Thoughts (and potential confusion) on embeddings & meilisearch #828

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Meilisearch

Thoughts (and potential confusion) on embeddings & meilisearch #828

Uh oh!

Uh oh!

nnethercott Apr 29, 2025

on the topic of composite embedders

limited support for hugging face models

an aside/thought experiment (or a gedankenexperiment as they call it in germany)

looking forwards

Replies: 1 comment

Uh oh!

dureuill May 12, 2025 Collaborator

nnethercott
Apr 29, 2025

dureuill
May 12, 2025
Collaborator