🗓️Combining Multiple Embeddings for Better Retrieval Outcomes

How Qyver Represents Complex Data More Effectively

Retrieving high-quality results from LLM-powered vector searches on complex, embedded data isn’t easy. The traditional approach looks like this:

Embed all entity data into a single vector using a model (e.g., Hugging Face, proprietary models).
Run a vector search to find the top X nearest neighbors.
Rerank results using:

Additional contextual filters
Information not captured in the embedding (e.g., filtering out out-of-stock items)

This method has serious limitations:

Data loss: A single embedding struggles to capture multiple attributes, reducing relevance.
Inefficient reranking: Reranking with cross-encoders adds latency, slowing down production.
High computational cost: Comparing all results pairwise to refine rankings is resource-intensive.

Is there a way to achieve high-quality retrieval without data loss and latency overhead? Yes!

A Smarter Approach: Multimodal Embeddings with Qyver

Instead of embedding all entity data into a single vector, Qyver Spaces allow you to:

Embed each attribute separately based on its modality.
Concatenate these embeddings into a multimodal vector.
Capture the full complexity of data without losing information.

This results in:

Better quality retrieval—more relevant, complete search results.
No need for expensive reranking—eliminates computational overhead.
Faster processing—retrieval is 10x faster, reducing latency from hundreds to tens of milliseconds.

Example: Searching Multimodal Data Efficiently

Consider a simple dataset where each paragraph has a text body and a like count:

Paragraph 1: "Glorious animals live in the wilderness." → likes: 10
Paragraph 2: "Growing computation power enables advancements in AI." → likes: 75

Using Qyver’s Spaces, we represent each attribute individually:

@schema
class Paragraph:
    id: IdField
    body: String
    like_count: Integer

Instead of embedding everything into a single vector, we use separate Spaces:

body_space = TextSimilaritySpace(
    text=paragraph.body, model="sentence-transformers/all-mpnet-base-v2"
)

like_space = NumberSpace(
    number=paragraph.like_count, min_value=0, max_value=100, mode=Mode.MAXIMUM
)

These Spaces are then combined into a single index:

paragraph_index = Index([body_space, like_space])

Efficient Querying Without Reranking

Once indexed, we query for relevance efficiently:

query = (
    Query(paragraph_index)
    .find(paragraph)
    .similar(body_space.text, Param("query_text"))
)

result = app.query(
    query,
    query_text="What makes the AI industry go forward?",
)

Expected Output:

body

like_count

Growing computation power enables advancements in AI.

paragraph-2

Glorious animals live in the wilderness.

paragraph-1

Why is this better?

Structured & unstructured data are embedded separately but stored in a single searchable index.
No need for reranking—Qyver Spaces ensure relevance is captured upfront.
️Faster & more accurate—reduces processing overhead while improving retrieval precision.

Scaling Beyond Simple Use Cases

Qyver is designed for real-world complexity, handling:

E-commerce recommendations → Blending product descriptions, user preferences, and purchase history.
Search & discovery → Multimodal indexing for text, images, and metadata.
LLM-powered applications → Enhancing RAG (Retrieval-Augmented Generation) with structured embeddings.

Final Thoughts

Instead of relying on reranking, filtering, or expensive post-processing, Qyver optimizes embeddings upfront—leading to faster, higher-quality retrieval.

Want to try it? Check out the notebook and see the power of multimodal embeddings in action.

Like what we’re doing? Give us a star!

PreviousQdrant NextDynamic Parameters/Query Time weights

Last updated 5 months ago