> For the complete documentation index, see [llms.txt](https://qyverlabs.gitbook.io/qyverlabs-docs/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://qyverlabs.gitbook.io/qyverlabs-docs/concepts/embeddings.md).

# Combining Multiple Embeddings for Better Retrieval Outcomes

#### **How Qyver Represents Complex Data More Effectively**

Retrieving **high-quality results** from **LLM-powered vector searches** on **complex, embedded data** isn’t easy. The **traditional approach** looks like this:

1. **Embed all entity data** into a **single vector** using a model (e.g., Hugging Face, proprietary models).
2. **Run a vector search** to find the **top X nearest neighbors**.&#x20;
3. **Rerank results** using:

* Additional **contextual filters**
* Information **not captured in the embedding** (e.g., filtering out out-of-stock items)

This method has **serious limitations**:

* **Data loss:** A single embedding struggles to **capture multiple attributes**, reducing relevance.
* **Inefficient reranking:** Reranking with **cross-encoders** adds **latency**, slowing down production.
* **High computational cost:** Comparing all results **pairwise** to refine rankings is resource-intensive.

**Is there a way to achieve high-quality retrieval** without **data loss and latency overhead**? **Yes!**

***

### **A Smarter Approach: Multimodal Embeddings with Qyver**

Instead of embedding **all entity data into a single vector**, **Qyver Spaces** allow you to:

* **Embed each attribute separately** based on its modality.
* **Concatenate these embeddings** into a **multimodal vector**.&#x20;
* **Capture the full complexity** of data **without losing information**.

This results in:

* **Better quality retrieval**—more relevant, complete search results.
* **No need for expensive reranking**—eliminates computational overhead.&#x20;
* **Faster processing**—retrieval is **10x faster**, reducing latency from **hundreds** to **tens of milliseconds**.

#### **Example: Searching Multimodal Data Efficiently**

Consider a **simple dataset** where each paragraph has a **text body** and a **like count**:

* **Paragraph 1:** *"Glorious animals live in the wilderness."* → **likes: 10**
* **Paragraph 2:** *"Growing computation power enables advancements in AI."* → **likes: 75**

Using **Qyver’s Spaces**, we represent each attribute **individually**:

```python
@schema
class Paragraph:
    id: IdField
    body: String
    like_count: Integer
```

Instead of embedding everything into a **single vector**, we use **separate Spaces**:

```python
body_space = TextSimilaritySpace(
    text=paragraph.body, model="sentence-transformers/all-mpnet-base-v2"
)

like_space = NumberSpace(
    number=paragraph.like_count, min_value=0, max_value=100, mode=Mode.MAXIMUM
)
```

These Spaces are then **combined into a single index**:

```python
paragraph_index = Index([body_space, like_space])
```

***

### **Efficient Querying Without Reranking**

Once indexed, we **query for relevance** efficiently:

```python
query = (
    Query(paragraph_index)
    .find(paragraph)
    .similar(body_space.text, Param("query_text"))
)

result = app.query(
    query,
    query_text="What makes the AI industry go forward?",
)
```

**Expected Output:**

| body                                                  | like\_count | id          |
| ----------------------------------------------------- | ----------- | ----------- |
| Growing computation power enables advancements in AI. | 75          | paragraph-2 |
| Glorious animals live in the wilderness.              | 10          | paragraph-1 |

**Why is this better?**

1. **Structured & unstructured data are embedded separately** but stored in a **single searchable index**.
2. **No need for reranking**—**Qyver Spaces** ensure relevance is **captured upfront**.
3. ️**Faster & more accurate**—reduces processing overhead while improving **retrieval precision**.

***

### **Scaling Beyond Simple Use Cases**

Qyver **is designed for real-world complexity**, handling:

* **E-commerce recommendations** → *Blending product descriptions, user preferences, and purchase history.*
* **Search & discovery** → *Multimodal indexing for text, images, and metadata.*
* **LLM-powered applications** → *Enhancing RAG (Retrieval-Augmented Generation) with structured embeddings.*

***

### **Final Thoughts**

Instead of relying on **reranking, filtering, or expensive post-processing**, **Qyver optimizes embeddings upfront**—leading to **faster, higher-quality retrieval**.

**Want to try it?** Check out the notebook and see the power of **multimodal embeddings in action**.

**Like what we’re doing?** Give us a [star](https://github.com/qyverlabs/qyver)!


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://qyverlabs.gitbook.io/qyverlabs-docs/concepts/embeddings.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
