🏗️Basic Building Blocks

Learn the Qyver Lingo

Introduction

Qyver’s framework is built on core components: @schema, Source, Spaces, Index, Query, and Executor. These elements allow you to construct a modular, customizable system tailored to your specific use cases.

You begin by defining how your embeddings should represent your data, guiding the system’s setup and allowing you to fine-tune each module before executing queries. You can dynamically adjust query weights for different scenarios, such as emphasizing user interests or recent items.

This modular approach separates query definition from execution, enabling you to reuse the same query across different environments without reimplementation. By leveraging @schema, Source, Spaces, Index, and Event, you can structure queries that adapt seamlessly to different Executors.

Qyver also simplifies deployment transitions, from in-memory testing to batch processing or real-time pipelines, making it easy to experiment with embeddings and retrieval strategies while retaining full control over index creation.

Let’s explore these building blocks in detail.

Turning Classes into Schemas

Once you’ve loaded your data (via JSON, a pandas DataFrame, etc.), the next step is to define a Schema that describes its structure.

Qyver makes this easy with the Schema decorator, which marks a class as a structured entity. Schemas translate data into searchable embeddings, allowing for efficient retrieval.

Example:

class ParagraphSchema(sl.Schema):
    body: String
    id: IdField

With your Schemas set up, you’re ready to move on to embedding and querying—this is where Qyver’s modular approach shines. The framework is designed to let you customize how your system processes and searches data to meet your specific needs.

Defining Embeddings with Spaces

Spaces is a declarative module designed to control how data attributes are embedded.

At both ingestion and query time, Spaces encapsulate vector creation logic, enabling tailored embeddings based on data type and retrieval needs.

Key Dimensions of Spaces:

Input Type: What type of data the Space supports (e.g., text, timestamps, numbers, categories).
Representation Type: Whether the Space represents similarity (e.g., TextSimilaritySpace) or scale (e.g., NumberSpace).

Choosing the Right Space

TextSimilaritySpace / CategoricalSimilaritySpace → for strings
RecencySpace → for timestamps
NumberSpace → for integers/floats

Each Space captures a distinct semantic or structural property (e.g., title, review count) and allows weighting before concatenation into a single multimodal vector. This approach ensures better search quality without costly re-ranking or post-processing.

Example:

relevance_space = sl.TextSimilaritySpace(text=paragraph.body, model="Snowflake/snowflake-arctic-embed-s")

Indexing for Efficient Queries

Qyver’s Index module allows you to group Spaces for optimized retrieval.

Example:

paragraph_index = sl.Index(relevance_space)

Structuring Queries

Before execution, you need to define how queries should interact with your index.

Key Query Components:

Query: Defines the index to search and any associated parameters.
.find(): Specifies the data type being retrieved.
.similar(): Defines similarity logic for ranking results.
.with_vector(): (Optional) Allows vector-based search on specific elements.

Example:

query = (
    sl.Query(paragraph_index)
    .find(paragraph)
    .similar(relevance_space.text, Param("query_text"))
)

Connecting Data with Source & Executor

To integrate your data with Qyver’s framework, use Source to link it to the schema.

source: InMemorySource = sl.InMemorySource(paragraph)

Then, set up an Executor, which connects the indexed data with the defined Spaces and processes queries efficiently.

executor = sl.InMemoryExecutor(sources=[source], indices=[paragraph_index])
app = executor.run()

Experimenting with Sample Data

Now, let’s insert some test data.

source.put([{"id": "happy_person", "body": "That is a happy person"}])
source.put([{"id": "happy_dog", "body": "That is a very happy dog"}])
source.put([{"id": "sunny_day", "body": "Today is a sunny day"}])

Run a query to see relevant results:

result = app.query(query, query_text="This is a happy person")
result.to_pandas()

Sample Output:

body

That is a very happy person

happy_person

That is a happy dog

happy_dog

Today is a sunny day

sunny_day

Changing the query further refines the output:

result = app.query(query, query_text="This is a happy dog")
result.to_pandas()

Updated Output:

body

That is a happy dog

happy_dog

That is a very happy person

happy_person

Today is a sunny day

sunny_day

Final Thoughts

Qyver’s framework empowers you to create a modular, flexible system that adapts to different use cases and deployments. By separating query logic from execution, you eliminate the need for repeated reimplementation, reranking, and post-processing—saving both time and resources.

Try it out and experiment with your own data. If you like what Qyver offers, give us a star!

PreviousSetup Qyver NextOverview

Last updated 5 months ago