Contrib Modules¶
The loom.contrib package contains optional integrations that extend Loom's
capabilities. Each module requires its own optional dependency extra.
| Module | Extra | Purpose |
|---|---|---|
contrib.duckdb |
duckdb |
Embedded analytics and vector search |
contrib.lancedb |
lancedb |
ANN vector search via LanceDB |
contrib.redis |
redis |
Production checkpoint persistence |
contrib.rag |
rag |
Social media stream RAG pipeline |
See RAG How-To for the RAG pipeline guide.
Valkey/Redis Store¶
Production checkpoint store using Redis/Valkey. Replaces the default in-memory store for persistent orchestrator checkpoints.
store ¶
Valkey-backed checkpoint store.
Production implementation of CheckpointStore using redis.asyncio (redis-py). The redis-py client library works unchanged with Valkey. Install with: pip install loom[redis]
Connection defaults
redis://redis:6379 — matches the Docker Compose / k8s service name. For local dev: redis://localhost:6379
RedisCheckpointStore ¶
Bases: CheckpointStore
Valkey-backed checkpoint store (via redis-py client).
Thin wrapper around redis.asyncio that implements the CheckpointStore interface. Handles connection lifecycle and TTL-based expiry natively. The redis-py client works unchanged with Valkey.
Source code in src/loom/contrib/redis/store.py
DuckDB Query Backend¶
Action-dispatch query backend for DuckDB. Supports full-text search, filtering, statistics, single-row get, and vector similarity search.
query_backend ¶
Generic DuckDB query and analytics backend for Loom workflows.
Provides a configurable action-dispatch query backend against any DuckDB table. Supports full-text search (via DuckDB FTS), attribute filtering, aggregate statistics, single-record retrieval, and vector similarity search.
Subclasses configure domain-specific behavior by passing constructor
parameters (table name, columns, filter definitions, etc.) rather than
overriding methods. For advanced customisation, override _get_handlers
to add or replace action handlers.
Example worker config::
processing_backend: "myapp.backends.MyQueryBackend"
backend_config:
db_path: "/tmp/workspace/data.duckdb"
See Also
loom.worker.processor.SyncProcessingBackend -- base class for sync backends loom.contrib.duckdb.DuckDBViewTool -- LLM-callable view tool loom.contrib.duckdb.DuckDBVectorTool -- LLM-callable vector search tool
DuckDBQueryError ¶
Bases: BackendError
Raised when a DuckDB query operation fails.
Wraps underlying DuckDB exceptions with a descriptive message
and the original cause attached via __cause__.
DuckDBQueryBackend ¶
DuckDBQueryBackend(db_path: str = '/tmp/workspace/data.duckdb', *, table_name: str = 'documents', result_columns: list[str] | None = None, json_columns: set[str] | None = None, id_column: str = 'id', full_text_column: str | None = 'full_text', fts_fields: str = 'full_text,summary', filter_fields: dict[str, str] | None = None, stats_groups: set[str] | None = None, stats_aggregates: list[str] | None = None, default_order_by: str = 'rowid', embedding_column: str = 'embedding')
Bases: SyncProcessingBackend
Generic action-dispatch query backend for DuckDB tables.
Opens a read-only connection to the DuckDB database and dispatches
to the appropriate query handler based on the action field in
the payload.
All queries use parameterized statements to prevent SQL injection.
Results from search/filter actions exclude large content columns
(configurable via full_text_column) to keep messages small.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
db_path
|
str
|
Path to the DuckDB database file. |
'/tmp/workspace/data.duckdb'
|
table_name
|
str
|
Table to query. |
'documents'
|
result_columns
|
list[str] | None
|
Columns returned in search/filter results. |
None
|
json_columns
|
set[str] | None
|
Set of column names containing JSON strings that should be parsed back into Python objects on read. |
None
|
id_column
|
str
|
Primary key column name for the |
'id'
|
full_text_column
|
str | None
|
Large content column included only in
|
'full_text'
|
fts_fields
|
str
|
Comma-separated field names for DuckDB FTS
|
'full_text,summary'
|
filter_fields
|
dict[str, str] | None
|
Mapping of payload field names to SQL condition
templates. Example: |
None
|
stats_groups
|
set[str] | None
|
Set of column names allowed as |
None
|
stats_aggregates
|
list[str] | None
|
SQL aggregate expressions for the stats
query. Defaults to |
None
|
default_order_by
|
str
|
ORDER BY clause for filter results. |
'rowid'
|
embedding_column
|
str
|
Column name for vector embeddings used
in the |
'embedding'
|
Source code in src/loom/contrib/duckdb/query_backend.py
process_sync ¶
Dispatch a query action against the DuckDB database.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
payload
|
dict[str, Any]
|
Must contain |
required |
config
|
dict[str, Any]
|
Worker config dict. May include |
required |
Returns:
| Type | Description |
|---|---|
dict[str, Any]
|
A dict with |
dict[str, Any]
|
|
Raises:
| Type | Description |
|---|---|
ValueError
|
If the action is unknown. |
DuckDBQueryError
|
If the database query fails. |
Source code in src/loom/contrib/duckdb/query_backend.py
DuckDB View Tool¶
Read-only DuckDB view exposed as an LLM-callable tool. Workers can query structured data during processing.
view_tool ¶
DuckDB view tool — exposes a DuckDB view as an LLM-callable tool.
When configured in a worker's knowledge_silos, this tool lets the LLM query a read-only DuckDB view during reasoning. The LLM can search (full-text) or list records from the view.
Example knowledge_silos config::
knowledge_silos:
- name: "catalog"
type: "tool"
provider: "loom.contrib.duckdb.DuckDBViewTool"
config:
db_path: "/tmp/workspace/data.duckdb"
view_name: "summaries"
description: "Search and browse record summaries"
max_results: 20
The tool auto-introspects the view's columns via DESCRIBE to build its JSON Schema definition. Queries use parameterized SQL to prevent injection.
DuckDBViewTool ¶
DuckDBViewTool(db_path: str, view_name: str, description: str = 'Query a database view', max_results: int = 20)
Bases: SyncToolProvider
Expose a DuckDB view as an LLM-callable search/list tool.
The tool dynamically introspects the view's column schema at instantiation time and builds a JSON Schema tool definition that the LLM can call.
Supports two operations
search: Full-text ILIKE search across all text columnslist: List recent records with optional column filters
All queries are parameterized and results are capped at max_results.
Source code in src/loom/contrib/duckdb/view_tool.py
get_definition ¶
Build JSON Schema tool definition from view columns.
Source code in src/loom/contrib/duckdb/view_tool.py
execute_sync ¶
Execute a query against the DuckDB view.
Source code in src/loom/contrib/duckdb/view_tool.py
DuckDB Vector Tool¶
Semantic similarity search via DuckDB embeddings, exposed as an LLM tool.
vector_tool ¶
DuckDB vector similarity search tool for LLM function-calling.
Uses embedding vectors stored in DuckDB to find semantically similar
records. Query text is embedded via Ollama at search time, then
compared against stored vectors using DuckDB's list_cosine_similarity.
Example knowledge_silos config::
knowledge_silos:
- name: "similar_items"
type: "tool"
provider: "loom.contrib.duckdb.DuckDBVectorTool"
config:
db_path: "/tmp/workspace/data.duckdb"
table_name: "documents"
result_columns: ["id", "title", "summary", "created_at"]
embedding_column: "embedding"
tool_name: "find_similar"
description: "Find records semantically similar to a query"
embedding_model: "nomic-embed-text"
See Also
loom.worker.embeddings -- OllamaEmbeddingProvider loom.worker.tools -- SyncToolProvider base class
DuckDBVectorTool ¶
DuckDBVectorTool(db_path: str, table_name: str = 'documents', result_columns: list[str] | None = None, embedding_column: str = 'embedding', tool_name: str = 'find_similar', description: str = 'Find semantically similar records', embedding_model: str = 'nomic-embed-text', ollama_url: str | None = None, max_results: int = 10)
Bases: SyncToolProvider
Semantic similarity search over DuckDB vector embeddings.
Generates a query embedding via Ollama, then uses DuckDB's
list_cosine_similarity function to find the most similar
records by their stored embedding vectors.
Only records with non-null embeddings are searched.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
db_path
|
str
|
Path to the DuckDB database file. |
required |
table_name
|
str
|
Table containing the records and embeddings. |
'documents'
|
result_columns
|
list[str] | None
|
Columns to include in results. If None,
introspects the table schema at first use, excluding
the embedding column and any column named |
None
|
embedding_column
|
str
|
Name of the column storing embedding vectors. |
'embedding'
|
tool_name
|
str
|
Name exposed in the LLM tool definition. |
'find_similar'
|
description
|
str
|
Description exposed in the LLM tool definition. |
'Find semantically similar records'
|
embedding_model
|
str
|
Ollama model name for embedding generation. |
'nomic-embed-text'
|
ollama_url
|
str | None
|
Optional custom Ollama server URL. |
None
|
max_results
|
int
|
Hard cap on returned results. |
10
|
Source code in src/loom/contrib/duckdb/vector_tool.py
result_columns
property
¶
Return result columns, introspecting on first access if needed.
get_definition ¶
Return tool definition for LLM function-calling.
Source code in src/loom/contrib/duckdb/vector_tool.py
execute_sync ¶
Embed the query and search for similar records.
Source code in src/loom/contrib/duckdb/vector_tool.py
LanceDB Vector Store¶
ANN vector storage and search via LanceDB. Faster than DuckDB for large
datasets. Implements the VectorStore ABC.
store ¶
LanceDB-backed vector store for embedded text chunks.
Stores EmbeddedChunk records in a LanceDB table with native vector columns. Supports: - Batch insertion of TextChunk objects (with embedding generation) - Pre-embedded chunk insertion - Approximate Nearest Neighbor (ANN) similarity search - Metadata filtering (e.g. by channel_id) - Basic CRUD (get, delete by chunk_id)
Uses Loom's OllamaEmbeddingProvider for query embedding generation.
LanceDB provides ANN indexing for faster search over large datasets compared to exact cosine similarity in DuckDB.
LanceDBVectorStore ¶
LanceDBVectorStore(db_path: str = '/tmp/rag-vectors.lance', embedding_model: str = 'nomic-embed-text', ollama_url: str = 'http://localhost:11434')
Bases: VectorStore
Embedded vector store backed by LanceDB.
Usage::
store = LanceDBVectorStore("/tmp/rag-vectors.lance")
store.initialize()
# Embed and store chunks
store.add_chunks(chunks)
# Search
results = store.search("earthquake damage", limit=5)
store.close()
Source code in src/loom/contrib/lancedb/store.py
initialize ¶
Open or create the LanceDB database and table.
Source code in src/loom/contrib/lancedb/store.py
close ¶
add_chunks ¶
Embed and insert TextChunk objects. Returns count of inserted rows.
Source code in src/loom/contrib/lancedb/store.py
add_embedded_chunks ¶
Insert pre-embedded chunks (no embedding generation needed).
Source code in src/loom/contrib/lancedb/store.py
search ¶
search(query: str, limit: int = 10, min_score: float = 0.0, channel_ids: list[int] | None = None) -> list[SimilarityResult]
Semantic similarity search using LanceDB ANN.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
query
|
str
|
Natural language query (embedded via Ollama) |
required |
limit
|
int
|
Maximum results to return |
10
|
min_score
|
float
|
Minimum cosine similarity threshold |
0.0
|
channel_ids
|
list[int] | None
|
Optional filter by source channel |
None
|
Returns:
| Type | Description |
|---|---|
list[SimilarityResult]
|
List of SimilarityResult sorted by descending similarity |
Source code in src/loom/contrib/lancedb/store.py
count ¶
get ¶
Retrieve a single embedded chunk by ID.
Source code in src/loom/contrib/lancedb/store.py
delete ¶
Delete a chunk by ID. Returns True if a row was deleted.
Source code in src/loom/contrib/lancedb/store.py
delete_by_source ¶
Delete all chunks for a given source post. Returns count.
Source code in src/loom/contrib/lancedb/store.py
stats ¶
Return summary statistics about the store.
Source code in src/loom/contrib/lancedb/store.py
LanceDB Vector Tool¶
Semantic similarity search via LanceDB, exposed as an LLM tool.
tool ¶
LanceDB vector similarity search tool for LLM function-calling.
Uses embedding vectors stored in LanceDB to find semantically similar records. Query text is embedded via Ollama at search time, then compared against stored vectors using LanceDB's ANN search.
Example knowledge_silos config::
knowledge_silos:
- name: "similar_items"
type: "tool"
provider: "loom.contrib.lancedb.LanceDBVectorTool"
config:
db_path: "/tmp/workspace/rag-vectors.lance"
table_name: "rag_chunks"
tool_name: "find_similar"
description: "Find records semantically similar to a query"
embedding_model: "nomic-embed-text"
See Also
loom.worker.embeddings -- OllamaEmbeddingProvider loom.worker.tools -- SyncToolProvider base class
LanceDBVectorTool ¶
LanceDBVectorTool(db_path: str, table_name: str = 'rag_chunks', vector_column: str = 'vector', result_columns: list[str] | None = None, tool_name: str = 'find_similar', description: str = 'Find semantically similar records', embedding_model: str = 'nomic-embed-text', ollama_url: str | None = None, max_results: int = 10)
Bases: SyncToolProvider
Semantic similarity search over LanceDB vector embeddings.
Generates a query embedding via Ollama, then uses LanceDB's ANN search to find the most similar records by their stored vectors.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
db_path
|
str
|
Path to the LanceDB database directory. |
required |
table_name
|
str
|
Table containing the records and embeddings. |
'rag_chunks'
|
vector_column
|
str
|
Name of the column storing embedding vectors. |
'vector'
|
result_columns
|
list[str] | None
|
Columns to include in results. If None, returns chunk_id, text, source_channel_id, source_global_id. |
None
|
tool_name
|
str
|
Name exposed in the LLM tool definition. |
'find_similar'
|
description
|
str
|
Description exposed in the LLM tool definition. |
'Find semantically similar records'
|
embedding_model
|
str
|
Ollama model name for embedding generation. |
'nomic-embed-text'
|
ollama_url
|
str | None
|
Optional custom Ollama server URL. |
None
|
max_results
|
int
|
Hard cap on returned results. |
10
|
Source code in src/loom/contrib/lancedb/tool.py
get_definition ¶
Return tool definition for LLM function-calling.
Source code in src/loom/contrib/lancedb/tool.py
execute_sync ¶
Embed the query and search for similar records.