Skip to content

Vector Databases

🧬 pgvector (PostgreSQL Extension)

pgvector is a PostgreSQL extension that enables storing and querying high-dimensional vector embeddings inside the database. It allows similarity search using L2 distance, cosine similarity, or inner product β€” making it useful for AI-powered queries and hybrid search.


πŸ”§ Purpose

To support vector similarity search directly within PostgreSQL, allowing you to compare embeddings using standard SQL queries alongside your relational data.


βš™οΈ Installation & Activation

pgvector is installed from source inside the custom PostGIS-based PostgreSQL container.

To enable it within any PostgreSQL database (e.g., client_data), run:

CREATE EXTENSION IF NOT EXISTS vector;

🧠 Use Cases

  • Comparing vector embeddings of documents or spatial entities
  • Hybrid filtering: combining vector search with SQL filters
  • Lightweight RAG and retrieval pipelines without needing external vector DBs
  • Use in conjunction with PostGIS for spatial + semantic search

πŸ“¦ Integration Details

  • Extension Name: vector
  • Database Platform: PostgreSQL 15 + PostGIS
  • Location: Installed in the custom db-pg Docker container
  • Activated for: client_data and any additional per-tenant databases via init_extensions.sql

πŸ“ Example Query

-- Find the top 5 similar items to a given vector
SELECT id, embedding <-> '[0.1, 0.2, 0.3]' AS similarity
FROM embeddings
ORDER BY similarity
LIMIT 5;

🧠 Qdrant (Vector Search Engine)

Qdrant is an open-source vector similarity search engine designed for AI applications. It enables fast and scalable retrieval of high-dimensional vector embeddings β€” supporting use cases like semantic search, RAG pipelines, and intelligent filtering.


🎯 Purpose

To serve as the platform’s core vector database for storing and querying embeddings related to documents, tables, assets, and more.


βš™οΈ Setup & Configuration

  • Docker Service Name: citymap-qdrant
  • Host: localhost
  • Port: 6333 (REST API)

Deployed as a Docker container connected to the shared citymap-network.

Important: When accessing Qdrant from other containers within the same Docker network, use the container name citymap-qdrant as the hostname. Example:

client = QdrantClient(host="citymap-qdrant", port=6333)

🧱 Storage Details

  • Persistence: Configured using Docker volumes to retain data between restarts.
  • Collections: Created dynamically via Qdrant API or SDKs.
  • Payload: Used to store metadata (e.g., table name, column examples, source).

πŸ“‘ Accessing Qdrant

You can interact with Qdrant using REST or gRPC:

  • REST: http://localhost:6333
  • Health check:
curl http://localhost:6333/ready

🧠 Typical Use Cases

  • Vector search over embeddings from documents, tables, and assets
  • Contextual retrieval for LLM-powered applications
  • Hybrid search: combine vector similarity with metadata filters
  • Foundation for Retrieval-Augmented Generation (RAG) and recommendation engines