Building a personal knowledge base from AI chats solves a frustrating problem: losing valuable answers to ephemeral conversations. Most search tools rely on keywords, missing the context behind phrases like "blood thinner" or "containerization technology." A new approach combines vector embeddings with PostgreSQL to create semantic search that understands meaning. Here’s how one developer built ChatScroll using Amazon Aurora PostgreSQL with pgvector, ltree, and tsvector—plus Next.js—for a scalable, searchable AI knowledge base.
The frustration behind unsearchable AI conversations
AI assistants deliver precise answers, but those insights often vanish into chat histories that resist recall. Users repeatedly rephrase queries hoping to rediscover past guidance, only to find the same results buried or irrelevant. The core issue isn’t the answers—it’s the lack of structured storage and semantic search. Standard keyword searches fail when context matters more than exact wording, leaving users to manually sift through endless chat logs.
Designing a knowledge system that remembers what words cannot
The solution transforms transient AI responses into persistent, categorized knowledge units called "Scrolls." Instead of relying on exact text matches, the system encodes meaning using 3072-dimensional vector embeddings. When a user saves a Scroll, the answer text is processed by an embedding model, converting semantic content into a numerical representation. This vector is stored alongside the original content in Amazon Aurora PostgreSQL, enabling searches that match intent rather than literal terms.
How pgvector powers semantic understanding in PostgreSQL
Amazon Aurora PostgreSQL with the pgvector extension handles both structured data and vector search efficiently. During content ingestion:
- The AI answer is sent to Google’s gemini-embedding-001 model
- The model returns a 3072-dimension vector embedding
- The vector is stored in Aurora alongside the Scroll’s text and metadata
When a user searches, the query undergoes the same transformation:
- The search phrase converts to a query vector
- Aurora compares the query vector against stored embeddings using cosine distance
- Results rank by semantic similarity with a confidence threshold
A sample SQL query leverages hybrid search:
SELECT * FROM scrolls
WHERE 1 - (embedding <=> $queryVec) > 0.5
ORDER BY embedding <=> $queryVec
LIMIT 5;This approach ensures that searching for "containerization" retrieves Docker-related Scrolls even if the word “Docker” never appears in the content.
Combining three PostgreSQL extensions for full control
Aurora’s flexibility comes from three PostgreSQL extensions working in concert:
- pgvector stores high-dimensional embeddings and computes similarity via cosine distance
- ltree organizes folders as hierarchical paths using dot notation, like
development.tools.docker, enabling fast subtree queries without recursive CTEs - tsvector delivers full-text search with ranking via
ts_rank, which can be combined with vector similarity for hybrid queries
Together, these tools enable precise categorization, fast retrieval, and contextual ranking—critical for a personal knowledge base.
A dual-database strategy for performance and scale
To balance query complexity with chat volume, the architecture separates workloads across two AWS databases:
- Amazon Aurora PostgreSQL hosts Scrolls, folder hierarchies, user profiles, and embeddings
- Amazon DynamoDB stores real-time chat messages with a time-based partition key and 90-day TTL for auto-expiry
Aurora handles structured queries and semantic search, while DynamoDB efficiently streams high-frequency chat events using on-demand billing. This separation prevents Aurora from becoming a performance bottleneck during peak usage.
Testing semantic search in action
Live examples demonstrate the system’s effectiveness:
- Searching "blood thinner" surfaces a saved Scroll about warfarin, despite the absence of the exact phrase
- Queries like "containerization" return Docker-related content, filtered to relevant categories
- Hybrid search combines keyword relevance with semantic context for higher precision
Folder-scoped searches add another layer of accuracy, ensuring results stay within the user’s intended domain.
What’s next for AI-powered knowledge systems
As vector databases and embedding models evolve, personal knowledge platforms will move beyond keyword matching toward true contextual understanding. Developers can replicate this pattern using open-source tools or managed services, blending relational structure with vector search. The future points toward AI systems that don’t just answer questions—they remember, organize, and surface insights precisely when needed.
For those exploring semantic search, Aurora PostgreSQL with pgvector offers a robust foundation—one that transforms fleeting AI chats into lasting, searchable knowledge.
AI summary
Yapay zeka yanıtlarınızı kalıcı hale getirmenin yolu: Amazon Aurora PostgreSQL ve pgvector ile kişisel bilgi tabanı oluşturma. Teknik detaylar ve uygulamalı rehber burada.