Technical Overview
Core Technologies
- Language Model: Claude Sonnet for all text generation and responses
- Vector Database: PostgreSQL 16.10 with pgvector extension for semantic search
- Embeddings: SentenceTransformer all-MiniLM-L6-v2 (384 dimensions) with cosine similarity
- Vision Analysis: Claude Vision API for automated image description and metadata extraction (all 2,114 images)
- Voice Generation: ElevenLabs voice cloning for natural speech interface
- Programming Paradigm: Natural Language Programming with Claude Code
- Backend: Python 3.8+ with Flask, psycopg2, and async processing
- Search Strategy: Multi-tier semantic search with contextual learning and weighted precedence
RAG Pipeline Architecture
Pipeline Steps in Detail
Step 1: Message Analysis
The system performs semantic parsing to extract meaningful terms:
- Stop word filtering (removes 'the', 'and', etc.)
- Meaningful word extraction (words > 2 characters)
- Important term identification (words > 4 characters)
Step 2: Multi-Tier Search Strategy
Three-tier approach for comprehensive retrieval:
- Tier 1: Full context search (top 10 meaningful words)
- Tier 2: Two-word combinations for precision
- Tier 3: Individual important terms
Step 3: Evidence Extraction
Searches across multiple data sources simultaneously:
- Email database (semantic search)
- PDF documents (full-text search)
- Image database (AI vision + metadata)
- User feedback (exact match)
Step 4: Context Assembly
Builds comprehensive context with priority ordering:
- User corrections and verified facts
- Complete conversation history (no limits)
- Retrieved email evidence
- Domain-specific context (legal/estate)
Step 5: Response Generation
AI model generates response with personality preservation:
- Dad's communication style template
- Context-aware response generation
- Source attribution for transparency
- Feedback integration for improvement
PostgreSQL Database Architecture
🗄️ Production Database Schema
AI-Dad uses PostgreSQL 16.10 with the pgvector extension to store and search 58,547+ documents with 384-dimensional semantic vectors.
Key Database Features
- ✅ pgvector Extension: Native vector similarity search with cosine distance operator (<=>)
- ✅ IVFFlat Index: Inverted file flat index for fast approximate nearest neighbor search
- ✅ JSONB Metadata: Flexible schema for storing email headers, image analysis, Q&A context
- ✅ Concurrent-Safe: MVCC architecture allows parallel builds and real-time queries
- ✅ Contextual Feedback: search_sessions table tracks user feedback per query context
Vectorization & Semantic Search
🎯 How Text Becomes Searchable Vectors
Every piece of text (emails, documents, Q&A pairs) is transformed into a 384-dimensional mathematical representation that captures semantic meaning using the SentenceTransformer model.
The Vectorization Pipeline
"Dad always said to read the fine print in contracts"
[Dad] [always] [said] [read] [fine] [print] [contracts]
all-MiniLM-L6-v2
Transformer Neural Network
[0.12, -0.45, 0.78, ..., 0.23]
Semantic fingerprint
How Vector Search Works
User Query:
"What did Dad say about contract negotiations?"
Semantic Matching Process:
- Query converted to vector
- Cosine similarity calculated against all stored vectors
- Nearest neighbors retrieved (highest similarity)
- Results ranked by relevance + user annotation boost
Why Vectors Are Powerful
Example: Semantic Similarity in Action
When you search for "contract negotiation", the system finds related content even with different wording:
- ✅ "agreement discussions" (0.89 similarity)
- ✅ "deal terms" (0.85 similarity)
- ✅ "legal negotiations" (0.92 similarity)
- ✅ "bargaining position" (0.81 similarity)
The numbers represent cosine similarity scores (0-1), where 1 is identical meaning.
Socratic RAG Optimization - The "Big-R"
🎓 What Makes It "Socratic"?
The system learns through dialogue analysis, extracting wisdom not just from what was said, but how it was communicated across 49,222 email conversations and 4,838 Q&A pairs.
Implementation Details
Multi-Vector Indexing Strategy
| Vector Type | Purpose | Weight |
|---|---|---|
| Full Thread | Complete conversational context | 1.0 |
| Question Vectors | Query matching | 1.3 |
| Answer Vectors | Response retrieval | 1.5 |
| Combined Q&A | Semantic relationship | 1.2 |
Thread Preservation Benefits
- ✅ Maintains temporal context and conversation flow
- ✅ Preserves Dad's communication patterns and style
- ✅ Enables understanding of evolving topics
- ✅ Captures nuanced legal advice in context
- ✅ Retains emotional tone and personal touches
Learning & Optimization Strategy
Three-Phase Learning Algorithm
Phase 1: Validation-Based Optimization
Phase 2: User Feedback Integration
- Track user reactions (👍/👎) to responses
- Store corrections for future reference
- Increase weights for validated sources
- Decrease weights for contradicting sources
Phase 3: Advanced Pattern Recognition
- Intersection Ranking: Higher scores for multi-vector hits
- Thread Coherence: Complete threads weighted over fragments
- Temporal Proximity: Recent interactions prioritized
- Legal Patterns: Domain-specific boost for legal queries
Configuration Management
Dynamic Weight Configuration
{
"precedence": {
"dad_emails": {
"weight": 1.5,
"priority": 1,
"description": "Primary source - highest authenticity"
},
"qa_pairs": {
"weight": 1.3,
"priority": 2,
"description": "Validated conversation patterns"
},
"pdf_documents": {
"weight": 1.0,
"priority": 3,
"description": "Legal and financial documents"
}
},
"learning_parameters": {
"learning_rate": 0.01,
"momentum": 0.9,
"optimization_metric": "retrieval_accuracy"
}
}
System Metrics & Performance
Data Source Priority Matrix
| Priority | Data Source | Count | Weight | Search Method | Use Case |
|---|---|---|---|---|---|
| 1 | Context Files | 5 | 4.00 | Domain-Specific | Estate, Family, Professional |
| 2 | User Corrections | 113 | 3.00 | Exact Match | Verified Facts |
| 3 | User Annotations | Growing | 2.50 | Image Description | User-Provided Context |
| 4 | Chat History | 283 | N/A | Complete History | Conversation Context |
| 5 | Email Database | 49,223 | 2.20 | Semantic (pgvector) | Primary Evidence |
| 6 | Vision AI Images | 2,114 | 1.86 | Claude Vision Analysis | Visual Memories |
| 7 | Q&A Pairs | 4,838 | 1.70 | Semantic (pgvector) | Socratic Patterns |
| 8 | Attachments | 521 | 1.25 | Full-Text (PostgreSQL) | Legal/Financial Docs |
| 9 | Referenced URLs | 1,450 | 0.85 | Content Extraction | External Resources |
| 10 | AI Responses | 143 | 0.50 | Previous AI Output | Response History |
Performance Optimizations
- ✅ PostgreSQL pgvector cosine similarity with optimized indexes
- ✅ Intelligent search multipliers (n_results * 2 for diversity)
- ✅ Sub-second queries: 0.8-0.9s typical response time
- ✅ Async processing for parallel searches across content types
- ✅ Complete conversation history (no limits) with context-aware ranking
- ✅ Real-time file watcher for incremental updates
Enhanced Features (October 2025)
- Contextual Learning: Query-context-specific feedback system that learns what's relevant for each type of question (e.g., Helen photos vs. legal documents) without affecting unrelated queries
- Vision AI Integration: Claude Vision API automatically analyzes all 2,114 images, generating searchable descriptions, metadata, and semantic embeddings
- Smart Image Search: Intelligent filename parsing with user annotation priority (+30 score boost) and context-aware ranking
- Annotation Interface: In-chat image annotation with instant PostgreSQL updates and delete/insert strategy for seamless editing
- Review Query Detection: Automatically prioritizes full email content (1.5x boost) over annotations when user asks to "review" or "summarize" documents
- Adaptive Search: Dynamic content-type balancing ensures diverse results (emails, images, Q&A, PDFs) in every response
Future Enhancements
Planned Improvements
- Automated A/B Testing: Continuous weight optimization through experimentation
- Conversational Templates: Pattern mining for common query types
- Temporal Weighting: Dynamic adjustment based on recency and relevance
- Picture Similarity Searching: Advanced image search using visual similarity algorithms
- Multi-User Support: Personalized interactions for different family members
- Voice Integration: Natural speech interface using ElevenLabs voice cloning (work in progress - aging my voice clone towards my father's voice)
- Open Framework Release: After refining the system, will make the general framework available for others to preserve their own family legacies