AI-Dad Technical Architecture

Deep Dive into the RAG Implementation & Socratic Optimization

Technical Overview

49,223
Total Emails
58,547
Total Content Items
4,838
Q&A Pairs
384
Embedding Dimensions

Core Technologies

  • Language Model: Claude Sonnet for all text generation and responses
  • Vector Database: PostgreSQL 16.10 with pgvector extension for semantic search
  • Embeddings: SentenceTransformer all-MiniLM-L6-v2 (384 dimensions) with cosine similarity
  • Vision Analysis: Claude Vision API for automated image description and metadata extraction (all 2,114 images)
  • Voice Generation: ElevenLabs voice cloning for natural speech interface
  • Programming Paradigm: Natural Language Programming with Claude Code
  • Backend: Python 3.8+ with Flask, psycopg2, and async processing
  • Search Strategy: Multi-tier semantic search with contextual learning and weighted precedence

RAG Pipeline Architecture

User Input
Preprocessing
Multi-Search
Ranking
Generation

Pipeline Steps in Detail

Step 1: Message Analysis

The system performs semantic parsing to extract meaningful terms:

  • Stop word filtering (removes 'the', 'and', etc.)
  • Meaningful word extraction (words > 2 characters)
  • Important term identification (words > 4 characters)

Step 2: Multi-Tier Search Strategy

Three-tier approach for comprehensive retrieval:

  • Tier 1: Full context search (top 10 meaningful words)
  • Tier 2: Two-word combinations for precision
  • Tier 3: Individual important terms

Step 3: Evidence Extraction

Searches across multiple data sources simultaneously:

  • Email database (semantic search)
  • PDF documents (full-text search)
  • Image database (AI vision + metadata)
  • User feedback (exact match)

Step 4: Context Assembly

Builds comprehensive context with priority ordering:

  1. User corrections and verified facts
  2. Complete conversation history (no limits)
  3. Retrieved email evidence
  4. Domain-specific context (legal/estate)

Step 5: Response Generation

AI model generates response with personality preservation:

  • Dad's communication style template
  • Context-aware response generation
  • Source attribution for transparency
  • Feedback integration for improvement

PostgreSQL Database Architecture

🗄️ Production Database Schema

AI-Dad uses PostgreSQL 16.10 with the pgvector extension to store and search 58,547+ documents with 384-dimensional semantic vectors.

-- Main documents table with pgvector integration CREATE TABLE documents ( id SERIAL PRIMARY KEY, document_id VARCHAR(255) UNIQUE NOT NULL, content_type VARCHAR(50) NOT NULL, source_path TEXT, content TEXT NOT NULL, embedding vector(384), -- pgvector extension metadata JSONB, created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP, updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP ); -- Index for fast vector similarity search (cosine distance) CREATE INDEX documents_embedding_idx ON documents USING ivfflat (embedding vector_cosine_ops); -- Index for content type filtering CREATE INDEX documents_content_type_idx ON documents(content_type); -- Search sessions for contextual learning CREATE TABLE search_sessions ( session_id VARCHAR(255) PRIMARY KEY, username VARCHAR(100) NOT NULL, search_query TEXT, search_results JSONB, feedback_data JSONB, context_data JSONB, created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP );

Key Database Features

  • pgvector Extension: Native vector similarity search with cosine distance operator (<=>)
  • IVFFlat Index: Inverted file flat index for fast approximate nearest neighbor search
  • JSONB Metadata: Flexible schema for storing email headers, image analysis, Q&A context
  • Concurrent-Safe: MVCC architecture allows parallel builds and real-time queries
  • Contextual Feedback: search_sessions table tracks user feedback per query context

Vectorization & Semantic Search

🎯 How Text Becomes Searchable Vectors

Every piece of text (emails, documents, Q&A pairs) is transformed into a 384-dimensional mathematical representation that captures semantic meaning using the SentenceTransformer model.

The Vectorization Pipeline

1. Original Text

"Dad always said to read the fine print in contracts"

2. Tokenization

[Dad] [always] [said] [read] [fine] [print] [contracts]

3. Embedding Model

all-MiniLM-L6-v2
Transformer Neural Network

4. 384-D Vector

[0.12, -0.45, 0.78, ..., 0.23]
Semantic fingerprint

How Vector Search Works

User Query:

"What did Dad say about contract negotiations?"

Semantic Matching Process:

  1. Query converted to vector
  2. Cosine similarity calculated against all stored vectors
  3. Nearest neighbors retrieved (highest similarity)
  4. Results ranked by relevance + user annotation boost

Why Vectors Are Powerful

🎯
Semantic Understanding
🔍
Finds Related Concepts
Fast Similarity Search
🌐
Language Agnostic

Example: Semantic Similarity in Action

When you search for "contract negotiation", the system finds related content even with different wording:

  • ✅ "agreement discussions" (0.89 similarity)
  • ✅ "deal terms" (0.85 similarity)
  • ✅ "legal negotiations" (0.92 similarity)
  • ✅ "bargaining position" (0.81 similarity)

The numbers represent cosine similarity scores (0-1), where 1 is identical meaning.

Socratic RAG Optimization - The "Big-R"

🎓 What Makes It "Socratic"?

The system learns through dialogue analysis, extracting wisdom not just from what was said, but how it was communicated across 49,222 email conversations and 4,838 Q&A pairs.

Implementation Details

# Socratic Extraction Process def extract_socratic_patterns(email_thread): """ Extract Q&A patterns from email conversations preserving full context and communication style """ # Parse complete thread structure thread_context = parse_email_thread(email_thread) # Generate multiple vectors for comprehensive search vectors = { 'full_thread': embed(thread_context.full_text), 'questions': [embed(q) for q in thread_context.questions], 'answers': [embed(a) for a in thread_context.answers], 'combined': embed(thread_context.qa_summary) } return vectors

Multi-Vector Indexing Strategy

Vector Type Purpose Weight
Full Thread Complete conversational context 1.0
Question Vectors Query matching 1.3
Answer Vectors Response retrieval 1.5
Combined Q&A Semantic relationship 1.2

Thread Preservation Benefits

  • ✅ Maintains temporal context and conversation flow
  • ✅ Preserves Dad's communication patterns and style
  • ✅ Enables understanding of evolving topics
  • ✅ Captures nuanced legal advice in context
  • ✅ Retains emotional tone and personal touches

Learning & Optimization Strategy

Three-Phase Learning Algorithm

Phase 1: Validation-Based Optimization

# Use Q&A pairs as ground truth for optimization def optimize_weights(): success_metrics = [] for qa_pair in socratic_pairs: # Search using just the question results = search_database(qa_pair.question) # Check retrieval accuracy if qa_pair.answer in results.top_5: success_metrics.append({ 'source': results.source_type, 'rank': results.rank, 'score': results.similarity_score }) # Adjust weights based on performance update_source_weights(success_metrics)

Phase 2: User Feedback Integration

  • Track user reactions (👍/👎) to responses
  • Store corrections for future reference
  • Increase weights for validated sources
  • Decrease weights for contradicting sources

Phase 3: Advanced Pattern Recognition

  • Intersection Ranking: Higher scores for multi-vector hits
  • Thread Coherence: Complete threads weighted over fragments
  • Temporal Proximity: Recent interactions prioritized
  • Legal Patterns: Domain-specific boost for legal queries

Configuration Management

Dynamic Weight Configuration

{
  "precedence": {
    "dad_emails": {
      "weight": 1.5,
      "priority": 1,
      "description": "Primary source - highest authenticity"
    },
    "qa_pairs": {
      "weight": 1.3,
      "priority": 2,
      "description": "Validated conversation patterns"
    },
    "pdf_documents": {
      "weight": 1.0,
      "priority": 3,
      "description": "Legal and financial documents"
    }
  },
  "learning_parameters": {
    "learning_rate": 0.01,
    "momentum": 0.9,
    "optimization_metric": "retrieval_accuracy"
  }
}

System Metrics & Performance

49,223
Individual Emails
521
Attachments
2,114
Vision-Analyzed Images
1,450
Referenced URLs

Data Source Priority Matrix

Priority Data Source Count Weight Search Method Use Case
1 Context Files 5 4.00 Domain-Specific Estate, Family, Professional
2 User Corrections 113 3.00 Exact Match Verified Facts
3 User Annotations Growing 2.50 Image Description User-Provided Context
4 Chat History 283 N/A Complete History Conversation Context
5 Email Database 49,223 2.20 Semantic (pgvector) Primary Evidence
6 Vision AI Images 2,114 1.86 Claude Vision Analysis Visual Memories
7 Q&A Pairs 4,838 1.70 Semantic (pgvector) Socratic Patterns
8 Attachments 521 1.25 Full-Text (PostgreSQL) Legal/Financial Docs
9 Referenced URLs 1,450 0.85 Content Extraction External Resources
10 AI Responses 143 0.50 Previous AI Output Response History

Performance Optimizations

  • ✅ PostgreSQL pgvector cosine similarity with optimized indexes
  • ✅ Intelligent search multipliers (n_results * 2 for diversity)
  • ✅ Sub-second queries: 0.8-0.9s typical response time
  • ✅ Async processing for parallel searches across content types
  • ✅ Complete conversation history (no limits) with context-aware ranking
  • ✅ Real-time file watcher for incremental updates

Enhanced Features (October 2025)

  • Contextual Learning: Query-context-specific feedback system that learns what's relevant for each type of question (e.g., Helen photos vs. legal documents) without affecting unrelated queries
  • Vision AI Integration: Claude Vision API automatically analyzes all 2,114 images, generating searchable descriptions, metadata, and semantic embeddings
  • Smart Image Search: Intelligent filename parsing with user annotation priority (+30 score boost) and context-aware ranking
  • Annotation Interface: In-chat image annotation with instant PostgreSQL updates and delete/insert strategy for seamless editing
  • Review Query Detection: Automatically prioritizes full email content (1.5x boost) over annotations when user asks to "review" or "summarize" documents
  • Adaptive Search: Dynamic content-type balancing ensures diverse results (emails, images, Q&A, PDFs) in every response

Future Enhancements

Planned Improvements

  • Automated A/B Testing: Continuous weight optimization through experimentation
  • Conversational Templates: Pattern mining for common query types
  • Temporal Weighting: Dynamic adjustment based on recency and relevance
  • Picture Similarity Searching: Advanced image search using visual similarity algorithms
  • Multi-User Support: Personalized interactions for different family members
  • Voice Integration: Natural speech interface using ElevenLabs voice cloning (work in progress - aging my voice clone towards my father's voice)
  • Open Framework Release: After refining the system, will make the general framework available for others to preserve their own family legacies