1. Backend Framework
1.1. Overview
The backend is built with FastAPI, providing a RESTful API for the AI-powered pregnancy health assistant. It integrates with Google Gemini AI for natural language processing and intelligent responses.
1.2. Technology Stack
Framework: FastAPI (Python web framework)
AI Service: Google Gemini 2.0 Flash Lite
Embedding Model: Google Gemini text-embedding-004
Server: Uvicorn (ASGI server)
Data Processing: NumPy (for vector similarity calculations)
1.3. Project Structure
backend/
├── main.py # Application entry point
├── api/ # API routes
│ ├── __init__.py # Router aggregation
│ └── chat.py # Chat endpoint
├── core/ # Core configuration and dependencies
│ ├── config.py # Application configuration
│ └── dependencies.py # Dependency injection
├── services/ # Business logic services
│ ├── embedding_service.py # Text embedding generation
│ ├── vector_search.py # Vector similarity search
│ └── knowledge_base.py # Knowledge base management
├── schemas/ # Pydantic models
│ ├── requests.py # Request models
│ └── responses.py # Response models
├── data/ # Data storage
│ └── embeddings.json # Pre-generated page embeddings
├── scripts/ # Utility scripts
│ └── generate_embeddings.py # Embedding generation script
└── static/ # Static files
└── images/ # Image assets
1.4. File Descriptions
- main.py
Application entry point that initializes FastAPI, configures CORS middleware, mounts static files, and registers API routes. Provides health check endpoint.
- api/chat.py
Main chat API endpoint that processes user messages, retrieves relevant context from knowledge base and vector search, generates AI responses using Gemini, and returns structured responses with navigation actions and suggestions.
- core/config.py
Centralized configuration management including: - Gemini API credentials and model settings - CORS configuration - File paths for embeddings and topics - System context prompt for AI assistant
- core/dependencies.py
Dependency injection using FastAPI’s dependency system with singleton pattern: - AI client initialization - Service instances (embedding, vector search, knowledge base) - Ensures single instance per service for efficiency
- services/embedding_service.py
Generates text embeddings using Google Gemini embedding API. Provides methods for single and batch embedding generation, and cosine similarity calculation for vector comparison.
- services/vector_search.py
Vector-based semantic search service that: - Loads pre-generated page embeddings - Generates query embeddings - Performs cosine similarity search - Returns top-k most relevant pages
- services/knowledge_base.py
Manages topic knowledge from topics.json: - Loads and indexes all topics and sub-topics - Provides topic context for AI prompts - Finds related topics based on keywords - Creates intelligent context based on user queries
- schemas/requests.py
Pydantic models for API requests: -
ChatRequest: User message with optional conversation history -ChatMessage: Individual message in conversation history- schemas/responses.py
Pydantic models for API responses: -
ChatResponse: Main response with text and actions -NavigateAction: Page navigation action -SuggestionAction: Follow-up question suggestion- scripts/generate_embeddings.py
Utility script to pre-generate embeddings for all pages: - Loads page data from frontend pageData and topics.json - Generates embeddings using Gemini API - Saves embeddings to data/embeddings.json for fast retrieval
1.5. Data Flow Architecture
Request Flow
Client Request
│
▼
┌─────────────────┐
│ FastAPI Router │
│ (api/chat.py) │
└────────┬────────┘
│
├──────────────────┬──────────────────┐
▼ ▼ ▼
┌─────────────────┐ ┌──────────────┐ ┌──────────────────┐
│ Knowledge Base │ │ Vector Search│ │ AI Client │
│ Service │ │ Service │ │ (Gemini) │
└────────┬────────┘ └──────┬───────┘ └────────┬─────────┘
│ │ │
│ │ │
▼ ▼ ▼
┌─────────────────┐ ┌──────────────┐ ┌──────────────────┐
│ topics.json │ │ embeddings │ │ Generate │
│ (Topic Context) │ │ .json │ │ Response │
└─────────────────┘ └──────────────┘ └────────┬─────────┘
│
▼
┌──────────────┐
│ ChatResponse │
│ + Actions │
└──────┬───────┘
│
▼
Client Response
Detailed Process Flow
User sends message →
ChatRequestwith message and optional historyKnowledge Base Service creates context: - Analyzes query keywords - Matches relevant topics from topics.json - Builds formatted context string for AI
Vector Search Service finds relevant pages: - Generates embedding for user query - Compares with pre-generated page embeddings - Returns top 5 most similar pages
AI Prompt Construction: - System context (role and guidelines) - Knowledge base context (relevant topics) - Conversation history (last 5 messages) - User question
Gemini API Call: - Generates natural language response - Uses exponential backoff retry on failures - Falls back to keyword-based suggestions if AI unavailable
AI Suggestions Generation: - Generates 2-3 follow-up questions - Uses same language as user query - Creates suggestion actions
Response Assembly: - Combines AI response text - Adds navigation actions (from vector search) - Adds suggestion actions (from AI) - Returns
ChatResponse
1.6. API Endpoints
- POST /chat
Main chat endpoint for user interactions.
Request Body:
{ "message": "What is gestational diabetes?", "history": [ { "role": "user", "content": "Hello" }, { "role": "assistant", "content": "Hello! How can I help you?" } ] }
Response:
{ "response": "Gestational diabetes is...", "actions": [ { "type": "navigate", "text": "Gestational Diabetes", "path": "/complications-gdm", "description": "Learn more about GDM..." }, { "type": "suggestion", "text": "How to manage GDM?", "query": "How can I manage gestational diabetes?" } ] }
- GET /
Health check endpoint returning API status and statistics.
Response:
{ "status": "ok", "message": "Pregnancy AI Assistant", "indexed_pages": 45, "topics": 5 }
1.7. Configuration
Environment Variables
GEMINI_API_KEY: Google Gemini API key (required)
Configuration File
All configuration is managed in core/config.py:
AI Model:
gemini-2.0-flash-liteEmbedding Model:
models/text-embedding-004Retry Settings: 3 attempts with exponential backoff
CORS: Configured for cross-origin requests
1.8. Deployment
The backend can be deployed using:
Vercel: Configured via
vercel.jsonUvicorn: Direct ASGI server deployment
Docker: Containerized deployment (if configured)
Requirements
All dependencies are listed in requirements.txt:
fastapi
uvicorn[standard]
google-genai
python-dotenv
numpy
pydantic