Deploying Your Document Q&A Application

Overview

This tutorial guides you through deploying your Document Q&A application to production. We'll cover deploying both the Next.js frontend and FastAPI backend using Vercel, setting up environment variables, and configuring continuous integration and deployment (CI/CD).

Prerequisites

  • A GitHub repository with your Document Q&A application code
  • A Vercel account (free tier is sufficient)
  • Your Groq API key for the LLM integration
  • Basic understanding of environment variables and CI/CD concepts

Preparing Your Application for Deployment

1. Environment Variables

Create a .env.example file to document required environment variables:

# .env.example

# API Configuration
NEXT_PUBLIC_API_URL=http://localhost:3000/api

# LLM Configuration
GROQ_API_KEY=your-groq-api-key

# Document Storage
DOCUMENT_STORAGE_PATH=./tmp/documents

# Optional: Analytics
NEXT_PUBLIC_ANALYTICS_ID=your-analytics-id

Make sure your application code references these environment variables:

// src/lib/api-client.js
const apiClient = axios.create({
  baseURL: process.env.NEXT_PUBLIC_API_URL || '/api',
  // ...
});

// api/services/llm_service.py
def get_llm_client():
    api_key = os.environ.get("GROQ_API_KEY")
    if not api_key:
        raise ValueError("GROQ_API_KEY environment variable is not set")
    
    return GroqClient(api_key=api_key)

2. Update package.json

Ensure your package.json has the correct build and start scripts:

{
  "name": "document-qa-frontend",
  "version": "1.0.0",
  "scripts": {
    "dev": "next dev",
    "build": "next build",
    "start": "next start",
    "lint": "next lint"
  }
}

3. Create a Vercel Configuration File

Create a vercel.json file in the root of your project:

{
  "version": 2,
  "buildCommand": "npm run build",
  "devCommand": "npm run dev",
  "installCommand": "npm install",
  "framework": "nextjs",
  "outputDirectory": ".next",
  "functions": {
    "api/**/*": {
      "memory": 1024,
      "maxDuration": 60
    }
  },
  "routes": [
    {
      "src": "/api/(.*)",
      "dest": "/api/$1"
    }
  ]
}

Deploying the Frontend

1. Connect to Vercel

Follow these steps to deploy your Next.js frontend to Vercel:

  1. Sign in to Vercel
  2. Click "Add New" and select "Project"
  3. Import your GitHub repository
  4. Configure the project settings:
    • Framework Preset: Next.js
    • Root Directory: ./ (or the path to your frontend code if in a monorepo)
    • Build Command: npm run build
    • Output Directory: .next

2. Configure Environment Variables

Add your environment variables in the Vercel project settings:

  1. In your project dashboard, go to "Settings" > "Environment Variables"
  2. Add each environment variable from your .env.example file:
    • NEXT_PUBLIC_API_URL: Set to your API URL (e.g., https://your-app.vercel.app/api)
    • GROQ_API_KEY: Your Groq API key
    • Add any other required variables
  3. Click "Save" to apply the changes

3. Deploy

Click "Deploy" to start the deployment process. Vercel will build and deploy your application.

Deploying the Backend API

1. Serverless Functions with Vercel

To deploy your FastAPI backend as serverless functions on Vercel, create an api directory in your project root:

# Project structure
/
├── api/
│   ├── index.py           # Main API entry point
│   ├── upload.py          # Document upload endpoint
│   ├── ask.py             # Question answering endpoint
│   ├── requirements.txt   # Python dependencies
│   └── services/          # API services
├── src/                   # Frontend code
├── public/
├── package.json
└── vercel.json

2. Create API Entry Points

Create serverless function files for each API endpoint:

# api/index.py
from fastapi import FastAPI
from fastapi.middleware.cors import CORSMiddleware
from mangum import Mangum

app = FastAPI()

# Configure CORS
app.add_middleware(
    CORSMiddleware,
    allow_origins=["*"],  # In production, specify your frontend URL
    allow_credentials=True,
    allow_methods=["*"],
    allow_headers=["*"],
)

@app.get("/api")
async def root():
    return {"message": "Document Q&A API is running"}

# Create handler for AWS Lambda / Vercel
handler = Mangum(app)

# api/upload.py
from fastapi import FastAPI, UploadFile, File
from fastapi.middleware.cors import CORSMiddleware
from mangum import Mangum
import os
import uuid
from tempfile import NamedTemporaryFile

app = FastAPI()

# Configure CORS
app.add_middleware(
    CORSMiddleware,
    allow_origins=["*"],
    allow_credentials=True,
    allow_methods=["*"],
    allow_headers=["*"],
)

@app.post("/api/upload")
async def upload_document(file: UploadFile = File(...)):
    # Validate file type
    if file.content_type not in ["application/pdf", "text/plain"]:
        return {"detail": "Invalid file type. Only PDF and text files are supported."}, 400
    
    # Generate document ID
    document_id = str(uuid.uuid4())
    
    # In a serverless environment, we need to use temporary storage
    # or cloud storage like S3. For this example, we'll use /tmp
    with NamedTemporaryFile(delete=False, dir="/tmp", suffix=f"_{document_id}") as tmp:
        # Write file content
        content = await file.read()
        tmp.write(content)
        tmp_path = tmp.name
    
    # In a real application, you would process the document here
    # and store metadata in a database
    
    return {
        "document_id": document_id,
        "message": "Document uploaded successfully"
    }

# Create handler for AWS Lambda / Vercel
handler = Mangum(app)

# api/ask.py
from fastapi import FastAPI, Request
from fastapi.middleware.cors import CORSMiddleware
from mangum import Mangum
import os
import json

app = FastAPI()

# Configure CORS
app.add_middleware(
    CORSMiddleware,
    allow_origins=["*"],
    allow_credentials=True,
    allow_methods=["*"],
    allow_headers=["*"],
)

@app.post("/api/ask")
async def ask_question(request: Request):
    data = await request.json()
    document_id = data.get("document_id")
    question = data.get("question")
    
    # Validate inputs
    if not document_id or not question:
        return {"detail": "Missing required parameters"}, 400
    
    # In a real application, you would:
    # 1. Retrieve the document content
    # 2. Call the LLM service
    # 3. Return the answer
    
    # For this example, we'll simulate an LLM response
    answer = f"This is a simulated answer to: '{question}'"
    
    return {"answer": answer}

# Create handler for AWS Lambda / Vercel
handler = Mangum(app)

3. Create requirements.txt

Create a requirements.txt file in the api directory:

fastapi==0.95.0
mangum==0.17.0
python-multipart==0.0.6
pydantic==1.10.7
groq==0.4.0
python-dotenv==1.0.0

Setting Up CI/CD

1. GitHub Actions for CI

Create a GitHub Actions workflow file at .github/workflows/ci.yml:

name: CI

on:
  push:
    branches: [ main ]
  pull_request:
    branches: [ main ]

jobs:
  test:
    runs-on: ubuntu-latest
    
    steps:
    - uses: actions/checkout@v3
    
    - name: Set up Node.js
      uses: actions/setup-node@v3
      with:
        node-version: '18'
        cache: 'npm'
    
    - name: Install dependencies
      run: npm ci
    
    - name: Lint
      run: npm run lint
    
    - name: Run tests
      run: npm test
    
    - name: Set up Python
      uses: actions/setup-python@v4
      with:
        python-version: '3.11'
    
    - name: Install Python dependencies
      run: |
        python -m pip install --upgrade pip
        pip install -r api/requirements.txt
        pip install pytest pytest-asyncio httpx
    
    - name: Run API tests
      run: |
        cd api
        pytest

2. Vercel GitHub Integration

Vercel automatically integrates with GitHub to provide continuous deployment:

  1. Each push to your main branch will trigger a production deployment
  2. Pull requests will create preview deployments
  3. You can configure additional settings in the Vercel project dashboard:
    • Go to "Settings" > "Git"
    • Configure production branch and preview branches
    • Set up build and development settings

Production Considerations

1. Document Storage

For production, consider using a cloud storage solution instead of temporary storage:

# Install AWS SDK
pip install boto3

# api/services/storage_service.py
import boto3
import os
from botocore.exceptions import ClientError

s3_client = boto3.client(
    's3',
    aws_access_key_id=os.environ.get('AWS_ACCESS_KEY_ID'),
    aws_secret_access_key=os.environ.get('AWS_SECRET_ACCESS_KEY'),
    region_name=os.environ.get('AWS_REGION', 'us-east-1')
)

bucket_name = os.environ.get('S3_BUCKET_NAME')

async def upload_document_to_s3(file_content, document_id, file_extension):
    """Upload a document to S3 bucket"""
    try:
        file_key = f"documents/{document_id}{file_extension}"
        s3_client.put_object(
            Bucket=bucket_name,
            Key=file_key,
            Body=file_content
        )
        return {
            "bucket": bucket_name,
            "key": file_key
        }
    except ClientError as e:
        print(f"Error uploading to S3: {e}")
        raise

async def get_document_from_s3(document_id):
    """Get a document from S3 bucket"""
    try:
        # In a real app, you would store the file key in a database
        # For this example, we'll try to find the file by listing objects
        prefix = f"documents/{document_id}"
        response = s3_client.list_objects_v2(
            Bucket=bucket_name,
            Prefix=prefix
        )
        
        if 'Contents' in response and len(response['Contents']) > 0:
            file_key = response['Contents'][0]['Key']
            obj = s3_client.get_object(
                Bucket=bucket_name,
                Key=file_key
            )
            return obj['Body'].read()
        
        return None
    except ClientError as e:
        print(f"Error retrieving from S3: {e}")
        raise

2. Database Integration

Add a database to store document metadata and user sessions:

# Install database driver
pip install motor

# api/services/database_service.py
import os
import motor.motor_asyncio
from bson import ObjectId

# MongoDB connection
client = motor.motor_asyncio.AsyncIOMotorClient(os.environ.get("MONGODB_URI"))
db = client.document_qa_db

documents_collection = db.documents

async def create_document(document_data):
    """Create a new document record"""
    result = await documents_collection.insert_one(document_data)
    return str(result.inserted_id)

async def get_document(document_id):
    """Get document by ID"""
    if not ObjectId.is_valid(document_id):
        return None
    
    document = await documents_collection.find_one({"_id": ObjectId(document_id)})
    return document

async def update_document(document_id, update_data):
    """Update document data"""
    if not ObjectId.is_valid(document_id):
        return False
    
    result = await documents_collection.update_one(
        {"_id": ObjectId(document_id)},
        {"$set": update_data}
    )
    
    return result.modified_count > 0

3. Rate Limiting and Security

Implement rate limiting and security measures:

# Install dependencies
pip install fastapi-limiter redis

# api/main.py
from fastapi import FastAPI, Request, Depends
from fastapi.middleware.cors import CORSMiddleware
from fastapi_limiter import FastAPILimiter
from fastapi_limiter.depends import RateLimiter
import redis.asyncio as redis
import os

app = FastAPI()

# Configure CORS with specific origins
app.add_middleware(
    CORSMiddleware,
    allow_origins=[os.environ.get("FRONTEND_URL", "http://localhost:3000")],
    allow_credentials=True,
    allow_methods=["*"],
    allow_headers=["*"],
)

# Set up Redis for rate limiting
@app.on_event("startup")
async def startup():
    redis_url = os.environ.get("REDIS_URL", "redis://localhost:6379/0")
    redis_client = redis.from_url(redis_url, encoding="utf-8", decode_responses=True)
    await FastAPILimiter.init(redis_client)

# Apply rate limiting to endpoints
@app.post("/api/upload")
@app.post("/api/ask")
@app.get("/api/documents/{document_id}")
async def rate_limited_endpoints(
    request: Request,
    _=Depends(RateLimiter(times=10, seconds=60))  # 10 requests per minute
):
    # Your endpoint logic here
    pass

Monitoring and Logging

1. Vercel Analytics

Enable Vercel Analytics for your project:

  1. Go to your Vercel project dashboard
  2. Navigate to "Analytics" tab
  3. Click "Enable Analytics"
  4. Configure the settings as needed

2. Custom Logging

Implement custom logging for your API:

# api/utils/logger.py
import logging
import json
import os
import sys
from datetime import datetime

class CustomJSONFormatter(logging.Formatter):
    def format(self, record):
        log_record = {
            "timestamp": datetime.utcnow().isoformat(),
            "level": record.levelname,
            "message": record.getMessage(),
            "module": record.module,
            "function": record.funcName,
            "line": record.lineno,
        }
        
        if hasattr(record, 'request_id'):
            log_record["request_id"] = record.request_id
            
        if record.exc_info:
            log_record["exception"] = self.formatException(record.exc_info)
            
        return json.dumps(log_record)

def setup_logger():
    logger = logging.getLogger("document_qa_api")
    logger.setLevel(logging.INFO)
    
    # Create console handler
    handler = logging.StreamHandler(sys.stdout)
    handler.setFormatter(CustomJSONFormatter())
    
    logger.addHandler(handler)
    
    return logger

logger = setup_logger()

# Usage in API endpoints
from .utils.logger import logger

@app.post("/api/upload")
async def upload_document(file: UploadFile = File(...)):
    request_id = str(uuid.uuid4())
    logger.info(f"Processing upload request", extra={"request_id": request_id})
    
    try:
        # Process upload
        # ...
        logger.info(f"Document uploaded successfully", extra={"request_id": request_id})
        return {"document_id": document_id}
    except Exception as e:
        logger.error(f"Error uploading document: {str(e)}", extra={"request_id": request_id}, exc_info=True)
        raise

3. Error Tracking

Integrate an error tracking service like Sentry:

# Install Sentry SDK
pip install sentry-sdk

# api/main.py
import sentry_sdk
from sentry_sdk.integrations.fastapi import FastApiIntegration

# Initialize Sentry
sentry_sdk.init(
    dsn=os.environ.get("SENTRY_DSN"),
    integrations=[FastApiIntegration()],
    traces_sample_rate=1.0,  # Adjust in production
    environment=os.environ.get("ENVIRONMENT", "development"),
)

# Frontend integration
// src/pages/_app.js
import * as Sentry from '@sentry/nextjs';

Sentry.init({
  dsn: process.env.NEXT_PUBLIC_SENTRY_DSN,
  tracesSampleRate: 1.0, // Adjust in production
  environment: process.env.NODE_ENV,
});

Next Steps

Now that you've deployed your Document Q&A application, you can: