Deploying Your Document Q&A Application
Overview
This tutorial guides you through deploying your Document Q&A application to production. We'll cover deploying both the Next.js frontend and FastAPI backend using Vercel, setting up environment variables, and configuring continuous integration and deployment (CI/CD).
Prerequisites
- A GitHub repository with your Document Q&A application code
- A Vercel account (free tier is sufficient)
- Your Groq API key for the LLM integration
- Basic understanding of environment variables and CI/CD concepts
Preparing Your Application for Deployment
1. Environment Variables
Create a .env.example file to document required environment variables:
# .env.example # API Configuration NEXT_PUBLIC_API_URL=http://localhost:3000/api # LLM Configuration GROQ_API_KEY=your-groq-api-key # Document Storage DOCUMENT_STORAGE_PATH=./tmp/documents # Optional: Analytics NEXT_PUBLIC_ANALYTICS_ID=your-analytics-id
Make sure your application code references these environment variables:
// src/lib/api-client.js
const apiClient = axios.create({
baseURL: process.env.NEXT_PUBLIC_API_URL || '/api',
// ...
});
// api/services/llm_service.py
def get_llm_client():
api_key = os.environ.get("GROQ_API_KEY")
if not api_key:
raise ValueError("GROQ_API_KEY environment variable is not set")
return GroqClient(api_key=api_key)2. Update package.json
Ensure your package.json has the correct build and start scripts:
{
"name": "document-qa-frontend",
"version": "1.0.0",
"scripts": {
"dev": "next dev",
"build": "next build",
"start": "next start",
"lint": "next lint"
}
}3. Create a Vercel Configuration File
Create a vercel.json file in the root of your project:
{
"version": 2,
"buildCommand": "npm run build",
"devCommand": "npm run dev",
"installCommand": "npm install",
"framework": "nextjs",
"outputDirectory": ".next",
"functions": {
"api/**/*": {
"memory": 1024,
"maxDuration": 60
}
},
"routes": [
{
"src": "/api/(.*)",
"dest": "/api/$1"
}
]
}Deploying the Frontend
1. Connect to Vercel
Follow these steps to deploy your Next.js frontend to Vercel:
- Sign in to Vercel
- Click "Add New" and select "Project"
- Import your GitHub repository
- Configure the project settings:
- Framework Preset: Next.js
- Root Directory:
./(or the path to your frontend code if in a monorepo) - Build Command:
npm run build - Output Directory:
.next
2. Configure Environment Variables
Add your environment variables in the Vercel project settings:
- In your project dashboard, go to "Settings" > "Environment Variables"
- Add each environment variable from your
.env.examplefile:- NEXT_PUBLIC_API_URL: Set to your API URL (e.g.,
https://your-app.vercel.app/api) - GROQ_API_KEY: Your Groq API key
- Add any other required variables
- NEXT_PUBLIC_API_URL: Set to your API URL (e.g.,
- Click "Save" to apply the changes
3. Deploy
Click "Deploy" to start the deployment process. Vercel will build and deploy your application.
Deploying the Backend API
1. Serverless Functions with Vercel
To deploy your FastAPI backend as serverless functions on Vercel, create an api directory in your project root:
# Project structure / ├── api/ │ ├── index.py # Main API entry point │ ├── upload.py # Document upload endpoint │ ├── ask.py # Question answering endpoint │ ├── requirements.txt # Python dependencies │ └── services/ # API services ├── src/ # Frontend code ├── public/ ├── package.json └── vercel.json
2. Create API Entry Points
Create serverless function files for each API endpoint:
# api/index.py
from fastapi import FastAPI
from fastapi.middleware.cors import CORSMiddleware
from mangum import Mangum
app = FastAPI()
# Configure CORS
app.add_middleware(
CORSMiddleware,
allow_origins=["*"], # In production, specify your frontend URL
allow_credentials=True,
allow_methods=["*"],
allow_headers=["*"],
)
@app.get("/api")
async def root():
return {"message": "Document Q&A API is running"}
# Create handler for AWS Lambda / Vercel
handler = Mangum(app)
# api/upload.py
from fastapi import FastAPI, UploadFile, File
from fastapi.middleware.cors import CORSMiddleware
from mangum import Mangum
import os
import uuid
from tempfile import NamedTemporaryFile
app = FastAPI()
# Configure CORS
app.add_middleware(
CORSMiddleware,
allow_origins=["*"],
allow_credentials=True,
allow_methods=["*"],
allow_headers=["*"],
)
@app.post("/api/upload")
async def upload_document(file: UploadFile = File(...)):
# Validate file type
if file.content_type not in ["application/pdf", "text/plain"]:
return {"detail": "Invalid file type. Only PDF and text files are supported."}, 400
# Generate document ID
document_id = str(uuid.uuid4())
# In a serverless environment, we need to use temporary storage
# or cloud storage like S3. For this example, we'll use /tmp
with NamedTemporaryFile(delete=False, dir="/tmp", suffix=f"_{document_id}") as tmp:
# Write file content
content = await file.read()
tmp.write(content)
tmp_path = tmp.name
# In a real application, you would process the document here
# and store metadata in a database
return {
"document_id": document_id,
"message": "Document uploaded successfully"
}
# Create handler for AWS Lambda / Vercel
handler = Mangum(app)
# api/ask.py
from fastapi import FastAPI, Request
from fastapi.middleware.cors import CORSMiddleware
from mangum import Mangum
import os
import json
app = FastAPI()
# Configure CORS
app.add_middleware(
CORSMiddleware,
allow_origins=["*"],
allow_credentials=True,
allow_methods=["*"],
allow_headers=["*"],
)
@app.post("/api/ask")
async def ask_question(request: Request):
data = await request.json()
document_id = data.get("document_id")
question = data.get("question")
# Validate inputs
if not document_id or not question:
return {"detail": "Missing required parameters"}, 400
# In a real application, you would:
# 1. Retrieve the document content
# 2. Call the LLM service
# 3. Return the answer
# For this example, we'll simulate an LLM response
answer = f"This is a simulated answer to: '{question}'"
return {"answer": answer}
# Create handler for AWS Lambda / Vercel
handler = Mangum(app)3. Create requirements.txt
Create a requirements.txt file in the api directory:
fastapi==0.95.0 mangum==0.17.0 python-multipart==0.0.6 pydantic==1.10.7 groq==0.4.0 python-dotenv==1.0.0
Setting Up CI/CD
1. GitHub Actions for CI
Create a GitHub Actions workflow file at .github/workflows/ci.yml:
name: CI
on:
push:
branches: [ main ]
pull_request:
branches: [ main ]
jobs:
test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Set up Node.js
uses: actions/setup-node@v3
with:
node-version: '18'
cache: 'npm'
- name: Install dependencies
run: npm ci
- name: Lint
run: npm run lint
- name: Run tests
run: npm test
- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: '3.11'
- name: Install Python dependencies
run: |
python -m pip install --upgrade pip
pip install -r api/requirements.txt
pip install pytest pytest-asyncio httpx
- name: Run API tests
run: |
cd api
pytest2. Vercel GitHub Integration
Vercel automatically integrates with GitHub to provide continuous deployment:
- Each push to your main branch will trigger a production deployment
- Pull requests will create preview deployments
- You can configure additional settings in the Vercel project dashboard:
- Go to "Settings" > "Git"
- Configure production branch and preview branches
- Set up build and development settings
Production Considerations
1. Document Storage
For production, consider using a cloud storage solution instead of temporary storage:
# Install AWS SDK
pip install boto3
# api/services/storage_service.py
import boto3
import os
from botocore.exceptions import ClientError
s3_client = boto3.client(
's3',
aws_access_key_id=os.environ.get('AWS_ACCESS_KEY_ID'),
aws_secret_access_key=os.environ.get('AWS_SECRET_ACCESS_KEY'),
region_name=os.environ.get('AWS_REGION', 'us-east-1')
)
bucket_name = os.environ.get('S3_BUCKET_NAME')
async def upload_document_to_s3(file_content, document_id, file_extension):
"""Upload a document to S3 bucket"""
try:
file_key = f"documents/{document_id}{file_extension}"
s3_client.put_object(
Bucket=bucket_name,
Key=file_key,
Body=file_content
)
return {
"bucket": bucket_name,
"key": file_key
}
except ClientError as e:
print(f"Error uploading to S3: {e}")
raise
async def get_document_from_s3(document_id):
"""Get a document from S3 bucket"""
try:
# In a real app, you would store the file key in a database
# For this example, we'll try to find the file by listing objects
prefix = f"documents/{document_id}"
response = s3_client.list_objects_v2(
Bucket=bucket_name,
Prefix=prefix
)
if 'Contents' in response and len(response['Contents']) > 0:
file_key = response['Contents'][0]['Key']
obj = s3_client.get_object(
Bucket=bucket_name,
Key=file_key
)
return obj['Body'].read()
return None
except ClientError as e:
print(f"Error retrieving from S3: {e}")
raise2. Database Integration
Add a database to store document metadata and user sessions:
# Install database driver
pip install motor
# api/services/database_service.py
import os
import motor.motor_asyncio
from bson import ObjectId
# MongoDB connection
client = motor.motor_asyncio.AsyncIOMotorClient(os.environ.get("MONGODB_URI"))
db = client.document_qa_db
documents_collection = db.documents
async def create_document(document_data):
"""Create a new document record"""
result = await documents_collection.insert_one(document_data)
return str(result.inserted_id)
async def get_document(document_id):
"""Get document by ID"""
if not ObjectId.is_valid(document_id):
return None
document = await documents_collection.find_one({"_id": ObjectId(document_id)})
return document
async def update_document(document_id, update_data):
"""Update document data"""
if not ObjectId.is_valid(document_id):
return False
result = await documents_collection.update_one(
{"_id": ObjectId(document_id)},
{"$set": update_data}
)
return result.modified_count > 03. Rate Limiting and Security
Implement rate limiting and security measures:
# Install dependencies
pip install fastapi-limiter redis
# api/main.py
from fastapi import FastAPI, Request, Depends
from fastapi.middleware.cors import CORSMiddleware
from fastapi_limiter import FastAPILimiter
from fastapi_limiter.depends import RateLimiter
import redis.asyncio as redis
import os
app = FastAPI()
# Configure CORS with specific origins
app.add_middleware(
CORSMiddleware,
allow_origins=[os.environ.get("FRONTEND_URL", "http://localhost:3000")],
allow_credentials=True,
allow_methods=["*"],
allow_headers=["*"],
)
# Set up Redis for rate limiting
@app.on_event("startup")
async def startup():
redis_url = os.environ.get("REDIS_URL", "redis://localhost:6379/0")
redis_client = redis.from_url(redis_url, encoding="utf-8", decode_responses=True)
await FastAPILimiter.init(redis_client)
# Apply rate limiting to endpoints
@app.post("/api/upload")
@app.post("/api/ask")
@app.get("/api/documents/{document_id}")
async def rate_limited_endpoints(
request: Request,
_=Depends(RateLimiter(times=10, seconds=60)) # 10 requests per minute
):
# Your endpoint logic here
passMonitoring and Logging
1. Vercel Analytics
Enable Vercel Analytics for your project:
- Go to your Vercel project dashboard
- Navigate to "Analytics" tab
- Click "Enable Analytics"
- Configure the settings as needed
2. Custom Logging
Implement custom logging for your API:
# api/utils/logger.py
import logging
import json
import os
import sys
from datetime import datetime
class CustomJSONFormatter(logging.Formatter):
def format(self, record):
log_record = {
"timestamp": datetime.utcnow().isoformat(),
"level": record.levelname,
"message": record.getMessage(),
"module": record.module,
"function": record.funcName,
"line": record.lineno,
}
if hasattr(record, 'request_id'):
log_record["request_id"] = record.request_id
if record.exc_info:
log_record["exception"] = self.formatException(record.exc_info)
return json.dumps(log_record)
def setup_logger():
logger = logging.getLogger("document_qa_api")
logger.setLevel(logging.INFO)
# Create console handler
handler = logging.StreamHandler(sys.stdout)
handler.setFormatter(CustomJSONFormatter())
logger.addHandler(handler)
return logger
logger = setup_logger()
# Usage in API endpoints
from .utils.logger import logger
@app.post("/api/upload")
async def upload_document(file: UploadFile = File(...)):
request_id = str(uuid.uuid4())
logger.info(f"Processing upload request", extra={"request_id": request_id})
try:
# Process upload
# ...
logger.info(f"Document uploaded successfully", extra={"request_id": request_id})
return {"document_id": document_id}
except Exception as e:
logger.error(f"Error uploading document: {str(e)}", extra={"request_id": request_id}, exc_info=True)
raise3. Error Tracking
Integrate an error tracking service like Sentry:
# Install Sentry SDK
pip install sentry-sdk
# api/main.py
import sentry_sdk
from sentry_sdk.integrations.fastapi import FastApiIntegration
# Initialize Sentry
sentry_sdk.init(
dsn=os.environ.get("SENTRY_DSN"),
integrations=[FastApiIntegration()],
traces_sample_rate=1.0, # Adjust in production
environment=os.environ.get("ENVIRONMENT", "development"),
)
# Frontend integration
// src/pages/_app.js
import * as Sentry from '@sentry/nextjs';
Sentry.init({
dsn: process.env.NEXT_PUBLIC_SENTRY_DSN,
tracesSampleRate: 1.0, // Adjust in production
environment: process.env.NODE_ENV,
});Next Steps
Now that you've deployed your Document Q&A application, you can: