Deployment Guide
This guide explains how to deploy FR-APIaaS in a self-hosted environment using Docker Compose or Kubernetes. The platform consists of the API service, embedding service, database, cache, and object storage, all of which can run locally or on any cloud platform with container support.
Prerequisites
- Docker 24+ and Docker Compose 2.20+
- PostgreSQL 16 with the
pgvectorextension (bundled in the provided Docker image) - Redis 7
- GPU: optional. The embedding service auto-detects a CUDA-capable GPU and falls back to CPU mode automatically. CPU mode is slower but fully functional.
Quick Start (Docker Compose)
For self-managed deployments, the simplest production baseline is the included docker-compose.prod.yml stack. It starts the required services, applies migrations, and exposes the API on port 8080.
# Clone and configure
cp .env.example .env
# Edit .env with your values
# Start all services
docker compose -f docker-compose.prod.yml up -d
# Run database migrations
docker compose exec api ./fr-apiaas migrate up
# Verify health
curl http://localhost:8080/healthThe API will be available at http://localhost:8080/api/v1. See the Authentication guide to issue your first API key.
Environment Variables
Copy .env.example to .env and provide the required values before starting the stack.
| Variable | Required | Description |
|---|---|---|
| DATABASE_URL | Yes | PostgreSQL connection string (must have pgvector installed) |
| REDIS_URL | Yes | Redis connection string, e.g. redis://localhost:6379/0 |
| JWT_SECRET | Yes | Secret used to sign JWT tokens. Use 64+ random characters. |
| STRIPE_SECRET_KEY | Billing | Stripe API key. Required only if billing is enabled. |
| STRIPE_WEBHOOK_SECRET | Billing | Stripe webhook signing secret for payment events |
| SMTP_HOST | SMTP server hostname for transactional email | |
| SMTP_USER | SMTP username / sender address | |
| SMTP_PASS | SMTP password or app-specific password | |
| ADMIN_EMAILS | No | Comma-separated list of email addresses that get admin panel access |
| EMBEDDING_SERVICE_URL | No | URL of the embedding service. Default: http://embedding-service:8000 |
| MINIO_ENDPOINT | Storage | MinIO (or S3-compatible) endpoint for image storage |
| MINIO_ACCESS_KEY | Storage | MinIO access key ID |
| MINIO_SECRET_KEY | Storage | MinIO secret access key |
Kubernetes
Kubernetes manifests are available in the k8s/ directory. The embedding service deployment includes a Horizontal Pod Autoscaler (HPA) configured to scale between 2 and 10 replicas based on CPU utilization. Apply the manifests with:
kubectl apply -f k8s/namespace.yaml
kubectl apply -f k8s/configmap.yaml
kubectl apply -f k8s/secrets.yaml
kubectl apply -f k8s/
For production Kubernetes environments, prefer managed PostgreSQL and Redis services rather than in-cluster StatefulSets.
GPU vs CPU Mode
The embedding service detects a CUDA-capable GPU at startup and uses it automatically. To force CPU-only mode, for example on hosts without GPU drivers, set the following environment variable on the embedding service container:
MODEL_DEVICE=cpuCPU mode increases embedding latency, but recognition accuracy remains unchanged. It is suitable for development, evaluation, and lower-throughput production workloads.
Health Checks
The API exposes two health endpoints. Use them for load balancer checks and liveness or readiness probes.
| Endpoint | Description |
|---|---|
| GET /health | Basic liveness check. Returns 200 if the process is running. |
| GET /health/dependencies | Deep readiness check. Verifies connectivity to PostgreSQL, Redis, and the embedding service. |
Production Checklist
- Set
JWT_SECRETto a randomly generated string of at least 64 characters (e.g.openssl rand -hex 32). - Enable SSL/TLS termination in front of the API. Never expose port 8080 directly to the internet.
- Configure
CORS_ALLOWED_ORIGINSto your exact frontend domain(s) rather than*. - Set up log aggregation (e.g. Loki, CloudWatch, Datadog). The API emits structured JSON logs.
- Configure automated backups for the PostgreSQL volume or managed database. Face embeddings are stored there and cannot be reconstructed without the original source images.
- Review rate limits on API keys via the dashboard to prevent abuse before production rollout.