Deployment Guide

This guide explains how to deploy FR-APIaaS in a self-hosted environment using Docker Compose or Kubernetes. The platform consists of the API service, embedding service, database, cache, and object storage, all of which can run locally or on any cloud platform with container support.

Prerequisites

  • Docker 24+ and Docker Compose 2.20+
  • PostgreSQL 16 with the pgvector extension (bundled in the provided Docker image)
  • Redis 7
  • GPU: optional. The embedding service auto-detects a CUDA-capable GPU and falls back to CPU mode automatically. CPU mode is slower but fully functional.

Quick Start (Docker Compose)

For self-managed deployments, the simplest production baseline is the included docker-compose.prod.yml stack. It starts the required services, applies migrations, and exposes the API on port 8080.

# Clone and configure
cp .env.example .env
# Edit .env with your values

# Start all services
docker compose -f docker-compose.prod.yml up -d

# Run database migrations
docker compose exec api ./fr-apiaas migrate up

# Verify health
curl http://localhost:8080/health

The API will be available at http://localhost:8080/api/v1. See the Authentication guide to issue your first API key.

Environment Variables

Copy .env.example to .env and provide the required values before starting the stack.

VariableRequiredDescription
DATABASE_URLYesPostgreSQL connection string (must have pgvector installed)
REDIS_URLYesRedis connection string, e.g. redis://localhost:6379/0
JWT_SECRETYesSecret used to sign JWT tokens. Use 64+ random characters.
STRIPE_SECRET_KEYBillingStripe API key. Required only if billing is enabled.
STRIPE_WEBHOOK_SECRETBillingStripe webhook signing secret for payment events
SMTP_HOSTEmailSMTP server hostname for transactional email
SMTP_USEREmailSMTP username / sender address
SMTP_PASSEmailSMTP password or app-specific password
ADMIN_EMAILSNoComma-separated list of email addresses that get admin panel access
EMBEDDING_SERVICE_URLNoURL of the embedding service. Default: http://embedding-service:8000
MINIO_ENDPOINTStorageMinIO (or S3-compatible) endpoint for image storage
MINIO_ACCESS_KEYStorageMinIO access key ID
MINIO_SECRET_KEYStorageMinIO secret access key

Kubernetes

Kubernetes manifests are available in the k8s/ directory. The embedding service deployment includes a Horizontal Pod Autoscaler (HPA) configured to scale between 2 and 10 replicas based on CPU utilization. Apply the manifests with:

kubectl apply -f k8s/namespace.yaml
kubectl apply -f k8s/configmap.yaml
kubectl apply -f k8s/secrets.yaml
kubectl apply -f k8s/

For production Kubernetes environments, prefer managed PostgreSQL and Redis services rather than in-cluster StatefulSets.

GPU vs CPU Mode

The embedding service detects a CUDA-capable GPU at startup and uses it automatically. To force CPU-only mode, for example on hosts without GPU drivers, set the following environment variable on the embedding service container:

MODEL_DEVICE=cpu

CPU mode increases embedding latency, but recognition accuracy remains unchanged. It is suitable for development, evaluation, and lower-throughput production workloads.

Health Checks

The API exposes two health endpoints. Use them for load balancer checks and liveness or readiness probes.

EndpointDescription
GET /healthBasic liveness check. Returns 200 if the process is running.
GET /health/dependenciesDeep readiness check. Verifies connectivity to PostgreSQL, Redis, and the embedding service.

Production Checklist

  • Set JWT_SECRET to a randomly generated string of at least 64 characters (e.g. openssl rand -hex 32).
  • Enable SSL/TLS termination in front of the API. Never expose port 8080 directly to the internet.
  • Configure CORS_ALLOWED_ORIGINS to your exact frontend domain(s) rather than *.
  • Set up log aggregation (e.g. Loki, CloudWatch, Datadog). The API emits structured JSON logs.
  • Configure automated backups for the PostgreSQL volume or managed database. Face embeddings are stored there and cannot be reconstructed without the original source images.
  • Review rate limits on API keys via the dashboard to prevent abuse before production rollout.