FR-APIaaS - Face Recognition API as a Service

This guide explains how to deploy FR-APIaaS in a self-hosted environment using Docker Compose or Kubernetes. The platform consists of the API service, embedding service, database, cache, and object storage, all of which can run locally or on any cloud platform with container support.

Prerequisites

Docker 24+ and Docker Compose 2.20+
PostgreSQL 16 with the pgvector extension (bundled in the provided Docker image)
Redis 7
GPU: optional. The embedding service auto-detects a CUDA-capable GPU and falls back to CPU mode automatically. CPU mode is slower but fully functional.

Quick Start (Docker Compose)

For self-managed deployments, the simplest production baseline is the included docker-compose.prod.yml stack. It starts the required services, applies migrations, and exposes the API on port 8080.

# Clone and configure
cp .env.example .env
# Edit .env with your values

# Start all services
docker compose -f docker-compose.prod.yml up -d

# Run database migrations
docker compose exec api ./fr-apiaas migrate up

# Verify health
curl http://localhost:8080/health

The API will be available at http://localhost:8080/api/v1. See the Authentication guide to issue your first API key.

Environment Variables

Copy .env.example to .env and provide the required values before starting the stack.

Variable	Required	Description
DATABASE_URL	Yes	PostgreSQL connection string (must have pgvector installed)
REDIS_URL	Yes	Redis connection string, e.g. redis://localhost:6379/0
JWT_SECRET	Yes	Secret used to sign JWT tokens. Use 64+ random characters.
STRIPE_SECRET_KEY	Billing	Stripe API key. Required only if billing is enabled.
STRIPE_WEBHOOK_SECRET	Billing	Stripe webhook signing secret for payment events
SMTP_HOST	Email	SMTP server hostname for transactional email
SMTP_USER	Email	SMTP username / sender address
SMTP_PASS	Email	SMTP password or app-specific password
ADMIN_EMAILS	No	Comma-separated list of email addresses that get admin panel access
EMBEDDING_SERVICE_URL	No	URL of the embedding service. Default: http://embedding-service:8000
MINIO_ENDPOINT	Storage	MinIO (or S3-compatible) endpoint for image storage
MINIO_ACCESS_KEY	Storage	MinIO access key ID
MINIO_SECRET_KEY	Storage	MinIO secret access key

Kubernetes

Kubernetes manifests are available in the k8s/ directory. The embedding service deployment includes a Horizontal Pod Autoscaler (HPA) configured to scale between 2 and 10 replicas based on CPU utilization. Apply the manifests with:

kubectl apply -f k8s/namespace.yaml
kubectl apply -f k8s/configmap.yaml
kubectl apply -f k8s/secrets.yaml
kubectl apply -f k8s/

For production Kubernetes environments, prefer managed PostgreSQL and Redis services rather than in-cluster StatefulSets.

GPU vs CPU Mode

The embedding service detects a CUDA-capable GPU at startup and uses it automatically. To force CPU-only mode, for example on hosts without GPU drivers, set the following environment variable on the embedding service container:

MODEL_DEVICE=cpu

CPU mode increases embedding latency, but recognition accuracy remains unchanged. It is suitable for development, evaluation, and lower-throughput production workloads.

Health Checks

The API exposes two health endpoints. Use them for load balancer checks and liveness or readiness probes.

Endpoint	Description
GET /health	Basic liveness check. Returns 200 if the process is running.
GET /health/dependencies	Deep readiness check. Verifies connectivity to PostgreSQL, Redis, and the embedding service.

Production Checklist

Set JWT_SECRET to a randomly generated string of at least 64 characters (e.g. openssl rand -hex 32).
Enable SSL/TLS termination in front of the API. Never expose port 8080 directly to the internet.
Configure CORS_ALLOWED_ORIGINS to your exact frontend domain(s) rather than *.
Set up log aggregation (e.g. Loki, CloudWatch, Datadog). The API emits structured JSON logs.
Configure automated backups for the PostgreSQL volume or managed database. Face embeddings are stored there and cannot be reconstructed without the original source images.
Review rate limits on API keys via the dashboard to prevent abuse before production rollout.