Operate a self-hosted Riposte deployment

Confirm runtime health and licensing

Call GET /health once npx riposte-run (or your embedded host) is listening. The response lists queue/cache providers, pub/sub, the rate limiter, and other subsystems alongside onboarding status, license information, and memory usage. Use it as your canary: a production-ready payload shows status: "healthy", every service flag set to true, and onboarding.canProceed still true after launch.

Keep a simple readiness check in place:

curl -sfS http://<riposte-host>/health | jq '.data.status'

Alert whenever status flips to "degraded" or when license.currentAccounts approaches license.effectiveAccountLimit. The endpoint also reports memory.percentage so you can detect slow leaks before the process is recycled.

Track license utilisation

The health payload surfaces license.plan, effectiveAccountLimit, and currentAccounts. When you approach your licensed cap, you can raise a ticket or expand capacity before new authorisations fail.

Detailed probes

Use GET /health/detailed for rollouts. It echoes per-service diagnostics so you can see which dependency is failing before reintroducing a node to traffic.

Use the admin portal for live debugging

Keep the bundled admin portal enabled in production—its overview surfaces the same health signals you monitor through the API, and every workspace already authenticates against the ADMIN_USERNAME/ADMIN_PASSWORD pair you provide. The navigation exposes pages for sync logs, webhook deliveries, scheduler configuration, historical backfills, and connected accounts so you can debug without touching the shell.

Start with the Sync Logs screen to filter by severity, provider, operation, or account; it streams from GET /logs and refreshes automatically while you triage incidents. When a team needs to inspect job state, pivot to Historical Syncs or Scheduler to review queued runs, start or pause backfills, and fetch the exact API payloads the UI sends so you can reproduce them in automated tooling.

Triage webhooks in one place

The Webhooks view lists every endpoint returned by GET /webhooks/endpoints and flags inactive targets so you can spot misconfigurations before events start backing up.

Scale horizontally with PostgreSQL and Redis

Production installs should point DATABASE_URL at PostgreSQL so you can run multiple instances without SQLite file locks. Convert an existing workspace with riposte-migrate --to postgresql, then restart one node and confirm services.database stays true in /health.

Set REDIS_URL to move caching, queues, and pub/sub into Redis. Riposte detects the change and flips cache.provider, queue.provider, and pubsub.provider to "redis" so replicas coordinate work through shared streams. Finish the switchover by:

1. Pointing every replica at the same Redis endpoint. 2. Verifying /health reports Redis-backed services as healthy. 3. Calling GET /health/detailed to confirm queue, rate limiter, and pub/sub services report healthy: true across replicas.

Keep auto-migrations in check

Riposte applies migrations at startup by default. Override AUTO_APPLY_MIGRATIONS=false (or set riposte.migrations.autoApply to false) when you need to run migrations once per rollout instead of on every replica.

Capture and forward structured logs

Riposte emits structured Pino logs by default. Tune verbosity with LOG_LEVEL (defaults to info) and leave pretty-printing enabled only in development so production sinks stay machine parsable.

To export logs, enable the built-in OpenTelemetry pipeline:

1. Set OTEL_LOGS_ENABLED=true. 2. Point OTEL_EXPORTER_OTLP_ENDPOINT (or OTEL_EXPORTER_OTLP_LOGS_ENDPOINT) at your collector. Most hosted platforms—Datadog, Honeycomb, New Relic, OpenTelemetry Collector—accept OTLP over HTTPS. 3. Provide credentials via OTEL_LOGS_HEADERS (for example api-key=<token>, DD-API-KEY=<token>, or x-honeycomb-team=<team-key>). 4. Optionally tag streams with OTEL_SERVICE_NAME, OTEL_LOGGER_NAME, and OTEL_LOGS_RESOURCE_ATTRIBUTES so downstream dashboards group Riposte separately.

The logger reconfigures itself on startup using those variables, so container restarts automatically pick up changes checked into your environment repository. When you already run an OpenTelemetry Collector, point Riposte at that internal endpoint and let the collector fan logs out to Cloud Logging, Splunk, or any other OTLP-compatible sink.

Stream logs over HTTP

The built-in GET /logs endpoint stores up to 2,000 entries in memory for quick debugging. Ship everything else to your OTLP target for retention, alerting, and search.

Batch tuning

Set OTEL_LOGS_MAX_QUEUE_SIZE, OTEL_LOGS_MAX_EXPORT_BATCH_SIZE, or OTEL_LOGS_SCHEDULED_DELAY_MS to align with collector limits when you need to throttle exports.

Debug sync issues quickly

Use GET /logs to slice by level, provider, operation, or accountId when you need to chase down a failing sync. Pair that with GET /health or /health/detailed to confirm downstream services are reachable during incidents. Every log emitted through the helpers in product/src/common/logger.ts is mirrored to both the OTLP exporter and the in-process buffer, so what you query locally matches what lands in your external aggregation layer.

Work directly with queues and background jobs

Riposte exposes queue controls so operators can intervene without redeploying. The debounced sync queue supports GET /debounced-queue/status for global metrics, GET /debounced-queue/accounts/:accountId to inspect a single tenant, and POST /debounced-queue/clear when you must flush retries after fixing an upstream dependency. Each replica shares queue state through Redis once you set REDIS_URL, and the admin portal’s historical sync page mirrors the same APIs to pause, resume, or restart backfills when a mailbox looks stuck.

Audit pub/sub driven work

Global queue processing logs include the provider, account, and operation so you can trace Pub/Sub-triggered jobs end-to-end before replaying them.

Query PostgreSQL directly from your services

Once DATABASE_URL points at PostgreSQL, every Riposte table is available for read replicas and reporting. Key datasets include RiposteProviderAuthToken for stored OAuth secrets, RiposteMessage for cached message metadata, and RiposteMessageTrackingEvent for engagement trails. Use standard PostgreSQL clients inside your application to join Riposte data with your own domain models—indexes on accountId, providerId, and receivedAt already exist so analytic queries stay fast. Stick to SELECT access unless you are intentionally rotating credentials; the admin API and portal continue to own schema migrations.

Service accounts welcome

Provision read-only roles for downstream services so you can explore Riposte tables in BI tools without exposing write access to operational workloads.

Integrate Google Cloud Pub/Sub when needed

If you rely on Gmail push notifications, configure googleCloud.pubsub in sync.config.json (or export GOOGLE_PUBSUB_TOPIC_NAME, GOOGLE_PUBSUB_SUBSCRIPTION_NAME, and GOOGLE_APPLICATION_CREDENTIALS). Riposte validates the topic, subscription, and credentials at startup, surfaces readiness through /health, and routes webhook deduplication through the shared Redis pub/sub service whenever REDIS_URL is present. Leave the fields unset to stick with polling, or tighten pubsub.maxMessages, ackDeadline, and maxRetries to match your project quotas before cutting traffic over.

Match staging to prod

Point staging at a separate Pub/Sub subscription with the same configuration so you can rehearse upgrades without consuming production notifications.

Deploy, restart, and upgrade safely

Build production bundles with npm run build and launch them via npm start, which invokes the compiled entry point and registers shutdown hooks. When a process receives SIGINT or SIGTERM, Riposte logs the signal, runs server.stop(), drains queues, and only then exits. Send those signals with your process manager (systemd, Kubernetes, Docker) so the runtime can close out work before you scale down.

For rolling deployments:

1. Remove one replica from your load balancer and send it SIGTERM. 2. Wait for its readiness check to stop succeeding (connection refused or non-200) or for the process to exit cleanly. 3. Deploy the new build, run npm run db:migrate (or your curated migration script) if you disabled AUTO_APPLY_MIGRATIONS, and restart the service. 4. Confirm /health and /health/detailed return status: "healthy". 5. Reattach the instance to the balancer and repeat.

Keep RIPOSTE_LICENSE_KEY, OAuth secrets, and database URLs in your environment manager so every replica boots with the same configuration. Avoid SIGKILL unless the process is wedged—any forced kill bypasses the graceful shutdown path and can drop in-flight sync work.

Dry-run upgrades

Use a staging environment with the same PostgreSQL and Redis backends to test new releases. Run riposte-migrate there first so production changes are predictable.

Last resort kills

If you must terminate with SIGKILL, follow up by inspecting GET /logs on surviving replicas for stuck jobs and rerun any missed work manually.