ENTERPRISE LLM GATEWAY

The control plane for LLM traffic.

OpenAI-compatible gateway for budgets, PII guardrails, audit logs, and reliability routing.

Request a demo View security model

POLICY • AUDIT • BUDGETS • ROUTING

OpenAI-compatible Audit-first controls

Cloudflare-native edge

Policy decisions • Request trace timeline • Budget actions

Sentinel Primo Console

PII detected -> blocked

Stop cost spikes

Caps, budgets, alerts

Enforce project budgets and token caps before provider spend runs away.

Prevent leaks + PII

Redact / block before storage

Apply redact or block decisions at ingress so sensitive data does not leave your boundary.

Stay up during provider flaps

Retry + fallback routing

Fail over cleanly with retry and fallback policies when primary routes degrade.

Who this is for

Built for platform, security, and FinOps ownership

Platform teams

Standardize LLM ingress without SDK rewrites across services.

Security teams

Apply policy controls and preserve auditable request evidence.

FinOps teams

Enforce budgets and stop runaway token spend before invoices.

Core capabilities

Signature control-plane bento

How it works

Three steps to enforce policy at the edge

Step 1

Swap base URL

Keep your existing OpenAI SDK flow and point requests to Sentinel Primo.

Step 2

Apply policies

Run allow, block, and redact checks before provider execution.

Step 3

Observe & enforce

Review audit logs, budget actions, route decisions, and reliability outcomes.

Routing diagram

Control-plane proof

Internal product signals, not marketing noise

Requests view

Live request queue with policy and route outcomes.

PII redactedFallback triggeredToken usage logged

Policy decisions

Deterministic policy path with redact and block evidence.

Allow / Block / RedactRule match traceProvider call gated

Budgets and caps

Budget governance with threshold and enforcement signals.

Budget exceeded: blockedSoft threshold alertsPer-project caps

Security & trust

Controls built for high-trust operations

Data handling

Logs store operational metadata, not full payloads by default.
PII detection can redact or block before provider calls.
Provider keys remain scoped per project and environment.

Retention & audit

Retention policies are configurable by workspace or environment.
Audit events are structured for review and export.
Cost, policy, and routing outcomes stay traceable per request.

Isolation & access

Project-level API keys support workload separation.
Workspace boundaries prevent cross-team spillover.
SSO and advanced RBAC are clearly tracked as roadmap items.

FAQ

Do you log raw prompts and raw model responses? +

Not by default. Sentinel Primo is designed around metadata-first logging and configurable retention controls.

How does PII handling work? +

Policies can detect sensitive entities and apply allow, redact, or block actions before provider forwarding.

Can we configure retention windows? +

Yes. Retention profiles are configurable so teams can align operational visibility with internal policy.

Are provider keys exposed to application clients? +

No. Provider credentials stay server-side inside Sentinel Primo policy and routing layers.

Can we export audit records? +

Yes. Structured audit exports are available for compliance, finance, and security workflows.

Resources

Operational guides for rollout teams

Quickstart (5 min)

Swap your base URL, verify request flow, and enable baseline policy checks.

Cost control (budgets)

Define spend guardrails and enforce deterministic budget actions.

Reliability

Design retry and fallback behavior for provider outages and latency spikes.

Security & privacy

Configure PII treatment, metadata logs, and retention posture.

Early-access pricing is available for design partners. Request a demo to discuss fit.

Talk to sales

Bring governance to every model request

Review architecture fit, policy posture, and rollout sequencing with the Sentinel Primo team.

Request a demo Email us