Resilience documentation

Site-deployed instances

What happens at an instance whose reachability has narrowed to a site — the three severance cases, application engagement, and behavior on reconnect.

A site-deployed instance is an Authonomy instance running on customer hardware at a site whose connectivity to external identity providers can be severed: a branch, a store, a facility. The instance behaves exactly as the central case described in Failover behavior — same router, same ladder, same evaluation logic. What changes at a site is which methods the instance can reach at any given moment.

This page describes what the site instance does in normal operation, what it does during each shape of severance, and how it returns to normal on reconnect.

The reachability graph from a site

A branch instance typically has three classes of outbound paths:

Public internet — to the cloud identity providers configured on the ladder.

Internal WAN — to on-premises infrastructure such as Active Directory, where the deployment carries an AD rung.

Local — the instance’s own database, session store, and credential material, all co-located at the site.

Native authentication doesn’t appear as a separate path because its dependencies are co-located with the site in any topology compatible with site placement. Three severance cases correspond to three sets of edges in the graph.

Public internet severed, internal WAN intact

The edges to both cloud providers are cut; the edge to AD remains. The router’s ladder falls through any cloud rungs to the AD rung, which continues to serve authentication at the site. This is the case the on-premises AD rung specifically addresses, and it’s the most operationally meaningful severance shape for branch deployments that carry an AD rung.

Internal WAN severed, public internet intact

The edge to AD is cut; the edges to the cloud providers remain. The router’s ladder skips the AD rung; the cloud rungs still serve. Less common as a severance shape because the internal WAN is often the more robust of the two paths from a branch, but the topology treats it as just another edge.

Full WAN severance

Every edge leaving the site is cut. No external rung is reachable; the ladder resolves to native authentication against the instance’s canonical view. The credential store is co-located with the site in either topology, so the instance can verify any factor enrolled by users in the site’s population. This is the case that motivates the native floor — no external rung can serve it.

In all three cases, the router’s decision path is unchanged. Only the input state — which methods the health monitor reports healthy — has narrowed. There is no “severance mode” the instance enters; it keeps doing what it’s been doing, against whichever methods remain reachable.

Applications point at the site instance, always

Applications at the site authenticate against the site’s instance at all times, not only during severance. The instance is the authentication endpoint for the site’s applications. No application-layer failover is required when the WAN severs, because the application is already pointed at the site-local instance.

This avoids two problems a hot-failover pattern would introduce. First, there’s no decision at the application layer about when to switch; the site-local instance is the endpoint regardless. Second, there’s no ambiguity about which instance is serving a given site’s traffic during severance; traffic at the site is always served by the site-local instance.

Not every application at the site needs to authenticate through Authonomy. The deployment selects the applications whose availability matters during site severance — typically the mission-critical systems required to keep the site operating. Applications outside that set can continue authenticating directly against the customer’s identity provider and inherit that provider’s normal availability characteristics.

What the floor serves locally

Native authentication at a site is served by the instance against credentials available to that site: WebAuthn, TOTP, password, and administrative credentials where configured. Identity in the canonical view is replicated from the customer’s authoritative identity source; Authonomy-native credential material is propagated through the deployment’s credential store. Both are bounded by the drift-window and propagation contracts described in Reconciliation and drift.

The set of subjects for whom native authentication is available at a site is a deployment decision, made through the canonical view’s scope (which subjects it contains) and the enrollment policy (which subjects have a locally-verifiable factor). A deployment with a defined operational population at each site typically scopes the view to that population and enrolls them for a locally-verifiable factor.

A subject whose configured method is an external IdP flow and who has not been enrolled for a locally-verifiable factor cannot authenticate during a full WAN severance — the configured method requires a provider the site cannot reach.

State and sync during severance

The instance’s canonical view is kept current through the topology’s configured propagation mechanisms. During severance, the sync stream from the customer’s authoritative source is frozen — no new identity data can arrive. The drift window grows by the duration of the severance for each frozen stream.

A provisioning change at the authoritative source, or a credential revocation in the credential store, doesn’t appear at the site until reconnect. A user terminated at the upstream IdP during a four-hour WAN severance retains authentication at the site for those four hours, plus the configured drift window before the severance began.

This is the tradeoff site deployment expresses: continuity at the cost of bounded lag. Deployments whose threat model cannot tolerate the lag have three levers, described in Understand the drift window.

Audit during severance

Every authentication event served by the instance during severance is appended to the instance’s audit trail synchronously with the event. Every state transition (method selected, factor verified, operator override applied) is captured in the same trail. Audit lives at the instance; severance does not change where audit is written.

Where the deployment requires audit aggregation across multiple instances, the aggregation is a separate concern outside the request path. It is not required for the instance to serve authentication during severance, and it does not change the durability of the audit trail at the instance.

Return to normal on reconnect

When the WAN returns, the methods the site can reach widen again. The transition proceeds in three steps:

WAN reachability re-established. External providers become reachable from the site again. The instance’s health monitor begins reflecting that.

Sync and propagation catchup. The sync engine resumes propagating identity changes from the customer’s authoritative source. Credential-store changes propagate according to the selected pattern. Enrollment and recovery surfaces become available again once the credential store is reachable. The staleness window on each affected stream shrinks back to the configured drift bound.

Stabilization. A brief stabilization window applies before the router’s view of each method returns to its pre-severance state. The window exists so that a flapping WAN does not produce repeated transitions; during the window, the instance continues serving against whichever method the health monitor reports healthy.

There is no separate restore phase, no manual catchup procedure, no consistency rebuild. The instance was serving authentication continuously throughout the severance; the recovery is reachability widening back to its normal shape.

Hardware and placement footprint

The instance runs on customer-operated hardware at the site. Placement is deployment-dependent — a store back-office server, a branch network appliance, a facility edge host — and the instance’s footprint is modest enough to run on single-site hardware. Dedicated hardware is not required; co-residence with other site infrastructure is supported where the deployment’s isolation requirements allow it.

A site can run a redundant set of instances behind a load balancer, sharing an HA-clustered database at the site. This is the placement that adds continuity through single-instance failure; the choice between per-site and redundant-set-per-site is described in Pick a deployment placement.