Conjur architecture and deployment reference
In this topic, you will learn the basics of the Conjur Enterprise deployment, hardware and software requirements, architecture reference, and best practices.
Definitions
Architecture overview
A high availability Conjur Enterprise deployment is configured in a Leader-Standby-Follower architecture. This deployment contains the following components:
- One active Leader
- At least two Standbys.
- One or more Followers; we recommend at least two.
Replication
The Standbys continuously replicate the Conjur database from the active Leader using PostgreSQL streaming replication.
- Synchronous replication ensures that there is always an up-to-date Standby database.
- Asynchronous replication may lag behind the Leader; we recommend that you set one Standby as synchronous.
- Followers are replications of the Leader, configured for application authentication and authorization, and for secrets retrieval. They are deployed close to target applications for low latency.
- Leader-to-Follower replication is asynchronous. The Followers connect to the Leader through a load balancer. This avoids having to reconfigure the Followers whenever a Standby becomes the Leader.
Write operations cannot be made against a Follower; attempting a write operation against a Follower results in an error. |
Conjur cluster
The Conjur cluster consists of the Leader and Standby nodes. You can set up a Conjur cluster to fail over automatically (auto-failover) or manually (manual failover).
The Leader and Standby nodes in an auto-failover cluster share their health state with each other using etcd.
The Leader is defined with a TTL (time to live) value. If the Leader becomes unavailable after this period, Conjur uses the Raft consensus algorithm to select a Standby that will be the new Leader. To avoid data loss, preference is given to the Standby whose database is most up-to-date.
The cluster should always contain an odd number of nodes - one Leader and an even number of Standbys. For example, there can be 1 Leader and 2 Standbys and any number of DR Standbys in an auto-failover cluster. |
This flow diagram describes how auto-failover cluster is setup.
Fore more information, see Configure auto-failover.
Disaster recovery
In your DR site, use DR Standbys. These instances are not in the auto-failover cluster and have to be manually promoted to the Leader.
For more information, see Site disaster recovery walkthrough.
To perform the manual failover, you promote a Standby and rebase all other Standbys in the cluster to the new Leader.
DR Standbys and Followers do not need to be rebased because they rebase automatically through the Conjur cluster load balancer which finds the healthy Leader automatically for them.
This flow diagram describes how manual failover cluster is setup.
Disaster recovery
In a manual failover cluster, defining DR Standbys is optional as regular Standbys already act as DR Standbys and get promoted manually.
Conjur Follower
Followers should run in close proximity to the applications that they serve. You can run multiple Followers in the same environment, and for better scalability and availability, we recommend to put a load balancer before it, to route the traffic evenly.
Followers have the following characteristics:
- Followers replicate from the Leader (on port 5432) and contain the same policies and secrets.
- Followers write audit data and forward it to the Leader (on port 1999).
- Followers are used to achieve high availability. Even if the Leader is temporarily unhealthy, Followers can continue to serve the clients and keep business going.
- Followers communicate with the Leader through the cluster load balancer; they are always routed to the healthy Leader and automatically rebase if a Standby is promoted to Leader.
Best practices and recommendations
To optimize Conjur availability, we recommend the following:
-
Use multiple regions and multiple AZs. This ensures that a failing region or AZ does not affect Conjur availability.
-
The Leader and synchronous Standby should run in the same region but in different AZs. This ensures that at any given point, two Conjur nodes in different AZs are completely synchronized. They run in the same region to rely on low network latency, as every transaction on the Leader is not complete until it is also done on the synchronous Standby.
-
Asynchronous Standbys can be in another AZ. If another AZ is not available, we recommend running them in the same AZ as the synchronous Standby so that if the Leader AZ fails, there is a quorum to promote one of the other Standbys to Leader.
-
We recommend running asynchronous DR Standbys in a different region to the Conjur cluster. If a disaster has caused the main region to become unavailable, Conjur can be manually promoted from the DR site.
-
We recommend running the Followers as close as possible to the applications they serve. This helps ensure maximum availability and minimal latency for the requests.
Deployment options
The Conjur cluster (Leader and Standbys) can be deployed as follows:
Deployment type | Description |
---|---|
As a container |
Node runs as a single container. |
Followers can be deployed as follows:
Deployment type | Description |
---|---|
As a container |
Node runs as a single container. Supported container: Docker and Podman. |
Kubernetes |
Node runs as a Pod inside OpenShift/Kubernetes. |
Accessibility
Leader and Standby
Port |
Accessible from |
Description |
||
---|---|---|---|---|
22 |
Local machine for setup / management |
Required for SSH access
|
||
443 |
Load balancer |
TLS endpoint for Conjur UI and API |
||
444 |
Load balancer |
HTTP health endpoint: simplifies load balancer setup |
||
1999 |
Load balancer audit stream |
Audit events are streamed from the Follower to the Leader (using syslog-ng) |
||
5432 |
Load balancer, other Standby nodes |
Required for data replication from the Leader to Standbys and Followers (PostgreSQL) |
Follower
Port |
Accessible from |
Description |
||
---|---|---|---|---|
22 |
Local machine for setup / management |
Optional for SSH access
|
||
443 |
Load balancer |
TLS endpoint for Conjur UI and API |
||
444 |
Load balancer |
HTTP health endpoint ; simplifies load balancer setup |
Communication between components
This section describes how the Conjur components communicate with each other.
Load balancer considerations
Conjur cluster load balancer
The Conjur cluster load balancer provides a well known network endpoint to forward requests to the Leader in a Conjur cluster. The load balancer constantly checks the health of the instances via the /health endpoint and, based on the result, can route traffic to the healthy Leader. The health check can be done via HTTPS through port 443 or HTTP through port 444.
The Conjur cluster load balancer must support the following capabilities:
|
Follower load balancer
The Follower load balancer is used to balance API requests over two or more Followers in the same location. The following additional capabilities depends on the load balancer you have:
-
Keeping the source IP address for IP address restriction and auditing - The load balancer must preserve the source IP address of the incoming request or be able to add an X-Forwarded-For header with the original source IP address of the request.
-
Mutual TLS communication - Follower-Leader communication and Kubernetes authentication rely on Mutual TLS. Therefore, the load balancer should not perform TLS termination on its own - pass the connection through.
The Followers load balancer may support the capabilities listed above, if the user requires it for the specified actions. |
Source IP address preservation
Non-transparent layer 4 and layer 7 proxies are supported to supply the correct client IP address if they are configured to meet the following requirements:
- The first non-transparent proxy a client connects to is a layer 7 (HTTP) proxy.
- All non-transparent proxies are included in the Conjur Trusted Proxies configuration.
- All non-transparent proxies are configured to append the IP address of the request to the X-Forwarded-For HTTP header before forwarding the request.
- Clients can ONLY connect to the first proxy and are unable to bypass it.
Security considerations
By default, the Conjur server keys are kept inside the Conjur node in cleartext. To improve the security of these keys at rest, we recommend encrypting the server keys with a master key. The master key encrypts/decrypts the server keys, and is intended to be given at run-time manually or automatically from a secure location such as HSM or AWS KMS. When a Conjur node starts, it can automatically access the master key in its protected store and use it to decrypt the server keys and start the Conjur services in a healthy state.
AWS KMS
The following image depicts the relationship between the Conjur nodes and the AWS instances, where AWS KMS is used to secure the master key.
HSM
The following image depicts the relationship between the Conjur nodes and the HSM instances, where an HSM is used to secure the master key.
For more information, see Server key encryption methods.
Audit
Conjur keeps audit information on actions that are performed in the system. The audit is written to three destinations:
File name | Description |
---|---|
audit.json |
Audits in JSON format; located in /var/log/conjur |
audit.log |
Audits in text format; located in /var/log/conjur |
Audit DB |
Provides easily accessible audit data for the Conjur UI |
Audits are collected from the Followers and sent to the Leader. The Leader adds its own audit to these destinations. The audits that are collected in the Leader are all local and are not replicated to other Conjur nodes. Therefore, we recommend that you export them from the Leader to a centralized SIEM.