How to Force Primary Reads for Critical User Transactions
Operational scenario: A write completes successfully, but the immediately following read — served from an asynchronous replica — returns the pre-write state. For payment verification, inventory deduction, or session token validation, this is not an eventual-consistency nuisance; it is a correctness failure.
This page gives you the exact steps to route those reads to the primary, with proxy rules, ORM configuration, session-level hints, a load-verified validation procedure, and a safe rollback path.
Symptom Identification
Look for these signals before concluding you have a stale-read problem on a critical path:
Application log patterns
OptimisticLockingFailureExceptionorDataMismatchExceptionwithin 500 ms of a successful write response.- HTTP
409 Conflictresponses on resources that were just created or updated by the same user session. - Inventory reservation failures where the system reports insufficient stock for items the user just added to cart.
Database metrics
-- PostgreSQL: check current replay lag per replica
SELECT client_addr,
state,
EXTRACT(EPOCH FROM replay_lag) AS lag_seconds
FROM pg_stat_replication;
-- MySQL / MariaDB
SHOW REPLICA STATUS\G
-- Look at: Seconds_Behind_Source, Replica_IO_Running, Replica_SQL_RunningThreshold calibration: When replication lag on any replica serving payment or session paths exceeds 250 ms, route those reads to the primary. For inventory paths a 500 ms threshold is typical. These numbers must be derived from your actual write-to-read interval on critical flows, not set arbitrarily.
Header-propagation signal: If your write responses include an X-Write-Timestamp header, the client can compare it to Date on the subsequent read response. A read served more than max_replication_lag ms after the write timestamp is a candidate for stale data.
Routing Architecture
The diagram below shows where primary-read enforcement can be applied at each layer. Enforcement at the proxy is the most reliable; ORM-level is the most portable; session hints are the last-resort fallback.
Root Cause Analysis
Stale reads on critical paths come from one of four sources:
ORM defaults: Hibernate, Spring Data, Prisma, and most ORMs route SELECT statements to read-only data sources by default. A @Transactional(readOnly=true) annotation — or its absence when a framework infers read-only mode — will direct the query to a replica regardless of the transaction’s business semantics.
Connection pool misconfiguration: HikariCP, R2DBC, and c3p0 apply routing rules at pool selection time. If the routing key is determined before the query text is known, locking queries like SELECT ... FOR UPDATE can still land on a replica, where they will either be rejected (read-only instance error) or silently skip the lock.
Proxy rule gaps: Implementing read/write splitting at the proxy layer requires explicit rules for every query pattern that must land on the primary. A missing rule for SELECT balance FROM accounts WHERE ... (no FOR UPDATE) will route it to a replica even when the application logic requires a fresh value.
Binary log / WAL throughput saturation: During write spikes, the replica’s SQL thread falls behind the I/O thread. Even if application routing is correct, a replica that showed 0 ms lag ten seconds ago may now be 800 ms behind. Routing decisions must be made against real-time lag samples, not cached state.
Step-by-Step Resolution
Step 1 — Tag critical endpoints with consistency metadata
Inject routing directives at the API boundary so every downstream layer can make the same routing decision without inspecting query text.
X-DB-Consistency: STRONG
X-Transaction-Isolation: READ_COMMITTEDFor framework-annotated code, use the transaction attributes directly:
// Java / Spring — forces write datasource, READ_COMMITTED isolation
@Transactional(isolation = Isolation.READ_COMMITTED, readOnly = false)
public PaymentResult verifyPayment(String txId) { ... }Verification: In your access log, confirm requests to /api/payments/* and /api/inventory/deduct carry X-DB-Consistency: STRONG on every call. Missing headers on retry paths are the most common gap.
Step 2 — Configure connection pool middleware
Implement a datasource interceptor that reads the consistency tag and switches the active datasource before the query executes.
// HikariCP / Spring — AbstractRoutingDataSource interceptor
@Component
public class ConsistencyRoutingFilter extends OncePerRequestFilter {
@Override
protected void doFilterInternal(HttpServletRequest req,
HttpServletResponse res, FilterChain chain)
throws ServletException, IOException {
String consistency = req.getHeader("X-DB-Consistency");
boolean txActive = TransactionSynchronizationManager.isActualTransactionActive();
if ("STRONG".equals(consistency) || txActive) {
RoutingContextHolder.set(RoutingKey.PRIMARY); // write datasource
} else {
RoutingContextHolder.set(RoutingKey.REPLICA); // read datasource
}
try {
chain.doFilter(req, res);
} finally {
RoutingContextHolder.clear(); // avoid context leak
}
}
}Verification: After deploying, query the pool stats endpoint (/actuator/metrics/hikaricp.connections.active) and confirm that payment endpoints draw connections from the primaryPool, not replicaPool. Use TransactionSynchronizationManager.getCurrentTransactionName() in a test to assert the correct pool is selected.
Step 3 — Deploy proxy-layer routing rules
Proxy-level enforcement is independent of application code and survives ORM upgrades. Configure ProxySQL (MySQL) or a PgBouncer alias (PostgreSQL) to enforce primary routing for locking queries and high-stakes patterns.
ProxySQL (MySQL / MariaDB):
-- Route SELECT ... FOR UPDATE and balance reads to the primary hostgroup (10)
INSERT INTO mysql_query_rules (rule_id, active, match_digest, destination_hostgroup, apply)
VALUES
(101, 1, '^SELECT.*FOR UPDATE', 10, 1),
(102, 1, '^SELECT.*LOCK IN SHARE MODE', 10, 1),
(103, 1, '^SELECT.*balance.*FROM accounts', 10, 1),
(104, 1, '^SELECT.*FROM sessions.*WHERE token', 10, 1);
LOAD MYSQL QUERY RULES TO RUNTIME;
SAVE MYSQL QUERY RULES TO DISK;Note: the /* FORCE_MASTER */ comment trick works only as a ProxySQL match_digest pattern — it is not a native MySQL SQL hint. ProxySQL intercepts the comment during query routing; the database engine never sees it.
PgBouncer (PostgreSQL): Create a dedicated app_primary database alias that points to the primary’s host and restricts pool_mode to transaction:
[databases]
; normal read alias — can hit any replica
app_read = host=replica1.internal port=5432 dbname=appdb pool_size=40
; primary alias — used only for critical read paths
app_primary = host=primary.internal port=5432 dbname=appdb pool_size=20
; pool_size capped to prevent critical reads from starving writes
[pgbouncer]
pool_mode = transaction
max_client_conn = 200
default_pool_size = 20
server_reset_query = DISCARD ALLApplications connect to app_primary for critical operations and app_read for everything else. This sidesteps PgBouncer’s lack of query-level routing by making the connection string itself the routing signal.
Verification: After applying rules, run a test SELECT ... FOR UPDATE through your proxy and confirm it lands on the primary:
-- PostgreSQL: on the primary, check active connections
SELECT pid, usename, client_addr, state, query
FROM pg_stat_activity
WHERE query LIKE '%FOR UPDATE%';
-- MySQL: on the primary
SHOW PROCESSLIST;
-- client_addr should match your application servers, not replica IPsStep 4 — Add database-level session hints as a fallback
When middleware routing fails or cannot be deployed immediately, enforce primary execution at the session level. This is a last-resort measure — it requires the application to hold a connection to the primary and set session state before executing the critical query.
PostgreSQL:
-- On a connection from the primary pool only
SET SESSION default_transaction_read_only = OFF;
-- Then execute the critical read
SELECT balance, version FROM accounts WHERE id = $1 FOR UPDATE;MySQL / MariaDB:
-- Ensure the connection is from a non-read-only endpoint
-- Use a separate DSN that points to the primary, then:
SET SESSION TRANSACTION ISOLATION LEVEL READ COMMITTED;
BEGIN;
SELECT quantity FROM inventory WHERE sku = ? FOR UPDATE;
-- ... perform deduction ...
COMMIT;Verification: Confirm the session setting took effect before executing:
-- PostgreSQL
SHOW default_transaction_read_only; -- must return 'off'
-- MySQL
SELECT @@transaction_isolation; -- must return 'READ-COMMITTED'Step 5 — Validate routing under simulated load
Before declaring the configuration complete, generate realistic write-then-read workloads and confirm routing holds under concurrency.
# PostgreSQL — pgbench with a custom script that writes then immediately reads
pgbench -c 20 -j 4 -T 60 -f critical_read_test.sql appdb
# critical_read_test.sql
BEGIN;
UPDATE accounts SET balance = balance - 10 WHERE id = :aid;
SELECT balance FROM accounts WHERE id = :aid FOR UPDATE;
COMMIT;While the test runs, observe on the primary:
-- Count connections per client application
SELECT client_addr, count(*) AS conns, state
FROM pg_stat_activity
WHERE datname = 'appdb'
GROUP BY client_addr, state
ORDER BY conns DESC;All FOR UPDATE connections must appear on the primary’s pg_stat_activity. If any appear on a replica’s equivalent view, a routing rule is missing.
Configuration Snippet
The following is the minimal complete change set for a Spring Boot + HikariCP + ProxySQL stack. Annotate and adjust parameters before deploying:
# application.yml — dual datasource configuration
spring:
datasource:
primary:
jdbc-url: jdbc:mysql://proxysql.internal:6033/appdb
username: app_user
password: "${DB_PRIMARY_PASSWORD}"
hikari:
pool-name: primaryPool
maximum-pool-size: 20 # cap: critical reads + writes share this
minimum-idle: 5
connection-timeout: 3000 # fail fast rather than queue behind writes
read-only: false # never mark primary pool as read-only
replica:
jdbc-url: jdbc:mysql://proxysql.internal:6033/appdb
username: app_reader
password: "${DB_REPLICA_PASSWORD}"
hikari:
pool-name: replicaPool
maximum-pool-size: 50
minimum-idle: 10
read-only: true # ProxySQL sees read-only flag and routes to replica HG// RoutingDataSource — selects pool based on RoutingContextHolder
@Configuration
public class DataSourceConfig {
@Bean
@Primary
public DataSource routingDataSource(
@Qualifier("primaryDataSource") DataSource primary,
@Qualifier("replicaDataSource") DataSource replica) {
AbstractRoutingDataSource ds = new AbstractRoutingDataSource() {
@Override
protected Object determineCurrentLookupKey() {
return RoutingContextHolder.get(); // PRIMARY or REPLICA
}
};
ds.setTargetDataSources(Map.of(
RoutingKey.PRIMARY, primary,
RoutingKey.REPLICA, replica
));
ds.setDefaultTargetDataSource(replica); // safe default
ds.afterPropertiesSet();
return ds;
}
}Critical parameters:
maximum-pool-size: 20on the primary pool — deliberately low to prevent critical reads from crowding out writes.connection-timeout: 3000— fail fast if the primary is overloaded; let the circuit breaker trip rather than queue indefinitely.read-only: falseon primary pool — some JDBC drivers and proxies use this flag to enforce routing; setting it incorrectly on the primary pool sends all traffic to replicas.
Verification and Rollback
Confirm the fix is working:
-- PostgreSQL: no critical reads should appear in replica's pg_stat_activity
SELECT count(*) FROM pg_stat_activity
WHERE query LIKE '%FOR UPDATE%'
AND client_addr != '0.0.0.0'; -- replace with primary IP range
-- MySQL: on the replica, this count must be 0 during normal operation
SELECT count(*) FROM information_schema.PROCESSLIST
WHERE STATE LIKE '%Waiting%' AND INFO LIKE '%FOR UPDATE%';Monitor the primary’s connection count and CPU for 15 minutes after the change. If sustained CPU exceeds 85% or max_connections utilisation exceeds 80%, engage the circuit breaker (see below) before replication pressure compounds the issue.
Rolling back safely:
-
Toggle the feature flag: disable
db.force_primary_readsvia Consul, LaunchDarkly, or a KubernetesConfigMap. This reverts ORM routing without a code deployment. -
Verify replicas are caught up before reducing primary traffic:
sql-- PostgreSQL SELECT state, sync_state, (replay_lsn = write_lsn) AS fully_synced FROM pg_stat_replication; -- MySQL SHOW REPLICA STATUS\G -- Seconds_Behind_Source = 0 AND Replica_IO_Running = Yes -
Remove or disable the ProxySQL rules that force primary routing:
sqlUPDATE mysql_query_rules SET active = 0 WHERE rule_id IN (101, 102, 103, 104); LOAD MYSQL QUERY RULES TO RUNTIME; SAVE MYSQL QUERY RULES TO DISK; -
Run a table checksum across primary and replica for the critical tables:
sql-- MySQL CHECKSUM TABLE accounts, inventory, sessions; -- Primary and replica sums must match; divergence indicates data loss during the incident
Edge Cases and Gotchas
Multi-region deployments: In a multi-region read replica topology, the “primary” is a single global writer. Forcing primary reads from a distant region adds cross-region round-trip latency (often 80–200 ms) to every critical operation. Model this into your SLA before enabling forced primary reads globally. A common pattern is to accept the latency for financial operations and allow regional stale reads for session data with a short expiry.
Cascading replication (replica-of-replica): If your topology includes replication chains where a replica itself serves as a source for a downstream replica, lag compounds at each hop. Routing to the intermediate replica rather than the true primary may still result in stale reads. Always route to the write node’s direct DSN, not an intermediate replica’s DSN, for locking queries.
Primary overload from forced reads: Routing all critical reads to the primary introduces capacity risk. Deploy Resilience4j or Envoy circuit breakers to monitor primary CPU, IOPS, and connection count. If utilisation exceeds 85% for more than 60 seconds, trip the breaker and temporarily redirect non-financial reads to replicas. Reserve synchronous primary routing for financial, identity, and inventory operations only — route analytics, reporting, and audit trail reads through asynchronous queues (Kafka / SQS) instead. For a structured approach to managing this degradation path, see Fallback Strategies When Replicas Fall Behind.
Logical vs physical replication (PostgreSQL): Physical streaming replication applies all WAL changes in order; logical replication filters by publication and can skip or delay certain operations. If your replica uses logical replication for selective table sync, a primary read for a table not in the publication will fail on the replica regardless of lag — force primary reads for the entire affected transaction, not just the specific table.
FAQ
Does forcing primary reads cause connection pool exhaustion?
It can if the primary pool is sized for write workloads only. Reserve a dedicated primary read pool with a ceiling (e.g. max_size=20) so critical reads compete with writes for a bounded resource — not the entire connection limit. Size the primary pool for peak_critical_read_qps * avg_query_duration_ms / 1000 connections, then add a 20% headroom buffer.
Can SELECT … FOR UPDATE always be sent to a replica?
No. SELECT ... FOR UPDATE and SELECT ... LOCK IN SHARE MODE acquire row locks and must execute on the primary. Sending them to a replica will either fail with a read-only instance error or silently return stale data without acquiring the lock — depending on whether the replica enforces read_only = ON. Never rely on replica enforcement; enforce at the proxy or ORM layer.
How do I handle primary reads in a multi-region setup?
Route to the regional primary first. If cross-region latency is acceptable for the operation (e.g. financial writes that already pay the round-trip cost), route critical reads to the same primary. If not, consider a synchronous replica promoted to a regional primary for that region — but this requires a quorum write path, which fundamentally changes your consistency model.
Related
← Back to Fallback Strategies When Replicas Fall Behind
- Detecting and Handling Replication Lag in Real-Time — how to measure
replay_lagandSeconds_Behind_Sourcecontinuously so routing decisions are based on current state, not stale telemetry. - Using Application-Level Timestamps to Bypass Stale Replicas — a complementary technique that lets the application detect staleness at read time and escalate to the primary without proxy changes.
- Implementing Read/Write Splitting at the Proxy Layer — the broader context for ProxySQL and PgBouncer routing rules, including how primary-read rules interact with the default read/write split.
- Routing Queries Based on Data Freshness Requirements — when per-query freshness metadata is a better fit than blanket primary routing for an endpoint.