Routing Queries Based on Data Freshness Requirements

Back to Replication Lag & Consistency Management

Problem framing

Within the broader replication lag and consistency management framework, one of the most operationally risky gaps is the absence of an explicit freshness contract at the query level. Without it, every read is implicitly “best effort” — engineers assume staleness is acceptable until a financial discrepancy, a double-spend, or a vanished inventory item proves otherwise. The SLA risk is asymmetric: reads that tolerate eventual consistency are cheap to serve from replicas, but reads that silently downgrade from a strict freshness requirement cause data correctness incidents that are expensive to debug and may be externally visible.

This page covers how to close that gap: classifying read operations into discrete freshness tiers, annotating queries so routing infrastructure can act on those classifications, and wiring the routing layer to real-time lag signals so decisions remain accurate under changing replica health. Failure to enforce freshness routing also over-provisions primary capacity — unannotated queries frequently default to the primary “to be safe”, defeating the horizontal read scaling that read replicas are meant to deliver.

Concept definition and scope

Freshness-aware query routing is the practice of directing each read query to the database node whose current replication lag satisfies the maximum staleness window the query can tolerate. It is distinct from:

Freshness-aware routing adds a third axis — temporal correctness — to the routing decision. It requires both a classification scheme (tiering queries by staleness tolerance) and a feedback loop (measuring actual lag to determine which nodes satisfy each tier at any given moment).

Freshness tier model

Adopt a three-tier model covering the vast majority of production read patterns:

Tier Annotation Max acceptable lag Typical use cases
Strict FRESHNESS: STRICT 0 ms (primary or sync standby only) Financial ledgers, inventory deduction, user session state, payment idempotency keys
Near-real-time FRESHNESS: NEAR_RT ≤ 500 ms User profiles, activity feeds, dashboard metrics, order status
Eventual FRESHNESS: EVENTUAL Seconds to minutes Historical reporting, search index reads, batch analytics, audit logs

Unannotated queries must not silently route to replicas. Route them to a quarantine pool for review — or fail them with a descriptive error — rather than allowing them to bypass freshness checks. Ambiguous defaults are how staleness violations reach production.

Mechanism deep-dive

Step 1 — Classify data domains at design time

Map each data domain to a tier before writing routing infrastructure. Classify based on the consequence of serving a stale read, not on expected lag:

  • Zero-consequence stalenessEVENTUAL
  • Visible to user, recoverableNEAR_RT
  • Financial, security, or irreversible operationSTRICT

Encode the classification in a data dictionary or service contract, not just in code comments. Teams that skip this step end up with per-developer routing decisions that drift over time.

Step 2 — Inject freshness annotations at the query layer

The most portable annotation mechanism is a SQL comment injected by a middleware layer the application cannot bypass:

python
# Django ORM interceptor — injects freshness hint before execution
from django.db.backends.base.operations import DatabaseOperations

class FreshnessAnnotationMiddleware:
    TIER_HINTS = {
        "strict":   "/* FRESHNESS: STRICT */",
        "near_rt":  "/* FRESHNESS: NEAR_RT */",
        "eventual": "/* FRESHNESS: EVENTUAL */",
    }

    def execute(self, sql, params, many, context):
        tier = context.get("freshness_tier", "strict")  # default to strict
        hint = self.TIER_HINTS[tier]
        annotated_sql = f"{hint} {sql}"
        return super().execute(annotated_sql, params, many, context)

For services that cannot modify the ORM layer, inject hints via a connection string parameter that the proxy interprets, or use a session variable (SET @freshness_tier = 'eventual') that the proxy reads before each statement.

Step 3 — Configure proxy routing rules

ProxySQL evaluates mysql_query_rules in rule_id order. Anchor rules with ^ to prevent partial matches misclassifying writes as reads, and set apply=1 to short-circuit further rule evaluation:

sql
-- Hostgroup layout
-- HG 10 = writer (primary)
-- HG 20 = low-lag replicas (max_replication_lag ≤ 1s)
-- HG 30 = relaxed replicas (max_replication_lag ≤ 10s)

INSERT INTO mysql_query_rules
  (rule_id, active, match_digest, destination_hostgroup, apply, comment)
VALUES
  (10, 1, '^\/\* FRESHNESS: STRICT \*\/',   10, 1, 'Strict → primary'),
  (20, 1, '^\/\* FRESHNESS: NEAR_RT \*\/',  20, 1, 'Near-RT → low-lag replicas'),
  (30, 1, '^\/\* FRESHNESS: EVENTUAL \*\/', 30, 1, 'Eventual → any replica'),
  (99, 1, '^SELECT',                         10, 1, 'Unannotated reads → primary (quarantine)');

LOAD MYSQL QUERY RULES TO RUNTIME;
SAVE MYSQL QUERY RULES TO DISK;

Step 4 — Wire lag detection into routing weights

Static hostgroup assignment is insufficient — a replica can drift from HG 20 eligibility during write bursts. ProxySQL’s max_replication_lag field per server causes automatic weight reduction to zero when lag is exceeded, but this binary approach is too coarse for near-real-time workloads. Pair it with an external lag monitor that adjusts weights dynamically:

python
# Lag monitor — polls every 250ms, adjusts weights
import pymysql, time

THRESHOLDS = {"near_rt": 0.5, "eventual": 10.0}

def adjust_replica_weights(admin_conn, lag_by_host: dict):
    for host, lag_s in lag_by_host.items():
        if lag_s <= THRESHOLDS["near_rt"]:
            weight = 100
        elif lag_s <= THRESHOLDS["eventual"]:
            weight = 30   # still eligible for EVENTUAL
        else:
            weight = 0    # circuit open
        admin_conn.execute(
            "UPDATE mysql_servers SET weight=%s WHERE hostname=%s AND hostgroup_id=20",
            (weight, host)
        )
    admin_conn.execute("LOAD MYSQL SERVERS TO RUNTIME")

Beware of thundering herd effects when multiple replicas simultaneously breach a threshold: implement jittered backoff between polling cycles and stagger weight adjustments across replicas to prevent oscillating routing states during transient spikes.

Step 5 — Define deterministic fallback paths

Every tier must have a declared fallback. When the circuit opens, the routing engine must not silently downgrade — it must execute a predefined action:

Tier Primary action Fallback when lag > threshold
STRICT Route to primary Already at primary; no fallback needed
NEAR_RT Route to low-lag replica 503 to caller, or primary with rate limit cap
EVENTUAL Route to any replica Route to least-lagged replica; only fail if all replicas are unreachable

Architecture overview

The diagram below shows the routing decision flow from an annotated application query through the proxy layer to the appropriate database node.

Freshness-aware query routing architecture An annotated application query enters ProxySQL, which evaluates query rules to extract the freshness tier, then consults the lag monitor to select an eligible replica. STRICT queries go to the primary, NEAR_RT to low-lag replicas, EVENTUAL to any replica. If no eligible replica exists for a tier the circuit breaker returns a 503. Application (annotated query) ProxySQL Query rule evaluation (extract tier) Lag monitor (heartbeat + GTID) STRICT Primary (writer node) NEAR_RT Low-lag replica (lag ≤ 500 ms) EVENTUAL Replica pool (any healthy node) Circuit breaker (503 / rate-limit) lag > SLA

Trade-off comparison table

Routing layer Implementation Zero-code-change Lag visibility Failure surface Best fit
Transparent middleware ProxySQL, HAProxy, Envoy Yes Proxy-level only Routing loops during proxy health-check failures High-throughput analytics, reporting
ORM/datasource switching Django DB Routers, Spring AbstractRoutingDataSource No Full application context Connection pool exhaustion under burst traffic Transactional services with complex routing logic
Service mesh sidecar Istio/Linkerd, Envoy cluster routing Yes Infrastructure metadata Metadata cache staleness during rapid topology changes Multi-service platforms with mTLS requirements
Application-level cursor SQLAlchemy session factories, Prisma client routing No Full query context Developer discipline required; easy to bypass Greenfield services where ORM adoption is uniform

For production systems handling mixed workloads, the most operationally resilient pattern is a hybrid: route bulk analytics and reporting through a transparent proxy, while transactional services use explicit ORM-level datasource switching. Ensure connection pools are isolated per freshness tier to prevent noisy-neighbour effects — a surge in EVENTUAL queries must not exhaust the connection budget for STRICT queries.

Configuration runbook

ProxySQL — full freshness-aware setup

sql
-- 1. Define hostgroups with explicit lag constraints
INSERT INTO mysql_replication_hostgroups
  (writer_hostgroup, reader_hostgroup, comment)
VALUES
  (10, 20, 'Primary → Read Replicas');

-- 2. Register servers with per-server lag thresholds (seconds)
--    weight=0 means the server is registered but not eligible until lag drops
INSERT INTO mysql_servers
  (hostgroup_id, hostname, port, max_replication_lag, weight, comment)
VALUES
  (10, 'primary.db.internal',    3306, 0,  100, 'Writer node'),
  (20, 'replica-01.db.internal', 3306, 1,  100, 'HG20: low-lag replica'),
  (20, 'replica-02.db.internal', 3306, 1,   80, 'HG20: low-lag replica'),
  (30, 'replica-03.db.internal', 3306, 10, 100, 'HG30: relaxed replica'),
  (30, 'replica-04.db.internal', 3306, 10,  60, 'HG30: relaxed replica');

-- 3. Query rules — rule_id order matters; lower = higher priority
--    apply=1 short-circuits further evaluation
INSERT INTO mysql_query_rules
  (rule_id, active, match_digest, destination_hostgroup, apply, comment)
VALUES
  (10, 1, '^\/\* FRESHNESS: STRICT \*\/',   10, 1, 'Strict → primary'),
  (20, 1, '^\/\* FRESHNESS: NEAR_RT \*\/',  20, 1, 'Near-RT → HG20'),
  (30, 1, '^\/\* FRESHNESS: EVENTUAL \*\/', 30, 1, 'Eventual → HG30'),
  (99, 1, '^SELECT',                         10, 1, 'Unannotated reads → primary (quarantine)');

-- 4. Monitoring and timeout tuning
UPDATE global_variables SET variable_value = '250'
  WHERE variable_name = 'mysql-monitor_replication_lag_interval';    -- poll every 250ms
UPDATE global_variables SET variable_value = '200'
  WHERE variable_name = 'mysql-monitor_read_only_timeout';           -- replica health timeout
UPDATE global_variables SET variable_value = '1000'
  WHERE variable_name = 'mysql-connect_timeout_server';              -- connection timeout
UPDATE global_variables SET variable_value = '2'
  WHERE variable_name = 'mysql-query_retries_on_failure';            -- retry budget

LOAD MYSQL VARIABLES TO RUNTIME; SAVE MYSQL VARIABLES TO DISK;
LOAD MYSQL SERVERS TO RUNTIME;   SAVE MYSQL SERVERS TO DISK;
LOAD MYSQL QUERY RULES TO RUNTIME; SAVE MYSQL QUERY RULES TO DISK;

Failure-state annotation: When max_replication_lag is breached for a server, ProxySQL reduces its effective weight to 0, shifting traffic to the writer hostgroup. To prevent primary overload, cap mysql-max_connections and implement application-level circuit breakers that return cached responses or 503 when the primary connection queue depth exceeds 80%.

PostgreSQL — lag detection query for routing agents

sql
-- Run on each replica to measure true application-level lag
-- Requires a heartbeat table populated by the primary every 250ms
SELECT
  now() - MAX(written_at) AS replica_lag_seconds,
  pg_is_in_recovery()     AS is_replica
FROM replication_heartbeat;

-- Cross-check with streaming replication stats (on primary)
SELECT
  application_name,
  client_addr,
  EXTRACT(EPOCH FROM replay_lag) AS replay_lag_seconds,
  state
FROM pg_stat_replication
ORDER BY replay_lag_seconds DESC;

Monitoring and alerting signals

Instrument the routing layer with distributed tracing spans that capture freshness decision metadata alongside normal latency signals.

Key metrics to export:

Metric Type Alert threshold
replica_lag_seconds{host, tier} Gauge NEAR_RT: > 0.5s for 30s; EVENTUAL: > 30s for 2m
routing_fallback_total{tier} Counter Any STRICT fallback triggers PagerDuty
routing_decision_latency_ms Histogram p99 > 5ms indicates proxy saturation
consistency_violation_rate{service} Counter > 0 for STRICT tier is a P1 incident
circuit_breaker_open{hostgroup} Gauge Alert immediately when HG20 circuit opens

Prometheus alerting rules:

yaml
groups:
  - name: freshness_routing
    rules:
      - alert: NearRTReplicaLagExceeded
        expr: replica_lag_seconds{tier="near_rt"} > 0.5
        for: 30s
        labels:
          severity: warning
        annotations:
          summary: "Replica {{ $labels.host }} exceeds 500ms lag for NEAR_RT tier"
          action: "Verify replication threads; check network latency; prepare manual weight override"

      - alert: StrictRoutingFallbackTriggered
        expr: increase(routing_fallback_total{tier="strict"}[5m]) > 0
        labels:
          severity: critical
        annotations:
          summary: "STRICT-tier query fell back — investigate primary or routing rule misconfiguration"

      - alert: CircuitBreakerOpen
        expr: circuit_breaker_open{hostgroup="20"} == 1
        for: 0m
        labels:
          severity: critical
        annotations:
          summary: "HG20 circuit open — NEAR_RT traffic routing to primary"

Heartbeat table setup (MySQL/PostgreSQL):

sql
-- Create heartbeat table on primary
CREATE TABLE replication_heartbeat (
  id          INT PRIMARY KEY DEFAULT 1,
  written_at  TIMESTAMP(6) NOT NULL DEFAULT NOW(),
  source_host VARCHAR(128)
);

-- Populate from a cron / sidecar every 250ms
-- MySQL
INSERT INTO replication_heartbeat (id, written_at, source_host)
VALUES (1, NOW(6), @@hostname)
ON DUPLICATE KEY UPDATE written_at = NOW(6), source_host = @@hostname;

-- PostgreSQL
INSERT INTO replication_heartbeat (id, written_at, source_host)
VALUES (1, clock_timestamp(), inet_server_addr()::text)
ON CONFLICT (id) DO UPDATE
  SET written_at = clock_timestamp(),
      source_host = inet_server_addr()::text;

Heartbeat-based lag measurement captures application-level latency — including cases where Seconds_Behind_Source (MySQL) or replay_lag from pg_stat_replication (PostgreSQL) reports zero during a replication thread stall that has not yet disconnected. Complement both signals with GTID/LSN gap tracking to detect silent halts.

Failure modes and recovery steps

1. Unannotated queries silently routed to replicas

Root cause: A catch-all SELECT rule exists without requiring freshness annotations, so all reads reach replicas regardless of staleness tolerance.

Diagnosis:

sql
-- Identify unannotated read traffic in ProxySQL's query digest
SELECT hostgroup, digest_text, count_star
FROM stats_mysql_query_digest
WHERE digest_text LIKE '%SELECT%'
  AND digest_text NOT LIKE '%FRESHNESS:%'
ORDER BY count_star DESC
LIMIT 20;

Remediation: Change the catch-all rule to route unannotated reads to the writer hostgroup (hostgroup 10) and raise an application-level alert. Do not silently accept them on replicas.


2. Thundering herd when multiple replicas simultaneously exceed the threshold

Root cause: A write spike or network hiccup causes several replicas to breach the lag threshold at the same time, collapsing all replica weight to zero and flooding the primary.

Diagnosis: Sudden spike in primary Threads_running (MySQL) or active_connections (PostgreSQL), combined with circuit_breaker_open firing for multiple hostgroups simultaneously.

Remediation:

python
# Stagger weight reductions with jitter to avoid simultaneous collapse
import random, time

def degrade_replica(admin_conn, host, current_lag):
    delay = random.uniform(0.05, 0.3)  # 50–300ms jitter
    time.sleep(delay)
    new_weight = compute_weight(current_lag)
    admin_conn.execute(
        "UPDATE mysql_servers SET weight=%s WHERE hostname=%s", (new_weight, host)
    )
    admin_conn.execute("LOAD MYSQL SERVERS TO RUNTIME")

Also apply max_connections caps on the primary to prevent cascade overload.


3. Proxy rule regex misclassifying writes as reads

Root cause: Unanchored regex patterns (missing ^) match write queries containing SELECT subqueries, routing them to read replicas.

Diagnosis:

sql
-- Check query rule match counts in ProxySQL
SELECT rule_id, match_digest, hits
FROM stats_mysql_query_rules
ORDER BY hits DESC;

-- Inspect matched queries that were writes
SELECT hostgroup, digest_text
FROM stats_mysql_query_digest
WHERE digest_text LIKE '%INSERT%' OR digest_text LIKE '%UPDATE%'
  AND hostgroup != 10;  -- HG10 = writer

Remediation: Anchor all regex rules with ^. For MySQL queries containing comment hints, ensure the comment appears at the very start of the SQL string before the DML keyword.


4. Statistics lag causing suboptimal execution plans on replicas

Root cause: Replicas receive bulk writes that alter data distribution, but optimizer statistics are not refreshed, causing full-table scans on queries that should use an index.

Diagnosis: High Innodb_buffer_pool_reads / Innodb_buffer_pool_read_requests ratio on MySQL, or low heap_blks_hit / (heap_blks_hit + heap_blks_read) in pg_statio_user_tables on PostgreSQL.

Remediation:

sql
-- MySQL: refresh statistics on affected table
ANALYZE TABLE orders;

-- PostgreSQL: refresh without locking
ANALYZE orders;

-- Add an optimizer hint for critical read paths where statistics lag is unavoidable
-- MySQL
SELECT /*+ INDEX(o idx_orders_created_at) */ * FROM orders WHERE created_at > NOW() - INTERVAL 1 HOUR;

-- PostgreSQL
/*+ IndexScan(o idx_orders_created_at) */
SELECT * FROM orders WHERE created_at > NOW() - INTERVAL '1 hour';

Also enforce query complexity limits: max_execution_time in MySQL and statement_timeout in PostgreSQL to reject unbounded scans on replica endpoints.


5. Incident response — freshness guarantees cannot be met

When multiple replicas breach thresholds simultaneously and primary rate limits are hit, execute this sequence:

  1. Open the circuit: Force all NEAR_RT and EVENTUAL traffic to the primary with a strict connection cap (mysql-max_connections or pg_hba.conf limit for replica users).
  2. Drain long-running replica queries: SELECT pg_terminate_backend(pid) FROM pg_stat_activity WHERE state = 'idle in transaction' AND query_start < NOW() - INTERVAL '30 seconds' (PostgreSQL) or KILL QUERY <id> for blocking threads (MySQL).
  3. Validate replication state: Run heartbeat delta checks and GTID/LSN gap analysis on each replica.
  4. Restore gradually: Increase replica weights at 10% increments, monitoring buffer pool hit ratio and query latency p99 between each increment.

Child pages

Operational checklist

FAQ

How do I prevent strict queries from silently reading stale data?

Tag STRICT queries with a SQL comment annotation and route them exclusively to a writer hostgroup or synchronous standby. Configure ProxySQL query rules with apply=1 so the rule short-circuits before any replica hostgroup is evaluated. Pair this with application-level assertions that verify result freshness against a heartbeat table timestamp.

What is the safest fallback when all replicas exceed the lag threshold?

Open the circuit for NEAR_RT traffic: return a 503 or cached response rather than silently routing to a stale replica. Apply strict rate limiting when falling back to the primary to prevent write amplification. For EVENTUAL traffic, allow continued routing to the least-lagged replica rather than failing outright.

Should I use a proxy or application-layer routing for freshness-aware queries?

Proxies (ProxySQL, PgBouncer, HAProxy) offer zero code changes and centralised control but are opaque to application logic. ORM-level routing (Django DB Routers, Spring AbstractRoutingDataSource) gives explicit developer control and is easier to test in isolation. Most high-throughput systems use a hybrid: transparent proxy for bulk analytics and explicit ORM switching for transactional services.