Clash Meta url-test & fallback: Auto Failover (2026)

What problem are url-test and fallback solving?

A static select group is honest: it does exactly what you click. That is ideal when you want deterministic behavior for debugging or when a provider forbids silent switching. It is less ideal when your subscription contains twenty similar nodes and three of them are having a bad day. url-test exists so the core can periodically measure latency to a small probe endpoint through each candidate and keep traffic on whichever node currently wins the race, subject to a stability rule so you do not oscillate every thirty seconds. fallback exists when you truly care about order: try my expensive dedicated line first, then a cheaper multi-tenant pool, then DIRECT for domestic sites, without manually dragging priorities after each outage.

Both types rely on the same underlying idea: a health check is just an HTTP(S) request the core issues on a timer through each member (or through the active member, depending on options) to decide whether a hop is alive and how fast it answers. That is not the same as a speed-test app saturating your link; it is a lightweight probe that correlates surprisingly well with “will this tab load before I get angry.” If your rules still send traffic to the wrong group, no amount of tuning interval will help—confirm GEOIP/GEOSITE ordering and DNS/FakeIP alignment before you chase proxy-groups math.

url-test versus fallback in one paragraph

Use url-test when candidates are roughly interchangeable and you want the lowest delay to a representative probe URL—think regional VM clusters where any exit is acceptable. Use fallback when candidates are not interchangeable and you want a strict ladder: primary until it fails health criteria, then secondary, then tertiary. In production profiles people often nest both: a url-test pool for “Asia anycast,” another for “US anycast,” then a fallback that tries the Asia pool first and only falls through to the US pool if the entire upstream probe pattern says the first path is unhealthy. You can also place a fallback under a top-level select if you still want a manual override entry in the UI.

Remember: automatic groups only rearrange which proxy name is active inside that group. Your rules section still decides whether traffic reaches that group at all.

Health check fields you actually touch

Most working configurations set four knobs: url (what to fetch), interval (seconds between rounds), tolerance (milliseconds of improvement required before url-test will move your selection to avoid pointless churn), and the implicit timeout behavior implied by your core version and network. Shorter interval reacts faster but burns more battery on laptops and annoys strict providers if you hammer probes. Longer interval feels calmer but leaves you on a degraded node until the next sweep. A pragmatic home default in 2026 is often 120–300 seconds for stable residential links, tighter for mobile hotspots where the underlying NAT changes frequently.

The probe url should return quickly and be reachable from every candidate. Community profiles often use vendor-neutral endpoints such as Google’s generate_204 pattern or equivalent tiny responses; some operators prefer a URL on their own domain so they control rate limits. Avoid probes that redirect through captive portals or return huge bodies—you are measuring TCP/TLS plus HTTP latency, not downloading a film. If a node blocks the probe host entirely, that node will look “dead” even when other sites work; that is not a bug, it is a signal that your health URL is a poor match for that path.

Reading tolerance as a latency threshold

Newcomers misread tolerance as “maximum ping allowed.” In url-test it is better read as a latency threshold for switching: if the currently selected node reports 180 ms and another reports 160 ms, a tolerance of 50 ms means you might not switch, because the improvement does not clear the hysteresis band. That prevents flapping when two nodes statistically tie. If you set tolerance to zero, you chase every micro-optimization and may trigger more policy churn than your applications appreciate. If you set tolerance extremely high, you rarely move even when a clearly better node exists—useful when stability beats optimality, counterproductive when you wanted aggressive auto failover.

Step 1: Declare a url-test group (fastest healthy node)

Start from real proxy names exactly as they appear after your subscription merge. The following skeleton declares a pool called AUTO-BEST that tests three nodes every three minutes against a lightweight 204 endpoint, with a 40 ms switching hysteresis. Adjust names to match your file.

proxy-groups:
  - name: AUTO-BEST
    type: url-test
    url: http://www.gstatic.com/generate_204
    interval: 180
    tolerance: 40
    lazy: false
    proxies:
      - Node-A
      - Node-B
      - Node-C

  - name: PROXY
    type: select
    proxies:
      - AUTO-BEST
      - Node-A
      - Node-B
      - DIRECT

Setting lazy: false (when supported) means the core will warm-test members instead of waiting until a group is first selected—helpful when your top MATCH rule points here and you want numbers populated before you open a browser. If you prefer fewer background probes on metered networks, flip lazy to true and accept that the first session after launch might briefly use an unmeasured choice until tests complete.

Wire AUTO-BEST into rules the same way you would wire any other policy name: explicit domain lines above GEOIP, then broader catches. Nothing about url-test exempts you from rule order discipline; see the GEOIP/GEOSITE article if catch-all lines steal traffic.

Step 2: Declare a fallback group (primary / backup)

A fallback group walks the proxies list top to bottom and sticks with the first member that passes the health check. That matches mental models like “corporate egress first, consumer VPN second, direct last.” You can include another proxy-groups entry as a member, not only leaf nodes—handy when your primary is itself a url-test bundle.

proxy-groups:
  - name: AUTO-BEST
    type: url-test
    url: http://www.gstatic.com/generate_204
    interval: 180
    tolerance: 40
    proxies:
      - Node-A
      - Node-B

  - name: RESILIENT
    type: fallback
    url: http://www.gstatic.com/generate_204
    interval: 60
    proxies:
      - AUTO-BEST
      - Node-C
      - DIRECT

Notice the shorter interval on RESILIENT: when the entire AUTO-BEST bundle is unhealthy, you may want faster promotion to Node-C. You can tune independently per group; just document why so future-you does not “simplify” them into one number and reintroduce slow failover.

Step 3: Point rules at the automatic group, not scattered nodes

A common mistake is listing fifteen raw nodes in rules while expecting url-test to save you. The selector only runs among members of the group your rule referenced. Centralize: build one AUTO-BEST or RESILIENT umbrella, reference that name in rules, and keep raw nodes as manual overrides inside a select for debugging. When subscriptions rename nodes after refresh, you update one group instead of fifty lines of DOMAIN-SUFFIX targets.

rules:
  - DOMAIN-SUFFIX,google.com,RESILIENT
  - GEOIP,CN,DIRECT
  - MATCH,RESILIENT

If you also use advanced constructs such as relay chains, treat each chain as just another named outbound: you may place a relay inside fallback as a primary path if your provider expects that topology—then fall back to a simple single-hop node when the chain fails probes.

Step 4: Verify that failover really happens

Theory is cheap; confirmation is worth ten YAML edits. After reload, open your client’s connection panel or attach a dashboard such as the one described in the external-controller / YACD walkthrough so you can see live policy selection and latency columns. Then run a deliberate exercise: pick the currently active member inside AUTO-BEST and block it upstream (or pause that server) so the probe fails. Within roughly one interval window, traffic should migrate to the next healthy candidate without you touching the UI.

1Reload and parse

Read the core log at info level immediately after import. Typos in proxy names inside proxy-groups fail closed with explicit errors—fix those before interpreting health results.

2Confirm probes fire

You should see periodic health traffic in logs or metrics. Silence usually means lazy: true with no selection yet, or the group is not referenced by any live rule, so the core never needed to instantiate it.

3Bisect DNS versus group logic

If domains resolve to unexpected countries, your GEOIP lines may fire before domain rules, making it look like fallback “ignored” a failure. Fix DNS alignment first; automatic groups cannot override a rule that never routes to them.

4Measure real applications

Synthetic probes are directional hints. After automatic selection stabilizes, open the sites you actually care about. If HTTP is fine but UDP voice fails, pivot to UDP-oriented articles—health checks here are predominantly TCP-centric unless you have extended your profile with additional tooling.

Provider etiquette: extremely short interval values across dozens of nodes can look like abuse on shared infrastructure. Prefer sane timers and local overrides for test labs.

Pitfalls that masquerade as broken health checks

First, captive portals and hotel Wi‑Fi sometimes answer every probe with a redirect HTML page that still returns HTTP 200. Your core may think the node is healthy while real TLS sites fail. Compare behavior on tethered LTE as a control. Second, some nodes block specific probe domains by policy; swap the url instead of assuming the node is offline. Third, mixing system-proxy-only capture with partially excluded apps yields split brains: the core switches groups correctly while the app never traversed the core at all—TUN mode remains the blunt instrument when capture completeness matters.

Fourth, subscription churn renames nodes underneath you. If Node-A disappears after refresh, your url-test list references a ghost; keep a small local patch file for stable aliases or regenerate groups with your converter templates. Fifth, remember that Clash Meta cannot fix an upstream that is rate-limiting your entire account—automatic selection only chooses among the members you listed, not invent new capacity.

Tuning cheatsheet for 2026 home labs

Start conservative: interval 180–300 for url-test on wired desktops, tolerance between 30 and 80 ms for residential jitter, lazy: true on laptops when on battery. For fallback ladders that guard interactive work, shorten the outer interval to 60–120 so a dead primary releases quickly, but keep inner url-test pools calmer to avoid nested probe storms. Document two profiles—“aggressive” and “stable”—and snapshot YAML in git so you can roll back when an experimental timer collides with reality.

Documentation, downloads, and source transparency

Mihomo release notes remain the authoritative place for newly introduced group fields. For installers, use the official Clash download page; GitHub is appropriate for reading source, filing issues, and verifying signatures—not the only distribution channel for day-to-day users.

Summary

url-test groups in Clash Meta pick a low-latency healthy member using periodic HTTP probes, controlled by url, interval, and a tolerance hysteresis that acts as a practical latency threshold for switching. fallback groups honor strict priority: first healthy hop wins, making them ideal for auto failover ladders. Declare both under proxy-groups, reference a single umbrella name from rules, align DNS and GEOIP order first, then verify with logs and a dashboard so you see probes and promotions in real time. Compared with endlessly clicking a manual select list, a disciplined automatic profile feels less like magic and more like instrumentation you can trust.

When outages shrink from “restart everything” to “watch the core slide to backup within one interval,” you have earned the boring kind of reliability that actually saves evenings.

→ Download Clash for free and experience the difference

How to Set Up Clash Meta url-test and fallback for Auto Failover (2026)