Skip to content

Conversation

Copy link

Copilot AI commented Dec 27, 2025

Fix infinite retry when a single host fails with server error

Plan:

  • Understand the issue: when _make_query_plan() sets query_plan to a list [self._host] instead of an iterator, subsequent retries iterate the list from the beginning, causing infinite retries
  • Fix the bug by converting the single-host list to an iterator in _make_query_plan() method
  • Create unit test to verify the fix prevents infinite retries with single host
  • Run existing tests to ensure no regressions
  • Run code review and security checks
Original prompt

This section details on the original issue you should resolve

<issue_title>infinite retry when a single host fails with server error</issue_title>
<issue_description>When executing a query on a single host with the default retry policy, if the query fails with a server error, the query is retried infinitely and without delays.

Consider this simple test:

    servers = await manager.servers_add(1, auto_rack_dc="dc1")
    cql, hosts = await manager.get_ready_cql(servers)
    async with new_test_keyspace(manager, "WITH replication = {'class': 'NetworkTopologyStrategy', 'replication_factor': 1}") as ks:
        await cql.run_async(f"CREATE TABLE {ks}.t2 (pk int, ck int, v int, PRIMARY KEY (pk, ck))")
        await manager.api.enable_injection(servers[0].ip_addr, "fail_mutate_internal", one_shot=False)
        await cql.run_async(f"INSERT INTO {ks}.t2(pk, ck, v) VALUES(0,0,0)")

the query fails immediately with NoHostAvailable exception:

>           await cql.run_async(f"INSERT INTO {ks}.t2(pk, ck, v) VALUES(0,0,0)")
E           cassandra.cluster.NoHostAvailable: ('Unable to complete the operation against any hosts', {<Host: 127.193.50.1:9042 dc1>: <Error from server: code=0000 [Server error] message="Error injection in mutate_internal">})

However, suppose we change the query to run on a specific host with host=hosts[0]:

        await cql.run_async(f"INSERT INTO {ks}.t2(pk, ck, v) VALUES(0,0,0)", host=hosts[0])

now we see infinite Server error logs:

11:16:34.820 WARNING>  Host 127.99.46.1:9042 error: Server error.
11:16:34.820 WARNING>  Host 127.99.46.1:9042 error: Server error.
11:16:34.820 WARNING>  Host 127.99.46.1:9042 error: Server error.
11:16:34.821 WARNING>  Host 127.99.46.1:9042 error: Server error.
11:16:34.821 WARNING>  Host 127.99.46.1:9042 error: Server error.
11:16:34.822 WARNING>  Host 127.99.46.1:9042 error: Server error.
11:16:34.822 WARNING>  Host 127.99.46.1:9042 error: Server error.
11:16:34.822 WARNING>  Host 127.99.46.1:9042 error: Server error.
11:16:34.823 WARNING>  Host 127.99.46.1:9042 error: Server error.
11:16:34.823 WARNING>  Host 127.99.46.1:9042 error: Server error.

I suspect the problem is that in send_request it expects query_plan to be an iterator, but in make_query_plan if there is a single host then it creates a list.</issue_description>

Comments on the Issue (you are @copilot in this section)


✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

infinite retry when a single host fails with server error

2 participants