VM Specification Validation Test Suite by SRIKKANTH · Pull Request #4408 · microsoft/lisa

SRIKKANTH · 2026-04-06T11:33:25Z

Description

This module validates that Azure VMs match their declared hardware specifications.

A sample run book and CSV file included in examples.

The CSV file defines the expected properties for each VM size (CPU count, memory, NIC count, max disks, local NVMe disks, IOPS, etc.), and the LISA CSV combinator in runbook iterates over each row, provisioning a VM and running validation tests against it.

Related Issue

None

Type of Change

Checklist

Description is filled in above
No credentials, secrets, or internal details are included
Peer review requested (if not, add required peer reviewers after raising PR)
Tests executed and results posted below

Test Validation

Key Test Cases:

verify_vm_cpu_count
verify_vm_memory
verify_vm_gpu_count
verify_vm_sriov_nic_count
verify_vm_synthetic_nic_count
verify_vm_max_premium_ssd_disk_count
verify_vm_nvme_disk_count
verify_vm_nvme_disk_iops
verify_vm_premium_ssd_iops
verify_vm_premium_ssd_throughput
verify_vm_network_bandwidth

Impacted LISA Features:
None
Tested Azure Marketplace Images:

Canonical:0001-com-ubuntu-server-jammy:22_04-lts-gen2:latest

Test Results

Image	VM Size	Result
Canonical:0001-com-ubuntu-server-jammy:22_04-lts-gen2:latest	Standard_D32s_v5	PASSED

Copilot

Pull request overview

Adds a new LISA test suite and sample runbook/CSV to validate that provisioned Azure VMs match declared hardware specifications (CPU, memory, NICs, disks, NVMe, and optional perf thresholds), driven by a CSV combinator.

Changes:

Introduces VmSpecValidation test suite with checks for CPU/memory, NIC counts, disk/NVMe counts, and basic fio/ntttcp perf validation.
Adds an example runbook using the CSV combinator to iterate VM sizes and pass expected_* variables into each test iteration.
Adds a sample vm_specs.csv with one VM size row to demonstrate the format.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 9 comments.

File	Description
lisa/microsoft/testsuites/vm_spec_validation/vm_spec_validation.py	New VM spec validation test suite (functional + light perf checks).
lisa/microsoft/runbook/examples/vm_specs.csv	Sample expected-spec CSV schema + example row.
lisa/microsoft/runbook/examples/vm_spec_validation.yml	Example runbook wiring CSV combinator variables into the new suite.

lisa/microsoft/runbook/examples/vm_spec_validation.yml

lisa/microsoft/testsuites/vm_spec_validation/vm_spec_validation.py

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Copilot

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 6 comments.

Copilot · 2026-04-06T13:08:49Z

lisa/microsoft/testsuites/vm_spec_validation/vm_spec_validation.py

+    @TestCaseMetadata(
+        description="""
+        Verify that the VM exposes the expected number of GPU devices.
+
+        This test is skipped when the ``expected_gpu_count`` variable
+        is empty or zero.
+
+        Steps:
+        1. Read expected_gpu_count from the runbook variables.
+        2. Query GPU PCI devices via ``lspci``.
+        3. Assert the GPU device count matches expected_gpu_count.
+        """,
+        priority=5,
+        requirement=simple_requirement(min_gpu_count=1),
+    )


This test case claims it is skipped when expected_gpu_count is empty/zero, but the requirement includes simple_requirement(min_gpu_count=1), which will prevent the case from running on non-GPU SKUs (so it won't actually exercise the skip logic or validate expected_gpu_count=0). Consider removing the min_gpu_count requirement and letting the test skip based on the variable, or alternatively update the description/variables to align with the enforced GPU requirement.

lisa/microsoft/testsuites/vm_spec_validation/vm_spec_validation.py

Copilot · 2026-04-06T13:08:51Z

lisa/microsoft/testsuites/vm_spec_validation/vm_spec_validation.py

+        # Start ntttcp receiver (server) in the background
+        server_process = server_ntttcp.run_as_server_async(
+            server_nic_name,
+            run_time_seconds=60,  # 60s for a stable throughput measurement
+            ports_count=ports_count,
+        )
+
+        try:
+            # Run ntttcp sender (client)
+            client_ntttcp.run_as_client(
+                client_nic_name,
+                server_ip=server_node.internal_address,
+                threads_count=1,
+                run_time_seconds=60,  # match the server duration
+                ports_count=ports_count,
+            )
+        finally:
+            # Stop the server and collect its result
+            server_node.tools[Kill].by_name(server_ntttcp.command)
+
+        server_executable_result = server_process.wait_result()
+        server_result = server_ntttcp.create_ntttcp_result(
+            server_executable_result
+        )
+
+        # ntttcp reports throughput in Gbps — convert to Mbps to match
+        # the runbook variable unit.
+        measured_bw_mbps = int(server_result.throughput_in_gbps * 1000)
+
+        # Allow tolerance below the declared max
+        bw_floor_mbps = int(
+            expected_bw_mbps * (100 - _PERF_TOLERANCE_PERCENT) / 100
+        )
+        log.info(
+            f"VM size: {vm_size} — ntttcp network bandwidth: "
+            f"measured {measured_bw_mbps} Mbps, "
+            f"expected >= {bw_floor_mbps} Mbps "
+            f"(declared: {expected_bw_mbps} Mbps, "
+            f"tolerance: {_PERF_TOLERANCE_PERCENT}%)"
+        )
+        assert_that(measured_bw_mbps).described_as(
+            f"VM size {vm_size}: expected network throughput "
+            f">= {bw_floor_mbps} Mbps (declared {expected_bw_mbps} Mbps "
+            f"with {_PERF_TOLERANCE_PERCENT}% tolerance) but measured "
+            f"only {measured_bw_mbps} Mbps"


For bandwidth validation, this uses ntttcp directly without the usual Ntttcp.setup_system(...)/restore_system(...) tuning and without any retry/cleanup logic. Existing ntttcp-based perf tests (e.g. lisa/microsoft/testsuites/performance/common.py) explicitly tune sysctls/firewall and include retries to reduce flakiness. Consider following the same pattern here to improve reliability of the measured throughput vs. declared expectations.

Suggested change

# Start ntttcp receiver (server) in the background

server_process = server_ntttcp.run_as_server_async(

server_nic_name,

run_time_seconds=60, # 60s for a stable throughput measurement

ports_count=ports_count,

)

try:

# Run ntttcp sender (client)

client_ntttcp.run_as_client(

client_nic_name,

server_ip=server_node.internal_address,

threads_count=1,

run_time_seconds=60, # match the server duration

ports_count=ports_count,

)

finally:

# Stop the server and collect its result

server_node.tools[Kill].by_name(server_ntttcp.command)

server_executable_result = server_process.wait_result()

server_result = server_ntttcp.create_ntttcp_result(

server_executable_result

)

# ntttcp reports throughput in Gbps — convert to Mbps to match

# the runbook variable unit.

measured_bw_mbps = int(server_result.throughput_in_gbps * 1000)

# Allow tolerance below the declared max

bw_floor_mbps = int(

expected_bw_mbps * (100 - _PERF_TOLERANCE_PERCENT) / 100

)

log.info(

f"VM size: {vm_size} — ntttcp network bandwidth: "

f"measured {measured_bw_mbps} Mbps, "

f"expected >= {bw_floor_mbps} Mbps "

f"(declared: {expected_bw_mbps} Mbps, "

f"tolerance: {_PERF_TOLERANCE_PERCENT}%)"

)

assert_that(measured_bw_mbps).described_as(

f"VM size {vm_size}: expected network throughput "

f">= {bw_floor_mbps} Mbps (declared {expected_bw_mbps} Mbps "

f"with {_PERF_TOLERANCE_PERCENT}% tolerance) but measured "

f"only {measured_bw_mbps} Mbps"

# Allow tolerance below the declared max

bw_floor_mbps = int(

expected_bw_mbps * (100 - _PERF_TOLERANCE_PERCENT) / 100

)

# Retry up to 3 times to reduce transient bandwidth variance.

max_attempts = 3

best_measured_bw_mbps = 0

server_ntttcp.setup_system()

client_ntttcp.setup_system()

try:

for attempt in range(1, max_attempts + 1):

# Start ntttcp receiver (server) in the background

server_process = server_ntttcp.run_as_server_async(

server_nic_name,

run_time_seconds=60, # 60s for a stable throughput measurement

ports_count=ports_count,

)

try:

# Run ntttcp sender (client)

client_ntttcp.run_as_client(

client_nic_name,

server_ip=server_node.internal_address,

threads_count=1,

run_time_seconds=60, # match the server duration

ports_count=ports_count,

)

finally:

# Stop the server and collect its result

server_node.tools[Kill].by_name(server_ntttcp.command)

server_executable_result = server_process.wait_result()

server_result = server_ntttcp.create_ntttcp_result(

server_executable_result

)

# ntttcp reports throughput in Gbps — convert to Mbps to match

# the runbook variable unit.

measured_bw_mbps = int(server_result.throughput_in_gbps * 1000)

best_measured_bw_mbps = max(

best_measured_bw_mbps, measured_bw_mbps

)

log.info(

f"VM size: {vm_size} — ntttcp network bandwidth attempt "

f"{attempt}/{max_attempts}: measured {measured_bw_mbps} Mbps, "

f"expected >= {bw_floor_mbps} Mbps "

f"(declared: {expected_bw_mbps} Mbps, "

f"tolerance: {_PERF_TOLERANCE_PERCENT}%)"

)

if measured_bw_mbps >= bw_floor_mbps:

break

finally:

server_node.tools[Kill].by_name(server_ntttcp.command)

server_ntttcp.restore_system()

client_ntttcp.restore_system()

log.info(

f"VM size: {vm_size} — best ntttcp network bandwidth: "

f"measured {best_measured_bw_mbps} Mbps, "

f"expected >= {bw_floor_mbps} Mbps "

f"(declared: {expected_bw_mbps} Mbps, "

f"tolerance: {_PERF_TOLERANCE_PERCENT}%)"

)

assert_that(best_measured_bw_mbps).described_as(

f"VM size {vm_size}: expected network throughput "

f">= {bw_floor_mbps} Mbps (declared {expected_bw_mbps} Mbps "

f"with {_PERF_TOLERANCE_PERCENT}% tolerance) but best measured "

f"value across {max_attempts} attempt(s) was only "

f"{best_measured_bw_mbps} Mbps"

Copilot · 2026-04-06T13:08:51Z

lisa/microsoft/runbook/examples/vm_spec_validation.yml

+# Usage:
+#   lisa -r lisa\microsoft\runbook\examples\vm_spec_validation.yml \
+#        --variable "subscription_id:<your_sub_id>" \
+#        --variable "csv_file:lisa\microsoft\runbook\examples\vm_specs.csv"
+#
+# You can also override location, marketplace_image, etc. from the CLI.
+
+test_pass: "adhoc"
+
+name: vm_spec_validation
+
+include:
+  - path: ../azure.yml
+
+extension:
+  - ../../testsuites
+
+variable:
+  # Path to the CSV file with VM specifications (override via CLI if needed)
+  - name: csv_file
+    value: 'lisa\microsoft\runbook\examples\vm_specs.csv'
+


This runbook example uses Windows-style backslashes in paths (both in the CLI example and the default csv_file value). Other runbook examples in this repo use forward slashes (e.g. lisa/microsoft/runbook/examples/git_bisect.yml), which are more portable across OSes/shells. Consider switching to forward slashes in the example paths to make the sample runbook work out-of-the-box on Linux runners.

Copilot

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 5 comments.

Copilot · 2026-04-06T13:38:59Z

lisa/microsoft/testsuites/vm_spec_validation/vm_spec_validation.py

+    ) -> None:
+        expected_memory_gb = _get_int_var(variables, VAR_EXPECTED_MEMORY_GB)
+        expected_memory_mb = expected_memory_gb * 1024
+        actual_memory_mb = node.tools[Free].get_free_memory_mb()


verify_vm_memory is comparing the expected total RAM against Free.get_free_memory_mb(), which returns free/unused memory and will vary with workload/OS caches. This will cause false failures even when the VM has the correct total RAM. Use total memory (e.g., Free.get_total_memory_gb() / total in MiB) for the comparison instead of free memory.

Suggested change

actual_memory_mb = node.tools[Free].get_free_memory_mb()

actual_memory_mb = int(node.tools[Free].get_total_memory_gb() * 1024)

get_free_memory_mb gives better granularity than "get_total_memory_gb() * 1024"

Copilot · 2026-04-06T13:39:00Z

lisa/microsoft/testsuites/vm_spec_validation/vm_spec_validation.py

+            expected_bw * (100 + _PERF_TOLERANCE_PERCENT) / 100
+        )
+        log.info(
+            f"VM size: {vm_size} - fio seq read across "
+            f"{len(data_disks)} disk(s): "
+            f"measured {measured_bw} MiB/s, "
+            f"expected {bw_floor}-{bw_ceiling} MiB/s "
+            f"(declared: {expected_bw} MBps)"
+        )


The throughput check mixes units: the docstring/CSV variable is labeled as "MBps" but the measurement derives from fio IOPS with block_size=1024K, which is effectively MiB/s. Either define the runbook variable as MiB/s, or convert MB/s to MiB/s (1000/1024) before applying the tolerance and reporting results to avoid confusion and inconsistent expectations.

Copilot · 2026-04-06T13:39:00Z

lisa/microsoft/testsuites/vm_spec_validation/vm_spec_validation.py

+                iodepth=64,
+                numjob=4,
+                block_size="4K",
+                size_gb=8192,


fio.launch(..., size_gb=8192, ...) is potentially misleading: in LISA's Fio tool, size_gb is passed as --size=<value>M (MiB), so 8192 means ~8 GiB per target, not 8192 GiB. Add an inline comment clarifying the intended unit/size (or adjust the value) so future readers don't assume this is a multi-terabyte workload.

Suggested change

size_gb=8192,

size_gb=8192, # LISA Fio passes this as MiB, so this is ~8 GiB per target.

Copilot · 2026-04-06T13:39:00Z

lisa/microsoft/testsuites/vm_spec_validation/vm_spec_validation.py

+                filename=filename,
+                mode="randread",
+                iodepth=64,
+                numjob=thread_count,
+                block_size="4K",
+                size_gb=8192,
+                time=120,
+                overwrite=True,


The fio workload parameters here (e.g., iodepth=64, numjob=thread_count, time=120) materially change test duration and load but are currently undocumented magic numbers. Add short inline comments (or named module-level constants) for these parameters so it’s clear why these values were chosen and how to tune them safely.

Copilot · 2026-04-06T13:39:00Z

lisa/microsoft/testsuites/vm_spec_validation/vm_spec_validation.py

+class VmSpecValidation(TestSuite):
+    """Validate VM hardware against runbook-declared specifications."""
+
+    # ------------------------------------------------------------------
+    # CPU validation
+    # ------------------------------------------------------------------
+    @TestCaseMetadata(
+        description="""
+        Verify that the VM's vCPU count matches the expected value from


The PR description’s listed "Key Test Cases" includes names that don’t match the implemented methods in this suite (e.g., it mentions verify_vm_max_premium_data_disks / verify_vm_premium_ssd_storage_throughput / verify_vm_spec_summary, but the code defines verify_vm_max_premium_ssd_disk_count / verify_vm_premium_ssd_throughput and has no verify_vm_spec_summary). Please update the PR description (or align method names/implement the missing case) so readers can reliably run the referenced tests.

Copilot

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 5 comments.

Copilot · 2026-04-07T05:57:09Z

lisa/microsoft/testsuites/vm_spec_validation/vm_spec_validation.py

+    # Strip any non-digit characters (e.g. commas, units, spaces) before parsing.
+    digits_only = re.sub(r"[^\d]", "", str(raw))
+    if not digits_only:
+        raise SkippedException(
+            f"Variable '{name}' contains no digits - skipping check."
+        )
+    value = int(digits_only)
+    if value <= 0:
+        raise SkippedException(
+            f"Variable '{name}' is {value} (zero or negative) - skipping check."
+        )


_get_int_var() removes all non-digit characters before parsing. This drops the sign, so values like "-1" become "1" and will not be skipped as intended. Consider parsing with a regex that preserves an optional leading +/- (or explicitly rejecting any value containing '-' before stripping), so negative values reliably trigger the "zero or negative" skip path.

Copilot · 2026-04-07T05:57:09Z

lisa/microsoft/testsuites/vm_spec_validation/vm_spec_validation.py

+                iodepth=64,
+                numjob=4,
+                block_size="4K",
+                size_gb=8192,


The fio wrapper's size_gb argument is passed through as --size=<value>M (MiB) in lisa/tools/fio.py, so size_gb=8192 actually means ~8 GiB. This is easy to misread and accidentally change. Please add an inline comment (or define a named constant) clarifying the unit and intent for 8192 here (and consider using an expression like 8 * 1024 for readability).

Suggested change

size_gb=8192,

# Fio.size_gb is passed through as MiB (--size=<value>M),

# so 8 * 1024 here intentionally means an 8 GiB fio size.

size_gb=8 * 1024,

Copilot · 2026-04-07T05:57:10Z

lisa/microsoft/testsuites/vm_spec_validation/vm_spec_validation.py

+                mode="randread",
+                iodepth=64,
+                numjob=4,
+                block_size="4K",
+                size_gb=8192,
+                time=120,
+                overwrite=True,


These fio parameters include several hard-coded values (iodepth=64, numjob=4, runtime=120, size_gb=8192) that materially control test behavior and result stability. Per repo guidelines, please either introduce named constants for them or add brief inline comments explaining why these specific values were chosen.

Copilot · 2026-04-07T05:57:10Z

lisa/microsoft/testsuites/vm_spec_validation/vm_spec_validation.py

+                filename=filename,
+                mode="read",
+                iodepth=64,
+                numjob=thread_count,
+                block_size="1024K",
+                size_gb=8192,
+                time=120,
+                overwrite=True,
+            )


This throughput test also uses hard-coded fio parameters (iodepth=64, numjob=thread_count, block_size=1024K, runtime=120, size_gb=8192). Please add inline comments or pull them into named constants so future maintainers can safely tune them and understand the tradeoffs.

Copilot · 2026-04-07T05:57:10Z

lisa/microsoft/testsuites/vm_spec_validation/vm_spec_validation.py

+    def verify_vm_max_premium_ssd_disk_count(
+        self, node: Node, log: Logger, variables: Dict[str, Any]
+    ) -> None:
+        expected_max_disks = _get_int_var(variables, VAR_EXPECTED_MAX_DISKS)
+        vm_size = variables.get(VAR_VM_SIZE, "unknown")


PR description/test list mentions verify_vm_max_premium_data_disks, but this suite defines verify_vm_max_premium_ssd_disk_count. To avoid confusion for runbook authors and reviewers, please align the method name (or update the PR description/runbook docs) so the documented test case names match the actual ones.

Copilot

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 7 comments.

Copilot · 2026-04-07T16:11:03Z

lisa/microsoft/testsuites/vm_spec_validation/vm_spec_validation.py

+        best_iops = 0
+        for i in range(_PERF_ITERATIONS):
+            result = fio.launch(
+                name=f"iops_all_disks_{i}",
+                filename=filename,
+                mode="randread",
+                iodepth=64,
+                numjob=thread_count,


numjob=thread_count with iodepth=64 can create an extremely high total queue depth on large core-count VMs and trigger libaio limits (EAGAIN) or severe perf drop. Consider capping numjob (the repo’s perf helpers cap to 256 for libaio) and/or bounding total qdepth to reduce flakiness.

Suggested change

best_iops = 0

for i in range(_PERF_ITERATIONS):

result = fio.launch(

name=f"iops_all_disks_{i}",

filename=filename,

mode="randread",

iodepth=64,

numjob=thread_count,

fio_numjob = min(

thread_count, 256

) # Cap libaio jobs to avoid EAGAIN/perf collapse on large-core VMs.

max_total_queue_depth = (

256 * 64

) # Preserve the prior effective upper bound of numjob * iodepth.

fio_iodepth = max(

1, max_total_queue_depth // fio_numjob

) # Ensure total queue depth stays bounded while keeping iodepth >= 1.

log.info(

f"VM size: {vm_size} - fio settings for aggregate IOPS: "

f"thread_count={thread_count}, numjob={fio_numjob}, "

f"iodepth={fio_iodepth}, total_queue_depth={fio_numjob * fio_iodepth}"

)

best_iops = 0

for i in range(_PERF_ITERATIONS):

result = fio.launch(

name=f"iops_all_disks_{i}",

filename=filename,

mode="randread",

iodepth=fio_iodepth,

numjob=fio_numjob,

Copilot · 2026-04-07T16:11:03Z

lisa/microsoft/testsuites/vm_spec_validation/vm_spec_validation.py

+        fio = node.tools[Fio]
+        cpu = node.tools[Lscpu]
+        thread_count = cpu.get_thread_count()
+        best_bw_mbps = 0
+        for i in range(_PERF_ITERATIONS):
+            result = fio.launch(
+                name=f"storage_throughput_all_disks_{i}",
+                filename=filename,
+                mode="read",
+                iodepth=64,
+                numjob=thread_count,
+                block_size="1024K",
+                size_gb=8192,
+                time=120,
+                overwrite=True,
+            )


Same fio concurrency concern here: numjob=thread_count (with iodepth=64) can exceed async I/O limits on large VMs. Consider capping numjob similarly to other perf code in the repo to avoid intermittent failures.

Copilot · 2026-04-07T16:11:03Z

lisa/microsoft/testsuites/vm_spec_validation/vm_spec_validation.py

+        bw_ceiling = int(
+            expected_bw * (100 + _PERF_TOLERANCE_PERCENT) / 100
+        )
+        log.info(
+            f"VM size: {vm_size} - fio seq read across "
+            f"{len(data_disks)} disk(s): "
+            f"measured {measured_bw} MBps, "
+            f"expected {bw_floor}-{bw_ceiling} MBps "
+            f"(declared: {expected_bw} MBps)"
+        )
+        assert_that(measured_bw).described_as(
+            f"VM size {vm_size}: expected storage throughput "
+            f"between {bw_floor} and {bw_ceiling} MBps "
+            f"(declared {expected_bw} MBps with "
+            f"{_PERF_TOLERANCE_PERCENT}% tolerance) across "
+            f"{len(data_disks)} disk(s) but measured "
+            f"{measured_bw} MBps"
+        ).is_between(bw_floor, bw_ceiling)


The test description says it validates throughput is “at least” the expected value, but the assertion enforces an upper bound (is_between(bw_floor, bw_ceiling)). If the measured throughput exceeds the declared value by > tolerance, this will fail even though it meets the “at least” requirement. Either update the docstring to match a bounded-range check, or change the assertion to enforce only a minimum.

Suggested change

bw_ceiling = int(

expected_bw * (100 + _PERF_TOLERANCE_PERCENT) / 100

)

log.info(

f"VM size: {vm_size} - fio seq read across "

f"{len(data_disks)} disk(s): "

f"measured {measured_bw} MBps, "

f"expected {bw_floor}-{bw_ceiling} MBps "

f"(declared: {expected_bw} MBps)"

)

assert_that(measured_bw).described_as(

f"VM size {vm_size}: expected storage throughput "

f"between {bw_floor} and {bw_ceiling} MBps "

f"(declared {expected_bw} MBps with "

f"{_PERF_TOLERANCE_PERCENT}% tolerance) across "

f"{len(data_disks)} disk(s) but measured "

f"{measured_bw} MBps"

).is_between(bw_floor, bw_ceiling)

log.info(

f"VM size: {vm_size} - fio seq read across "

f"{len(data_disks)} disk(s): "

f"measured {measured_bw} MBps, "

f"expected at least {bw_floor} MBps "

f"(declared: {expected_bw} MBps)"

)

assert_that(measured_bw).described_as(

f"VM size {vm_size}: expected storage throughput "

f"at least {bw_floor} MBps "

f"(declared {expected_bw} MBps with "

f"{_PERF_TOLERANCE_PERCENT}% tolerance) across "

f"{len(data_disks)} disk(s) but measured "

f"{measured_bw} MBps"

).is_greater_than_or_equal_to(bw_floor)

Copilot · 2026-04-07T16:11:04Z

lisa/microsoft/testsuites/vm_spec_validation/vm_spec_validation.py

+        # Allow tolerance around the declared max
+        bw_floor_mbps = int(
+            expected_bw_mbps
+            * (100 - _PERF_TOLERANCE_PERCENT)
+            / 100
+        )
+        bw_ceiling_mbps = int(
+            expected_bw_mbps
+            * (100 + _PERF_TOLERANCE_PERCENT)
+            / 100
+        )
+        log.info(
+            f"VM size: {vm_size} — ntttcp network bandwidth: "
+            f"measured {measured_bw_mbps} Mbps, "
+            f"expected {bw_floor_mbps}-{bw_ceiling_mbps} Mbps "
+            f"(declared: {expected_bw_mbps} Mbps, "
+            f"tolerance: {_PERF_TOLERANCE_PERCENT}%)"
+        )
+        assert_that(measured_bw_mbps).described_as(
+            f"VM size {vm_size}: expected network throughput "
+            f"between {bw_floor_mbps} and {bw_ceiling_mbps} Mbps "
+            f"(declared {expected_bw_mbps} Mbps with "
+            f"{_PERF_TOLERANCE_PERCENT}% tolerance) but measured "
+            f"{measured_bw_mbps} Mbps"
+        ).is_between(bw_floor_mbps, bw_ceiling_mbps)


The network bandwidth docstring says “at least” expected throughput, but the assertion enforces an upper bound (is_between). This can fail if the VM measures higher than the declared spec. Either document that this is a “within tolerance” validation, or assert only a lower bound.

Suggested change

# Allow tolerance around the declared max

bw_floor_mbps = int(

expected_bw_mbps

* (100 - _PERF_TOLERANCE_PERCENT)

/ 100

)

bw_ceiling_mbps = int(

expected_bw_mbps

* (100 + _PERF_TOLERANCE_PERCENT)

/ 100

)

log.info(

f"VM size: {vm_size} — ntttcp network bandwidth: "

f"measured {measured_bw_mbps} Mbps, "

f"expected {bw_floor_mbps}-{bw_ceiling_mbps} Mbps "

f"(declared: {expected_bw_mbps} Mbps, "

f"tolerance: {_PERF_TOLERANCE_PERCENT}%)"

)

assert_that(measured_bw_mbps).described_as(

f"VM size {vm_size}: expected network throughput "

f"between {bw_floor_mbps} and {bw_ceiling_mbps} Mbps "

f"(declared {expected_bw_mbps} Mbps with "

f"{_PERF_TOLERANCE_PERCENT}% tolerance) but measured "

f"{measured_bw_mbps} Mbps"

).is_between(bw_floor_mbps, bw_ceiling_mbps)

# Allow tolerance below the declared throughput to account for

# measurement variance while still enforcing a minimum expectation.

bw_floor_mbps = int(

expected_bw_mbps

* (100 - _PERF_TOLERANCE_PERCENT)

/ 100

)

log.info(

f"VM size: {vm_size} — ntttcp network bandwidth: "

f"measured {measured_bw_mbps} Mbps, "

f"expected at least {bw_floor_mbps} Mbps "

f"(declared: {expected_bw_mbps} Mbps, "

f"tolerance: {_PERF_TOLERANCE_PERCENT}%)"

)

assert_that(measured_bw_mbps).described_as(

f"VM size {vm_size}: expected network throughput "

f"at least {bw_floor_mbps} Mbps "

f"(declared {expected_bw_mbps} Mbps with "

f"{_PERF_TOLERANCE_PERCENT}% tolerance) but measured "

f"{measured_bw_mbps} Mbps"

).is_greater_than_or_equal_to(bw_floor_mbps)

Copilot · 2026-04-07T16:11:04Z

lisa/microsoft/testsuites/vm_spec_validation/vm_spec_validation.py

+            result = fio.launch(
+                name=f"nvme_iops_all_disks_{i}",
+                filename=filename,
+                mode="randread",
+                iodepth=64,
+                numjob=4,
+                block_size="4K",
+                size_gb=8192,
+                time=120,
+                overwrite=True,


Fio.launch(size_gb=...) is passed to fio as --size=<value>M (MiB). Using size_gb=8192 is likely intended as 8192 MiB (~8 GiB), but the code reads like 8192 GB. Consider introducing a clearly named local constant (e.g., size_mb = 8192) and passing that, or add an inline note about units.

Copilot · 2026-04-07T16:11:04Z

lisa/microsoft/testsuites/vm_spec_validation/vm_spec_validation.py

+    def verify_vm_premium_ssd_iops(
+        self, node: Node, log: Logger, variables: Dict[str, Any]
+    ) -> None:


PR description’s “Key Test Cases” list doesn’t match the implemented method names (e.g., it lists verify_vm_max_premium_data_disks/verify_vm_premium_disk_iops/verify_vm_premium_ssd_storage_throughput, but this suite defines verify_vm_max_premium_ssd_disk_count/verify_vm_premium_ssd_iops/verify_vm_premium_ssd_throughput). Please align the PR description (and any run instructions) with the actual test method names to avoid confusion when selecting cases.

Copilot · 2026-04-07T16:11:04Z

lisa/microsoft/runbook/examples/vm_spec_validation.yml

+# Usage:
+#   lisa -r lisa\microsoft\runbook\examples\vm_spec_validation.yml \
+#        --variable "subscription_id:<your_sub_id>" \
+#        --variable "csv_file:lisa\microsoft\runbook\examples\vm_specs.csv"
+#
+# You can also override location, marketplace_image, etc. from the CLI.
+
+test_pass: "adhoc"
+
+name: vm_spec_validation
+
+include:
+  - path: ../azure.yml
+
+extension:
+  - ../../testsuites
+
+variable:
+  # Path to the CSV file with VM specifications (override via CLI if needed)
+  - name: csv_file
+    value: 'lisa\microsoft\runbook\examples\vm_specs.csv'
+


This example runbook uses Windows-style backslashes in paths (e.g., lisa\microsoft\...). On Linux/macOS runners this is usually treated as a literal backslash and can cause “file not found”. Consider switching example paths to forward slashes for cross-platform usability.

VM Specification Validation Test Suite

ee034bf

Copilot AI review requested due to automatic review settings April 6, 2026 11:33

SRIKKANTH requested review from LiliDeng and johnsongeorge-w as code owners April 6, 2026 11:33

Copilot started reviewing on behalf of SRIKKANTH April 6, 2026 11:34 View session

Copilot AI reviewed Apr 6, 2026

View reviewed changes

Update lisa/microsoft/runbook/examples/vm_spec_validation.yml

16e74ed

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Copilot AI review requested due to automatic review settings April 6, 2026 13:01

Copilot started reviewing on behalf of SRIKKANTH April 6, 2026 13:02 View session

Copilot AI reviewed Apr 6, 2026

View reviewed changes

SRIKKANTH added 2 commits April 6, 2026 18:46

Update vm_spec_validation.py

cc3f477

Update vm_spec_validation.py

8ff7c0c

Copilot AI review requested due to automatic review settings April 6, 2026 13:33

Copilot started reviewing on behalf of SRIKKANTH April 6, 2026 13:34 View session

Copilot AI reviewed Apr 6, 2026

View reviewed changes

SRIKKANTH requested review from kanchansenlaskar and umfranci April 6, 2026 14:32

SRIKKANTH added 2 commits April 6, 2026 21:27

Update vm_spec_validation.py

af8a571

Update vm_spec_validation.py

2b3e1e7

Copilot AI review requested due to automatic review settings April 7, 2026 05:52

Copilot started reviewing on behalf of SRIKKANTH April 7, 2026 05:53 View session

Copilot AI reviewed Apr 7, 2026

View reviewed changes

SRIKKANTH added 3 commits April 7, 2026 11:27

Update vm_specs.csv

c0aa3b3

Update vm_spec_validation.py

91e1d54

Update vm_spec_validation.py

f48c17f

Copilot AI review requested due to automatic review settings April 7, 2026 16:05

Copilot started reviewing on behalf of SRIKKANTH April 7, 2026 16:06 View session

Copilot AI reviewed Apr 7, 2026

View reviewed changes

Update vm_spec_validation.py

3e77bb9

	actual_memory_mb = node.tools[Free].get_free_memory_mb()
	actual_memory_mb = int(node.tools[Free].get_total_memory_gb() * 1024)

	size_gb=8192,
	size_gb=8192, # LISA Fio passes this as MiB, so this is ~8 GiB per target.

-                size_gb=8192,
+                # Fio.size_gb is passed through as MiB (--size=<value>M),
+                # so 8 * 1024 here intentionally means an 8 GiB fio size.
+                size_gb=8 * 1024,

Conversation

SRIKKANTH commented Apr 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Related Issue

Type of Change

Checklist

Test Validation

Test Results

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Apr 6, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI Apr 6, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 6, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Apr 6, 2026

Choose a reason for hiding this comment

Uh oh!

SRIKKANTH Apr 6, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 6, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 6, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 6, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 6, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

SRIKKANTH commented Apr 6, 2026 •

edited

Loading