Skip to content

Conversation

@Yuvraj-cyborg
Copy link
Contributor

@Yuvraj-cyborg Yuvraj-cyborg commented Dec 25, 2025

Which issue does this PR close?

Rationale for this change

The structopt crate is deprecated and has been superseded by clap with its derive feature.
This migration:

  • Removes a deprecated dependency
  • Potentially improves compilation times for the benchmarks crate
  • Keeps the codebase up-to-date with modern Rust ecosystem practices

What changes are included in this PR?

  • Updated Cargo.toml to use clap = { version = "4.5.53", features = ["derive"] } instead of structopt

  • Migrated all benchmark source files to use clap's derive macros:

  • #[derive(StructOpt)]#[derive(Parser/Args/Subcommand)]

  • #[structopt(...)]#[arg(...)] for fields

  • #[structopt(...)]#[command(...)] for struct/enum-level attributes

  • StructOpt::from_args()Parser::parse()

  • Removed parse(from_os_str) as PathBuf works natively in clap

  • Changed short flag strings to chars ('p' instead of "p")

Are these changes tested?

Yes

Are there any user-facing changes?

No. The CLI interface remains identical - this is a drop-in replacement of the underlying argument parsing library.

@Yuvraj-cyborg
Copy link
Contributor Author

cc: @alamb @Jefffrey @martin-g @timsaucer

@Yuvraj-cyborg
Copy link
Contributor Author

@Jefffrey could u review this one !?

@Jefffrey
Copy link
Contributor

I'll try take a look when I have time

#[structopt(name = "IMDB", about = "IMDB Dataset Processing.")]
#[derive(Debug, Parser)]
#[command(name = "IMDB", about = "IMDB Dataset Processing.")]
enum ImdbOpt {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is unconventional - #[derive(Parser)] + enum.

Let's keep it consistent with the other binaries, e.g.:

#[derive(Debug, Parser)]
struct Cli {
    #[command(subcommand)]
    command: ImdbOpt,
}

#[derive(Debug, Subcommand)]
enum ImdbOpt {
    Benchmark(BenchmarkSubCommandOpt),
    Convert(imdb::ConvertOpt),
}
...

#[structopt(name = "Memory Profiling Utility")]
#[derive(Debug, Parser)]
#[command(name = "Memory Profiling Utility")]
struct MemProfileOpt {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Shall it rename it to Cli to make it consistent with the other binaries ?

@Yuvraj-cyborg Yuvraj-cyborg force-pushed the replace-structopt-with-clap branch from 721fd14 to e6d0f01 Compare December 28, 2025 19:23
@Yuvraj-cyborg Yuvraj-cyborg force-pushed the replace-structopt-with-clap branch from e6d0f01 to 70fc238 Compare December 28, 2025 19:24
@martin-g
Copy link
Member

Please don't force-push! It makes it harder to make incremental reviews.
The merge will squash the commits anyway.

@Yuvraj-cyborg
Copy link
Contributor Author

Please don't force-push! It makes it harder to make incremental reviews. The merge will squash the commits anyway.

Ahh sorry bad habit of rebasing and force pushing, would take care of it from next time, apologies !

@alamb
Copy link
Contributor

alamb commented Dec 30, 2025

run benchmarks

@alamb-ghbot
Copy link

🤖 ./gh_compare_branch.sh gh_compare_branch.sh Running
Linux aal-dev 6.14.0-1018-gcp #19~24.04.1-Ubuntu SMP Wed Sep 24 23:23:09 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Comparing replace-structopt-with-clap (70fc238) to bb4e0ec diff using: tpch_mem clickbench_partitioned clickbench_extended
Results will be posted here when complete

@alamb
Copy link
Contributor

alamb commented Dec 30, 2025

Thanks @Yuvraj-cyborg and @martin-g
! I am just running the benchmarks to make sure this works well and if so I think it is ready to go

"libloading 0.8.9",
]

[[package]]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is really nice to see ❤️

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:)

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR migrates the datafusion-benchmarks crate from the deprecated structopt library to clap v4 with derive features, removing an unmaintained dependency and modernizing the CLI parsing approach.

Key Changes

  • Replaced structopt dependency with clap 4.5.53 featuring derive macros
  • Migrated all CLI argument parsing code across 16+ source files to use clap's derive API
  • Updated all attribute syntax from structopt conventions to clap v4 patterns

Reviewed changes

Copilot reviewed 13 out of 18 changed files in this pull request and generated no comments.

Show a summary per file
File Description
benchmarks/Cargo.toml Adds clap 4.5.53 with derive features, removes structopt dependency
Cargo.lock Removes structopt and its transitive dependencies (clap 2.x, heck 0.3.3, proc-macro-error, etc.), consolidates on clap 4.5.53
benchmarks/src/util/options.rs Migrates CommonOpt from StructOpt to Args, updates custom parser to use value_parser
benchmarks/src/tpch/run.rs Converts RunOpt to use Args derive with updated attribute syntax
benchmarks/src/tpcds/run.rs Converts RunOpt to use Args derive with updated attribute syntax
benchmarks/src/sort_tpch.rs Converts RunOpt to use Args derive with updated attribute syntax
benchmarks/src/smj.rs Converts RunOpt to use Args derive with updated attribute syntax
benchmarks/src/nlj.rs Converts RunOpt to use Args derive with updated attribute syntax
benchmarks/src/imdb/run.rs Converts RunOpt to use Args derive with updated attribute syntax
benchmarks/src/imdb/convert.rs Converts ConvertOpt to use Args derive with updated attribute syntax
benchmarks/src/hj.rs Converts RunOpt to use Args derive with updated attribute syntax
benchmarks/src/h2o.rs Converts RunOpt to use Args derive with updated attribute syntax
benchmarks/src/clickbench.rs Converts RunOpt to use Args derive with updated attribute syntax
benchmarks/src/cancellation.rs Converts RunOpt to use Args derive with updated attribute syntax
benchmarks/src/bin/mem_profile.rs Restructures CLI with Parser and Subcommand derives, uses Cli::parse()
benchmarks/src/bin/imdb.rs Restructures CLI with Parser and Subcommand derives for nested subcommands, uses Cli::parse()
benchmarks/src/bin/external_aggr.rs Restructures CLI with Parser, Subcommand, and Args derives, uses Cli::parse()
benchmarks/src/bin/dfbench.rs Restructures CLI with Parser and Subcommand derives, uses Cli::parse()

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@alamb
Copy link
Contributor

alamb commented Dec 30, 2025

I ran the tests locally and they worked well for me too

@alamb-ghbot
Copy link

🤖: Benchmark completed

Details

Comparing HEAD and replace-structopt-with-clap
--------------------
Benchmark clickbench_extended.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━┓
┃ Query        ┃        HEAD ┃ replace-structopt-with-clap ┃    Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━┩
│ QQuery 0     │  2583.35 ms │                  2543.75 ms │ no change │
│ QQuery 1     │  1072.99 ms │                  1058.56 ms │ no change │
│ QQuery 2     │  2055.07 ms │                  2021.40 ms │ no change │
│ QQuery 3     │  1130.77 ms │                  1125.91 ms │ no change │
│ QQuery 4     │  2259.95 ms │                  2267.83 ms │ no change │
│ QQuery 5     │ 28316.49 ms │                 27962.12 ms │ no change │
│ QQuery 6     │  3982.06 ms │                  3907.68 ms │ no change │
│ QQuery 7     │  3666.47 ms │                  3617.15 ms │ no change │
└──────────────┴─────────────┴─────────────────────────────┴───────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary                          ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (HEAD)                          │ 45067.16ms │
│ Total Time (replace-structopt-with-clap)   │ 44504.39ms │
│ Average Time (HEAD)                        │  5633.39ms │
│ Average Time (replace-structopt-with-clap) │  5563.05ms │
│ Queries Faster                             │          0 │
│ Queries Slower                             │          0 │
│ Queries with No Change                     │          8 │
│ Queries with Failure                       │          0 │
└────────────────────────────────────────────┴────────────┘
--------------------
Benchmark clickbench_partitioned.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query        ┃        HEAD ┃ replace-structopt-with-clap ┃        Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 0     │     2.00 ms │                     2.03 ms │     no change │
│ QQuery 1     │    51.74 ms │                    51.12 ms │     no change │
│ QQuery 2     │   135.91 ms │                   139.82 ms │     no change │
│ QQuery 3     │   153.33 ms │                   156.13 ms │     no change │
│ QQuery 4     │  1155.09 ms │                  1090.49 ms │ +1.06x faster │
│ QQuery 5     │  1532.12 ms │                  1469.14 ms │     no change │
│ QQuery 6     │     2.25 ms │                     2.08 ms │ +1.08x faster │
│ QQuery 7     │    55.00 ms │                    56.73 ms │     no change │
│ QQuery 8     │  1483.21 ms │                  1423.07 ms │     no change │
│ QQuery 9     │  1920.84 ms │                  1847.45 ms │     no change │
│ QQuery 10    │   343.46 ms │                   358.11 ms │     no change │
│ QQuery 11    │   402.05 ms │                   408.02 ms │     no change │
│ QQuery 12    │  1412.87 ms │                  1343.48 ms │     no change │
│ QQuery 13    │  2055.19 ms │                  2028.70 ms │     no change │
│ QQuery 14    │  1281.13 ms │                  1238.85 ms │     no change │
│ QQuery 15    │  1286.91 ms │                  1245.39 ms │     no change │
│ QQuery 16    │  2546.26 ms │                  2549.46 ms │     no change │
│ QQuery 17    │  2514.32 ms │                  2523.73 ms │     no change │
│ QQuery 18    │  5676.20 ms │                  4864.78 ms │ +1.17x faster │
│ QQuery 19    │   123.09 ms │                   124.89 ms │     no change │
│ QQuery 20    │  1987.85 ms │                  1917.45 ms │     no change │
│ QQuery 21    │  2266.46 ms │                  2206.64 ms │     no change │
│ QQuery 22    │  3965.83 ms │                  3860.30 ms │     no change │
│ QQuery 23    │ 20189.20 ms │                 12276.94 ms │ +1.64x faster │
│ QQuery 24    │   229.18 ms │                   208.93 ms │ +1.10x faster │
│ QQuery 25    │   492.84 ms │                   471.86 ms │     no change │
│ QQuery 26    │   230.49 ms │                   213.80 ms │ +1.08x faster │
│ QQuery 27    │  2966.38 ms │                  2661.44 ms │ +1.11x faster │
│ QQuery 28    │ 22464.10 ms │                 23522.04 ms │     no change │
│ QQuery 29    │   982.43 ms │                   971.10 ms │     no change │
│ QQuery 30    │  1381.72 ms │                  1331.06 ms │     no change │
│ QQuery 31    │  1421.97 ms │                  1345.46 ms │ +1.06x faster │
│ QQuery 32    │  5226.83 ms │                  4623.46 ms │ +1.13x faster │
│ QQuery 33    │  5923.67 ms │                  5687.47 ms │     no change │
│ QQuery 34    │  6230.41 ms │                  5871.76 ms │ +1.06x faster │
│ QQuery 35    │  1979.21 ms │                  1880.03 ms │ +1.05x faster │
│ QQuery 36    │    66.23 ms │                    65.20 ms │     no change │
│ QQuery 37    │    47.44 ms │                    45.35 ms │     no change │
│ QQuery 38    │    66.32 ms │                    68.53 ms │     no change │
│ QQuery 39    │   103.70 ms │                   102.63 ms │     no change │
│ QQuery 40    │    27.43 ms │                    26.43 ms │     no change │
│ QQuery 41    │    23.40 ms │                    22.38 ms │     no change │
│ QQuery 42    │    19.08 ms │                    19.39 ms │     no change │
└──────────────┴─────────────┴─────────────────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┓
┃ Benchmark Summary                          ┃             ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━┩
│ Total Time (HEAD)                          │ 102425.16ms │
│ Total Time (replace-structopt-with-clap)   │  92323.13ms │
│ Average Time (HEAD)                        │   2381.98ms │
│ Average Time (replace-structopt-with-clap) │   2147.05ms │
│ Queries Faster                             │          11 │
│ Queries Slower                             │           0 │
│ Queries with No Change                     │          32 │
│ Queries with Failure                       │           0 │
└────────────────────────────────────────────┴─────────────┘
--------------------
Benchmark tpch_mem_sf1.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query        ┃      HEAD ┃ replace-structopt-with-clap ┃        Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 1     │ 122.00 ms │                   114.23 ms │ +1.07x faster │
│ QQuery 2     │  26.79 ms │                    28.50 ms │  1.06x slower │
│ QQuery 3     │  37.81 ms │                    39.61 ms │     no change │
│ QQuery 4     │  29.06 ms │                    28.96 ms │     no change │
│ QQuery 5     │  88.21 ms │                    88.80 ms │     no change │
│ QQuery 6     │  19.94 ms │                    19.87 ms │     no change │
│ QQuery 7     │ 220.87 ms │                   238.92 ms │  1.08x slower │
│ QQuery 8     │  41.22 ms │                    38.89 ms │ +1.06x faster │
│ QQuery 9     │ 112.81 ms │                   102.50 ms │ +1.10x faster │
│ QQuery 10    │  64.23 ms │                    63.17 ms │     no change │
│ QQuery 11    │  18.07 ms │                    18.09 ms │     no change │
│ QQuery 12    │  52.15 ms │                    50.83 ms │     no change │
│ QQuery 13    │  47.86 ms │                    46.77 ms │     no change │
│ QQuery 14    │  13.71 ms │                    13.53 ms │     no change │
│ QQuery 15    │  24.45 ms │                    23.73 ms │     no change │
│ QQuery 16    │  24.00 ms │                    23.79 ms │     no change │
│ QQuery 17    │ 156.16 ms │                   154.51 ms │     no change │
│ QQuery 18    │ 286.95 ms │                   282.37 ms │     no change │
│ QQuery 19    │  39.35 ms │                    37.17 ms │ +1.06x faster │
│ QQuery 20    │  48.65 ms │                    49.13 ms │     no change │
│ QQuery 21    │ 297.40 ms │                   314.23 ms │  1.06x slower │
│ QQuery 22    │  17.21 ms │                    17.19 ms │     no change │
└──────────────┴───────────┴─────────────────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━┓
┃ Benchmark Summary                          ┃           ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━┩
│ Total Time (HEAD)                          │ 1788.90ms │
│ Total Time (replace-structopt-with-clap)   │ 1794.80ms │
│ Average Time (HEAD)                        │   81.31ms │
│ Average Time (replace-structopt-with-clap) │   81.58ms │
│ Queries Faster                             │         4 │
│ Queries Slower                             │         3 │
│ Queries with No Change                     │        15 │
│ Queries with Failure                       │         0 │
└────────────────────────────────────────────┴───────────┘

@alamb
Copy link
Contributor

alamb commented Dec 30, 2025

I'll plan to merge this tomorrow unless anyone else would like time to review

@alamb alamb added this pull request to the merge queue Jan 1, 2026
@alamb
Copy link
Contributor

alamb commented Jan 1, 2026

Thanks again @Yuvraj-cyborg and @martin-g

Merged via the queue into apache:main with commit cd12d51 Jan 1, 2026
38 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Replace deprecated structopt with clap in benchmarks crate

5 participants