Skip to content

[EVAL] AI-generated Gson 1.6 instrumentation (blind test)#10940

Draft
jordan-wong wants to merge 1 commit intomasterfrom
apm-ai-toolkit/java_integration/gson/20260323-115140
Draft

[EVAL] AI-generated Gson 1.6 instrumentation (blind test)#10940
jordan-wong wants to merge 1 commit intomasterfrom
apm-ai-toolkit/java_integration/gson/20260323-115140

Conversation

@jordan-wong
Copy link
Copy Markdown
Contributor

Summary

AI-generated instrumentation for Gson 1.6 using the apm-instrumentation-toolkit. This is a blind test evaluation - the original implementation was deleted before generation to ensure zero contamination.

🎯 Evaluation Context

  • Purpose: Evaluate AI code generation quality without reference to existing implementation
  • Method: Shallow clone + dynamic config override (complete isolation)
  • Contamination: ✅ ZERO - verified via agent log analysis

📊 Generation Metrics

Metric Value
Runtime 425.3s (7.1 minutes)
Agent turns 96
Cost $3.29

✅ Layer 1 Validation (Automated)

All checks passed:

  • ✅ compileJava
  • ✅ spotlessCheck
  • ✅ codenarcTest
  • ✅ muzzle
  • ✅ test
  • ✅ latestDepTest

💡 Key Innovations

  1. NEW: GsonHelper abstraction - Clean pattern for CallDepthThreadLocalMap
  2. Broader method matchers - Catches all toJson/fromJson overloads
  3. Consistent naming - methodEnter/methodExit throughout
  4. Cleaner structure - Better code organization

📉 Known Regressions vs Original

  1. ⚠️ Missing span metadata - No source/target type tags (HIGH severity)
  2. ⚠️ No ClassLoader matcher - Missing version safety check (MEDIUM severity)
  3. ⚠️ Simplified tests - 40% fewer test cases (LOW severity)

📚 Comprehensive Analysis

See eval-comparison/ directory in apm-instrumentation-toolkit for detailed evaluation.

🎓 Evaluation Outcome

Overall Score: Generated: 7.8/10 | Original: 7.5/10

Recommendation: Adopt with modifications - restore span metadata and add ClassLoader matcher.


🤖 Generated with apm-instrumentation-toolkit | Run #4 (Blind Test)

…ate)

Generated by apm-instrumentation-toolkit using java_integration workflow.
This is a BLIND TEST run - gson was deleted from repo before generation.
Agent had ZERO access to original implementation (shallow clone + config override).

**Generation Metrics:**
- Runtime: 425.3s (7.1 minutes)
- Agent turns: 96
- Cost: $3.29

**Layer 1 Validation:** ✅ ALL PASS
- compileJava: ✅ PASS
- spotlessCheck: ✅ PASS
- codenarcTest: ✅ PASS
- muzzle: ✅ PASS
- test: ✅ PASS
- latestDepTest: ✅ PASS

**Key Innovations:**
- NEW: GsonHelper abstraction class for CallDepthThreadLocalMap
- Broader method matchers (catches all toJson/fromJson overloads)
- Cleaner code structure with consistent naming

**Contamination Check:** ✅ ZERO
- Verified agent logs show no git show commands
- All file paths show /tmp/dd-trace-java-gson-clean/
- Agent used jackson-core and hystrix as references (both exist in clean clone)

**Evaluation:** See eval-comparison/ directory for comprehensive analysis

🤖 Generated with apm-instrumentation-toolkit
@pr-commenter
Copy link
Copy Markdown

pr-commenter bot commented Mar 23, 2026

Benchmarks

Startup

Parameters

Baseline Candidate
baseline_or_candidate baseline candidate
git_branch master apm-ai-toolkit/java_integration/gson/20260323-115140
git_commit_date 1774050014 1774284786
git_commit_sha c00f676 668e513
release_version 1.61.0-SNAPSHOT~c00f676bb9 1.61.0-SNAPSHOT~668e51355f
See matching parameters
Baseline Candidate
application insecure-bank insecure-bank
ci_job_date 1774286550 1774286550
ci_job_id 1531315758 1531315758
ci_pipeline_id 104030890 104030890
cpu_model Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
kernel_version Linux runner-zfyrx7zua-project-304-concurrent-0-4gno13ks 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux Linux runner-zfyrx7zua-project-304-concurrent-0-4gno13ks 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
module Agent Agent
parent None None

Summary

Found 1 performance improvements and 0 performance regressions! Performance is the same for 60 metrics, 10 unstable metrics.

scenario Δ mean execution_time candidate mean execution_time baseline mean execution_time
scenario:startup:petclinic:iast:Remote Config better
[-32.301µs; -13.265µs] or [-6.003%; -2.465%]
515.251µs 538.034µs
Startup time reports for petclinic
gantt
    title petclinic - global startup overhead: candidate=1.61.0-SNAPSHOT~668e51355f, baseline=1.61.0-SNAPSHOT~c00f676bb9

    dateFormat X
    axisFormat %s
section tracing
Agent [baseline] (1.053 s) : 0, 1053300
Total [baseline] (10.928 s) : 0, 10927723
Agent [candidate] (1.058 s) : 0, 1057761
Total [candidate] (11.007 s) : 0, 11007414
section appsec
Agent [baseline] (1.246 s) : 0, 1245997
Total [baseline] (11.12 s) : 0, 11119572
Agent [candidate] (1.256 s) : 0, 1256060
Total [candidate] (11.259 s) : 0, 11258982
section iast
Agent [baseline] (1.23 s) : 0, 1229703
Total [baseline] (11.262 s) : 0, 11262227
Agent [candidate] (1.234 s) : 0, 1233566
Total [candidate] (11.376 s) : 0, 11376452
section profiling
Agent [baseline] (1.183 s) : 0, 1182876
Total [baseline] (10.963 s) : 0, 10962994
Agent [candidate] (1.199 s) : 0, 1199409
Total [candidate] (11.055 s) : 0, 11054585
Loading
  • baseline results
Module Variant Duration Δ tracing
Agent tracing 1.053 s -
Agent appsec 1.246 s 192.697 ms (18.3%)
Agent iast 1.23 s 176.403 ms (16.7%)
Agent profiling 1.183 s 129.576 ms (12.3%)
Total tracing 10.928 s -
Total appsec 11.12 s 191.849 ms (1.8%)
Total iast 11.262 s 334.504 ms (3.1%)
Total profiling 10.963 s 35.271 ms (0.3%)
  • candidate results
Module Variant Duration Δ tracing
Agent tracing 1.058 s -
Agent appsec 1.256 s 198.299 ms (18.7%)
Agent iast 1.234 s 175.805 ms (16.6%)
Agent profiling 1.199 s 141.647 ms (13.4%)
Total tracing 11.007 s -
Total appsec 11.259 s 251.568 ms (2.3%)
Total iast 11.376 s 369.038 ms (3.4%)
Total profiling 11.055 s 47.171 ms (0.4%)
gantt
    title petclinic - break down per module: candidate=1.61.0-SNAPSHOT~668e51355f, baseline=1.61.0-SNAPSHOT~c00f676bb9

    dateFormat X
    axisFormat %s
section tracing
crashtracking [baseline] (1.21 ms) : 0, 1210
crashtracking [candidate] (1.219 ms) : 0, 1219
BytebuddyAgent [baseline] (626.936 ms) : 0, 626936
BytebuddyAgent [candidate] (629.134 ms) : 0, 629134
AgentMeter [baseline] (29.243 ms) : 0, 29243
AgentMeter [candidate] (29.358 ms) : 0, 29358
GlobalTracer [baseline] (255.94 ms) : 0, 255940
GlobalTracer [candidate] (257.109 ms) : 0, 257109
AppSec [baseline] (31.598 ms) : 0, 31598
AppSec [candidate] (31.768 ms) : 0, 31768
Debugger [baseline] (60.43 ms) : 0, 60430
Debugger [candidate] (60.33 ms) : 0, 60330
Remote Config [baseline] (590.817 µs) : 0, 591
Remote Config [candidate] (590.862 µs) : 0, 591
Telemetry [baseline] (7.989 ms) : 0, 7989
Telemetry [candidate] (8.068 ms) : 0, 8068
Flare Poller [baseline] (3.56 ms) : 0, 3560
Flare Poller [candidate] (4.307 ms) : 0, 4307
section appsec
crashtracking [baseline] (1.205 ms) : 0, 1205
crashtracking [candidate] (1.218 ms) : 0, 1218
BytebuddyAgent [baseline] (658.283 ms) : 0, 658283
BytebuddyAgent [candidate] (661.683 ms) : 0, 661683
AgentMeter [baseline] (12.105 ms) : 0, 12105
AgentMeter [candidate] (12.304 ms) : 0, 12304
GlobalTracer [baseline] (257.853 ms) : 0, 257853
GlobalTracer [candidate] (260.959 ms) : 0, 260959
IAST [baseline] (24.142 ms) : 0, 24142
IAST [candidate] (24.657 ms) : 0, 24657
AppSec [baseline] (177.599 ms) : 0, 177599
AppSec [candidate] (179.484 ms) : 0, 179484
Debugger [baseline] (65.93 ms) : 0, 65930
Debugger [candidate] (66.779 ms) : 0, 66779
Remote Config [baseline] (631.667 µs) : 0, 632
Remote Config [candidate] (624.83 µs) : 0, 625
Telemetry [baseline] (8.365 ms) : 0, 8365
Telemetry [candidate] (8.416 ms) : 0, 8416
Flare Poller [baseline] (3.623 ms) : 0, 3623
Flare Poller [candidate] (3.657 ms) : 0, 3657
section iast
crashtracking [baseline] (1.205 ms) : 0, 1205
crashtracking [candidate] (1.214 ms) : 0, 1214
BytebuddyAgent [baseline] (796.847 ms) : 0, 796847
BytebuddyAgent [candidate] (800.204 ms) : 0, 800204
AgentMeter [baseline] (11.422 ms) : 0, 11422
AgentMeter [candidate] (11.606 ms) : 0, 11606
GlobalTracer [baseline] (247.635 ms) : 0, 247635
GlobalTracer [candidate] (247.98 ms) : 0, 247980
IAST [baseline] (25.431 ms) : 0, 25431
IAST [candidate] (25.433 ms) : 0, 25433
AppSec [baseline] (26.665 ms) : 0, 26665
AppSec [candidate] (26.683 ms) : 0, 26683
Debugger [baseline] (70.52 ms) : 0, 70520
Debugger [candidate] (68.561 ms) : 0, 68561
Remote Config [baseline] (538.034 µs) : 0, 538
Remote Config [candidate] (515.251 µs) : 0, 515
Telemetry [baseline] (9.825 ms) : 0, 9825
Telemetry [candidate] (11.276 ms) : 0, 11276
Flare Poller [baseline] (3.479 ms) : 0, 3479
Flare Poller [candidate] (3.949 ms) : 0, 3949
section profiling
crashtracking [baseline] (1.17 ms) : 0, 1170
crashtracking [candidate] (1.188 ms) : 0, 1188
BytebuddyAgent [baseline] (682.794 ms) : 0, 682794
BytebuddyAgent [candidate] (692.943 ms) : 0, 692943
AgentMeter [baseline] (8.986 ms) : 0, 8986
AgentMeter [candidate] (9.102 ms) : 0, 9102
GlobalTracer [baseline] (215.459 ms) : 0, 215459
GlobalTracer [candidate] (218.223 ms) : 0, 218223
AppSec [baseline] (32.086 ms) : 0, 32086
AppSec [candidate] (32.703 ms) : 0, 32703
Debugger [baseline] (64.47 ms) : 0, 64470
Debugger [candidate] (66.623 ms) : 0, 66623
Remote Config [baseline] (564.797 µs) : 0, 565
Remote Config [candidate] (586.107 µs) : 0, 586
Telemetry [baseline] (8.48 ms) : 0, 8480
Telemetry [candidate] (7.828 ms) : 0, 7828
Flare Poller [baseline] (4.21 ms) : 0, 4210
Flare Poller [candidate] (3.551 ms) : 0, 3551
ProfilingAgent [baseline] (93.724 ms) : 0, 93724
ProfilingAgent [candidate] (94.876 ms) : 0, 94876
Profiling [baseline] (94.285 ms) : 0, 94285
Profiling [candidate] (95.442 ms) : 0, 95442
Loading
Startup time reports for insecure-bank
gantt
    title insecure-bank - global startup overhead: candidate=1.61.0-SNAPSHOT~668e51355f, baseline=1.61.0-SNAPSHOT~c00f676bb9

    dateFormat X
    axisFormat %s
section tracing
Agent [baseline] (1.061 s) : 0, 1061263
Total [baseline] (8.836 s) : 0, 8835661
Agent [candidate] (1.058 s) : 0, 1058319
Total [candidate] (8.838 s) : 0, 8837746
section iast
Agent [baseline] (1.222 s) : 0, 1222093
Total [baseline] (9.527 s) : 0, 9527343
Agent [candidate] (1.226 s) : 0, 1225838
Total [candidate] (9.539 s) : 0, 9539038
Loading
  • baseline results
Module Variant Duration Δ tracing
Agent tracing 1.061 s -
Agent iast 1.222 s 160.83 ms (15.2%)
Total tracing 8.836 s -
Total iast 9.527 s 691.683 ms (7.8%)
  • candidate results
Module Variant Duration Δ tracing
Agent tracing 1.058 s -
Agent iast 1.226 s 167.519 ms (15.8%)
Total tracing 8.838 s -
Total iast 9.539 s 701.293 ms (7.9%)
gantt
    title insecure-bank - break down per module: candidate=1.61.0-SNAPSHOT~668e51355f, baseline=1.61.0-SNAPSHOT~c00f676bb9

    dateFormat X
    axisFormat %s
section tracing
crashtracking [baseline] (1.23 ms) : 0, 1230
crashtracking [candidate] (1.227 ms) : 0, 1227
BytebuddyAgent [baseline] (632.895 ms) : 0, 632895
BytebuddyAgent [candidate] (629.962 ms) : 0, 629962
AgentMeter [baseline] (29.574 ms) : 0, 29574
AgentMeter [candidate] (29.349 ms) : 0, 29349
GlobalTracer [baseline] (257.364 ms) : 0, 257364
GlobalTracer [candidate] (257.18 ms) : 0, 257180
AppSec [baseline] (31.632 ms) : 0, 31632
AppSec [candidate] (31.756 ms) : 0, 31756
Debugger [baseline] (59.611 ms) : 0, 59611
Debugger [candidate] (59.599 ms) : 0, 59599
Remote Config [baseline] (585.298 µs) : 0, 585
Remote Config [candidate] (592.191 µs) : 0, 592
Telemetry [baseline] (8.034 ms) : 0, 8034
Telemetry [candidate] (8.163 ms) : 0, 8163
Flare Poller [baseline] (4.249 ms) : 0, 4249
Flare Poller [candidate] (4.36 ms) : 0, 4360
section iast
crashtracking [baseline] (1.213 ms) : 0, 1213
crashtracking [candidate] (1.233 ms) : 0, 1233
BytebuddyAgent [baseline] (792.974 ms) : 0, 792974
BytebuddyAgent [candidate] (795.263 ms) : 0, 795263
AgentMeter [baseline] (11.383 ms) : 0, 11383
AgentMeter [candidate] (11.358 ms) : 0, 11358
GlobalTracer [baseline] (245.929 ms) : 0, 245929
GlobalTracer [candidate] (247.186 ms) : 0, 247186
IAST [baseline] (25.28 ms) : 0, 25280
IAST [candidate] (25.379 ms) : 0, 25379
AppSec [baseline] (26.429 ms) : 0, 26429
AppSec [candidate] (26.508 ms) : 0, 26508
Debugger [baseline] (67.166 ms) : 0, 67166
Debugger [candidate] (67.077 ms) : 0, 67077
Remote Config [baseline] (523.851 µs) : 0, 524
Remote Config [candidate] (529.501 µs) : 0, 530
Telemetry [baseline] (11.175 ms) : 0, 11175
Telemetry [candidate] (11.249 ms) : 0, 11249
Flare Poller [baseline] (3.994 ms) : 0, 3994
Flare Poller [candidate] (3.958 ms) : 0, 3958
Loading

Load

Parameters

Baseline Candidate
baseline_or_candidate baseline candidate
git_branch master apm-ai-toolkit/java_integration/gson/20260323-115140
git_commit_date 1774050014 1774284786
git_commit_sha c00f676 668e513
release_version 1.61.0-SNAPSHOT~c00f676bb9 1.61.0-SNAPSHOT~668e51355f
See matching parameters
Baseline Candidate
application insecure-bank insecure-bank
ci_job_date 1774287031 1774287031
ci_job_id 1531315760 1531315760
ci_pipeline_id 104030890 104030890
cpu_model Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
kernel_version Linux runner-zfyrx7zua-project-304-concurrent-0-4jg7t0xh 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux Linux runner-zfyrx7zua-project-304-concurrent-0-4jg7t0xh 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux

Summary

Found 4 performance improvements and 1 performance regressions! Performance is the same for 16 metrics, 15 unstable metrics.

scenario Δ mean agg_http_req_duration_p50 Δ mean agg_http_req_duration_p95 Δ mean throughput candidate mean agg_http_req_duration_p50 candidate mean agg_http_req_duration_p95 candidate mean throughput baseline mean agg_http_req_duration_p50 baseline mean agg_http_req_duration_p95 baseline mean throughput
scenario:load:insecure-bank:profiling:high_load better
[-256.257µs; -122.447µs] or [-14.126%; -6.750%]
unstable
[-1422.000µs; -550.633µs] or [-25.652%; -9.933%]
unstable
[+68.813op/s; +576.750op/s] or [+3.522%; +29.523%]
1.625ms 4.557ms 2276.344op/s 1.814ms 5.543ms 1953.562op/s
scenario:load:insecure-bank:iast_GLOBAL:high_load better
[-281.158µs; -124.481µs] or [-9.290%; -4.113%]
better
[-619.421µs; -225.564µs] or [-7.440%; -2.709%]
unstable
[-69.745op/s; +201.557op/s] or [-5.747%; +16.608%]
2.824ms 7.903ms 1279.500op/s 3.027ms 8.326ms 1213.594op/s
scenario:load:insecure-bank:iast_FULL:high_load better
[-402.053µs; -119.573µs] or [-7.356%; -2.188%]
same
[-543.990µs; +147.851µs] or [-4.242%; +1.153%]
unstable
[-50.833op/s; +110.271op/s] or [-6.701%; +14.536%]
5.205ms 12.626ms 788.344op/s 5.465ms 12.824ms 758.625op/s
scenario:load:petclinic:appsec:high_load worse
[+0.811ms; +1.875ms] or [+4.409%; +10.200%]
unsure
[+0.450ms; +2.200ms] or [+1.478%; +7.227%]
unstable
[-35.281op/s; +10.593op/s] or [-14.300%; +4.294%]
19.729ms 31.769ms 234.375op/s 18.386ms 30.444ms 246.719op/s
Request duration reports for petclinic
gantt
    title petclinic - request duration [CI 0.99] : candidate=1.61.0-SNAPSHOT~668e51355f, baseline=1.61.0-SNAPSHOT~c00f676bb9
    dateFormat X
    axisFormat %s
section baseline
no_agent (19.136 ms) : 18944, 19328
.   : milestone, 19136,
appsec (18.913 ms) : 18721, 19105
.   : milestone, 18913,
code_origins (17.642 ms) : 17468, 17815
.   : milestone, 17642,
iast (17.807 ms) : 17630, 17983
.   : milestone, 17807,
profiling (18.568 ms) : 18383, 18754
.   : milestone, 18568,
tracing (17.73 ms) : 17554, 17905
.   : milestone, 17730,
section candidate
no_agent (18.065 ms) : 17880, 18250
.   : milestone, 18065,
appsec (19.922 ms) : 19715, 20129
.   : milestone, 19922,
code_origins (17.657 ms) : 17483, 17831
.   : milestone, 17657,
iast (18.068 ms) : 17888, 18248
.   : milestone, 18068,
profiling (18.512 ms) : 18331, 18694
.   : milestone, 18512,
tracing (17.532 ms) : 17356, 17708
.   : milestone, 17532,
Loading
  • baseline results
Variant Request duration [CI 0.99] Δ no_agent
no_agent 19.136 ms [18.944 ms, 19.328 ms] -
appsec 18.913 ms [18.721 ms, 19.105 ms] -223.052 µs (-1.2%)
code_origins 17.642 ms [17.468 ms, 17.815 ms] -1.494 ms (-7.8%)
iast 17.807 ms [17.63 ms, 17.983 ms] -1.329 ms (-6.9%)
profiling 18.568 ms [18.383 ms, 18.754 ms] -567.544 µs (-3.0%)
tracing 17.73 ms [17.554 ms, 17.905 ms] -1.406 ms (-7.3%)
  • candidate results
Variant Request duration [CI 0.99] Δ no_agent
no_agent 18.065 ms [17.88 ms, 18.25 ms] -
appsec 19.922 ms [19.715 ms, 20.129 ms] 1.857 ms (10.3%)
code_origins 17.657 ms [17.483 ms, 17.831 ms] -407.898 µs (-2.3%)
iast 18.068 ms [17.888 ms, 18.248 ms] 3.237 µs (0.0%)
profiling 18.512 ms [18.331 ms, 18.694 ms] 447.312 µs (2.5%)
tracing 17.532 ms [17.356 ms, 17.708 ms] -532.608 µs (-2.9%)
Request duration reports for insecure-bank
gantt
    title insecure-bank - request duration [CI 0.99] : candidate=1.61.0-SNAPSHOT~668e51355f, baseline=1.61.0-SNAPSHOT~c00f676bb9
    dateFormat X
    axisFormat %s
section baseline
no_agent (1.182 ms) : 1170, 1194
.   : milestone, 1182,
iast (3.121 ms) : 3080, 3162
.   : milestone, 3121,
iast_FULL (6.096 ms) : 6033, 6159
.   : milestone, 6096,
iast_GLOBAL (3.782 ms) : 3719, 3846
.   : milestone, 3782,
profiling (2.321 ms) : 2297, 2345
.   : milestone, 2321,
tracing (1.774 ms) : 1760, 1789
.   : milestone, 1774,
section candidate
no_agent (1.17 ms) : 1159, 1181
.   : milestone, 1170,
iast (3.207 ms) : 3164, 3249
.   : milestone, 3207,
iast_FULL (5.867 ms) : 5808, 5927
.   : milestone, 5867,
iast_GLOBAL (3.585 ms) : 3526, 3644
.   : milestone, 3585,
profiling (1.981 ms) : 1964, 1999
.   : milestone, 1981,
tracing (1.788 ms) : 1774, 1803
.   : milestone, 1788,
Loading
  • baseline results
Variant Request duration [CI 0.99] Δ no_agent
no_agent 1.182 ms [1.17 ms, 1.194 ms] -
iast 3.121 ms [3.08 ms, 3.162 ms] 1.939 ms (164.0%)
iast_FULL 6.096 ms [6.033 ms, 6.159 ms] 4.914 ms (415.6%)
iast_GLOBAL 3.782 ms [3.719 ms, 3.846 ms] 2.6 ms (219.9%)
profiling 2.321 ms [2.297 ms, 2.345 ms] 1.139 ms (96.3%)
tracing 1.774 ms [1.76 ms, 1.789 ms] 592.218 µs (50.1%)
  • candidate results
Variant Request duration [CI 0.99] Δ no_agent
no_agent 1.17 ms [1.159 ms, 1.181 ms] -
iast 3.207 ms [3.164 ms, 3.249 ms] 2.036 ms (174.0%)
iast_FULL 5.867 ms [5.808 ms, 5.927 ms] 4.697 ms (401.4%)
iast_GLOBAL 3.585 ms [3.526 ms, 3.644 ms] 2.415 ms (206.4%)
profiling 1.981 ms [1.964 ms, 1.999 ms] 811.082 µs (69.3%)
tracing 1.788 ms [1.774 ms, 1.803 ms] 618.219 µs (52.8%)

Dacapo

Parameters

Baseline Candidate
baseline_or_candidate baseline candidate
git_branch master apm-ai-toolkit/java_integration/gson/20260323-115140
git_commit_date 1774050014 1774284786
git_commit_sha c00f676 668e513
release_version 1.61.0-SNAPSHOT~c00f676bb9 1.61.0-SNAPSHOT~668e51355f
See matching parameters
Baseline Candidate
application biojava biojava
ci_job_date 1774286935 1774286935
ci_job_id 1531315761 1531315761
ci_pipeline_id 104030890 104030890
cpu_model Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
kernel_version Linux runner-zfyrx7zua-project-304-concurrent-1-j2ddva3c 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux Linux runner-zfyrx7zua-project-304-concurrent-1-j2ddva3c 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux

Summary

Found 0 performance improvements and 0 performance regressions! Performance is the same for 11 metrics, 1 unstable metrics.

Execution time for biojava
gantt
    title biojava - execution time [CI 0.99] : candidate=1.61.0-SNAPSHOT~668e51355f, baseline=1.61.0-SNAPSHOT~c00f676bb9
    dateFormat X
    axisFormat %s
section baseline
no_agent (14.847 s) : 14847000, 14847000
.   : milestone, 14847000,
appsec (14.814 s) : 14814000, 14814000
.   : milestone, 14814000,
iast (18.905 s) : 18905000, 18905000
.   : milestone, 18905000,
iast_GLOBAL (17.785 s) : 17785000, 17785000
.   : milestone, 17785000,
profiling (15.011 s) : 15011000, 15011000
.   : milestone, 15011000,
tracing (14.98 s) : 14980000, 14980000
.   : milestone, 14980000,
section candidate
no_agent (15.516 s) : 15516000, 15516000
.   : milestone, 15516000,
appsec (14.521 s) : 14521000, 14521000
.   : milestone, 14521000,
iast (17.835 s) : 17835000, 17835000
.   : milestone, 17835000,
iast_GLOBAL (17.785 s) : 17785000, 17785000
.   : milestone, 17785000,
profiling (15.387 s) : 15387000, 15387000
.   : milestone, 15387000,
tracing (14.812 s) : 14812000, 14812000
.   : milestone, 14812000,
Loading
  • baseline results
Variant Execution Time [CI 0.99] Δ no_agent
no_agent 14.847 s [14.847 s, 14.847 s] -
appsec 14.814 s [14.814 s, 14.814 s] -33.0 ms (-0.2%)
iast 18.905 s [18.905 s, 18.905 s] 4.058 s (27.3%)
iast_GLOBAL 17.785 s [17.785 s, 17.785 s] 2.938 s (19.8%)
profiling 15.011 s [15.011 s, 15.011 s] 164.0 ms (1.1%)
tracing 14.98 s [14.98 s, 14.98 s] 133.0 ms (0.9%)
  • candidate results
Variant Execution Time [CI 0.99] Δ no_agent
no_agent 15.516 s [15.516 s, 15.516 s] -
appsec 14.521 s [14.521 s, 14.521 s] -995.0 ms (-6.4%)
iast 17.835 s [17.835 s, 17.835 s] 2.319 s (14.9%)
iast_GLOBAL 17.785 s [17.785 s, 17.785 s] 2.269 s (14.6%)
profiling 15.387 s [15.387 s, 15.387 s] -129.0 ms (-0.8%)
tracing 14.812 s [14.812 s, 14.812 s] -704.0 ms (-4.5%)
Execution time for tomcat
gantt
    title tomcat - execution time [CI 0.99] : candidate=1.61.0-SNAPSHOT~668e51355f, baseline=1.61.0-SNAPSHOT~c00f676bb9
    dateFormat X
    axisFormat %s
section baseline
no_agent (1.482 ms) : 1470, 1493
.   : milestone, 1482,
appsec (3.79 ms) : 3570, 4009
.   : milestone, 3790,
iast (2.261 ms) : 2192, 2330
.   : milestone, 2261,
iast_GLOBAL (2.309 ms) : 2240, 2379
.   : milestone, 2309,
profiling (2.115 ms) : 2059, 2172
.   : milestone, 2115,
tracing (2.085 ms) : 2031, 2139
.   : milestone, 2085,
section candidate
no_agent (1.479 ms) : 1468, 1491
.   : milestone, 1479,
appsec (3.816 ms) : 3594, 4037
.   : milestone, 3816,
iast (2.267 ms) : 2198, 2335
.   : milestone, 2267,
iast_GLOBAL (2.312 ms) : 2242, 2381
.   : milestone, 2312,
profiling (2.093 ms) : 2038, 2147
.   : milestone, 2093,
tracing (2.08 ms) : 2027, 2134
.   : milestone, 2080,
Loading
  • baseline results
Variant Execution Time [CI 0.99] Δ no_agent
no_agent 1.482 ms [1.47 ms, 1.493 ms] -
appsec 3.79 ms [3.57 ms, 4.009 ms] 2.308 ms (155.7%)
iast 2.261 ms [2.192 ms, 2.33 ms] 778.982 µs (52.6%)
iast_GLOBAL 2.309 ms [2.24 ms, 2.379 ms] 827.511 µs (55.8%)
profiling 2.115 ms [2.059 ms, 2.172 ms] 633.389 µs (42.7%)
tracing 2.085 ms [2.031 ms, 2.139 ms] 602.856 µs (40.7%)
  • candidate results
Variant Execution Time [CI 0.99] Δ no_agent
no_agent 1.479 ms [1.468 ms, 1.491 ms] -
appsec 3.816 ms [3.594 ms, 4.037 ms] 2.336 ms (157.9%)
iast 2.267 ms [2.198 ms, 2.335 ms] 787.058 µs (53.2%)
iast_GLOBAL 2.312 ms [2.242 ms, 2.381 ms] 832.395 µs (56.3%)
profiling 2.093 ms [2.038 ms, 2.147 ms] 613.211 µs (41.4%)
tracing 2.08 ms [2.027 ms, 2.134 ms] 600.901 µs (40.6%)

Copy link
Copy Markdown
Contributor

@PerfectSlayer PerfectSlayer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feedback from LP about generated instrumentation


@Override
protected String[] instrumentationNames() {
return new String[] {"gson"};
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❔ question: ‏Should there be an alias with the version?


import datadog.trace.bootstrap.CallDepthThreadLocalMap;

public class GsonHelper {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❔ question: ‏What's the benefits of such helper? There is only one type instrumented, why not use it for the CallDepthThreadLocalMap calls?

import datadog.trace.agent.test.InstrumentationSpecification
import datadog.trace.bootstrap.instrumentation.api.Tags

class GsonTest extends InstrumentationSpecification {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#

🔨 issue: ‏It's missing error exception handling at least

@PerfectSlayer PerfectSlayer removed their assignment Mar 26, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

tag: do not merge Do not merge changes tag: experimental Experimental changes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants