Speed up simulation hot path (histories + in-place pool update)#50
Open
ythat wants to merge 1 commit into
Open
Speed up simulation hot path (histories + in-place pool update)#50ythat wants to merge 1 commit into
ythat wants to merge 1 commit into
Conversation
Cache time-slice row indices and align source rows by id in the history helpers (lagged/lagavg/cumavg/visit_sum) instead of rebuilding subsets and full-length %in% filters per history variable. Update the time-t slice in simulate() by reference rather than via row subassignment, and fetch the previous time slice once in the survival update. Results are unchanged (all.equal-identical to before); ~1.8x faster on the bundled survival example.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Hi! Thank you for developing and maintaining the
gfoRmulapackage.It has been very helpful both at work as well as the summer courses at CAUSALab. 😊
With the help of Claude, I'm proposing some changes that help with the performance, yielding identical output as the original package.
The short version: Across every scenario I tested, the point estimates and bootstrap CIs come out
all.equal()-identical to the currentmaster. It touches only two files and adds no dependencies (still pure R ondata.table).What changed
R/histories.RThe history helpers (
lagged,lagavg,cumavg,visit_sum) were rebuilding the same time-slice subset several times and constructing a full-lengthget(id_name) %in% current_idsfilter once per history variable per time step. The refactor caches each time slice's row indices once and aligns source rows to the current rows by id withmatch(), then writes by reference withdata.table::set(). The first-creation cumulative-average branches swap the repeated filteredtapply()calls for grouped means viasplit()/vapply()with id-indexed assignment. This relies on the existing invariant that the pool has one row per id per time in a consistent order.R/simulate.RTwo small changes:
tslice was written back withpool[pool[[time_name]] == t] <- newdf, a row-subassignment that round-trips the whole table. It's now updated in place at the precomputed row index withset()(creating any new columns first).prodp0/prodd0/poprisk. It now grabs that slice once and reuses it; the arithmetic is untouched.Benchmark
Using the bundled
basicdata_nocompexample (the documentedgformula_survivalspec; 2,500 ids, 7 time points), so it's fully reproducible without any external data. Same call onmastervs this branch. Machine: Apple Silicon, R 4.3.2,data.table 1.17.8.
nsamples = 0)¹ On a dataset this small the parallel run is dominated by fixed cluster-setup overhead (16 cores give the baseline only ~2.8× over its own serial run), so the per-simulation savings show up less in wall-clock here. They scale up with the number of time points, covariates, and bootstrap samples. For instance, on a larger analysis I work with (~7k subjects, 48 time points, 10 time varying covariates) the bootstrap speedup was around 2.3×, also with identical results.
I also spot-checked that output stays identical across integer / character / factor / non-sequential ids, the
continuous_eofoutcome type, and a competing-event survival analysis.I hope this is helpful 🙂