A faster reconstruction of the request list for the neigborlist#1373
Draft
Iximiel wants to merge 11 commits intoplumed:masterfrom
Draft
A faster reconstruction of the request list for the neigborlist#1373Iximiel wants to merge 11 commits intoplumed:masterfrom
Iximiel wants to merge 11 commits intoplumed:masterfrom
Conversation
ff9e8a1 to
e80f546
Compare
Iximiel
commented
Mar 6, 2026
Comment on lines
+323
to
+343
| for (unsigned i=0 ; i < requestIndexes_.size(); ++i) { | ||
| if(requestIndexes_[i]) { | ||
| requestlist_.push_back(fullatomlist_[i]); | ||
| } | ||
| } | ||
| Tools::removeDuplicates(requestlist_); | ||
| reduced=false; | ||
| //avoid bisecting the list again and again: | ||
| // I exploit the fact that requestlist_ is an ordered vector | ||
| // And I assume that index0 and index1 actually exists in the requestlist_ (see setRequestList()) | ||
| // so I can use lower_bond that uses binary seach instead of find | ||
|
|
||
| for (unsigned i=0 ; i < fullatomlist_.size(); ++i) { | ||
| const auto id=fullatomlist_[i]; | ||
| auto p = std::lower_bound(requestlist_.begin(), requestlist_.end(), id); | ||
| if (*p == id) { | ||
| indexesRemap_[i]=p-requestlist_.begin(); | ||
| } else { | ||
| //this atom is not used | ||
| indexesRemap_[i]=fullatomlist_.size(); | ||
| } | ||
| } |
Member
Author
There was a problem hiding this comment.
I observed that if I change in this way is slightly slower, even if this is slower
Suggested change
| for (unsigned i=0 ; i < requestIndexes_.size(); ++i) { | |
| if(requestIndexes_[i]) { | |
| requestlist_.push_back(fullatomlist_[i]); | |
| } | |
| } | |
| Tools::removeDuplicates(requestlist_); | |
| reduced=false; | |
| //avoid bisecting the list again and again: | |
| // I exploit the fact that requestlist_ is an ordered vector | |
| // And I assume that index0 and index1 actually exists in the requestlist_ (see setRequestList()) | |
| // so I can use lower_bond that uses binary seach instead of find | |
| for (unsigned i=0 ; i < fullatomlist_.size(); ++i) { | |
| const auto id=fullatomlist_[i]; | |
| auto p = std::lower_bound(requestlist_.begin(), requestlist_.end(), id); | |
| if (*p == id) { | |
| indexesRemap_[i]=p-requestlist_.begin(); | |
| } else { | |
| //this atom is not used | |
| indexesRemap_[i]=fullatomlist_.size(); | |
| } | |
| } | |
| usigned idx=0; | |
| for (unsigned i=0 ; i < requestIndexes_.size(); ++i) { | |
| if(requestIndexes_[i]) { | |
| requestlist_.push_back(fullatomlist_[i]); | |
| indexesRemap_[i]=idx; | |
| ++idx; | |
| } | |
| } |
Note that with this change to work I am not removing duplicates
Measuremnets done with:
WO1: GROUP ATOMS=1-5000
WO2: GROUP ATOMS=5001-10000
cpu: COORDINATION ...
GROUPA=WO1 GROUPB=WO2
SWITCH={RATIONAL NN=6 MM=12 D_0=0.6 R_0=1 D_MAX=2.0}
NLIST NL_CUTOFF=dmax+1.5 NL_STRIDE=10
...
using the benchmark with the sc configurations and 10000 atoms, in 10 seconds of tuns I get the following amount of steps:
| DMAX | stride | master | unordered | remap |
|---|---|---|---|---|
| 2 | 10 | 1681 | 1731 | 1821 |
| 2 | 20 | 3081 | 3181 | 3341 |
| 2 | 40 | 5353 | 5441 | 5681 |
| 4 | 10 | 773 | 1085 | 1139 |
| 4 | 20 | 1241 | 1601 | 1641 |
| 4 | 40 | 1768 | 2087 | 2161 |
| 6 | 10 | 311 | 517 | 545 |
| 6 | 20 | 460 | 637 | 665 |
| 6 | 40 | 601 | 713 | 756 |
(FYI the number of step WITHOUT the NL is 93 with dmax=6, 96 with dmax=4 and 98 with dmax=2)
The benchmark here is aimed at running the calculations on the "active" atoms, that should be the slice in the middle of the cube
448f57c to
f4076ac
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Like we discussed, I started with a very naive implementation of the "used list" for building the request list for the next steps.
Naively skipping overbuilding the list by just reordering the atoms that are in the couple list has a small impact:
(with 10000 [up] and 20000 atoms [below])
the nl settings are the usual
NLIST NL_CUTOFF=dmax+1.5 NL_STRIDE=s,with a single group. the benchmark has been run in 30 seconds, and I am showing the number of steps in that timeframeNote
If I remove the removeduplicates from
setRequestList, I break a bunch of tests and the ordered array optimization ingetReducedAtomListI am working on skipping the recalculation of the requestlist in case of an
ALL v ALLsimulation (for example with a single group)Target release
I would like my code to appear in release v2.11
Type of contribution
Copyright
COPYRIGHTfile with the correct license information. Code should be released under an open source license. I also used the commandcd src && ./header.sh mymodulenamein order to make sure the headers of the module are correct.Tests