Skip to content

A faster reconstruction of the request list for the neigborlist#1373

Draft
Iximiel wants to merge 11 commits intoplumed:masterfrom
Iximiel:update/NLrequestlist
Draft

A faster reconstruction of the request list for the neigborlist#1373
Iximiel wants to merge 11 commits intoplumed:masterfrom
Iximiel:update/NLrequestlist

Conversation

@Iximiel
Copy link
Member

@Iximiel Iximiel commented Mar 5, 2026

Description

Like we discussed, I started with a very naive implementation of the "used list" for building the request list for the next steps.

Naively skipping overbuilding the list by just reordering the atoms that are in the couple list has a small impact:
(with 10000 [up] and 20000 atoms [below])

Figure_1

the nl settings are the usual NLIST NL_CUTOFF=dmax+1.5 NL_STRIDE=s,with a single group. the benchmark has been run in 30 seconds, and I am showing the number of steps in that timeframe

Note

If I remove the removeduplicates from setRequestList, I break a bunch of tests and the ordered array optimization in getReducedAtomList

I am working on skipping the recalculation of the requestlist in case of an ALL v ALL simulation (for example with a single group)

Target release

I would like my code to appear in release v2.11

Type of contribution
  • changes to code or doc authored by PLUMED developers, or additions of code in the core or within the default modules
  • changes to a module not authored by you
  • new module contribution or edit of a module authored by you
Copyright
  • I agree to transfer the copyright of the code I have written to the PLUMED developers or to the author of the code I am modifying.
  • the module I added or modified contains a COPYRIGHT file with the correct license information. Code should be released under an open source license. I also used the command cd src && ./header.sh mymodulename in order to make sure the headers of the module are correct.
Tests
  • I added a new regtest or modified an existing regtest to validate my changes.
  • I verified that all regtests are passed successfully on GitHub Actions.

@Iximiel Iximiel force-pushed the update/NLrequestlist branch from ff9e8a1 to e80f546 Compare March 5, 2026 16:21
Comment on lines +323 to +343
for (unsigned i=0 ; i < requestIndexes_.size(); ++i) {
if(requestIndexes_[i]) {
requestlist_.push_back(fullatomlist_[i]);
}
}
Tools::removeDuplicates(requestlist_);
reduced=false;
//avoid bisecting the list again and again:
// I exploit the fact that requestlist_ is an ordered vector
// And I assume that index0 and index1 actually exists in the requestlist_ (see setRequestList())
// so I can use lower_bond that uses binary seach instead of find

for (unsigned i=0 ; i < fullatomlist_.size(); ++i) {
const auto id=fullatomlist_[i];
auto p = std::lower_bound(requestlist_.begin(), requestlist_.end(), id);
if (*p == id) {
indexesRemap_[i]=p-requestlist_.begin();
} else {
//this atom is not used
indexesRemap_[i]=fullatomlist_.size();
}
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I observed that if I change in this way is slightly slower, even if this is slower

Suggested change
for (unsigned i=0 ; i < requestIndexes_.size(); ++i) {
if(requestIndexes_[i]) {
requestlist_.push_back(fullatomlist_[i]);
}
}
Tools::removeDuplicates(requestlist_);
reduced=false;
//avoid bisecting the list again and again:
// I exploit the fact that requestlist_ is an ordered vector
// And I assume that index0 and index1 actually exists in the requestlist_ (see setRequestList())
// so I can use lower_bond that uses binary seach instead of find
for (unsigned i=0 ; i < fullatomlist_.size(); ++i) {
const auto id=fullatomlist_[i];
auto p = std::lower_bound(requestlist_.begin(), requestlist_.end(), id);
if (*p == id) {
indexesRemap_[i]=p-requestlist_.begin();
} else {
//this atom is not used
indexesRemap_[i]=fullatomlist_.size();
}
}
usigned idx=0;
for (unsigned i=0 ; i < requestIndexes_.size(); ++i) {
if(requestIndexes_[i]) {
requestlist_.push_back(fullatomlist_[i]);
indexesRemap_[i]=idx;
++idx;
}
}

Note that with this change to work I am not removing duplicates

Measuremnets done with:

WO1: GROUP ATOMS=1-5000
WO2: GROUP ATOMS=5001-10000

cpu: COORDINATION ...
 GROUPA=WO1 GROUPB=WO2
 SWITCH={RATIONAL NN=6 MM=12 D_0=0.6 R_0=1 D_MAX=2.0}
 NLIST NL_CUTOFF=dmax+1.5 NL_STRIDE=10
...

using the benchmark with the sc configurations and 10000 atoms, in 10 seconds of tuns I get the following amount of steps:

DMAX stride master unordered remap
2 10 1681 1731 1821
2 20 3081 3181 3341
2 40 5353 5441 5681
4 10 773 1085 1139
4 20 1241 1601 1641
4 40 1768 2087 2161
6 10 311 517 545
6 20 460 637 665
6 40 601 713 756

(FYI the number of step WITHOUT the NL is 93 with dmax=6, 96 with dmax=4 and 98 with dmax=2)

The benchmark here is aimed at running the calculations on the "active" atoms, that should be the slice in the middle of the cube

@Iximiel Iximiel force-pushed the update/NLrequestlist branch from 448f57c to f4076ac Compare March 6, 2026 16:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant