Skip to content

[GPU] Illegal memory access #2

@0x0v3rlo4d

Description

@0x0v3rlo4d

When i tried to run mpm_test_oreo i got warning about illegal memory access:

❯ ./build/tests/mpm_test_oreo
[Cuda] Initialize device index to 0
[Cuda] Device Property:
        GPU Device: 0
        Global Memory: 12488343552 bytes
        Shared Memory: 49152 bytes
        Register Per SM: 65536
        Multi-processor count: 28
        SM compute capabilities: 8.6.
[Cuda] Created 32 streams for device 0.
[MPM Engine info] [10:43:27 +07:00] [thread 82916] Set Default dt = 1.0673906e-05
[MPM Engine info] [10:43:27 +07:00] [thread 82916] Start initializing particles.
[MPM Engine info] [10:43:27 +07:00] [thread 82916]      Initializing model with particle count 1615062
[MPM Engine info] [10:43:27 +07:00] [thread 82916]      Initializing model with particle count 658581
[MPM Engine info] [10:43:27 +07:00] [thread 82916] Finished initializating particles.
GPUassert: an illegal memory access was encountered /run/media/overload/Transcend/Project/CKMPM/include/mpm_engine.cuh 228

I also ran compute-sanitizer to check what was wrong and got the following errors:

  1. Invalid atomic size
  2. Out of bounds memory address access
  3. And mismatching particle initialization

Here is the complete compute-sanitizer output:

❯ compute-sanitizer ./build/tests/mpm_test_oreo
========= COMPUTE-SANITIZER
[Cuda] Initialize device index to 0
[Cuda] Device Property:
        GPU Device: 0
        Global Memory: 12488343552 bytes
        Shared Memory: 49152 bytes
        Register Per SM: 65536
        Multi-processor count: 28
        SM compute capabilities: 8.6.
[Cuda] Created 32 streams for device 0.
[MPM Engine info] [10:52:14 +07:00] [thread 90740] Set Default dt = 1.0673906e-05
[MPM Engine info] [10:52:14 +07:00] [thread 90740] Start initializing particles.
[MPM Engine info] [10:52:14 +07:00] [thread 90740]      Initializing model with particle count 1615062
[MPM Engine info] [10:52:14 +07:00] [thread 90740]      Initializing model with particle count 658581
[MPM Engine info] [10:52:14 +07:00] [thread 90740] Finished initializating particles.
========= Invalid __global__ atomic of size 4 bytes
=========     at void mpm::ActivateBlocksWithParticles<mpm::test::MPMTestScene::MPMTestOreoConfig, mpm::MPMPartition<mpm::MPMGridConfig<mpm::MPMDomain<mpm::MPMDomainRange<(int)64, (int)64, (int)64>, mpm::MPMDomainOffset<(int)0, (int)0, (int)0>, (int)-1>, mpm::meta::Empty>>, mpm::Matrix<float, (unsigned long)3, (unsigned long)1> *>(T1, unsigned int, T3, T2)+0x280
=========     by thread (5,0,0) in block (0,0,0)
=========     Access to 0x7feec87fc530 is out of bounds
=========     and is 15.056 bytes before the nearest allocation at 0x7feec8800000 of size 1.048.576 bytes
=========     Saved host backtrace up to driver entry point at kernel launch time
=========         Host Frame: cudaLaunchKernel [0x101c77] in mpm_test_oreo
=========         Host Frame: void mpm::ActivateBlocksWithParticles<mpm::test::MPMTestScene::MPMTestOreoConfig, mpm::MPMPartition<mpm::MPMGridConfig<mpm::MPMDomain<mpm::MPMDomainRange<64, 64, 64>, mpm::MPMDomainOffset<0, 0, 0>, -1>, mpm::meta::Empty> >, mpm::Matrix<float, 3ul, 1ul>*>(mpm::test::MPMTestScene::MPMTestOreoConfig, unsigned int, mpm::Matrix<float, 3ul, 1ul>*, mpm::MPMPartition<mpm::MPMGridConfig<mpm::MPMDomain<mpm::MPMDomainRange<64, 64, 64>, mpm::MPMDomainOffset<0, 0, 0>, -1>, mpm::meta::Empty> >) [0x16227] in mpm_test_oreo
=========         Host Frame: void mpm::MPMEngine<mpm::MPMGridConfig<mpm::MPMDomain<mpm::MPMDomainRange<64, 64, 64>, mpm::MPMDomainOffset<0, 0, 0>, -1>, mpm::meta::Empty> >::InitialSetup<mpm::test::MPMTestScene::MPMTestOreoConfig>(mpm::test::MPMTestScene::MPMTestOreoConfig const&) [0x838b2] in mpm_test_oreo
=========         Host Frame: main [0x15be4] in mpm_test_oreo
========= 
========= Program hit cudaErrorLaunchFailure (error 719) due to "unspecified launch failure" on CUDA API call to cudaMemcpyAsync.
=========     Saved host backtrace up to driver entry point at error
=========         Host Frame: void mpm::MPMEngine<mpm::MPMGridConfig<mpm::MPMDomain<mpm::MPMDomainRange<64, 64, 64>, mpm::MPMDomainOffset<0, 0, 0>, -1>, mpm::meta::Empty> >::InitialSetup<mpm::test::MPMTestScene::MPMTestOreoConfig>(mpm::test::MPMTestScene::MPMTestOreoConfig const&) [0x83926] in mpm_test_oreo
=========         Host Frame: main [0x15be4] in mpm_test_oreo
========= 
GPUassert: unspecified launch failure /run/media/overload/Transcend/Project/CKMPM/include/mpm_engine.cuh 228
========= Target application returned an error
========= ERROR SUMMARY: 2 errors

I assume compilation error was in the play so i recompile it to no avail, the issue persists. Is there any way to adjust the simulation parameters to find the cause of error?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions