Direct GPU API
Introduction
The direct GPU API provides a direct access to GPU data of a PhysX scene. This data includes simulation state, such as actors’ poses and velocities, as well as control inputs, such as applied forces and torques. The functions in this API enable batched direct access to GPU data for PxRigidDynamic, PxArticulationReducedCoordinate and PxShape objects. This allows interoperation with GPU post- and preprocessing to implement more efficient integration into GPU based applications, most notably end-to-end GPU reinforcement learning pipelines.
Using Direct GPU API
To use this API, PxSceneFlag::eENABLE_DIRECT_GPU_API
flag needs to be raised, in combination
with PxSceneFlag::eENABLE_GPU_DYNAMICS
flag and PxBroadPhaseType::eGPU
broad phase type.
PxSceneDesc sceneDesc(gPhysics->getTolerancesScale());
...
sceneDesc.flags |= PxSceneFlag::eENABLE_DIRECT_GPU_API;
sceneDesc.flags |= PxSceneFlag::eENABLE_GPU_DYNAMICS;
sceneDesc.broadPhaseType = PxBroadPhaseType::eGPU;
...
gScene = gPhysics->createScene(sceneDesc);
Note
These options are immutable and cannot be changed after the scene has been created.
To access the scene’s GPU data, the user should retrieve PxDirectGPUAPI
instance, using the PxScene::getDirectGPUAPI()
function,
and then call one of PxDirectGPUAPI
’s member functions, specifying the address of a GPU buffer to read from or write to, the indices of the
actors to visit, and the type of data to set, get, or compute.
// allocate memory for the indices of all bodies
CUdeviceptr indexDevice = CUdeviceptr(NULL);
const size_t indexSize = sizeof(PxRigidDynamicGPUIndex) * nbBodies;
cuMemAlloc(&indexDevice, indexSize);
PxRigidDynamicGPUIndex* indexHost = NULL;
cuMemAllocHost((void**)&indexHost, indexSize);
// populate host index data
for(PxU32 i = 0; i < nbBodies; ++i)
indexHost[i] = bodies[i]->getGPUIndex();
// transfer and apply:
cuMemcpyHtoD(indexDevice, (void*)indexHost, indexSize);
// allocate memory for the bodies' poses
CUdeviceptr poseDataDevice = CUdeviceptr(NULL);
const size_t poseDataSize = nbBodies * sizeof(PxTransform);
cuMemAlloc(&poseDataDevice, poseDataSize);
// retrieve the bodies' poses
gScene->getDirectGPUAPI().getRigidDynamicData((void*)poseDataDevice, (const PxRigidDynamicGPUIndex*)indexDevice,
PxRigidDynamicGPUAPIReadType::eGLOBAL_POSE, nbBodies);
Note
Due to the internal architecture of GPU-accelerated PhysX, this GPU API will only work after the first simulation step has been taken, as the simulation first needs to know all the actors it will simulate and set up the sizes of the GPU-based structures. Using this direct GPU API will disable the existing CPU-based API for all the data exposed in the direct GPU API. However, for any CPU API function that does not have a counterpart in this direct GPU API, as well as for setting up actors on the scene before the simulation’s first step, the standard CPU API will continue to work.
Note
Since this API exposes low-level data, it may be changed without deprecation.
Direct GPU API limitations
Because of the internal architecture of the GPU-accelerated parts of PhysX, using this API comes with caveats:
All GPU-CPU copying of data exposed in this API will be disabled. This means that the existing CPU-based API will return outdated data, and any setters for data exposed in the interface will not work. On the other hand, significant speedups can be achieved because of the reduced amount of GPU-CPU memory copying.
Disabled GPU-CPU copying results in the following features not working properly, so they should not be used:
- Rigid Body Dynamics
The simulation output data available through the standard CPU API, like actors’ global poses and velocities, may be outdated and should be accessed through the direct GPU API.
The simulation input data, like initial pose, initial velocity, or applied force, can’t be set through the standard CPU API; the GPU API should be used.
- Articulations
Similarly to rigid bodies, the simulation output data available through the standard CPU API, like links’ poses and velocities, may be outdated and should be accessed through the GPU API.
The simulation input data, like initial pose, initial velocity, or applied force, can’t be set through the standard CPU API; the GPU API should be used.
- Joints
Only the D6 Joint has a GPU implementation. The rest of the joints don’t work with the GPU API to avoid CPU-GPU copying.
- Scene Queries
The current implementation works only on the CPU and relies on actors’ global poses, which are not updated with the GPU API enabled.
- Advanced Collision Detection
GPU narrow phase contact data isn’t copied to the CPU and thus isn’t available in CPU contact reporting.
Contact Modification relies on the CPU narrow phase and isn’t available in GPU contact reporting.
Triggers rely on actors’ global poses on the CPU, which aren’t updated with the GPU API enabled.
Continuous Collision Detection (based on sweep tests) relies on actors’ global poses, so it also doesn’t work. Speculative CCD (based on auto-enlarged contact distances) should still work.
- Custom Geometry (Cone and Cylinder)
The implementation relies on the CPU narrow phase, which in turn relies on actors’ global poses, which aren’t updated with the GPU API enabled.
The contact data generated by the CPU narrow phase isn’t available through the GPU API.
- Vehicles
Rely on scene queries and custom joints.
- Character Controllers
Rely on scene queries.
- Debug Visualization
Relies on actors’ global poses.