OpenCL 3.1.1 Released To Address A Possible Performance Regression
Released earlier this month was the OpenCL 3.1 specification with a focus on enhancing AI and HPC workloads for this long-time Khronos specification. Out today is OpenCL 3.1.1 as a point release with an emphasis on addressing a possible performance regression of OpenCL 3.1.
OpenCL 3.1.1 reverts the short-lived OpenCL 3.1 behavior of clGetEventInfo returning CL_COMPLETE as a host synchronization point. Those wanting a host synchronization point should instead call a function waiting on the OpenCL event instead like with clWaitForEvents. The pull request argued the change in behavior back to its OpenCL 3.0 semantics to avoid a performance regression with the cost of host synchronization:
OpenCL 3.1.1 also reserves some enum blocks for forthcoming Intel and Qualcomm extensions. Plus a few other minor fixes but the main change and motivating this quick point release is for reverting the clGetEventInfo behavior.
The OpenCL 3.1.1 spec can be found on GitHub.
OpenCL 3.1.1 reverts the short-lived OpenCL 3.1 behavior of clGetEventInfo returning CL_COMPLETE as a host synchronization point. Those wanting a host synchronization point should instead call a function waiting on the OpenCL event instead like with clWaitForEvents. The pull request argued the change in behavior back to its OpenCL 3.0 semantics to avoid a performance regression with the cost of host synchronization:
"This PR changes the behavior of clGetEventInfo(CL_EVENT_COMMAND_EXECUTION_STATUS) returning CL_COMPLETE back to the behavior in OpenCL 3.0. This avoids a potential performance regression when the stronger host synchronization point is not needed, for example to determine if the event is CL_COMPLETE to query event profiling data."
OpenCL 3.1.1 also reserves some enum blocks for forthcoming Intel and Qualcomm extensions. Plus a few other minor fixes but the main change and motivating this quick point release is for reverting the clGetEventInfo behavior.
The OpenCL 3.1.1 spec can be found on GitHub.
