Basically it appears the lower the settings, the more frequent the use of the cpu. But, I could be wrong.


All config numbers are powers of 2 so 2^20 vs 2^22 for items_per_kernel means
1048576 vs 4194304 or 4 times as much work per GPU kernel call. After 2^kernels_per_reduction * 2^items_per_kernel numbers have been checked (e.g. 2^22 * 2^9 = 2^31 = 2,147,483,648 numbers using the config above) are calculated prior to the CPU double checking the best results found. So, fewer numbers will make the video more responsive, but the CPU needs to stop and do its validation more often. Also, the CPU is __supposed__ to sit idle while the GPU is crunching but that really depends on the video drivers since the OpenCL spec doesn't guarantee asynchronous (a.k.a. non-blocking) kernel calls.