Limitations with cv::gpu convolution kernels (Feature #1134)


Added by Nghia Ho over 13 years ago. Updated almost 13 years ago.


Status:Done Start date:
Priority:High Due date:
Assignee:Vladislav Vinogradov % Done:

0%

Category:gpu (cuda)
Target version:2.4.0
Difficulty: Pull request:

Description

I was playing around with cv::gpu::GaussianBlur and noticed it limits the kernel size to 16 in width (from assert errors), which seems like a big limitation that should either be documented or corrected. I'm guessing the smaller kernel size allows for a more efficient implementation on the GPU but maybe it's also useful to include a more general version, albeit slower.


History

Updated by Nicu Stiurca over 13 years ago

I'm seeing the same problem, but it looks like it's much more widespread than cv::gpu::GaussianBlur(). In my case, I tried calling cv::gpu::createSeparableLinearFilter_GPU() with some kernels with size > 16 (eg, 25), and I got an assertion which I think must be the same one OP references. Anyways, looks like this limitation applies to all GPU filter kernels in the current implementation.

Updated by Anatoly Baksheev over 13 years ago

It's inefficiently to convolve using brute force algorithm with such kernel sizes. Use cv::gpu::convolve to leverage DFT-based convolution technique.

Updated by Vladislav Vinogradov about 13 years ago

Now separable linear filters like GaussianBlur supports kernel size up to 32.
Non separable linear filters supports kernel size up to 16.
For larger sizes it is better to use cv::gpu::convolve, DFT-based convolution implementation.

  • Status changed from Open to Done
  • Assignee changed from Anatoly Baksheev to Vladislav Vinogradov

Updated by Andrey Kamaev almost 13 years ago

  • Target version set to 2.4.0

Updated by Andreas Holz almost 13 years ago

I was using the cv::gpu::getMaxFilter_GPU function in the 2.3 version (and also SVN).

cv::Ptr<cv::gpu::BaseFilter_GPU> bf = cv::gpu::getMaxFilter_GPU(CV_8UC1, CV_8UC1, cv::Size(window_sizes.at(i),window_sizes.at(i)));
cv::Ptr<cv::gpu::FilterEngine_GPU> filter = cv::gpu::createFilter2D_GPU(bf, CV_8UC1, CV_8UC1);

But now it won't work any more because of the restriction of the kernel size.

#define FILTER2D_MAX_KERNEL_SIZE 16 in imgprog.cu
and
CV_Assert(ksize.width * ksize.height <= 16 * 16); in filtering.cpp

As far as I remeber the getMaxFilter_GPU function only calls the NPP function nppiFilterMax_8u_C1R,
so does it really make sense to restrict the kernel size in that case?
Or is there any workaround for that particular problem?

Updated by Vladislav Vinogradov almost 13 years ago

I can't reproduce your issue. I've tried this code and it works (2.4 and trunk version):

cv::Ptr<cv::gpu::BaseFilter_GPU> bf = cv::gpu::getMaxFilter_GPU(CV_8UC1, CV_8UC1, cv::Size(32, 32));
cv::Ptr<cv::gpu::FilterEngine_GPU> filter = cv::gpu::createFilter2D_GPU(bf, CV_8UC1, CV_8UC1);
cv::gpu::GpuMat dst;
filter->apply(img, dst);

Could you provide your code sample?

Kernel size is limited only in linear filters (2D and separable).

  • Status changed from Done to Open
  • Target version deleted (2.4.0)

Updated by Andreas Holz almost 13 years ago

Vladislav Vinogradov wrote:

I can't reproduce your issue. I've tried this code and it works (2.4 and trunk version):
[...]
Could you provide your code sample?

Kernel size is limited only in linear filters (2D and separable).

Sorry, it was my mistake. It wasn't the MaxFilter, it was a filter2D in my code.
Was the checking and conversion removed? in the CPU filtering it automatically
chose DFT if the kernel size was above a threshold.

Cheers,
Andreas

Updated by Vladislav Vinogradov almost 13 years ago

There are two 2D filtering functions in gpu module, both have own limitations:
  1. cv::gpu::filter2D - brute force filter (kernel size is limited to 16).
  2. cv::gpu::convolve - DFT-based convolution (doesn't support border extrapolation).

There is no automatic selection of appropriate implementation.
You should explicitly choose one of this implementation.

  • Status changed from Open to Done

Updated by Andrey Kamaev almost 13 years ago

  • Target version set to 2.4.0

Also available in: Atom PDF