10% speed improvement in filterSpecklesImpl by switching insertion order of connected components (Feature #3692)
Description
Intro:
filterSpeckles is a method that performs blob detection based on disparity similarity and removes all those found blobs that contain fewer elements than a predefined threshold. The filterSpeckles is used both in BM and SGBM methods and I it is very useful to get rid of spurious disparities. It is also very well implemented and runs real fast.
Feature:
I found a way of improving the computation time by performing a very simple change. I think that the computation improvement of 10% is due to the reduction of cache misses when iterating over the image.
Description:
The blob detection works starting from a seed (pixel with unassigned label), and performs a wavefront propagation from that seed that checks for neighboors that are similar to the seed. Each of those neighbors then become the seed and the propagation continues until no more similar neighbors are found. The order in which the neighbors are check are: right pixel, left pixel, pixel down, and pixel up. The neighbor pixel is added to a stack, so that the first one to be processed is the last inserted.
Proposal:
Change order in which the neighbors are checked and inserted into the stack. The order should be pixel up, pixel down, pixel left, pixel right. Basically, it means changing these lines in the file calib3d/src/stereosgbm.cpp
if( p.x < width-1 && !lpp[+1] && dpp[+1] != newVal && std::abs(dp - dpp[+1]) <= maxDiff )
{
lpp[+1] = curlabel;
*ws++ = Point2s(p.x+1, p.y);
}
if( p.x > 0 && !lpp[-1] && dpp[-1] != newVal && std::abs(dp - dpp[-1]) <= maxDiff )
{
lpp[-1] = curlabel;
*ws++ = Point2s(p.x-1, p.y);
}
if( p.y < height-1 && !lpp[+width] && dpp[+dstep] != newVal && std::abs(dp - dpp[+dstep]) <= maxDiff )
{
lpp[+width] = curlabel;
*ws++ = Point2s(p.x, p.y+1);
}
if( p.y > 0 && !lpp[-width] && dpp[-dstep] != newVal && std::abs(dp - dpp[-dstep]) <= maxDiff )
{
lpp[-width] = curlabel;
*ws++ = Point2s(p.x, p.y-1);
}
with these:
if( p.y < height-1 && !lpp[+width] && dpp[+dstep] != newVal && std::abs(dp - dpp[+dstep]) <= maxDiff )
{
lpp[+width] = curlabel;
*ws++ = Point2s(p.x, p.y+1);
}
if( p.y > 0 && !lpp[-width] && dpp[-dstep] != newVal && std::abs(dp - dpp[-dstep]) <= maxDiff )
{
lpp[-width] = curlabel;
*ws++ = Point2s(p.x, p.y-1);
}
if( p.x < width-1 && !lpp[+1] && dpp[+1] != newVal && std::abs(dp - dpp[+1]) <= maxDiff )
{
lpp[+1] = curlabel;
*ws++ = Point2s(p.x+1, p.y);
}
if( p.x > 0 && !lpp[-1] && dpp[-1] != newVal && std::abs(dp - dpp[-1]) <= maxDiff )
{
lpp[-1] = curlabel;
*ws++ = Point2s(p.x-1, p.y);
}
where only the first pair of if statements was interchanged with the second pair of if statements.
Rationale:
The above if statements are in a loop. The last element in ws is evaluated next. In the original method, the next one to be evaluated is the one below the current pixel. This might require to load the next disparity image scanline in the cache, leading to a larger probablity of a cache miss. If, instead, the lateral neighborhooding pixels are evaluated next, the required memory is already in the cache.
Improvements:
Measured as the average of 10 trials of processing 555 stereo images of size 720x480 (i.e., average of 5550 images) with ~80% disparity coverage.
Processor: Intel(R) Core(TM) i7-4960HQ CPU @ 2.60GHz
Compiler: gcc (Ubuntu/Linaro 4.6.3-1ubuntu5) 4.6.3
Platform: Ubuntu 12.04.4 LTS
Proposed version: 2.143 (/+-0.0296816) ms
Original version: 2.367 (/+-0.0316386) ms
Improvement of 10%.
These improvements might be larger on devices with less cache memory than my machine.
History
Updated by Vladislav Vinogradov almost 11 years ago
Hello Hernan Badino,
Thank you for your report!
Could you create a pull request with your fix (http://code.opencv.org/projects/opencv/wiki/How_to_contribute)? All help to the project is highly appreciated!
- Status changed from New to Open
Updated by Hernan Badino almost 11 years ago
Vladislav,
I did it. I push the change to master. Should I push the same change to 2.4?
Hernan
- % Done changed from 0 to 50
Updated by Vladislav Vinogradov almost 11 years ago
- Target version set to 3.0-alpha
- Pull request set to https://github.com/Itseez/opencv/pull/2764
Updated by Vladislav Vinogradov almost 11 years ago
The patch was merged into master branch.
Hernan Badino, thank you for your contribution!
- Status changed from Open to Done
Updated by Hernan Badino almost 11 years ago
Vladislav,
so you know, I've created a pull request for 2.4 as well.
Happy to contribute.
Hernan