fast connected components implementation (with patch) (Patch #1236)
Description
Herein contains an implementation for binary image connected components for 4 and 8 way. Its fast and more simple to use rather than find_contours/draw_contours. I wish to contribute it to opencv mainline, svn diff patch attached.
Associated revisions
Merge pull request #1236 from pengx17:2.4_fix_retina_color_param
History
Updated by Kirill Kornyakov about 13 years ago
- Tracker changed from Feature to Patch
Updated by Alexander Shishkov almost 13 years ago
- Target version deleted ()
Updated by Alexander Shishkov almost 13 years ago
- Priority changed from Normal to Low
Updated by Alexander Shishkov almost 13 years ago
- Assignee deleted (
Vadim Pisarevsky)
Updated by Alexander Shishkov almost 13 years ago
- Target version deleted ()
Updated by Jason Newton almost 13 years ago
any reason this has been bumped / ignored?
Updated by Kirill Kornyakov almost 13 years ago
Lack of resources... There are 45 more patches, most likely that they will be integrated in one of future versions. Sorry for that, we'll try to adopt your patch next time.
Updated by Jason Newton almost 13 years ago
Hm, I was going to let this go till a future version but i have a few things worth mentioning:
A friend's work who intended on contributing hysteresis thresholding as a stand alone function waiting on this to make it in.
More importantly, this is still an unsolved problem for most users... cvbloblib et all are in disarray and sometimes broken. Current solutions are too complicated and slow as they are based on floodfill where as this is a constant time 2 pass algorithm. AFAIK it also comes up often on the ML from what I remember.
Take a look at these results, both posted long after I submitted my patch:
http://stackoverflow.com/questions/9291175/opencv-and-python-connected-components-analysis/10052376#10052376
http://nghiaho.com/?p=1102
http://areshopencv.blogspot.com/2011/12/blob-detection-connected-component-pure.html
Is there any thing I can do to expedite inclusion? I'm not against submitting more patches but I honestly think it's time opencv has this functionality built in.
Updated by Kirill Kornyakov almost 13 years ago
Vadim, what is your opinion?
Updated by Jason Newton almost 13 years ago
Hi, one more thread postiing I found today on stackoverflow that I should've posted before,
http://stackoverflow.com/questions/4641817/blob-extraction-in-opencv
You can see other users think its a rather surprising oversight for a library such as opencv.
Updated by Aslak Hellesøy over 12 years ago
I have attached a slightly updated patch that applies cleanly against r8918. I also fixed a few compiler warnings.
- File opencv-connectedcomponents.patch added
Updated by Aslak Hellesøy over 12 years ago
Another updated patch
- File opencv-connectedcomponents.patch added
Updated by Costantino Grana over 12 years ago
+1
Updated by Jason Newton over 12 years ago
updated patch(es) that don't have any warnings (thanks to Aslak for pointing them out, unfortunately his patches introduce sign related bugs).
I took the opportunity to speed it up and have the following benchmarks available:
The format is:
Filename, Connectivity, iterations, time for old method, time for new method, time for new method with statistics (centroid, bounding box, area). Times are in seconds.
These were generated with files from the attached benchmark_kit.zip, the benchmark_cc.cpp program can be dropped into samples/cpp/ and compiled (after running cmake again) with make example_benchmark_cc. The arguments are imagefile iteration connectivity.
Note that when connectivity=4, the old method's benchmark is skipped, so the times are 0. An interesting oddity is that connectivity=4 is slower than 8. The stats computations are also surprisingly expensive.
reaper (i7 3820, x86_64, gcc 4.6.3), -O2
spiral1-big.png 4 1000 0.0000 111.3650 222.5280
spiral1.png 4 1000 0.0000 7.0330 14.1660
spiral1-small.png 4 1000 0.0000 0.2950 0.5980
spiral2-big.jpg 4 1000 0.0000 32.9920 79.4010
spiral2.jpg 4 1000 0.0000 0.5960 1.1360
spiral2-small.jpg 4 1000 0.0000 0.1330 0.2730
stuff-big.jpg 4 1000 0.0000 36.3200 93.7240
stuff.jpg 4 1000 0.0000 0.6390 1.5160
spiral1-big.png 8 1000 103.4610 77.1300 184.3070
spiral1.png 8 1000 5.9270 4.9480 11.5160
spiral1-small.png 8 1000 0.4770 0.2190 0.4740
spiral2-big.jpg 8 1000 49.4940 26.9110 72.6570
spiral2.jpg 8 1000 1.3150 0.3980 0.9550
spiral2-small.jpg 8 1000 0.8260 0.1460 0.3370
stuff-big.jpg 8 1000 48.8870 30.6580 86.3620
stuff.jpg 8 1000 0.8930 0.5070 1.3710
reaper, -O3
spiral1-big.png 4 1000 0.0000 110.7130 215.6300
spiral1.png 4 1000 0.0000 7.0950 13.6260
spiral1-small.png 4 1000 0.0000 0.3540 0.6170
spiral2-big.jpg 4 1000 0.0000 32.9020 77.7780
spiral2.jpg 4 1000 0.0000 0.4570 1.0110
spiral2-small.jpg 4 1000 0.0000 0.1270 0.2880
stuff-big.jpg 4 1000 0.0000 36.1840 91.7420
stuff.jpg 4 1000 0.0000 0.5650 1.5810
spiral1-big.png 8 1000 118.9170 77.0660 183.9130
spiral1.png 8 1000 6.4080 4.9450 11.8380
spiral1-small.png 8 1000 0.5010 0.2190 0.4810
spiral2-big.jpg 8 1000 52.0180 26.8950 72.2840
spiral2.jpg 8 1000 1.4720 0.3970 0.9370
spiral2-small.jpg 8 1000 0.5930 0.1090 0.2430
stuff-big.jpg 8 1000 51.6680 30.6020 86.4960
stuff.jpg 8 1000 0.9620 0.5740 1.3500
spyker (core 2 duo, x86_64, gcc 4.6.3), -O3
spiral1-big.png 4 1000 0.0000 216.9970 358.0390
spiral1.png 4 1000 0.0000 13.5170 22.4210
spiral1-small.png 4 1000 0.0000 0.5000 0.9370
spiral2-big.jpg 4 1000 0.0000 70.7200 129.7390
spiral2.jpg 4 1000 0.0000 0.8100 1.6770
spiral2-small.jpg 4 1000 0.0000 0.2250 0.4490
stuff-big.jpg 4 1000 0.0000 83.6310 155.5480
stuff.jpg 4 1000 0.0000 0.9410 2.3050
spiral1-big.png 8 1000 226.0340 156.2490 291.3120
spiral1.png 8 1000 10.0530 9.5950 18.2160
spiral1-small.png 8 1000 0.7030 0.3310 0.7450
spiral2-big.jpg 8 1000 87.6270 58.3730 116.4080
spiral2.jpg 8 1000 1.8560 0.6220 1.4930
spiral2-small.jpg 8 1000 0.8580 0.1800 0.4010
stuff-big.jpg 8 1000 94.7050 71.7530 142.8330
The main advantage of this algorithm is that it's worst case performance is approximately its best case performance (deterministic) and is better than the average and worst case of the current implementation (generally faster by significant percentages). Sometimes it's quite a bit faster, sometimes not much - depends on the input.... examples include an all zero image - this algorithm will make 2 passes, the old method will make just one. But it's performance will never be less than 2x as slow as the current implementation (because it is 2 pass), and the average case has good savings. The authors of the paper expected that it would scale better to higher resolution images because its memory access patterns are more predictable, I'd say this seems to be true. I also tacked on the statistics computation (stats typically used in blob detection) because it allows me to reuse one double-nested-forloop rather than create another. I got a few requests from this both at work and on stackoverflow since it's a common operation performed right after connected components... It is a big slowdown though but I can't figure out how to do them any faster. The second patch sacrifices some readability for speed, it makes it quite a bit faster though (~35% or so).
I hope this can finally get included... people are still making new threads on stackoverflow for how to do this... who knows where else. Some have actually been "rolling" their own implementations...
- File benchmark_kit.zip added
- File 0002-connectedComponents-peep-hole-optimizations-mostly-s.patch added
- File 0001-connectedComponents-warning-free-version.patch added
Updated by Jason Newton over 12 years ago
Reposting benchmarks with preformatting to preserve readability:
reaper (i7 3820, x86_64, gcc 4.6.3), -O2 spiral1-big.png 4 1000 0.0000 111.3650 222.5280 spiral1.png 4 1000 0.0000 7.0330 14.1660 spiral1-small.png 4 1000 0.0000 0.2950 0.5980 spiral2-big.jpg 4 1000 0.0000 32.9920 79.4010 spiral2.jpg 4 1000 0.0000 0.5960 1.1360 spiral2-small.jpg 4 1000 0.0000 0.1330 0.2730 stuff-big.jpg 4 1000 0.0000 36.3200 93.7240 stuff.jpg 4 1000 0.0000 0.6390 1.5160 spiral1-big.png 8 1000 103.4610 77.1300 184.3070 spiral1.png 8 1000 5.9270 4.9480 11.5160 spiral1-small.png 8 1000 0.4770 0.2190 0.4740 spiral2-big.jpg 8 1000 49.4940 26.9110 72.6570 spiral2.jpg 8 1000 1.3150 0.3980 0.9550 spiral2-small.jpg 8 1000 0.8260 0.1460 0.3370 stuff-big.jpg 8 1000 48.8870 30.6580 86.3620 stuff.jpg 8 1000 0.8930 0.5070 1.3710 reaper, -O3 spiral1-big.png 4 1000 0.0000 110.7130 215.6300 spiral1.png 4 1000 0.0000 7.0950 13.6260 spiral1-small.png 4 1000 0.0000 0.3540 0.6170 spiral2-big.jpg 4 1000 0.0000 32.9020 77.7780 spiral2.jpg 4 1000 0.0000 0.4570 1.0110 spiral2-small.jpg 4 1000 0.0000 0.1270 0.2880 stuff-big.jpg 4 1000 0.0000 36.1840 91.7420 stuff.jpg 4 1000 0.0000 0.5650 1.5810 spiral1-big.png 8 1000 118.9170 77.0660 183.9130 spiral1.png 8 1000 6.4080 4.9450 11.8380 spiral1-small.png 8 1000 0.5010 0.2190 0.4810 spiral2-big.jpg 8 1000 52.0180 26.8950 72.2840 spiral2.jpg 8 1000 1.4720 0.3970 0.9370 spiral2-small.jpg 8 1000 0.5930 0.1090 0.2430 stuff-big.jpg 8 1000 51.6680 30.6020 86.4960 stuff.jpg 8 1000 0.9620 0.5740 1.3500 spyker (core 2 duo, x86_64, gcc 4.6.3), -O3 spiral1-big.png 4 1000 0.0000 216.9970 358.0390 spiral1.png 4 1000 0.0000 13.5170 22.4210 spiral1-small.png 4 1000 0.0000 0.5000 0.9370 spiral2-big.jpg 4 1000 0.0000 70.7200 129.7390 spiral2.jpg 4 1000 0.0000 0.8100 1.6770 spiral2-small.jpg 4 1000 0.0000 0.2250 0.4490 stuff-big.jpg 4 1000 0.0000 83.6310 155.5480 stuff.jpg 4 1000 0.0000 0.9410 2.3050 spiral1-big.png 8 1000 226.0340 156.2490 291.3120 spiral1.png 8 1000 10.0530 9.5950 18.2160 spiral1-small.png 8 1000 0.7030 0.3310 0.7450 spiral2-big.jpg 8 1000 87.6270 58.3730 116.4080 spiral2.jpg 8 1000 1.8560 0.6220 1.4930 spiral2-small.jpg 8 1000 0.8580 0.1800 0.4010 stuff-big.jpg 8 1000 94.7050 71.7530 142.8330
Updated by Marina Kolpakova over 12 years ago
- Assignee set to Marina Kolpakova
Updated by Dustin Spicuzza over 12 years ago
Building this patch on Windows using Visual Studio 2008 (against 2.4.2, not trunk) yields a lot of warnings of the following nature:
..\..\..\OpenCV-2.4.2\modules\imgproc\src\connectedcomponents.cpp(213) : warning C4127: conditional expression is constant
Additionally, the uint32_t, uint64_t, and int64_t types are not defined on Windows, so I would suggest using the OpenCV defined equivalents instead (unsigned int, int64, uint64). The python bindings for OpenCV do not seem to properly handle two functions that differ only by argument type, so you should either drop one of the exported functions, or name them differently. You should mark the statsv vector as CV_OUT as well.
The python bindings do not build properly, one needs to add appropriate conversion functions for your new structure to cv2.cpp.
typedef vector<ConnectedComponentStats> vector_ConnectedComponentStats; static inline PyObject* pyopencv_from(const ConnectedComponentStats& c) { return Py_BuildValue("{s:i,s:i,s:i,s:i,s:d,s:d,s:K,s:K,s:I}", "lower_x", c.lower_x, "lower_y", c.lower_y, "upper_x", c.upper_x, "upper_y", c.upper_y, "centroid_x", c.centroid_x, "centroid_y", c.centroid_y, "integral_x", c.integral_x, "integral_y", c.integral_y, "area", c.area ); } static inline PyObject* pyopencv_from(const vector_ConnectedComponentStats& c) { return pyopencv_from_generic_vec(c); }
Curiously, the python bindings don't properly support converting from uint64/int64, and casts them to a double instead. I'm not sure of the proper solution for that, but adding the following functions to cv2.cpp seems to allow your function to return a value properly:
static PyObject* pyopencv_from(int64 value) { return PyLong_FromLongLong(value); } static PyObject* pyopencv_from(uint64 value) { return PyLong_FromUnsignedLongLong(value); }
Note that I haven't attached these in patch form because I'm not 100% sure of the best way to fix all of these issues, but the solutions detailed above seem to work for me.
Updated by Dustin Spicuzza over 12 years ago
Also, adding some documentation would be useful. At the very minimum, when one builds the documentation for OpenCV on a patched build your function doesn't show up in the docs.
Updated by Jason Newton over 12 years ago
Dustin,
I've fixed most of your issues. The warnings are a non-issue and I suggest you submit a patch to disable the warnings for that compiler if it concerns you.
Updated by Jason Newton over 12 years ago
I've also made a git repo from itsees branch on github and put in a pull-request. I figured it was better than the molasses speed of this issue tracker.
Updated by Jason Newton over 12 years ago
Jason Newton wrote:
I've also made a git repo from itsees branch on github and put in a pull-request. I figured it was better than the molasses speed of this issue tracker.
Sorry long and busy day, url:
https://github.com/nevion/opencv
Updated by Kirill Kornyakov about 12 years ago
Check GitHub for the latest status: https://github.com/Itseez/opencv/pull/135.
- Pull request set to https://github.com/Itseez/opencv/pull/135
- Target version set to 3.0
- Assignee changed from Marina Kolpakova to Jason Newton
Updated by Daniil Osokin about 12 years ago
- Status changed from Open to Done