OpenCV 2.3.1 - very slow calcOpticalFlowPyrLK (Bug #1423)
Description
cvCalcOpticalFlowPyrLK in OpenCV 2.3.1 is still 5-10 times slower than cvCalcOpticalFlowPyrLK from OpenCV 2.3 (win7 release)
I suppose, the reason is that in current optimized version of calcOpticalFlowPyrLK implementation (since r5985) there are executed both optimized and old code, eg. we can see something like this:
...
#if CV_SSE2
for (...) ...
#endif
here very similar for loop
...
Probably it should be:
...
#if CV_SSE2
for (...) ...
#else
here very similar for loop
#endif
...
History
Updated by Vadim Pisarevsky over 13 years ago
"x" is not reset to 0 after SSE2-optimized loop is finished, therefore scalar code is only executed on the rest of the line.
without a benchmark it's impossible to reproduce your results. Our tests show that there is significant performance increase.
- Status changed from Open to Done
- (deleted custom field) set to invalid
Updated by Bartosz Wieloch over 13 years ago
In my program I call cvCalcOpticalFlowPyrLK several times (about 8) for each pair of video frames (to do backtracking and with diferent values of maxLevel).
As the speed is crucial for me (program must run on an embedded Intel Atom processor) I use few feature points (25) and small window (5x5).
Below are running times [ms] in my simple benchmark:
- image size = 640x480
- number of feature points: 25 or 100
- maxLevel = 5
- termCriteria = (TermCriteria::COUNT + TermCriteria::EPS, 100, 0.05);
- winSize 3x3 to 29x29
- cvCalcOpticalFlowPyrLK is called 8 times for the same pair of images:
- flags=0 means that cvCalcOpticalFlowPyrLK is always called with flags=0,
- flags=3 means that cvCalcOpticalFlowPyrLK is called with flags=CV_LKFLOW_PYR_A_READY+CV_LKFLOW_PYR_B_READY except the first call when flags=0 (to prepare pyramids)
All tests are performed on Intel Core i7 (2.67GHz).
Number of points: 25 Number of points: 100 ------------------------------ ------------------------------- 2.3 2.3.1 2.3 2.3.1 ------- -------------------- -------- --------------------- -------- winSize flags=0 flags=3 flags=0 flags=0 flags=3 flags=0 h7. ========= ========= ======== ========= ========== 3x 3 12.983 2.593 20.906 15.752 5.290 22.379 4x 4 13.434 3.128 20.904 16.690 6.450 22.003 5x 5 13.907 -> 3.382 21.159 18.224 8.361 22.496 6x 6 14.888 4.388 21.794 20.800 10.785 23.096 7x 7 14.933 4.936 21.690 22.130 11.642 23.061 8x 8 15.726 4.979 21.853 25.239 14.110 22.775 9x 9 16.111 5.672 22.051 28.471 18.090 22.835 10x10 16.973 6.676 22.658 29.339 23.768 23.922 11x11 17.719 7.242 22.237 38.507 24.495 24.141 12x12 18.374 7.705 22.575 34.793 24.563 25.783 13x13 19.269 8.902 23.183 38.724 28.137 24.910 14x14 19.860 9.187 24.015 40.960 29.401 26.466 15x15 21.154 10.599 22.666 44.330 33.872 26.149 16x16 21.852 11.398 22.254 46.278 35.714 25.039 17x17 22.968 12.285 22.814 49.656 39.538 25.117 18x18 23.522 13.229 30.859 53.230 44.196 27.554 19x19 24.715 14.339 23.529 56.953 46.666 27.128 20x20 25.609 16.795 23.809 60.319 49.807 27.416 21x21 27.551 16.961 27.687 64.190 54.979 28.167 22x22 28.657 22.299 25.413 68.734 57.941 29.583 23x23 32.277 18.798 25.653 72.774 62.857 30.012 24x24 30.171 19.901 30.646 78.532 71.728 28.183 25x25 31.535 21.267 25.984 80.725 69.991 29.350 26x26 32.726 22.176 25.814 84.442 74.158 30.982 27x27 33.732 23.385 25.738 89.092 78.845 31.683 28x28 34.571 24.308 26.061 92.948 83.907 31.610 29x29 35.325 25.362 26.035 98.106 87.679 32.854 h7. ========= ========= ======== ========= ==========
Maybe my problem is too specific therefore I leave this ticket closed (probably I will use the old implementation from OpenCV2.3)
Updated by Vadim Pisarevsky over 13 years ago
ok, I now see how the previous version could be faster.
I will reopen the ticket, but set the priority to minor, since it's quite specific application of optical flow.
As a workaround, you may try to downscale the image to 320x240. Probably it will also give sufficiently accurate results at much better speed.
- Status changed from Done to Cancelled
- (deleted custom field) deleted (
invalid)
Updated by Andrey Kamaev almost 13 years ago
- Target version set to 2.4.0