OpenCV 2.3.1 - very slow calcOpticalFlowPyrLK (Bug #1423)


Added by Bartosz Wieloch over 13 years ago. Updated almost 13 years ago.


Status:Cancelled Start date:
Priority:Normal Due date:
Assignee:Vadim Pisarevsky % Done:

0%

Category:imgproc, video
Target version:2.4.0
Affected version: Operating System:
Difficulty: HW Platform:
Pull request:

Description

cvCalcOpticalFlowPyrLK in OpenCV 2.3.1 is still 5-10 times slower than cvCalcOpticalFlowPyrLK from OpenCV 2.3 (win7 release)

I suppose, the reason is that in current optimized version of calcOpticalFlowPyrLK implementation (since r5985) there are executed both optimized and old code, eg. we can see something like this:

...
#if CV_SSE2
for (...) ...
#endif
here very similar for loop
...

Probably it should be:
...
#if CV_SSE2
for (...) ...
#else
here very similar for loop
#endif
...


History

Updated by Vadim Pisarevsky over 13 years ago

"x" is not reset to 0 after SSE2-optimized loop is finished, therefore scalar code is only executed on the rest of the line.

without a benchmark it's impossible to reproduce your results. Our tests show that there is significant performance increase.

  • Status changed from Open to Done
  • (deleted custom field) set to invalid

Updated by Bartosz Wieloch over 13 years ago

In my program I call cvCalcOpticalFlowPyrLK several times (about 8) for each pair of video frames (to do backtracking and with diferent values of maxLevel).
As the speed is crucial for me (program must run on an embedded Intel Atom processor) I use few feature points (25) and small window (5x5).
Below are running times [ms] in my simple benchmark:
- image size = 640x480
- number of feature points: 25 or 100
- maxLevel = 5
- termCriteria = (TermCriteria::COUNT + TermCriteria::EPS, 100, 0.05);
- winSize 3x3 to 29x29
- cvCalcOpticalFlowPyrLK is called 8 times for the same pair of images:
- flags=0 means that cvCalcOpticalFlowPyrLK is always called with flags=0,
- flags=3 means that cvCalcOpticalFlowPyrLK is called with flags=CV_LKFLOW_PYR_A_READY+CV_LKFLOW_PYR_B_READY except the first call when flags=0 (to prepare pyramids)

All tests are performed on Intel Core i7 (2.67GHz).

              Number of points: 25            Number of points: 100     
         ------------------------------  -------------------------------
                  2.3            2.3.1            2.3             2.3.1 
-------  --------------------  --------  ---------------------  --------
winSize    flags=0    flags=3   flags=0    flags=0     flags=3   flags=0

h7.  =========  =========  ========  =========  ========== 

 3x 3       12.983      2.593    20.906     15.752       5.290    22.379
 4x 4       13.434      3.128    20.904     16.690       6.450    22.003
 5x 5       13.907   -> 3.382    21.159     18.224       8.361    22.496
 6x 6       14.888      4.388    21.794     20.800      10.785    23.096
 7x 7       14.933      4.936    21.690     22.130      11.642    23.061
 8x 8       15.726      4.979    21.853     25.239      14.110    22.775
 9x 9       16.111      5.672    22.051     28.471      18.090    22.835
10x10       16.973      6.676    22.658     29.339      23.768    23.922
11x11       17.719      7.242    22.237     38.507      24.495    24.141
12x12       18.374      7.705    22.575     34.793      24.563    25.783
13x13       19.269      8.902    23.183     38.724      28.137    24.910
14x14       19.860      9.187    24.015     40.960      29.401    26.466
15x15       21.154     10.599    22.666     44.330      33.872    26.149
16x16       21.852     11.398    22.254     46.278      35.714    25.039
17x17       22.968     12.285    22.814     49.656      39.538    25.117
18x18       23.522     13.229    30.859     53.230      44.196    27.554
19x19       24.715     14.339    23.529     56.953      46.666    27.128
20x20       25.609     16.795    23.809     60.319      49.807    27.416
21x21       27.551     16.961    27.687     64.190      54.979    28.167
22x22       28.657     22.299    25.413     68.734      57.941    29.583
23x23       32.277     18.798    25.653     72.774      62.857    30.012
24x24       30.171     19.901    30.646     78.532      71.728    28.183
25x25       31.535     21.267    25.984     80.725      69.991    29.350
26x26       32.726     22.176    25.814     84.442      74.158    30.982
27x27       33.732     23.385    25.738     89.092      78.845    31.683
28x28       34.571     24.308    26.061     92.948      83.907    31.610
29x29       35.325     25.362    26.035     98.106      87.679    32.854

h7.  =========  =========  ========  =========  ========== 

Maybe my problem is too specific therefore I leave this ticket closed (probably I will use the old implementation from OpenCV2.3)

Updated by Vadim Pisarevsky over 13 years ago

ok, I now see how the previous version could be faster.

I will reopen the ticket, but set the priority to minor, since it's quite specific application of optical flow.

As a workaround, you may try to downscale the image to 320x240. Probably it will also give sufficiently accurate results at much better speed.

  • Status changed from Done to Cancelled
  • (deleted custom field) deleted (invalid)

Updated by Andrey Kamaev almost 13 years ago

  • Target version set to 2.4.0

Also available in: Atom PDF