opencv-2.2 with TBB uses only one CPU-core/opencv-2.0 with openmp uses all four, but seg fault (Bug #1068)


Added by Pavel Reich almost 14 years ago. Updated almost 13 years ago.


Status:Cancelled Start date:
Priority:High Due date:
Assignee:Vadim Pisarevsky % Done:

0%

Category:core
Target version:2.4.0
Affected version: Operating System:
Difficulty: HW Platform:
Pull request:

Description

Hello,

I've downloaded opencv-2.2 release from willowgarage.com, compiled
this way: mkdir release && cd release && cmake D CMAKE_BUILD_TYPE=RELEASE -DWITH_TBB=ON ..
and even got
"-
Use TBB: YES" in the output.

However when I run time /usr/local/bin/opencv_haartraining -data trainout -vec positives.vec -bg negatives.txt -npos 110 -nneg 1000, it uses only one CPU-core. Operating system - ubuntu-10. btw, opencv_haartraining from the default opencv-dev package on ubuntu-10 also uses only one core.

I also tried the version from trunk, no luck - it still uses only one CPU.
However when I compiled opencv-2.0 (with openmp), it uses all 4 cores - however I got a segmentation fault:

time /usr/local/bin/opencv-haartraining -data trainout -vec positives.vec -bg negatives.txt -npos 110 -nneg 1000
Data dir name: trainout
Vec file name: positives.vec
BG file name: negatives.txt, is a vecfile: no
Num pos: 110
Num neg: 1000
Num stages: 14
Num splits: 1 (stump as weak classifier)
Mem: 200 MB
Symmetric: TRUE
Min hit rate: 0.995000
Max false alarm rate: 0.500000
Weight trimming: 0.950000
Equal weights: FALSE
Mode: BASIC
Width: 24
Height: 24
Applied boosting algorithm: GAB
Error (valid only for Discrete and Real AdaBoost): misclass
Max number of splits in tree cascade: 0
Min number of positive samples per cluster: 500
Required leaf false alarm rate: 6.10352e-05
Stage 0 loaded
Stage 1 loaded
Stage 2 loaded
Stage 3 loaded
Stage 4 loaded
Stage 5 loaded
Stage 6 loaded

Tree Classifier
Stage
----------+---+---+---+ | 0| 1| 2| 3| 4| 5| 6|
----------+---+---+---+

0---1---2---3---4---5---6

Number of features used : 85848

Parent node: 6

  • 1 cluster ***
    POS: 46 46 1.000000
    NEG: 418 0.0210856
    BACKGROUND PROCESSING TIME: 0.00
    Precalculation time: 0.00
    ------+-+---------+---------+---------+---------+ | N |%SMP|F| ST.THR | HR | FA | EXP. ERR|
    ------+-+---------+---------+---------+---------+ | 1|100%|-|-0.395958| 1.000000| 1.000000| 0.254310|
    ------+-+---------+---------+---------+---------+ | 2|100%|+|-1.186469| 1.000000| 1.000000| 0.174569|
    ------+-+---------+---------+---------+---------+ | 3|100%|-|-0.865389| 1.000000| 0.868421| 0.351293|
    ------+-+---------+---------+---------+---------+ | 4| 88%|+|-1.444954| 1.000000| 0.868421| 0.318966|
    ------+-+---------+---------+---------+---------+ | 5| 77%|-|-0.444954| 1.000000| 0.746411| 0.594828|
    ------+-+---------+---------+---------+---------+ | 6| 77%|+|-1.101890| 1.000000| 0.751196| 0.278017|
    ------+-+---------+---------+---------+---------+
    Segmentation fault

real 0m9.415s
user 0m27.270s
sys 0m0.068s

It would so great to run haartraining on many CPUs, for example on cx1g.large amazon ec2 with 16 cores.

Thanks.


Associated revisions

Revision 1590a1a5
Added by Roman Donchenko over 11 years ago

Merge pull request #1068 from AoD314:webp2

History

Updated by Pavel Reich almost 14 years ago

Some details from gdb:
----+----+-+---------+---------+---------+---------+ | 112| 8%|+|-11.405313| 1.000000| 0.727273| 0.355603|
------+-+---------+---------+---------+---------+

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0xb4966b70 (LWP 29567)]
0x08057c6a in icvSortIndexedValArray_32s (array=0x8096340, total=39, aux=0xb4966158) at haartraining/cvboost.cpp:82
82 CV_IMPLEMENT_QSORT_EX( icvSortIndexedValArray_32s, int, CMP_VALUES, CvValArray* )
(gdb)

[Switching to Thread 0xb4966b70 (LWP 29567)]
0x08057c6a in icvSortIndexedValArray_32s (array=0x8096340, total=39, aux=0xb4966158) at haartraining/cvboost.cpp:82
82 CV_IMPLEMENT_QSORT_EX( icvSortIndexedValArray_32s, int, CMP_VALUES, CvValArray* )
(gdb) backtrace
#0 0x08057c6a in icvSortIndexedValArray_32s (array=0x8096340, total=39, aux=0xb4966158) at haartraining/cvboost.cpp:82
#1 0x0805d4e4 in cvCreateMTStumpClassifier (.omp_data_i=0xbfffc7ac) at haartraining/cvboost.cpp:1031
#2 0xb6e1e22c in ?? () from /usr/lib/libgomp.so.1
#3 0xb709bcc9 in start_thread () from /lib/libpthread.so.0
#4 0xb6ef369e in clone () from /lib/libc.so.6
(gdb) frame 0
#0 0x08057c6a in icvSortIndexedValArray_32s (array=0x8096340, total=39, aux=0xb4966158) at haartraining/cvboost.cpp:82
82 CV_IMPLEMENT_QSORT_EX( icvSortIndexedValArray_32s, int, CMP_VALUES, CvValArray* )
(gdb)

Updated by Pavel Reich almost 14 years ago

Is it true, that even with TBB/openMP it doesn't improve performance, but purely runs the same code on several CPUs? http://tech.groups.yahoo.com/group/OpenCV/message/74117

Updated by Maciej Rumianowski over 13 years ago

I have checked and haartraining (trunk version and 2.2) doesn't use TBB. There is a lot of openmp code. I will try to force compilation of haartraining with OpenMP and see what happens.

Updated by Pavel Reich over 13 years ago

I've tried that - added #define CV_OPENMP HAVE_TBB in cvhaartraining.cpp and it still uses only one CPU..

Updated by Maciej Rumianowski over 13 years ago

Just find in CMakeLists.txt in main directory ENABLE_OPENMP and uncomment it. Additionally #define CV_OPENMP in haartraining.
After compilation i was succesfully using 8 cpus ;)

Updated by Alexander Shishkov about 13 years ago

  • Description changed from Hello, I've downloaded opencv-2.2 release from willowgarage.com, compile... to Hello, I've downloaded opencv-2.2 release from willowgarage.com, compile... More

Updated by Alexander Shishkov almost 13 years ago

We don't support OpenMP any more. Also this module does not yet paralleled with TBB.

  • Status changed from Open to Cancelled

Updated by Andrey Kamaev almost 13 years ago

  • Target version set to 2.4.0

Also available in: Atom PDF