cv::divide(..) throws "illegal instruction" in OpenCV build in VS2013 made with CMake 3.3 (Bug #4422)
Description
I used CMake 3.3 to generate both VS2010 and VS2013 OpenCV solutions. Tested
VS2010 Build and it is able to cv::divide just fine. I Tested the VS2013 build
of OpenCV in a VS2013 simple example (shown below) causes an "Illegal Instruction"
Exception when attempting to cv::divide a Mat.
I modified a sample as shown in Example#1. Running in the debugging mode gives this
error below. However, running as a standalone executable causes the program to simply
crash. (This does not happen in VS2010 build of opencv)
Here is the error:
- Unhandled exception at 0x6632facf (ISVision.dll) in InSpec.exe: 0xC000001D: Illegal Instruction.
breaking and checking out the code i am led to filename intrin_sse.hpp on line 263:
- OPENCV_HAL_IMPL_SSE_INITVEC(v_float32x4, float, f32, ps, ps, float, _mm_castsi128_ps)
v_setall_f32 seems to be the culprit. Expanding return is as follows:
_Tps -> float
_Tpvec -> v_float32x4
ssuffix -> ps
_Tpvec(_mm_set1_##ssuffix((_Tps)v)); -> v_float32x4(_mm_set1_ps((float) v));
this seems to create a struct using the single param constructor. the param
v is assigned to val. val is a __m128 type. NOTE: the call stack simply stops
at v_setall_f32(float v);
Example#1
Sample of code that breaks in Visual Studio 2013 build:¶
int main() {
std::string inputName = "full path to image . bmp";
Mat m1, m2, result;
m1 = imread(inputName, 1);
m2 = imread(inputName, 1);
namedWindow(wndname, WINDOW_AUTOSIZE);
imshow(wndname, m1);
waitKey(0);
cv::divide(0.5, m1, result); // This throws the error and crashes the program
imshow(wndname, result);
waitKey(0);}
Here are my CMake options:¶
BUILD_EXAMPLES
BUILD_TESTS
BUILD_WITH_DEBUG_INFO
BUILD_ZLIB
BUILD_opencv_calib3d
BUILD_opencv_core
BUILD_opencv_cudev
BUILD_opencv_features2d
BUILD_opencv_flann
BUILD_opencv_hal
BUILD_opencv_highgui
BUILD_opencv_imgcodecs
BUILD_opencv_imgproc
BUILD_opencv_ml
BUILD_opencv_objdetect
BUILD_opencv_rgbd (in the opencv_contrib/modules)
BUILD_opencv_video
BUILD_opencv_videoio
BUILD_opencv_world
ENABLE_AV*
ENABLE_FMA3
ENABLE_POPCNT
ENABLE_PRECOMPILED_HEADERS
ENABLE_SOLUTION_FOLDERS
ENABLE_SS*
WITH_CUBLAS
WITH_CUFFT
WITH_DIRECTX
WITH_IPP
WITH_OPENCL
WITH_OPENCLAMDBLAS
WITH_OPENCLAMDFFT
WITH_OPENCL_SVM
WITH_WIN32UI
I am sure everyone is building in VS2013 just fine. Does everyone's cv::divide go into the same SIMD branch?
Some Guesses:
Could I try compiling OpenCV without SSE? What kinds libraries would cause incompatibility in such a way that breaks how the instructions are compiled? Maybe there is a simple bug in cmake that misconfigures VS2013 projects? Maybe something in between: cmake option in simply incompatible with vs2013.
History
Updated by Oriah Ulrich over 9 years ago
Here is the call stack:
v_setall_f32
Div_SIMD<float>::operator()
div_f<float>
div32f
arithm_op
divide
----------------------------------------------------------------------------------------
in include\opencv2\hal\intrin.hpp:
simply including "opencv2/hal/intrin_cpp.hpp" instead of the intrin_sse.hpp is a current work around
Now cv::dividing in a project that uses opencv in vs2013 does not crash!
It seems like intrin.hpp includes intrin_sse.hpp when CV_SSE2 is defined and intrin_cpp.hpp otherwise.
However other calls to _mm_set1_ps cause problems not related to cv::divide
Updated by Oriah Ulrich over 9 years ago
Something Like cv::multiply(m1, m2, result); works fine.
Looking at the disassembled code I noticed that something is a bit off..
-vshufps xmm0,xmm0,xmm0,0 // visual studio 2010, runs just fine!
+vbroadcastss xmm0,xmm0 // visual studio 2013, crashes here
Updated by Oriah Ulrich over 9 years ago
Problem Solved!
The issue is exactly that the compiler generates vbroadcastss when its told to. However my computer does not support such instruction. Silly user error. It makes sense not to have a check for Haswell or later processors that support AVX2 instructions, since one might want to build for newer computers on an older machine. (But then how are they to test the build?) If it makes sense to consider a flag that disabled AVX2 even if the user specified it during project generation (via cmake) then cmake also needs to run a check for "Haswell or later" compatibility..
See this article:
https://software.intel.com/en-us/node/405250?language=es&wapkw=avx2+cpuid
And This
http://blogs.msdn.com/b/vcblog/archive/2014/02/28/avx2-support-in-visual-studio-c-compiler.aspx
I might make a pull request if there is time. but I havent even figure out where a good place for this addition would be. Maybe in CMake as an option?
- Target version set to 3.0
- Status changed from New to Done
- % Done changed from 0 to 100
Updated by Philip L over 9 years ago
Well there is a cmake switch u stated above "ENABLE_AVX" and "ENABLE_AVX2".
I think since these options are disabled by default you should be 100% sure that your cpu supports such instructions before enabling them.
Maybe such a check will be useful but yeah if you aren't sure what instructions u can use you should stick to the defaults. Also there is SSE2 enabled by default since most of the current hardware does support this.
AVX support is not available in opencv yet since there are no build bots to test these cases.