Signficant slow down of SVD in OpenCV 2.3 (Bug #1498)
Description
When using CvInvert with SVD method, OpenCV 2.3 is 3x slower than OpenCV 2.2.
Operating System: Windows
Compiler: MSVC 10
OpenCV Version - All versions of OpenCV 2.3
Same code attached was run using OpenCV 2.2 and OpenCV 2.3.1
Clearly identified CvInverse is slowest component.
Timing tests with 1000 iterations:
OpenCV 2.2 - 9 seconds
OpenCV 2.3.1 - 33 seconds
Test Code below;
1/**
2* Test speed of linear algebra functions in OpenCV
3 *
4 **/
5
6#include <opencv/cv.h>
7#include <cstdio>
8#include <cstdarg>
9#include <cstdlib>
10#include <iostream>
11#include <fstream>
12#include <cmath>
13#include <cassert>
14#include <ctime>
15
16using namespace std;
17
18void CovarMatrix (CvMat* CovMat, CvMat* FeatureVecs);
19
20int main(int argc, char *argv[])
21{
22 bool debug = false;
23
24 int FVLength = 64;
25
26 CvMat* avg = cvCreateMat(FVLength, 1, CV_64FC1);
27 CvMat* FeatVec = cvCreateMat(FVLength, 1, CV_64FC1);
28 CvMat* Anomaly = cvCreateMat(FVLength, 1, CV_64FC1);
29 CvMat* Ki = cvCreateMat(FVLength, FVLength, CV_64FC1);
30 CvMat* CovMat = cvCreateMat(FVLength, FVLength, CV_64FC1);
31 cvSetZero(CovMat);
32 CvMat* TotVecs = cvCreateMat(2,FVLength, CV_64FC1);
33
34 // Initialize the FeatVec and avg to be random
35
36 CvRNG rng_state = cvRNG(0xffffffff);
37 cvRandArr(&rng_state, FeatVec, CV_RAND_NORMAL, cvRealScalar(100), cvRealScalar(30));
38 cvRandArr(&rng_state, avg, CV_RAND_NORMAL, cvRealScalar(60), cvRealScalar(25));
39
40 // Start time
41 time_t start,end;
42 double dif;
43 time (&start);
44 unsigned int it = 10000;
45 for (int p = 0; p < it; p++)
46 {
47
48 cvSub(FeatVec, avg, Anomaly); // Find anomaly vector (feature-average)
49
50 //Ki is now going to be inverse covariance matrix
51
52 for (int k=0; k < FVLength; k++)
53 {
54 cvmSet(TotVecs,0,k,cvmGet(FeatVec,k,0));
55 cvmSet(TotVecs,1,k,cvmGet(avg,k,0));
56 }
57
58 // Next, find the covariance matrix of this.
59 CovarMatrix(CovMat, TotVecs);
60 cvInvert(CovMat, Ki, CV_SVD);
61
62 }
63
64 time(&end);
65 dif = difftime(end,start);
66 cout << "Time Ellapsed : " << dif << endl;
67
68 return 0;
69}
70
71void CovarMatrix (CvMat* CovMat, CvMat* Vecs)
72{
73 int nums = Vecs->cols;
74 int Len = Vecs->rows;
75 int n = Vecs->cols; // step size
76 double *data = Vecs->data.db; // pointer to float with all values in input
77 double sum;
78
79 for (int j = 0; j < nums ; j++)
80 {
81 sum = 0;
82 for (int i = 0; i < Len; i ++)
83 sum += data[i * n + j];
84 sum = sum/Len;
85
86 for (int i =0; i < Len; i++)
87 {
88 data[i * n + j] -= sum;
89 }
90 }
91
92 CvMat* VecsT = cvCreateMat(nums, Len, CV_64FC1); // covariance matrix
93 cvTranspose( Vecs,VecsT );
94 cvMulTransposed( VecsT, CovMat, 0);
95 // scale CovMat by m-1
96 double *CovDat = CovMat->data.db; // pointer to float with all values in input
97 for (int i = 0; i < CovMat->rows; i++) // for each feature vector, find the mean, and subtract it
98 {
99 for (int j = 0; j < CovMat->cols; j ++)
100 {
101 CovDat[i * (CovMat->cols) + j] = CovDat[i * (CovMat->cols) + j] / (Len-1);
102 }
103 }
104 cvReleaseMat(&VecsT);
105}
Related issues
| duplicated by Feature #1435: EM is slower in ver 2.3 | Cancelled |
History
Updated by Vadim Pisarevsky over 1 year ago
While the problem can not be fixed completely, since we moved to our own SVD implementation, which is much faster on small matrices and generally more accurate on large matrices (at the expense of speed), there is possible workaround -
Since the covariation matrix is symmetrical, you can use cvEigenVV to decompose it to V*E*V' (where V is an orthogonal matrix of eigenvectors and E is a diagonal matrix of eigenvalues) and then pass the matrix V twice to cvSVBkSb (since U=V).
In the latest trunk version (r7098 and higher) you can simply call cvInvert() and pass CV_SVD_SYM flag there.
- Status changed from Open to Done
- Pull request set to fixed
Updated by Andrey Kamaev about 1 year ago
- Target version set to 2.4.0
- Description changed from When using [[CvInvert]] with SVD method, [[OpenCV]]2.3 is 3x slower than [[Op... to When using @CvInvert@ with SVD method, OpenCV 2.3 is 3x slower than OpenCV 2.... More
Updated by Ming-Ming Cheng about 1 year ago
I tested this code in my computer where version 2.4 is about 6 times slower than version 2.2. Is there any possibility to supply user some choice to choose the old lapack package or the newer one. The package is been extensively used in many areas in the past ten years and it’s less likely to have bugs. It’s also less likely to develop a new math package which is significantly more efficient than lapack. Personally, if there is not significant performance gain, I prefer the lapack. If there is, why not suggest these changes to lapack maintainer? As a user, I strongly recommend return back to lapack version, considering the performance of current version.
Updated by Vadim Pisarevsky about 1 year ago
could you, please, replace
cvInvert(CovMat, Ki, CV_SVD);
with
cvInvert(CovMat, Ki, CV_SVD_SYM);
and measure it again?
I believe, the difference in speed should become much smaller.
Updated by Ming-Ming Cheng about 1 year ago
I noticed many post elsewhere: http://comments.gmane.org/gmane.comp.lib.opencv/50836 . Since there so many users find Lapack version is better, why not change back? I really prefer other new features in later version of opencv, thus I don’t want to change back to version 2.2. Even if I can directly use lapack to solve my own SVD problem, relative slowdown on many other opencv functions relying on this can’t be solved easily.
In the opencv change note: “LAPACK is not used by OpenCV anymore. The change decreased the library footprint and the compile time. We now use our own implementation of Jacobi SVD. SVD performance on small matrices (2x2 to 10x10) has been greatly improved; on larger matrices it is still pretty good. SVD accuracy on poorly-conditioned matrices has also been improved.” However, even for a 64x64 matrix, it’s 6 times much slower; I can imagine the lower performance for even larger matrix.
Updated by Ming-Ming Cheng about 1 year ago
When people use SVD, they typically don’t have a symmetrical matrix. Even if speed difference of using CV_SVD_SYM is smaller for this special case, it is still slower, and for more general cases, much slower. Performance gain in compile time can is not so important since most users don’t compile it very often. I believe returning this part back to Lapack would make OpenCV much better. I really appreciate your contributions to make opencv better and better in most part. For this specific part, could you please help to return back to Lapack please? Lapack is used in MKL and Matlab, it’s naturally to also use this stable library in OpenCV.
Vadim Pisarevsky wrote:
could you, please, replace
cvInvert(CovMat, Ki, CV_SVD);
with
cvInvert(CovMat, Ki, CV_SVD_SYM);
and measure it again?I believe, the difference in speed should become much smaller.
Updated by Vadim Pisarevsky about 1 year ago
is it possible to supply the matrix on which the current SVD is 6 times slower? according to my experiments, new SVD is only a bit slower than Lapack on matrices of size <100x100. The specification of OS, compiler and flags you use is also appreciated. Until we have valid proof if the problem, which can not be solved easily, it's not a productive argument. I reopen the ticket for now.
- Target version changed from 2.4.0 to 2.4.2
- Priority changed from Blocker to Normal
- Status changed from Done to Open
Updated by Ming-Ming Cheng 12 months ago
Yes of course. I have attached the data file and two versions (2.2 and 2.4) of binary. By just clicking RunTest.bat, you can get the run time used for different versions of opencv.
I run this in more than 10 computers of my friends, and version 2.4 is about 6 times slower. We all use windows 7 and visual studio 2010. Specifically, my computer is ThinkPad T410 with i5 CPU. Here is the source code of this exe:
/*** Test speed of linear algebra functions in OpenCV ***/
#include <opencv/cv.h>
#include <cstdio>
#include <cstdarg>
#include <cstdlib>
#include <iostream>
#include <fstream>
#include <cmath>
#include <cassert>
#include <ctime>
#ifdef _DEBUG
#pragma comment(lib, "opencv_core220d.lib")
#pragma comment(lib, "opencv_highgui220d.lib")
#else
#pragma comment(lib, "opencv_core220.lib")
#endif // _DEBUG
using namespace cv;
using namespace std;
void CovarMatrix (CvMat* CovMat, CvMat* FeatureVecs);
int main(int argc, char *argv[])
{
bool debug = false;
int FVLength = 64;
CvMat* avg = cvCreateMat(FVLength, 1, CV_64FC1);
CvMat* FeatVec = cvCreateMat(FVLength, 1, CV_64FC1);
CvMat* Anomaly = cvCreateMat(FVLength, 1, CV_64FC1);
CvMat* Ki = cvCreateMat(FVLength, FVLength, CV_64FC1);
CvMat* CovMat = cvCreateMat(FVLength, FVLength, CV_64FC1);
cvSetZero(CovMat);
CvMat* TotVecs = cvCreateMat(2,FVLength, CV_64FC1);
// Initialize the FeatVec and avg to be random
CvRNG rng_state = cvRNG(0xffffffff);
cvRandArr(&rng_state, FeatVec, CV_RAND_NORMAL, cvRealScalar(100), cvRealScalar(30));
cvRandArr(&rng_state, avg, CV_RAND_NORMAL, cvRealScalar(60), cvRealScalar(25));
// Start time
clock_t start = ::clock();
unsigned int it = 2000;
for (int p = 0; p < it; p++) {
cvSub(FeatVec, avg, Anomaly); // Find anomaly vector (feature-average)
for (int k=0; k < FVLength; k++) //Ki is now going to be inverse covariance matrix
{
cvmSet(TotVecs,0,k,cvmGet(FeatVec,k,0));
cvmSet(TotVecs,1,k,cvmGet(avg,k,0));
}
// Next, find the covariance matrix of this.
CovarMatrix(CovMat, TotVecs);
cvInvert(CovMat, Ki, CV_SVD);
if (p == 0){
CvFileStorage *fs = cvOpenFileStorage("data.xml", NULL, CV_STORAGE_WRITE);
cvWrite(fs, "CovMat", CovMat);
}
}
clock_t endT = clock();
cout<<"Time elapsed : "<< (double(endT - start))/CLOCKS_PER_SEC <<endl;
return 0;
}
void CovarMatrix (CvMat* CovMat, CvMat* Vecs)
{
int nums = Vecs->cols;
int Len = Vecs->rows;
int n = Vecs->cols; // step size
double *data = Vecs->data.db; // pointer to float with all values in input
double sum;
for (int j = 0; j < nums ; j++) {
sum = 0;
for (int i = 0; i < Len; i ++)
sum += data[i * n + j];
sum = sum/Len;
for (int i =0; i < Len; i++)
data[i * n + j] -= sum;
}
CvMat* VecsT = cvCreateMat(nums, Len, CV_64FC1); // covariance matrix
cvTranspose( Vecs,VecsT );
cvMulTransposed( VecsT, CovMat, 0);
// scale CovMat by m-1
double *CovDat = CovMat->data.db; // pointer to float with all values in input
for (int i = 0; i < CovMat->rows; i++) // for each feature vector, find the mean, and subtract it
for (int j = 0; j < CovMat->cols; j ++)
CovDat[i * (CovMat->cols) + j] = CovDat[i * (CovMat->cols) + j] / (Len-1);
cvReleaseMat(&VecsT);
}
Vadim Pisarevsky wrote:
is it possible to supply the matrix on which the current SVD is 6 times slower? according to my experiments, new SVD is only a bit slower than Lapack on matrices of size <100x100. The specification of OS, compiler and flags you use is also appreciated. Until we have valid proof if the problem, which can not be solved easily, it's not a productive argument. I reopen the ticket for now.
Updated by Ming-Ming Cheng 12 months ago
If you want, I can just start a survey and ask opencv users to give information about their platform and the run-time reported by the exe I just uploaded. Our lab has 40 students using OpenCV. All of them meet this problem. I think I can easily collected more than 100+ from different users, or even 1000+.
Updated by Andrey Kamaev 11 months ago
- Target version deleted (
2.4.2)
Updated by Vadim Pisarevsky 11 months ago
ok; the problem has been reproduced; we will try to solve it soon
- Target version set to 2.4.3
Updated by Vadim Pisarevsky 7 months ago
- Target version changed from 2.4.3 to 3.0
Updated by Wolfgang von Hansen 5 months ago
I would like to draw the attention to a different and more fundamental aspect of the underlying problem. One of my tasks involves solving large systems of several thousand equations, which -- due to the sparse nature of the problem -- can basically be broken down to compute the inverse of a band matrix. cvInvert from OpenCV 2.2 does the job really fine using the (default) LU-decomposition. The new code in OpenCV 2.3 no longer exploits the sparseness of a matrix. One can easily demonstrate this with a trivial example
Mat A = Mat::eye(n, n, CV_32F).inv();
by choosing n large enough. Depending on the speed of your computer some value like 2000 should keep it busy for a minute or so. The times are the same as if it were a random matrix of the same size. If you test different values for n, you will see that the run time is in O(n^3). The reason is most likely, that there are no checks for zero elements during the elimination process, thus blindly computing the inverse by brute force.
For real world problems, the larger a matrix gets, the sparser it is. A good implementation -- like that of LAPACK -- takes the zeros into account, yielding a (linear!) run time in O(n). I sincerely hope that someone can look into this. For me asymptotic behavior for large matrices is much more important than a few saved cycles for small ones (these a fast anyway on modern CPUs).
Updated by Kirill Kornyakov about 1 month ago
Dear community, could somebody work take this issue? Core OpenCV devs are busy with release preparations, so if we want to see this issue fixed, we need a volunteer!
- Affected version set to branch 'master' (2.4.9)