AutoTrain fail (Bug #4030)


Added by mathieu fontaine over 10 years ago. Updated almost 10 years ago.


Status:Done Start date:2014-11-26
Priority:Blocker Due date:
Assignee:Maria Dimashova % Done:

100%

Category:ml
Target version:3.0
Affected version:branch 'master' (3.0-dev) Operating System:Any
Difficulty: HW Platform:x64
Pull request:

Description

If i modify the sample to use trainAuto, it fails:

Starting training process
OpenCV Error: Bad argument (While cross-validation one or more of the classes have been fell out of the sample. Try to enlarge <CvSVMParams::k_fold>) in do_train, file /data/mfontaine/opencv/modules/ml/src/svm.cpp, line 1400
terminate called after throwing an instance of 'cv::Exception'
  what():  /data/mfontaine/opencv/modules/ml/src/svm.cpp:1400: error: (-5) While cross-validation one or more of the classes have been fell out of the sample. Try to enlarge <CvSVMParams::k_fold> in function do_train

whatever the k_folds value is.


Related issues

related to Bug #4464: class_labels of SVM::trainAuto is not consistent with tha... Done 2015-07-05

Associated revisions

Revision 6593422c
Added by Sancho McCann almost 10 years ago

Bugfix: #4030 SVM auto-training.

Revision ede4943d
Added by Alexander Smorkalov almost 10 years ago

Merge pull request #4030 from asmorkalov:as/accurate_cuda_arch_aarch64

History

Updated by wooden glider over 10 years ago

mathieu fontaine wrote:

If i modify the sample to use trainAuto, it fails:
[...]
whatever the k_folds value is.

Hi mathieu fontaine !

as you probably already know, the k-fold cv is defined as to partition the original sample set to exactly k sub sets and cross validate them. so, the parameter k must be dividable by the sample size, otherwise will results in the inconsistancy due to the final one sub set is smaller then others.
send in a parameter k that is dividable by the sample size may solve this problem.
plz let us know whether your problem is solved.

hope helpful!

Updated by wooden glider over 10 years ago

  • % Done changed from 0 to 50

Updated by wooden glider over 10 years ago

  • Status changed from New to Cancelled

Updated by mathieu fontaine over 10 years ago

like i said, no matter what the value of k_fold is, it doesn't work. And i think that you mean that k_fold must be a divisor of the sample size. you can try this:

#include <iostream>
#include <opencv2/core.hpp>
#include <opencv2/imgproc.hpp>
#include "opencv2/imgcodecs.hpp" 
#include <opencv2/highgui.hpp>
#include <opencv2/ml.hpp>

#define    NTRAINING_SAMPLES    100            // Number of training samples per class
#define FRAC_LINEAR_SEP        0.9f        // Fraction of samples which compose the linear separable part

using namespace cv;
using namespace cv::ml;
using namespace std;

static void help()
{
    cout<< "\n--------------------------------------------------------------------------" << endl
        << "This program shows Support Vector Machines for Non-Linearly Separable Data. " << endl
        << "Usage:"                                                               << endl
        << "./non_linear_svms" << endl
        << "--------------------------------------------------------------------------"   << endl
        << endl;
}

int main()
{
    help();

    // Data for visual representation
    const int WIDTH = 512, HEIGHT = 512;
    Mat I = Mat::zeros(HEIGHT, WIDTH, CV_8UC3);

    //--------------------- 1. Set up training data randomly ---------------------------------------
    Mat trainData(2*NTRAINING_SAMPLES, 2, CV_32FC1);
    Mat labels   (2*NTRAINING_SAMPLES, 1, CV_32SC1);

    RNG rng(100); // Random value generation class

    // Set up the linearly separable part of the training data
    int nLinearSamples = (int) (FRAC_LINEAR_SEP * NTRAINING_SAMPLES);

    // Generate random points for the class 1
    Mat trainClass = trainData.rowRange(0, nLinearSamples);
    // The x coordinate of the points is in [0, 0.4)
    Mat c = trainClass.colRange(0, 1);
    rng.fill(c, RNG::UNIFORM, Scalar(1), Scalar(0.4 * WIDTH));
    // The y coordinate of the points is in [0, 1)
    c = trainClass.colRange(1,2);
    rng.fill(c, RNG::UNIFORM, Scalar(1), Scalar(HEIGHT));

    // Generate random points for the class 2
    trainClass = trainData.rowRange(2*NTRAINING_SAMPLES-nLinearSamples, 2*NTRAINING_SAMPLES);
    // The x coordinate of the points is in [0.6, 1]
    c = trainClass.colRange(0 , 1);
    rng.fill(c, RNG::UNIFORM, Scalar(0.6*WIDTH), Scalar(WIDTH));
    // The y coordinate of the points is in [0, 1)
    c = trainClass.colRange(1,2);
    rng.fill(c, RNG::UNIFORM, Scalar(1), Scalar(HEIGHT));

    //------------------ Set up the non-linearly separable part of the training data ---------------

    // Generate random points for the classes 1 and 2
    trainClass = trainData.rowRange(  nLinearSamples, 2*NTRAINING_SAMPLES-nLinearSamples);
    // The x coordinate of the points is in [0.4, 0.6)
    c = trainClass.colRange(0,1);
    rng.fill(c, RNG::UNIFORM, Scalar(0.4*WIDTH), Scalar(0.6*WIDTH));
    // The y coordinate of the points is in [0, 1)
    c = trainClass.colRange(1,2);
    rng.fill(c, RNG::UNIFORM, Scalar(1), Scalar(HEIGHT));

    //------------------------- Set up the labels for the classes ---------------------------------
    labels.rowRange(                0,   NTRAINING_SAMPLES).setTo(1);  // Class 1
    labels.rowRange(NTRAINING_SAMPLES, 2*NTRAINING_SAMPLES).setTo(2);  // Class 2

    //------------------------ 2. Set up the support vector machines parameters --------------------
    SVM::Params params;
    params.svmType    = SVM::C_SVC;
    params.C            = 0.1;
    params.kernelType = SVM::LINEAR;
    params.termCrit   = TermCriteria(TermCriteria::MAX_ITER, (int)1e7, 1e-6);

    //------------------------ 3. Train the svm ----------------------------------------------------
    cout << "Starting training process" << endl;
    Ptr<TrainData> td = TrainData::create(trainData,ROW_SAMPLE,labels);

    Ptr<SVM> svm = SVM::create(params);
    svm->trainAuto(td,10);

    cout << "Finished training process" << endl;

    //------------------------ 4. Show the decision regions ----------------------------------------
    Vec3b green(0,100,0), blue (100,0,0);
    for (int i = 0; i < I.rows; ++i)
        for (int j = 0; j < I.cols; ++j)
        {
            Mat sampleMat = (Mat_<float>(1,2) << i, j);
            float response = svm->predict(sampleMat);

            if      (response == 1)    I.at<Vec3b>(j, i)  = green;
            else if (response == 2)    I.at<Vec3b>(j, i)  = blue;
        }

    //----------------------- 5. Show the training data --------------------------------------------
    int thick = -1;
    int lineType = 8;
    float px, py;
    // Class 1
    for (int i = 0; i < NTRAINING_SAMPLES; ++i)
    {
        px = trainData.at<float>(i,0);
        py = trainData.at<float>(i,1);
        circle(I, Point( (int) px,  (int) py ), 3, Scalar(0, 255, 0), thick, lineType);
    }
    // Class 2
    for (int i = NTRAINING_SAMPLES; i <2*NTRAINING_SAMPLES; ++i)
    {
        px = trainData.at<float>(i,0);
        py = trainData.at<float>(i,1);
        circle(I, Point( (int) px, (int) py ), 3, Scalar(255, 0, 0), thick, lineType);
    }

    //------------------------- 6. Show support vectors --------------------------------------------
    thick = 2;
    lineType  = 8;
    Mat sv = svm->getSupportVectors();

    for (int i = 0; i < sv.rows; ++i)
    {
        const float* v = sv.ptr<float>(i);
        circle(    I,  Point( (int) v[0], (int) v[1]), 6, Scalar(128, 128, 128), thick, lineType);
    }

    imwrite("result.png", I);                       // save the Image
    imshow("SVM for Non-Linear Training Data", I); // show it to the user
    waitKey(0);
}

  • Status changed from Cancelled to Incomplete

Updated by Serhiy M about 10 years ago

mathieu fontaine wrote:

If i modify the sample to use trainAuto, it fails:
[...]
whatever the k_folds value is.

Same error.
Even with different training set/features (bigger).

Updated by Sancho McCann almost 10 years ago

Is this being worked on? I can volunteer if nobody else is already.

Updated by Sancho McCann almost 10 years ago

  • Status changed from Incomplete to Done
  • % Done changed from 50 to 100

Also available in: Atom PDF