Add a binary FileStorage format (Feature #1849)


Added by Josh Klontz almost 13 years ago. Updated over 9 years ago.


Status:Open Start date:2012-04-25
Priority:Normal Due date:
Assignee:Vadim Pisarevsky % Done:

0%

Category:core
Target version:3.0
Difficulty: Pull request:

Description

It would be wonderful if FileStorage also supported serialization to a binary format when file size is more important than file readability. Furthermore, if implemented, a great extension to this would be to provide a mechanism for representing FileStorage as a buffer in memory. The problem i've run into while writing a gender classification algorithm is one of data persistence for the entire algorithm. Parts of the algorithm, like PCA using the Eigen matrix format are easy to serialize as a binary format. However, other parts, like OpenCV's SVM classifier are difficult to serialize as the only mechanisms for doing so are 1) directly to a file or 2) using the FileStorage class. Thus, with the goal of having one (small) file to represent the result of training all the steps in my algorithm (where some steps require serializing data structures from other libraries), the current approach I take is to save the SVM to its own temporary file, read the file from disk, compress the data, and finally write it to a file stream alongside the other portions of the algorithm. The proposed features or an alternative approach to serializing CvStatModels and the new Algorithm class to a memory buffer would be greatly appreciated! Thanks!


file_storage.h (1.8 kB) Peter Minin, 2013-12-15 09:35 pm

file_storage.cpp (1.4 kB) Peter Minin, 2013-12-15 09:35 pm


Associated revisions

Revision 10aec14a
Added by Roman Donchenko over 11 years ago

Merge pull request #1849 from StevenPuttemans:feature_3375_documentation

History

Updated by Joel Mckay almost 13 years ago

The compressed XML files contain marshalled data, and free libs like Xerces can parse it on just about any system (btw a symmetric MatLab xslt based translation routine may be handy for transparency). Additionally, human readable data for analysis is nicer to debug unlike a serialized binary.

A better explanation of why it is important is here:
http://en.wikipedia.org/wiki/Marshalling_%28computer_science%29

If libxerces is missing on your repo:
http://xerces.apache.org/xerces-c/

Updated by Alexander Smorkalov over 12 years ago

  • Assignee set to Vadim Pisarevsky
  • Target version set to 3.0

Updated by Peter Minin about 11 years ago

I miss this feature too, also mainly for saving classifier data. I went another way instead of compressing XML/YAML files: I wrote my own functions for binary storage, and in case of CvSVM I subclassed it and overloaded write() and read() with my ports of the CvFileStorage-based methods. Unfortunately, I couldn't find a way to save/load a Fisherfaces FaceRecognizer like that because it doesn't allow setting some of its variables through set*() and the class isn't even defined in a header file so subclassing won't work either.
I've attached my binary storage functions here and I'd like to ask some experienced OpenCV supporter to check if this approach can find its way to OpenCV and tell me what has to be changed for that.

Updated by eric king about 11 years ago

Opencv data serialization and de-serialization .I think it is a very important problema for data storage in opencv data format.So any body can fixed it immediately?

Updated by Maksim Shabunin over 9 years ago

Issue has been transferred to GitHub: https://github.com/Itseez/opencv/issues/4347

Also available in: Atom PDF