Suggestions for RTree documentation (Patch #1656)


Added by Nghia Ho about 13 years ago. Updated almost 13 years ago.


Status:Done Start date:2012-03-07
Priority:Low Due date:
Assignee:Vadim Pisarevsky % Done:

0%

Category:documentation
Target version:2.4.0
Affected version: Operating System:
Difficulty: HW Platform:
Pull request:

Description

RTree is lacking documentation for some of its parameters. I would like to suggest the following small additions to help beginners use RTree more effectively

int max_depth - the depth of the tree. A low value will likely underfit and conversely a high value will likely overfit. The optimal value can be obtained using cross validation or other suitable methods.

min_sample_count - minimum samples required at a leaf node for it to be split. A reasonable value is a small percentage of the total data eg. 1%.

max_categories - is not used (?) (according to my searching of the code)

max_num_of_trees_in_the_forest - The maximum number of trees in the forest (suprise, suprise). Typically the more trees you have the better the accuracy. However, the improvement in accuracy generally diminishes and asymptotes pass a certain number of trees. Also to keep in mind, the number of tree increases the prediction time linearly.

Maybe in the future the documentation should contain a section on using some of the machine learning algorithms. Doesn't have to be in depth, just short useful tips and caveats about each algorithm. I think it would be very useful for beginners.


Associated revisions

Revision 168d1936
Added by Vadim Pisarevsky almost 13 years ago

improved description of RTreeParams (ticket #1656; thanks to Nghia Ho)

History

Updated by Alexander Shishkov almost 13 years ago

  • Tracker changed from Feature to Patch
  • Target version deleted ()

Updated by Alexander Shishkov almost 13 years ago

  • Priority changed from Normal to Low
  • Category changed from ml to documentation

Updated by Alexander Shishkov almost 13 years ago

  • Assignee deleted (Maria Dimashova)

Updated by Vadim Pisarevsky almost 13 years ago

thanks! the description has been added in SVN trunk, r7661

  • Status changed from Open to Done
  • Assignee set to Vadim Pisarevsky

Updated by Alexander Shishkov almost 13 years ago

  • Target version set to 2.4.0

Also available in: Atom PDF