User Tools

Site Tools


rvkde:usage

About

If you are new to RVKDE, we suggest to read the README first, and then this document. The README introduces using two wrapper scripts (kde-train.pl and kde-predict.pl) to execute RVKDE, which is very similar to the procedure of using the well-know LIBSVM. An alternative way to use RVKDE is directly executing the rvkde (or rvkde.exe on Windows system) binary executable file. There are two benefits by doing so:

  1. The usage of rvkde is much powerful and flexible.
  2. You don't need to install the Perl environment, which might be a little annoying on Windows system.

The following sections are written mainly for Linux users. The mapping of most operations are trivial on Windows system.

Preparation

  • Create a directory for this document. Here we use ~/tmp as an example.
    cd ~
    mkdir tmp
  • Download and install RVKDE.
    cd ~/tmp
    wget http://mbi.ee.ncku.edu.tw/rvkde/res/rvkde-current-linux32.tgz
    tar zxvf rvkde-current-linux32.tgz

Classify the satimage dataset

  • Change the current path ~/tmp. All following steps are supposed to execute in this directory.
    cd ~/tmp
  • Classify the satimage dataset. Notice that your RVKDE version might be different.
    rvkde-0.2.3-final/rvkde --classify --train -v rvkde-0.2.3-final/satimage.scale -m rvkde-0.2.3-final/satimage.scale.model --ks 10 # training
    rvkde-0.2.3-final/rvkde --classify --predict -m rvkde-0.2.3-final/satimage.scale.model -V rvkde-0.2.3-final/satimage.scale.t -a 1 -b 1 --ks 10 --kt 10 # testing
  • Classify the satimage dataset in one-step.
    rvkde-0.2.3-final/rvkde --classify --predict -v rvkde-0.2.3-final/satimage.scale -V rvkde-0.2.3-final/satimage.scale.t -a 1 -b 1 --ks 10 --kt 10

Parameter selection

Most machine learning tools provide some parameters for users. For example, the k in knn classification algorithm and the k in k-means clustering algorithm. From the optimistic view, these parameters provide flexibility and make machine learning tools more powerful. However, from another point of view, these machine learning techniques cannot determine (or learn) some parameters automatically so that users must specify by themselves.

RVKDE provides two alternative ways to do its parameter selection as described in the two following sections.

Cross-validation

  • Change the current path to ~/tmp. All following steps are supposed to execute in this path.
  • Cross-validation on satimage.scale.
    rvkde-0.2.3-final/rvkde --cv --classify --acc -n 5 -v rvkde-0.2.3-final/satimage.scale -a 1 -b 1,2,0.5 --ks 1,30,1 --kt 1,30

Let's take a look at the command.

–cv Switch rvkde into cross-validation mode.
–classify Tell rvkde we want to do classification rather than regression now.
–acc Use accuracy as the evaluation index.
-n Do n-fold cross-validation.
-v Followed by the dataset for cross-validation.
-a Set the range (begin, end and step) of alpha values of RVKDE. In this example, 1 is the only possible alpha value.
-b Set the range (begin, end and step) of beta values of RVKDE. In this example, the possible beta values are 1, 1.5 and 2.
–ks Set the range (begin, end and step) of ks values of RVKDE. In this example, the possible ks values are 1, 2, … 30.
–kt Set the range (begin, end and step) of kt values of RVKDE. In this example, the possible kt values are also 1, 2, … 30 since the default step is 1.
  • The result looks like
    [0.918602] a=1 b=1 s=8 t=21...

    Which tell us the best parameter combination is alpha = 1, beta = 1, ks = 8 and kt = 21. In addition, the accuracy under the best parameters is 0.918602.

Predict with the selected parameters

Now we have a parameter combination derived from cross-validation.

  • Use these parameters to predict satimage.scale.t.
    rvkde-0.2.3-final/rvkde --predict --classify --acc -v rvkde-0.2.3-final/satimage.scale -V rvkde-0.2.3-final/satimage.scale.t -a 1 -b 1 --ks 8 --kt 21

Let's take a look at the command.

–predict Switch RVKDE into prediction mode (rather than cross-validation).
-v Followed by the training dataset.
-V Followed by the testing dataset.
  • The result looks like
    [0.9175] a=1 b=1 s=8 t=21...

    It indicates that RVKDE can yield a accuracy of 0.9175 under this parameter combination when using satimage.scale to predict satimage.scale.t.

Train, validate, and then test

Another common procedure for parameter selection is to create an independent validation set. For example, you can use satimage.scale.tr and satimage.scale.val to do parameter selection and see how good the parameters are when applying on satimage.scale.t.

  • Use satimage.scale.tr (as training set) and satimage.scale.val (as validation set) to select the parameters.
    rvkde-0.2.3-final/rvkde --predict --classify --acc -v rvkde-0.2.3-final/satimage.scale.tr -V rvkde-0.2.3-final/satimage.scale.val -a 1 -b 1,2,0.5 --ks 1,30,1 --kt 1,30,1
  • The result looks like
    [0.913599] a=1 b=1 s=8 t=23...
  • Predict satimage.scale.t with the selected parameters.
    rvkde-0.2.3-final/rvkde --predict --classify --acc -v rvkde-0.2.3-final/satimage.scale.tr -V rvkde-0.2.3-final/satimage.scale.t -a 1 -b 1 --ks 8 --kt 23
  • The result looks like
    [0.917] a=1 b=1 s=8 t=23...

    It reveals that RVKDE yields very close accuracies with these two parameter selection schemes.

rvkde/usage.txt · Last modified: 2008/08/19 12:21 by dirty