Now, suppose you want to chose k random records from this array, this is how you proceed.
Procedure:
Given n, generate a random permutation of {1, 2, ... n} as {1', 2', ... , n'} and pick first k numbers from the permuted array.
For example, say you want to pick 3 numbers randomly from 8, s.t. n = 8, k = 3.
1. {1, ... 8} ===> {2,3,1,7,6,4,8,5} ===> {2,3,1}
2. {1, ... 8} ===> {8,4,3,1,6,5,7,2} ===> {8,4,3}
Thus, once you have k such indexes, you just need to probe your original array with these indexes and get corresponding k random records.
The Java code snippet is as shown:
The output as expected is :
randomly selecting 3 out of them
1 a
7 g
5 e
randomly selecting 5 out of them
6 f
5 e
7 g
8 h
2 b
I needed this for:
I had to partition dataset containing some @1,000 records, for cross-validation. It is a method of training a classifier when you have small number of training records. In this, you will first decide how many folds of dataset you want to make, say 5-fold, then 1,000 records will be partitioned in the ration 4 : 1 such that every time 4/5 th of the data records will be used for training and remaining 1/5 th for testing (validating in this case). Thus, I needed to randomly pick 4/5 * 1000 of my records every time, for which I wrote above piece of code.
Note 1: Not listing the code, but adding image so that, whoever wants to try it, will actually type in the things, and thus they will never forget the trick! :)
Note 2: Reference - Weka software
No comments:
Post a Comment