on 2018 Sep 27 7:24 AM
Hi,
I am performing regression using PAL Random decision tree algorithm. I have 24 features and 25k rows. The Random decision tree algorithm predicts the same result for all the test data. My test data varies from (1.7 to 4 ) but the prediction results are all 2.72486 . I have tried reducing the depth of the trees fro unlimited to 7 and have also played around with the number of trees parameter but it did not produce any difference.
Following are the variables in my parameter table -
INSERT INTO #PAL_PARAMETER_TBL VALUES ('HAS_ID', 1, null, null);
INSERT INTO #PAL_PARAMETER_TBL VALUES ('TREES_NUM', 100, NULL, NULL);
INSERT INTO #PAL_PARAMETER_TBL VALUES ('TRY_NUM', 3, NULL, NULL);
INSERT INTO #PAL_PARAMETER_TBL VALUES ('MAX_DEPTH ', 6, null, NULL);
INSERT INTO #PAL_PARAMETER_TBL VALUES ('SEED', 0, NULL, NULL);
INSERT INTO #PAL_PARAMETER_TBL VALUES ('SPLIT_THRESHOLD', NULL, 1e-5, NULL);
INSERT INTO #PAL_PARAMETER_TBL VALUES ('CALCULATE_OOB', 1, NULL, NULL);
INSERT INTO #PAL_PARAMETER_TBL VALUES ('NODE_SIZE', 500, NULL, NULL);
INSERT INTO #PAL_PARAMETER_TBL VALUES ('THREAD_RATIO', NULL, 1.0, NULL);
The following image shows the predicted result on the left and the confidence on the right. Notice all of them are same.

The actual values are -

Why is this happening and how to avoid this ?
Thank you.
Request clarification before answering.
| User | Count |
|---|---|
| 15 | |
| 7 | |
| 4 | |
| 2 | |
| 1 | |
| 1 | |
| 1 | |
| 1 | |
| 1 | |
| 1 |
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.