Question: In Python or R. Predict the pdc-80-flag using the following features pre-rx-cost,numofgen,numofbrand,generic-cost,adjust-total-30d, and num-er. Determine the accuracy rate for test set for k = 75

In Python or R.

Predict the pdc-80-flag using the following features pre-rx-cost,numofgen,numofbrand,generic-cost,adjust-total-30d, and num-er. Determine the accuracy rate for test set for k = 75 to 105 with a step size of 2 and report it in a table. Use linear normalization method to normalize the input features and Euclidean distance for distance measure. Note that you must use the training parameters for the normalization of test points.

There are two datasets:

healthcareTest.csv

healthcareTrain.csv

I just need code.

In Python or R. Predict the pdc-80-flag using the following features pre-rx-cost,numofgen,numofbrand,generic-cost,adjust-total-30d,

datal = pd. read_csv ('healthcareTest.csv) datal. head() patindex pdc num_ip_post total_los_post_num_op_postnum_er_postnum_ndc_post_num_gpi6_post adjust_total_30d_post generic_rate_post... 0 2 0.333333 0 0 4 0 15 5 14.466667 0.101382 1 5 0.866667 0 0 5 0 16 4 18.000000 0.888889 2 21 0.500000 0 0 0 0 8 6 8.000000 0.875000 3 22 0.977778 0 0 9 0 40 9 42.533333 0.835423 4 33 0.527778 0 0 6 0 28 7 28.000000 0.964286 5 rows x 96 columns > data2 = pd. read_csv('healthcareTrain.csv') data2. head() 1 patindex pdc num_ip_post total_los_post_num_op_post_num_er_post_num_ndc_post_num_gpi6_post adjust_total_30d_post generic_rate_post... 2393 0.500000 0 0 2. 0 3 2 2.333333 1.0 0 1 2148 0.994444 0 0 3 0 17 5 5 26.500000 1.0 2 1799 0.472222 0 0 3 0 3 1 3.000000 1.0 ... 3 636 0.166667 0 0 0 0 2 2 2.000000 1.0 4 114 0.944444 0 0 3 0 12 2 12.000000 1.0 5 rows x 96 columns datal = pd. read_csv ('healthcareTest.csv) datal. head() patindex pdc num_ip_post total_los_post_num_op_postnum_er_postnum_ndc_post_num_gpi6_post adjust_total_30d_post generic_rate_post... 0 2 0.333333 0 0 4 0 15 5 14.466667 0.101382 1 5 0.866667 0 0 5 0 16 4 18.000000 0.888889 2 21 0.500000 0 0 0 0 8 6 8.000000 0.875000 3 22 0.977778 0 0 9 0 40 9 42.533333 0.835423 4 33 0.527778 0 0 6 0 28 7 28.000000 0.964286 5 rows x 96 columns > data2 = pd. read_csv('healthcareTrain.csv') data2. head() 1 patindex pdc num_ip_post total_los_post_num_op_post_num_er_post_num_ndc_post_num_gpi6_post adjust_total_30d_post generic_rate_post... 2393 0.500000 0 0 2. 0 3 2 2.333333 1.0 0 1 2148 0.994444 0 0 3 0 17 5 5 26.500000 1.0 2 1799 0.472222 0 0 3 0 3 1 3.000000 1.0 ... 3 636 0.166667 0 0 0 0 2 2 2.000000 1.0 4 114 0.944444 0 0 3 0 12 2 12.000000 1.0 5 rows x 96 columns

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!