Question: Software and Data: The accelerometer file contains approximately 3.5 million rows. Your analysis and investigations should rely on R code. In the raw data file,

Software and Data: The accelerometer file contains approximately 3.5 million rows. Your analysis and investigations should rely on R code. In the raw data file, please notice that we have data collected from 9 users (identified only as a, b, c, d, e, f, g, h, and i) who wore the different watches while engaging in the activities for a period of time. Hence, each user wore 4 different watches which recorded 3 dimensions of accelerometer data while the activities were on-going.

Because the data are generated through continuous time, we do not want to randomly select rows for our Training and Test subsamples. Instead of randomly dividing the data into two partitions, for this project you should randomly choose 6 of the 9 users (67%) as the Training subjects, and the other three as the Test subjects.

Your model may rely on any combination of the appropriate available variables. In addition, there is also a csv file called "HARnew". This file contains 311 time-sequenced observations of one individual engaging in several (fewer than six) activities. After you develop your model, run these new observation through the model to identify the activities corresponding to each of the rows of data. The columns marked "a_x", "a_y", "a_z" are the 3 dimensions of accelerometer data.

The Challenge: Using the observations from the two LG watch models, build a "Human Activity Recognition" (HAR) model that uses some or all of the measurement variables to classify the activities of the person.

You should develop at least four models and evaluate those four models by creating confusion matrices and calculating the misclassification rates for each; your report should present the performance evaluations and explain why you have chosen the model you selected.

Finally, identify the activities that you believe the individual was engaging in, and indicate which of the 311 rows of the new data table correspond to each activity (e.g. "rows 100-150 indicate that the subject was dancing")

Deliverable: The R Markdown document should be a technical document including: Relevant R code that you used. Use comments and prose liberally to annotate the code as well as the output. Selected graphs, tables, or statistics summarizing the models you investigated and supporting your final choice. You should include confusion matrices and a comparison of misclassification rates. Output that shows how you used the model to classify the 311 "new" observations.

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Mathematics Questions!