Created a R script called run_analysis.R that does the following.
- Merges the training and the test sets to create one data set.
- Extracts only the measurements on the mean and standard deviation for each measurement.
- Uses descriptive activity names to name the activities in the data set
- Appropriately labels the data set with descriptive variable names.
- From the data set in step 4, creates a second, independent tidy data set with the average of each variable for each activity and each subject
Steps:
- Set local working directory using the "setwd" command to the folder that has the UCI HAR Data set.
- Install required packages (data.table, plyr, reshape2)
- Use "read.table" to read the training and test data
- Merge the files from step 3 into one data set. The columns are subject, activity (numeric code) and the associated data for each subject and activity
- Names each of the columns of the dataset. Extracted the features data set to use as the column names for the data
- Extracted the means and standard deviations from the data set
- Converted the activity numbers to activity labels
- Summarized the data by subject and activity and calculated the mean data vaules for each
- Wrote this summary data to a text file (tidy.txt)