(by E. Valencia, May 2014)
The script run_analysis.R creates 2 tidy datasets from the inertial data files.
To run this script, the 'UCI HAR Dataset' uncompressed folder should be at the same folder than the script.
The output are 2 new tidy datasets in independent tab-separated text files named 'dat.txt' and 'dat2.txt', corresponding to the datasets required in steps 4 and 5 of the project.
The script performs the following actions in order to get to the resulting datasets:
-
Read data files from test observations: X_test.txt (data), subject_test.txt (subject number for each measurement), and y_test.txt (activity number for each measurement).
-
Create a 'test' data frame by combining the three data pieces from the previous step.
-
Read data files from train observations: X_train.txt (data), subject_train.txt (subject number for each measurement), and y_train.txt (activity number for each measurement).
-
Create a 'train' data frame by combining the data pieces from the previous step.
-
Combine the train and test data frames into a singe data frame.
-
Read the 'features.txt' files to obtain the name of each measurement, and use them no name the data frame columns.
-
Keep only the columns that have 'mean()' or 'std()' in its name.
-
Read the 'activity_labels.txt' file, and replace the activity numbers by its logical name (i.e. text).
-
Write this first tidy dataset into a tab-separated text file named 'dat.txt'.
-
Split the data frame by activities and subjects.
-
Compute the average of each measurement (i.e. column) and combine the resulting rows into a data frame.
-
Write the final data frame into another tab-separated text file named 'dat2.txt'.