Skip to content

HighTechnocracy/GCD_CourseProject

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 

Repository files navigation

#GCD_CourseProject

The files to complete the Course Project for Courera's Getting and Cleaning Data Course Project.

The file tidyData contains the means and standard variations of several measurements taken from 30 subjects as they performed six different activities (walking, walking upstairs, walking downstairs, sititng standing and laying). The original data was taken using the accelerometer and gyroscope in a Samsung II smartphone attached to the subjects' waist. Orginal data was transformed and cleaned using math I never could hope to understand, but I'm pretty sure very long equations with fancy names were involved. This data was broken across several .txt files (probably to be used as tables in some database somewhere) and placed online at:

http://archive.ics.uci.edu/ml/datasets/Human+Activity+Recognition+Using+Smartphones

The data was broken into "test" (30% of subjects) and "train" (70% of subjects) sets.

The R file "runanalysis.R" performs the following steps.

  1. Reads in the subject labels from the test and train datasets and combines them (using the rbind() function). It labels the vectors and removes the original variables.
  2. Reads in the activity labels from the test and train datasets and combines them. It labels the new variable and removes the originals.
  3. Reads in the test and train datasets as well as the table with the original variable names. It combines the two datasets with the subject and activity labels to form one large dataset and names the variables. It then removes the original objects.
  4. Coerce the first two variables ("subject" and "activity") into factor variables, subsets out variables with either "-mean" or "-std" in the name and combines these into a new (smaller) dataset.
  5. Using two for-loops, calculate the mean for each of the "mean" and "std" variables and store the results in as a new data.frame called "tidyData"
  6. Rename the variables with descriptive titles, rename the activities using descriptive titles, coerce "activity" and "subject" back into factors and remove all extraneous variables.

The tidyData data.frame has 180 observations (30 subjects performing each of 6 activities). Each observation contains 81 variables. Two are the subject and activity lables, which are the various calculated mean scores on means and standard deviations for each reading. Acceleration variables are in standard gravity units and gyroscope measures are in radians/second. The variable names follow:

##CODEBOOK

  1. tBodyAcc-mean()-X
  2. tBodyAcc-mean()-Y
  3. tBodyAcc-mean()-Z
  4. tBodyAcc-std()-X
  5. tBodyAcc-std()-Y
  6. tBodyAcc-std()-Z
  7. tGravityAcc-mean()-X
  8. tGravityAcc-mean()-Y
  9. tGravityAcc-mean()-Z
  10. tGravityAcc-std()-X
  11. tGravityAcc-std()-Y
  12. tGravityAcc-std()-Z
  13. tBodyAccJerk-mean()-X
  14. tBodyAccJerk-mean()-Y
  15. tBodyAccJerk-mean()-Z
  16. tBodyAccJerk-std()-X
  17. tBodyAccJerk-std()-Y
  18. tBodyAccJerk-std()-Z
  19. tBodyGyro-mean()-X
  20. tBodyGyro-mean()-Y
  21. tBodyGyro-mean()-Z
  22. tBodyGyro-std()-X
  23. tBodyGyro-std()-Y
  24. tBodyGyro-std()-Z
  25. tBodyGyroJerk-mean()-X
  26. tBodyGyroJerk-mean()-Y
  27. tBodyGyroJerk-mean()-Z
  28. tBodyGyroJerk-std()-X
  29. tBodyGyroJerk-std()-Y
  30. tBodyGyroJerk-std()-Z
  31. tBodyAccMag-mean()
  32. tBodyAccMag-std()
  33. tGravityAccMag-mean()
  34. tGravityAccMag-std()
  35. tBodyAccJerkMag-mean()
  36. tBodyAccJerkMag-std()
  37. tBodyGyroMag-mean()
  38. tBodyGyroMag-std()
  39. tBodyGyroJerkMag-mean()
  40. tBodyGyroJerkMag-std()
  41. fBodyAcc-mean()-X
  42. fBodyAcc-mean()-Y
  43. fBodyAcc-mean()-Z
  44. fBodyAccJerk-mean()-X
  45. fBodyAccJerk-mean()-Y
  46. fBodyAccJerk-mean()-Z
  47. fBodyAccJerk-std()-X
  48. fBodyAccJerk-std()-Y
  49. fBodyAccJerk-std()-Z
  50. fBodyAccJerk-meanFreq()-X
  51. fBodyAccJerk-meanFreq()-Y
  52. fBodyAccJerk-meanFreq()-Z
  53. fBodyGyro-mean()-X
  54. fBodyGyro-mean()-Y
  55. fBodyGyro-mean()-Z
  56. fBodyGyro-std()-X
  57. fBodyGyro-std()-Y
  58. fBodyGyro-std()-Z
  59. fBodyGyro-meanFreq()-X
  60. fBodyGyro-meanFreq()-Y
  61. fBodyGyro-meanFreq()-Z
  62. fBodyAccMag-mean()
  63. fBodyAccMag-std()
  64. fBodyAccMag-meanFreq()
  65. fBodyBodyAccJerkMag-mean()
  66. fBodyBodyAccJerkMag-std()
  67. fBodyBodyAccJerkMag-meanFreq()
  68. fBodyBodyGyroMag-mean()
  69. fBodyBodyGyroMag-std()
  70. fBodyBodyGyroMag-meanFreq()
  71. fBodyBodyGyroJerkMag-mean()
  72. fBodyBodyGyroJerkMag-std()
  73. fBodyBodyGyroJerkMag-meanFreq()
  74. angle(tBodyAccJerkMean),gravityMean)
  75. angle(tBodyGyroMean,gravityMean)
  76. angle(tBodyGyroJerkMean,gravityMean)
  77. angle(X,gravityMean)
  78. angle(Y,gravityMean)
  79. angle(Z,gravityMean)

License:

Original dataset used with permission by reference to the following citation:

[1] Davide Anguita, Alessandro Ghio, Luca Oneto, Xavier Parra and Jorge L. Reyes-Ortiz. Human Activity Recognition on >Smartphones using a Multiclass Hardware-Friendly Support Vector Machine. International Workshop of Ambient Assisted >Living (IWAAL 2012). Vitoria-Gasteiz, Spain. Dec 2012

This dataset is distributed AS-IS and no responsibility implied or explicit can be addressed to the authors or their >institutions for its use or misuse. Any commercial use is prohibited.

Jorge L. Reyes-Ortiz, Alessandro Ghio, Luca Oneto, Davide Anguita. November 2012.

About

The files to complete the Course Project for Courera's Getting and Cleaning Data Course Project.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages