Giter Club home page Giter Club logo

getdata's Introduction

After reading the data and descriptive labels I used sapply to label the data

xtest=read.csv("test/X_test.txt", header=FALSE, sep="") xtrain=read.csv("train/X_train.txt", header=FALSE, sep="")

ytest=read.csv("test/y_test.txt", col.names=c("activity"), header=FALSE, sep="") ytrain=read.csv("train/y_train.txt",col.names=c("activity"), header=FALSE, sep="")

subjecttrain=read.csv("train/subject_train.txt", col.names=c("subject"), header=FALSE, sep="") subjecttest=read.csv("test/subject_test.txt", col.names=c("subject"), header=FALSE, sep="")

xall = rbind(xtrain, xtest) yall = rbind(ytrain, ytest) subjectall = rbind(subjecttrain, subjecttest) #as the extraction simplifies the data set I decided to merge test and train without adding them the subject and the activity #2 Extracts only the measurements on the mean and standard deviation for each measurement.

features=read.csv("features.txt", header=FALSE, sep="") names(features)=c("id","feature")

vectExtract = grep("mean\(\)|std\(\)",features$feature)

library("dplyr") xallExtract = select(xall, one_of(paste0("V",as.character(vectExtract)))) #3 Uses descriptive activity names to name the activities in the data set

activity_labels=read.csv("activity_labels.txt", col.names=c("id","label"),header=FALSE, sep="") yall$activity = sapply(yall$activity, function(x){as.character(activity_labels$label[x])})

xallExtract$activity = yall$activity xallExtract$subject = subjectall$subject #4 Appropriately labels the data set with descriptive variable names.

#as features have labels containing the characters ()- I suppress them using the gsub function. names(xallExtract) = c(gsub("[-\(\)\,]","",grep("mean\(\)|std\(\)",features$feature,value=TRUE)),"activity","subject") #5 From the data set in step 4, creates a second, independent tidy data set with the average of each variable for each activity and each subject.

tidySet = summarise_each(group_by(xallExtract, activity,subject),funs(mean), vars=-c(activity,subject))

change the names of tidy set to reflect mean computation

names(tidySet)[3:length(tidySet)] = paste(names(tidySet)[3:length(tidySet)],"mean", sep=".") write.table(tidySet, file="tidyset.txt", row.name=FALSE)

getdata's People

Watchers

James Cloos avatar Salahuddin avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.