ctesta01 / qualtricstools Goto Github PK

Using R, Shiny, Pandoc, JSON, CSVs and more to automate processing Qualtrics surveys

R 98.38% CSS 0.39% JavaScript 1.23%

qualtricstools's Introduction

QualtricsTools

This is no longer the main repository for the QualtricsTools project. Check out the continued work on this project in the emmamorgan-tufts/QualtricsTools repository.

QualtricsTools is an R package that automatically processes Qualtrics survey data into reports breaking down the responses to each question. The package creates reports that summarize the results of closed-ended questions, compiles appendices of open-ended text responses, and generates question dictionaries that describe the details of each survey question. It also can generate reports for subsets of respondents based on their response data. This package uses the R web-application framework Shiny, the universal document converter Pandoc, Roxygen2 documentation, and much more.

This package was developed for Tufts University's Office of Institutional Research and Evaluation. Anyone is welcome to use it.

Installing and Running the Shiny App

Before installing, you must install R or Rstudio, devtools, Rtools (if you're on Windows), and Pandoc. For Rtools with Windows, please make sure Rtools is added to the Path environment variable. You need to install.packages('devtools') or have already installed the devtools package in R. After installing each of the prerequisites, to install QualtricsTools run the following in R:

devtools::install_github("ctesta01/QualtricsTools")

The QualtricsTools package includes a suite of functions to help you analyze Qualtrics data in R. Most of the package can be used on the command line in R. However, the simplest way to create basic reports of your Qualtrics data is to use the QualtricsTools Shiny app. The app includes an interactive user-friendly interface that lets you select your survey and data file, and generate reports (frequencies for closed-ended questions and compiled text responses for open-ended questions) for the entire respondent group and/or subgroups.

To run the Shiny app, load the package and then call the app() function.

library(QualtricsTools)
app()

The QualtricsTools Shiny app should now be running! Enjoy. To update your version of QualtricsTools to the most recent version, run devtools::install_github("ctesta01/QualtricsTools") again.

Most Useful Functions

Here are some of the most high level functions in the application. Be sure to library(QualtricsTools) before trying to run any of these. Each function takes a series of parameters (e.g. survey .qsf file, response .csv file, output directory). Running these commands without parameters as shown in the code below results in interactive prompts for the survey data and other settings in order to cut down on the need to repeatedly type or copy long file paths. For more details about each of these functions and their arguments, check out their documentation: get_setup, make_results_tables, and make_text_appendices.

# Load and Process Survey Data into R
get_setup()

# Start and run the Shiny app
app()

# Create a Report of Question Results Tables
make_results_tables()

# Create a Report of Text Appendices, for each free response part of the survey
make_text_appendices()

Usage and Reference Guides

The functionality of the web application and R package are documented in the following guides. Beyond this, almost all functions have Roxygen generated documentation which means that after running library(QualtricsTools) you can run help(function) or ?function on any function in QualtricsTools to check out the Roxygen2 generated documentation.

Frequently Asked Questions

Check out our FAQ for more help.

qualtricstools's People

Contributors

Stargazers

Watchers

Forkers

emma-morgan jasonpcasey svancomstock

qualtricstools's Issues

Response Col's Choice Text in Reshaped Responses

When somebody reshapes data to a long-and-lean format using lean_responses() they have the option to include what's called "panel data". This is our terminology for response columns that they would like to be included in every response, as if instead of representing a respondent's answer to a question, they represented part of the information about the respondent.

When that data gets reshaped, we need to be sure that if there is a choice the response column being included as panel data corresponds to, that it is included as part of the column name above the "panel data" columns.

Results Tables Document needs a Header

There should be a header to the document that is exported for download. (the results_tables.docx).

It needs to have the following components:

Name of Survey ( with "- [Specific Respondent Group]" if applicable)
of Respondents
Date of Generation

This is probably best done only in the docx file for the results_tables, not displayed in the Shiny app.

Ordering of text-appendices and results tables

These parts of the app are both ordered by the survey block sections, but out of order within the blocks.

choice_text_from_question has bugs

Sometimes it mis-categorizes a question type as Multiple Answer (like a "other: please specify" text entry component)
There are multiple question types that it should be able to respond with choice text for, but is failing.

Separating TE in Lean Reshaping

Emma suggested that we might want to split up the qualitative and quantitative responses when exporting sheets / CSVs for Tableau. That would mean separating the columns which are free responses away from the numeric responses in the output CSV data. She suggested this because otherwise the factors can become unwieldy in Tableau.

parameters in split respondents

I think the split respondents reactive blocks don't pass the original_first_rows or flow arguments to the html results functions.

use or don't use flow

since the flow only works on basic surveys, and the dining survey has these randomized survey flow blocks, there really needs to be an option in the app to process the survey without using flow

some functions need more documentation

Unfortunately I'm not a documentation-first writer.

These functions need to be better documented and commented:

split_side_by_sides in reorganizing_survey_data.R
server.R in inst/shiny/
ui.R in inst/shiny/
blocks_header_to_html in helper_functions.R
number_of_blocks in helper_functions.R

response type in qdict is wrong sometimes

Data Subsetting Functionality

Report generation is great, but sometimes a report for only some of the responses is necessary.

Ideas for how to subset data:

"Subset Responses" button in the File Upload menu
Response Column Dropdown
"Choose which responses should be included in the subset:"
Selectize input for which responses in the response column should be included in the subset data
"Subset with another column" button, appending the above 3 input fields to the subset responses panel again, for use with another response column.

Tabelizing Results Tables' Question Descriptions

The results tables' questions descriptions should be tables, instead of plaintext, so they're easier to format.

What should those tables look like and include?

Question: [DataExportTag]
[Question Text]
[Display Logic]

... what else? question type(s)?

text appendices and results tables need horizontal scrolling

data tables have a options(scrollx=TRUE) parameter that can be included, but for the rest of shiny dashboard, I haven't found any way to enable horizontal scrolling.

Missing install instruction

Hi,
to be able to install QualtricsTool I need to execute:

devtools::install_github('wleepang/shiny-directory-input')

before I can execute:

devtools::install_github("ctesta01/QualtricsTools")

Maybe you can add that to the tutorial.

Thanks

html_to_docx needs output_file parameter

right now the html_to_docx just returns a filepath to a docx file that's been generated in the computer's temporary folder.

html_to_docx should have the optional ability to move the file after generation to a directory given by a parameter, and if possible, also rename it.

html_2_pandoc needs output_dir

this function really needs an output_dir parameter

it just isn't usable without it

Revamping Logic

Last Wednesday (Feb 8th) Emma and I decided that we should convert the logic into something that resembles the Text Appendices notes on the question.

Idea: Maybe we should rename the display_logic -> Survey Logic.

I am also not sure that the "display_logic" includes everything it should. It needs to include at least: Survey Logic, Skip Logic, Display Logic, Survey Flow, and there's probably more I'm missing. I know that it definitely includes Survey Logic, Skip Logic, and Display Logic, but Flow might not be incorporated right now. That might also be more difficult, because Qualtrics' Flow in the QSF is not very legible.

coding NA-like questions

Sometimes analysts use "-1" as the RecodeValue for an option like "No Opinion" or "None of the Above" in the responses to a question.

The results tables for questions that includes these kinds of options should be calculated differently...

multiple choice questions: the denominator should not include the NA-choosing respondents.
matrix questions: "N of those not choosing NA", "Percent", and "Total N" are the minimum columns needed to properly convey the information.
...

don't split side-by-sides before splitting survey

Side-by-Side Questions, my old nemesis.

I believe that in the sequence of functions that is called by running get_setup() that side-by-side questions are split into independent questions and replace the original (for example, a double side-by-side question would be turned into two questions and the original would be removed). I believe that this conflicts with the process for splitting reports across survey respondents, but I am not entirely sure.

"data must not be null" error

The "data must not be null" appears quite often in the Shiny application. I think it has something to do with when the generate_results() function or one of the individual question results generation functions runs on a question with no responses, or completely empty responses.

Will update soon with more details.

Include Original Panel Data Checkbox

In the Reshape Data menu, there should be a checkbox for including all the original panel data from the CSV. How will I determine what the original panel data is?

I think going from the first column up to (but not including) the first column that is recognized as a response column to a question will get me the panel data fairly accurately. I am still ruminating on this.

More Sample Usage Wiki Pages

How to split respondents by their responses or panel data.
Question Results Generation
How are side-by-side questions handled?
An in-depth walkthrough of the Shiny application sequence of reactive events.
include in How It's Made: Results Tables a section on NA-like questions

numbers aren't rounding up

0.25 is rounding to 0.2 instead of 0.3... This is R's default

All / Uncodeable Toggle in Question Dictionary

I think the uncodeable_dictionary function is pretty cool, and helps my development of the application in a lot of ways that would help an analyst to create a report faster.

The problem is having the uncodeable_dictionary exclusively in R, whereas the analyst might be working in the browser and elsewhere with the Shiny app running.

I think a toggle button between seeing "All Questions" and "Uncodeable Questions" makes a lot of sense for within the Question Dictionary tabPanel in the UI.

Sending Blocks and Responses as args to lean_responses

the lean_responses() function uses the blocks and responses and automatically gets them from global scope.

Ideally, we should be able to pass them inline so that somebody with a lot of different surveys they wanted lean_responses for could generate lean_responses for each of them effectively.

Legacy to Insight: Pluralize orig_header_row -> orig_header_rows

This doesn't necessarily need to be done, it would just be nice style.

When I was originally writing code for Legacy, there was only one extra header row in the CSV data. Therefore, when I stripped it out from the CSV response data, I called it the "original header row" or some variant of that. Now, with "Qualtrics Insights," there are three headerrows and so this pluralization should be reflected in the codebase. However, I think that because I only had to transition from handling a 2xN to 3xN dataframe, and there was no change of data structure in that update, that the problem is mostly aesthetic.

add block headers

block headers need to be added so that the blocks can be differentiated.

matrix_multiple_answer_get_results makes assumptions about ordering

the matrix_multiple_answer_get_results function assumes that the response columns matched to the question will be ordered such that the columns can be taken in sets with however many choices there are in the question and then laid on top of one another to create the original data frame.

If the columns are not exactly in order, this will not work. At all.

Re-ordering and matching to specific sub-question and choices needs to be implemented, but later.

Exclude/Include Specific Question(s) Capability

Perhaps only in the library of functions, perhaps in the Shiny app as well, excluding or including a specific question could be useful to an analyst that wants to control the output of the application. It would be easier to hit a couple buttons in the Shiny app than it would be to delete an overwhelmingly long output to a question that wasn't even originally wanted...

Some ideas on how to do this visually:

A popup with input, where the user can choose each question they want to define, and then for each question there is a dropdown for "Exclude/Include". (Maybe easier is all the questions come up, and there's a "Exclude All" "Include All" button at the top for ease of use).
A selectize input for each question, with an additional dropdown applying to that input for a choice between their selection representing the "Exclude only the following questions" choice or the "Include only the following questions" choice.

Include Notes in Question Description

The reports' question descriptions need to include at the bottom any user included survey notes.

sometimes text appendices have extra lines

Every text appendix seems to have these extra lines...

Not sure what happened here.

Shiny app Include/Exclude not working for side-by-side questions

The Shiny app appears not to work for side-by-side questions. I'm not sure if this is a major issue right now and will probably wait until there is a need to do this

choice text for question-component text entries

Text appendices that are an appendix for a part of a question (as opposed to an only text entry question) need to include the choice text from the question that they correspond to in their "question text" header row.

SBS Question Result Generation

Of the question types not supported, I heard multiple individuals express that Side-By-Side questions were particularly important. Building out support for SBS questions should be a goal, even though they pose some challenging problems.

"See Appendix" message for choices with text entry

Choices need to be appended with "See Appendix" if they have a text entry component

text_appendices_tables needs rows, not row

original_first_row gets passed to text_appendices_table... that should be rows instead.

the import tags in the third row contain the string "TEXT" more reliably and accurately.

additionally, add a should_use_ofr value and code block in the text_appendices_table

More testing needs to be written

A good order to start in testing:

Skip Logic and Survey Flow need some love

Right now only display logic, the simplest to parse of the three, is included in the output of the Shiny app or any part of the library. Output of all three should be made accessible.

Code Review, Please

Please Review My Documentation and Code

I would like to ask that if you have some amount of time you could invest in the QualtricsTools project that you help me review some of the documentation, code, or guides that I have written.
I am including a sample of suggested material that I would love some feedback on. Please pick a single item, or a couple if you'd like, and let me know how I could improve them.

Specific questions you might keep in mind:

Does the documentation explain why you might need that function/parameter/item?
Do you understand why we need the function, code, logic, or web-app-component that you're reviewing?
Do you see relevant functionalities that are not explained/mentioned somewhere they should be?
Is the code clear, readable, concise, etc?

Extra bonus points:

Could the code be written such that it runs faster?
Can the code be simplified/improved/reduced?

Documentation:

Easier:

The QualtricsTools package README.md
- The README in a package is the recommended first location that new users look when viewing the package on GitHub or downloading it for the first time. It should be concise and informative, with answers to the most easily anticipated questions someone might have downloading the package for the first time.
Frequently Asked Questions
Generating Results Tables
Generating Text Appendices
The Shiny App : Explaining the App Components
- For an easy task, just look at the first section of this page. The "Understanding the Code" section is much more technical, and listed below as a harder task.
Appendix of Qualtrics Terms

Harder:

Code:

Easier (Still Subtle, though):

Harder:

Lean Reshaping cuts off comments

Even though the text appendices don't seem to be cut off, it seems like the contents of the text entry responses is cut off after using the Lean Reshaping script / guide.

Emma suggests that it has 'something to do with commas'...

non-black factors only for splitting_respondents

When you run the splitting_respondents operation on a survey and respondent set, the factors that the respondents could be split into often includes a "" factor, even when the data doesn't.

Add "install perl" Instructions

Perl is requisite for the readxl package, which is depended on in the Tableau reshaping guide. Also, installing Perl on Windows is not the easiest. Similar to getting the Rtools path into the Windows Path environment variable, the user needs to make sure that Perl is also in the Path variable.

Steve's Issue, Appendices not appearing

For a particular survey, a couple of single-answer multiple choice (radio button) questions that had multiple options for text entry produced an error (“question could not be processed automatically because the CSV data does not separate the responses for each text entry component of this question”). And as a result, the text appendices actually start at Appendix V. If you go into Include/Exclude, and just exclude the two problematic questions, the rest of the appendix appears.

all assumed defaults need to be changed for Insights

there are a bunch of defaults in the library, some of which are specific to pre-Insights Qualtrics. Moving to Insights-ready defaults is a medium-priority.

Some of these defaults are built into the programming, like conditional logic, like which if-case is checked first, and some of them are actual values which have defaults set to them when the user does not specify their values.

For now, this issue will serve as a list of things to be updated later.

headerrows = 2 in the csv loading process
headerrows = 2 in the respondent splitting
headerrows default input in shiny app
headerrows = 2 in the get_setup() function
explanation of headerrows in Splitting Respondents wiki page

Improving lean_response Reshaping

The files are too big. To reduce file-size, we will change to exporting three separate data frames that can be merged together as needed. The three dataframes described below by their columns will be exported to an Excel workbook with 3 worksheets.

lean responses

respondent ID
question response column
raw response
coded response

question dictionary

question data export tag
question response column
question text
question type 1
question type 2
question type 3
response type

panel data

respondent ID
optional: original panel data columns, included verbatim from response file
optional: questions -> panel data columns, with a raw response and coded response column

Download All as Zip

This might be a useful button? Need to mull this over more.

text entry questions should be excluded from lean responses

tableau doesn't handle text very well, so the lean responses shouldn't include text entry questions.

this can be fixed by adding some conditionals to the lean_responses() function in reorganizing_survey_data.R