Overview:
Advanced (previously called “Clean & Define” in QuenchTec’s Survey Design tool) – useful for creating a data map, filtering a dataset, creating new Questions, Splitting data.
The Clean and Define tool can be used for following operations:
- Select and/or create the questions that the new dataset will contain.
- Set values to respondents and fill questions with data based on expressions and optional logical expressions that restrict which respondents the operation will affect
- Perform sample weighting
- Split data
The Advanced Data Operation and its operations are described by XML code where the root element is called cdef and the operations are called structure, groups and weights. Any of the operations can be excluded but, if using more than one operation, they have to appear in the sequence above.
How to access and use Advanced:
Overview
Data Operations are a data workflow tool where you can add to, edit, and manipulate your data so that your data tables and variables display the way you need them to in Analysis & Reporting.
Using Data Operations, you can append, match, merge, and split out your data in a number of different ways.
Accessing Data Operations
Step 1: Data Operations can be accessed by clicking the “Add” button at the top of each individual project folder:
Step 2: Once Data Operations is selected, you’ll need to give the file a name and click “Create”
It will then open a blank workflow template for you to work in.
Step 3: Select the tool that you want to use by clicking on either of the “+” signs:
Step 4: Select the tool type that you want to use (Append, Advanced, Match).
Step 5: Give the tool a name and, if desired, tag it and/or add a description.
Step 6a: Click on “Input” and select the two datasets that you want to work with:
Step 6b: If required you can filter your datasets on cases that meet your required criteria. To do this, click on the “Select” button next to one or both of the Filter rows:
This will open the filter builder where you can set-up your required filter, by entering the required variable(s) (section 1, in the screen shot below). and answer choice(s) (section 2, in the screen shot below). You have the choice of doing this visually in the Visual section (3, in the screen shot below) or you can edit the filter in the Advanced section (4, in the screen shot below). The View total section (5, in the screen shot below) will tell you the number of cases that will be included in your filter. When you’re finished, click “Save”:
Step 7: Click on “Output” select the location where you want to save your files and give the dataset a name and description (optional). Then click “Create”:
Step 8: once the advanced feature is set-up, you will have a blank template to add your XML as required.
An example might look something like this:
When you’re finished, click the “Run workflow” button in the top right.
At the bottom of the screen, you’ll see a message that says: “Processing workflow” and when it’s finished, it will pop-up with a messaged that says: “Finished processing.”
Both your workflow and new dataset will be in your project folder ready for use:
Additional features within Advanced
The section above outlined the basics in terms of how to access and use the Advanced feature in Data Operations. In this section, we’ll cover a lot of the key functionalities within Advanced.
The structure of the Advanced Data Operations:
<cdef> <structure>...</structure> <groups>...</groups> <weights>...</weights> </cdef>
Operations
Structure Operations
A Structure operation consists of two sub operations, Select and Create, which can be combined in any order. These operations will select a subset of all questions and add new questions that later can be filled with data.
Select selects and/or rearranges questions from the input dataset and optionally assigns new identifiers (id/qno), transposes, concatenates or converts questions or combines components (logical OR). Create creates new questions. If any of these operations are used, the output dataset will contain only the selected or created questions and subsequent groups and weights operation will only be able to use these questions. The Structure operation should be omitted if you want to keep all questions.
Examples
All examples assume we have a dataset with questions having labels Q1 to Qn, where n is the number of questions.
- Select one or more questions and rearrange the order of questions.
Select Q1, then Q6, then question Q3 to Q5:
<select a=”\Q1,Q6,Q3,,Q5” />
- Select one or more sub questions and rearrange the order of sub questions.
Select sub question B, then E, then A. All other sub questions in Q1 are removed:
<select a=”\Q1.b,e,a” />
- Select one or more rows and rearrange the order of rows.
Select the third row, then the first and then rows 4 to 8. Rows 2 and 9 are removed if the question has 9 rows.
<select a=”\Q1.3,1,2,4,,8” />
- Select one or more columns of a grid question and rearrange the order of columns.
Select, from row 1 to 5, first column 3, then 2, then 1, then 4 to 6:
<select a=”\Q1.a.1,,5[3,2,1,4,,6]” />
- Select and insert empty rows and columns.
When inserting rows and columns, new code sequences are generated (1, 2, 3…).
Select the first 3 rows, then insert an empty row, then select column 1 to 3 and finally insert an empty column:
<select a=”\Q1.a.1,,3,![1,,3,!]”/>
- Transpose. Create a question with multiple sub questions from a grid question. Each row in the grid will form a new multi or single sub question. Each column in the grid will create a new row.
<select a=”\Q5.a” transpose=”true” />
- Convert. Create a question with multiple multi sub questions from a grid. Each grid column will form a new multi sub question.
<select a=”\Q3.a” convert=”true” />
- Convert. Create a multi grid question from multiple single/multi questions. Each sub question will form a grid column.
<select a=”\Q4.a,b,d” convert=”true” />
- Concatenate rows from many questions into one question.
Create a new question with label Q1 with all rows from Q2 and Q4 appended. New code sequences are generated for multi and single questions (1, 2, 3…).
Rules:
All questions must have the same structure (same type, number of sub questions, and number of columns). qtext and subq are kept from the first question. All rows and row comments are appended from the rest of the concatenated questions
<select a=”\Q1,Q2,Q4” concat=”r” />
- Concatenate sub questions from many questions into one.
Create a new question with all sub questions from Q3 and Q5 added to Q1. Finally set the question label to newId.
Rules:
All questions must have the same number of rows. Row texts are taken from the first question.
<select a=”\Q1,Q3,Q5” concat=”s” id=”newId”/>
- Combine rows.
Select row 1, then 3, then ‘OR’ row 4 to 6 into a new row, then select row 8, then ‘OR’ row 7, 9 and 10 into a new row.
Rules:
Single/Multi (n/m)
with a single question (n) this could lead to multiple answers. Q-bit is left as is, R-bit is set if at least one answer is in final output. A new code sequence is created (1, 2, 3…).
Grid single/multi (rn/rm)
with a single grid question (rn) it is very likely that the result will be multiple answers. Q-bits should be an ‘OR’ of each ‘OR’ed row’s Q-bits. R-bit should be set if at least one answer in the row.
Numeric/Time (f/h)
with a numeric or time question (f/h) values from each row are added. Q-bits should be an ‘OR’ of each ‘OR’ed row’s Q-bits and the same with the R-bits.
Common
text for a combined row is constructed as a string with a copy of the ‘OR’ operation performed. E.g. from the example, texts for the two created rows are ‘4:5’ and ‘7;9;10’
<select a=”\Q1.a.1,3,4:6,8,7;9;10” />
- Combine columns.
Select column 2, then 1, then ‘OR’ column 3 and 4, then ‘OR’ column 7 and 10.
Rules:
text for a combined column is constructed as a string with a copy of the ‘OR’ operation performed. E.g. from the example, texts for the two created columns should are ‘3:4’ and ‘7;10’
<select a=”\Q1.a[2,1,3:4,7;10]” />
- Combine single/multi sub questions.
The new sub question will be based on the first sub question and the type will be set to multi (m). All rows will be joined (version 1).
Q-bits are calculated as an ‘OR’ of the sub questions’ Q-bits. R-bit is set if at least one answer in the new sub question. A-bits are calculated as an ‘OR’ of the A-bits in each row.
Rules:
all texts are from the first sub question. In version 1, all rows will be joined and it is not possible to select a subset of rows.
<select a=”\Q1.b:c” />
- Create new questions
A Create operation can contain one or more <ques> elements (uses the same syntax as qdoc).
<create> <ques id="L1"> <subq type="m"> <stext><text lang="en">Which brands do you know?</text></stext> </subq> <rgroup> <r><text lang="en">Volvo</text></r> <r><text lang="en">Saab</text></r> <r><text lang="en">Ford</text></r> </rgroup> </ques> <ques id="L2"> <subq type="rn"> <stext><text lang="en">How would you rate</text></stext> <hgroup> <h><text lang="en">Good</text></h> <h><text lang="en">Bad</text></h> </hgroup> </subq> <rgroup> <r><text lang="en">Volvo</text></r> <r><text lang="en">Saab</text></r> <r><text lang="en">Ford</text></r> </rgroup> </ques> </create>
Select and Create can be combined in any order. Questions and components can also be part of many operations. For example, a question with two sub questions can be split into two questions and the same question can later be part of another operation:
<structure> <select a=”\Q1.a” /> <select a=”\Q1.b” /> ... <select a=”\Q1,,Q3.a” /> <create> ... </create> </structure>
Group Operations
A Group operation contains one or many expression groups and each group contains one or many Clean/Define expressions. An expression group can have an optional filter describing which respondents that will be affected by the expressions within the group.
Three types of Clean/Define expressions are possible to define within a group:
- Set expression
Sets new values from an assignment expression (action). - Logical define expression
Fills categorical questions with data created from a logical expression (action). - Arithmetical define expression
Fills quantitative questions with data created from an arithmetical expression (action).
Each of these expressions can have an optional filter (condition) that will restrict the operation to respondents that satisfy the expression.
When combining group filters and expression conditions the following table shows the possible logical combinations.
Group filter | Expression condition | Result |
True | True | True |
True | False | False |
False | True | False |
False | False | False |
Not set | True | True |
Not set | False | False |
True | Not set | True |
False | Not set | False |
Group Filter
Filter
[!]\QuesList.SubqList.RowList = ValueList
A normal filter expression including logical operators, parenthesis, intervals, lists, special codes and frequencies.
Set Expression
Condition - If
An optional condition can be set that will restrict the operation to respondents that satisfies the If expression.
Syntax
See Filter
Action - Then
An assignment expression that will set values (codes, quantities or special codes) to each respondent that satisfies the optional group filter and condition.
Syntax
[!]\QuesList.SubqList.Rowlist = [!,]ValueList
Or
\QuesList.SubqList.Rowlist = SpecialCode
Left hand side
A Question/Sub question/Row reference including intervals (,,) and lists (,). An exclamation mark states that all values on the right hand side will be removed instead of set.
Right hand side
A Special code or a value list containing one or many values that will be set. An exclamation mark states that all values (answers) will be removed prior to setting the given values.
Special codes:
!? -> Not asked
- -> Not answered
!¤ -> Non-existence
Value list
v -> Single value (code or quantity)
v,v,…,v -> Multiple values for M/RM.
!,v,v,…,v -> All answers will be removed prior to setting the new values (M/RM). The special case ”!,v” for all other data types is equal to “v” since only one value is possible.
Examples
- Set \Q3.b to 10 for each respondent that satisfies (\Q1.a=1 AND \Q2.a.1=13:19)
<group filter=”\Q1.a=1”> <set condition=”\Q2.a.1=13:19” action=”\Q3.b=10” </group>
- Remove all answers from \Q4.a.1 to \Q4.a.10 and then set \Q4.a.1 to \Q4.a.10 to 2, 3 and 4 for each respondent that satisfies (\Q10:Q12.a:b.1=1 | \Q13.a=5)
<set condition=”\Q10:Q12.a:b.1=1|\Q13.a=5” action=”\Q4.a.1,,10=!,2,3,4” />
- Set \Q1.a as ‘not asked’ for all respondents
<set action=”\Q1.a=!?” />
- Remove code 1 from \Q1.a for all respondents
<set action=”!\Q1.a=1” />
- Set all components in Q1 to Q5 to not answered
<set action=”\Q1,,Q5=-“ />
- Set all components in sub questions a, b, c and f in Q1 to not answered
<set action=”\Q1.a,,c,f=-” />
- Set rows 1, 10 and 12 in all sub questions in Q1 to not answered
<set action=”\Q1.1,10,,12=-“ />
Logical Define
Condition - If
An optional condition can be set that will restrict the operation to respondents that satisfies the If expression.
Syntax
See Filter
Action - Then
A logical expression that fills a categorical question with evaluated data (true/false).
Syntax
See Filter
Location
A reference to the categorical row that will have its data filled or updated.
Examples
- Evaluate \Q1.a=1 and fill the second row in \Q2.a with true/false values
<logic action=”\Q1.a=1” location=”\Q2.a.2” />
- Evaluate \Q3.b=10 for each respondent that satisfies (\Q1.a=1 AND \Q2.a.1=13:19) and fill the first row in \Q10.a with true/false values.
<group filter=”\Q1.a=1”> <logic condition=”\Q2.a.1=13:19” action=”\Q3.b=10” locatation=”\Q10.a.1” /> </group>
Arithmetical Define
Condition - If
An optional condition can be set that will restrict the operation to respondents that satisfies the If expression.
Syntax : See Filter
Action - Then
An arithmetical expression that fills a quantitative question with evaluated data.
Syntax
- N
Numeric value - [<[N]>]Address
Quantitative question: evaluates to a series of numerical answers.
Categorical question: evaluates to a series of codes (default) or to a series of categorical values (0/1).
<>Address means that non-answers should be kept.
<N>Address means that non-answers should be given the value N.
- serial
Fills a numeric question with a respondent index starting from 1. - Function
ABS(expression) absolute value,
NINT(expression) round to nearest integer,
MAX(expression) maximum value,
MIN(expression) minimum value,
INT(expression) convert to integer
Arithmetical operators: +, -, *, /, **
Parentheses can be used to override operator precedence.
Location
A reference to the quantitative row that will have its data filled or updated.
Examples
- Fill the second row in \Q2.a with 50 for all respondents.
<arithmetic action=”(2 + 3) * 10” location=”\Q2.a.2” />
- Fill the first row in \Q1.a with a respondent number ([1..N]).
<arithmetic action=”(2 + 3) * 10” location=”\Q1.a.1” />
- Evaluate INT(<10>\Q20.b.1 – 100) for each respondent that satisfies (\Q1.a=1 AND \Q2.a.1=3:9) and fill the first row in \Q10.a.
<group filter=”\Q1.a=1”> <arithmetic condition=”\Q2.a.1=3:9” action=”INT(<10>\Q20.b.1 – 100)” location=”\Q10.a.1” /> </group>
Weight Operations
A Weight operation contains one or many weight set definitions. The syntax is equal to the syntax used in Research Analyzer weighting documents.
Example
<weights> <wset id="GenRegAge" pop="3547" decimals="0"> <text>Population</text> <rim> <rimcell expr="\1.A=1&\2.A=1&\3.A=1" pop="148"/> <rimcell expr="\1.A=1&\2.A=1&\3.A=2" pop="282"/> <rimcell expr="\1.A=1&\2.A=2&\3.A=1" pop="73"/> <rimcell expr="\1.A=1&\2.A=2&\3.A=2" pop="120"/> <rimcell expr="\1.A=1&\2.A=3&\3.A=1" pop="38"/> <rimcell expr="\1.A=1&\2.A=3&\3.A=2" pop="65"/> <rimcell expr="\1.A=1&\2.A=4&\3.A=1" pop="26"/> <rimcell expr="\1.A=1&\2.A=4&\3.A=2" pop="45"/> <rimcell expr="\1.A=2&\2.A=1&\3.A=1" pop="144"/> <rimcell expr="\1.A=2&\2.A=1&\3.A=2" pop="275"/> <rimcell expr="\1.A=2&\2.A=2&\3.A=1" pop="69"/> <rimcell expr="\1.A=2&\2.A=2&\3.A=2" pop="113"/> <rimcell expr="\1.A=2&\2.A=3&\3.A=1" pop="37"/> <rimcell expr="\1.A=2&\2.A=3&\3.A=2" pop="60"/> <rimcell expr="\1.A=2&\2.A=4&\3.A=1" pop="24"/> <rimcell expr="\1.A=2&\2.A=4&\3.A=2" pop="42"/> </rim> </wset> <wset id="TwoRims" pop="R" decimals="0"> <text>Age/Gender</text> <rim> <rimcell expr="\1.A=1" pop="49"/> <rimcell expr="\1.A=2" pop="51"/> </rim> <rim> <rimcell expr="\2.A=1" pop="73"/> <rimcell expr="\2.A=2" pop="120"/> <rimcell expr="\2.A=3" pop="38"/> <rimcell expr="\2.A=4" pop="65"/> </rim> </wset> </weights>
Split
The Split tool allows a respondent to a questionnaire to be split into a number of sub-respondents. The Spilt function in Data Operations is very similar to SPSS’ VARSTOCASES tool.
Typically, a questionnaire can have two or more levels of data such as a household level and a personal level. All the information on the household level is common to all household members, but each household member has individual data associated with them. A typical questionnaire will then have a set of questions regarding the household (type of house, location, gross income, etc) and then several sets of questions for each household member (age, gender, individual income etc).
If we were interested in performing an analysis at the household member level, we might desire a questionnaire that appears to have been given individually to each household member. This new structure typically copies all the data from the top level (household level) and creates one set of questions for each household member. The original interview data is then split into a series of individual interviews (sub-respondents).
The Split tool is designed to allow experts to manipulate the underlying XML structure of your data sets directly using an XML editing window and tools. For example:
The split element defines a multiple structure of questions, sub questions, rows and columns that we want to split into a single structure. The number of elements in the dimension we split on much match the splitkey number given. Multiple split elements are allowed.
XML Syntax Used
Job
The <job> element defines certain properties of the split job. It wraps the other elements described later.
<job type='split' splitkey='N' consitencycheck='YES|NO' postfilter='FILTER'>
<job type=’split’><job> is exactly the same as <split></split>
<split splitkey='N' consitencycheck='YES|NO' postfilter='FILTER'></split>
Attributes are as shown in the table:
Name | Description |
type | Always ”split” for this use |
splitkey = ”N” | Defines the number of sub-respondents to create. It also determines the count for the various questionnaire structures that the system should ”concatenate” into a single structure. For example, if splitkey=’4’ then there should be 4 questions about gender (one for each household member) that will be concatenated into just one question in the split dataset. |
consitencycheck | If set to ”true”, a check should be performed after the split is run checking that all questions that ‘claim’ to be answered, really have an answer. |
postfilter | Defines an optional filter that will remove records (sub-respondents) after the split has been executed. Typically an original respondent only answered some of the lower level questions. For example, maximum of 10 houshold members is possible, but a given household only had 3 members. Seven sub-respondents can then be removed by specifying the postfilter accordingly. |
The split element defines a multiple structure of questions, sub questions, rows and columns that we want to split into a single structure. The number of elements in the dimension we split on much match the splitkey number given. Multiple split elements are allowed.
<split addr='DIM\ADDRESS_EXPRESSION' qno='QNO'> <text lang='LANGUAGE'>TEXT</text> </split>
DIM specify against which dimension the split should be done.
- Q : the ADDRESS_EXPRESSION lists a series of questions all with the same structure, and with N number of questions total. For example: addr=’Q\1,,4’ or addr=’Q\1,4,7,10’. Sub-questions, rows and columns may be restricted or rearranged if desired. Such as in addr=’Q\1,,4.A.1,,8’ which will split questions 1,2,3,4 but only sub-question A and rows 1 through 8. The expression addr=’Q\1,3,2,4.C,A,B.5,1,2,3,4’ re-arranges the questions, sub-questions and rows.
- S: the ADDRESS_EXPRESSION lists a single question with the same number of sub-questions as the splitkey. All sub-questions must of the same type. As an example, addr=’S\12.A,,D’ will create a new question with only one sub question. Re-arranging or restricting sub-questions, rows or columns is allowed as in addr=’S\12.A,C,B,D.1,,4,8,5,6,7 8[5,4,3,1,2 9]
- R: the ADDRESS_EXPRESSION lists one or more questions, each having the same number of rows. Sub-questions of type single or multiple are not allowed. The number of rows specified must match the splitkey number. Re-arranging and restricting questions, sub questions, rows and columns is allowed.
- H: the ADDRESS_EXPRESSION refers to a single sub-question of a single grid or multi-grid. The number of <h> elements (columns) must correspond to the splitkey number. The generated split question will be multiple choice.
ADDRESS_EXPRESSION lists the questions and/or sub-questions with the same structure that should be split. See description of DIM as to restrictions to the address.
QNO optionally specifies the new question number to be used.
TEXT specifies an optional override text to be used. If DIM is Q|S|H, then this text replaces the question text. If DIM is R this text replaces the row text.
Note: Multiple split elements are allowed.
Duplicate
The duplicate element is used to specify one or more question structures that we want to copy. The same respondent data will be copied onto every sub-respondent.
<duplicate addr='\ADDRESS_EXPRESSION' />
ADDRESS_EXPRESSION specifies a list of any questions that you want to duplicate. Re-arrangement of questions, sub questions, rows and columns is allowed. To select questions with different structures, use multiple duplicate elements.
Detailsplit
The detail split element can be used to individually select question structures for each sub respondent. It must contain the same number of select elements as the splitkey number.
<detailsplit> <template addr='\ADDRESS_EXPRESSION'/> <copy addr='\ADDRESS_EXPRESSION'/> <copy addr='\ADDRESS_EXPRESSION'/> …… </detailsplit>
The <template> address specifies the question(s) that will be used as the basis for creating the sub respondent data structure. Any list of questions and re-arranging of questions, sub-questions, rows and columns is allowed.
All <copy> address expressions will be compared against the template, and the structure must match. In the <copy> expression you may use “!” to signify missing components. For example, use <copy address=’\10.A,1,,4,!,5,,10’ /> if question 10 is missing row 5 compared to the template.
If the question structure varies, use multiple detailsplit elements.
The number of <copy> elements + the one <template> element must match the splitkey number.
Any number of <split>, <duplicate> and <detailsplit> elements in any order is allowed.
Index
You may insert a numeric question that contains two indices of the sub-respondent.
Sequence Index is the ‘loop’ counter of the split, it will show from which iteration of the split the sub-respondent has been generated, values will be 1,2,3… up to the splitkey value.
Record Index will differ if a postfilter has been given. It will indicate which record in sequence that this sub-respondent represents. So if e.g. a splitkey of 3 was given, and for a given original respondent the second sub-respondent was removed by the post filter, the Sequence Index would be 1,3, where as the Record Index would be 1,2.
<index qno='QNO'> <stext><text lang='LANG'>TEXT</text></stext> <rectext><text lang='LANG'>TEXT</text></rectext> <seqtext><text lang='LANG'>TEXT</text></seqtext> </index>
QNO specify the questionnaire number to be used for the index question.
<stext> specify the text to be used as the question text.
<rectext> specify the text to be used for the record text.
<seqtext> specify the text to be used for the sequence text.
Frequency
You may insert a numeric question that contains the total number of records (sub-respondents) “split” from a given input respondent. Unless a postfilter has been specified, this will equal the splitkey value. If a postfilter has been specified, the number will be reduced to indicate the actual number of sub-respondents generated.
<frequency qno='QNO'> <stext><text lang='LANG'>TEXT</text></stext> <rtext><text lang='LANG'>TEXT</text></rtext> </frequency>
QNO specify the questionnaire number to be used for the frequency question.
<stext> specify the text to be used as the question text.
<rtext> specify the text to be used for the row text.
Inverse Frequency
You may insert a numeric question that contains the inverse number of ‘split’ records (sub-respondents) total from a given input respondent. Unless postfilter have been specified, this will equal the inverse of the splitkey value. If however postfilter has been specified, the number will be adjusted to indicate the inverse of the actual number of sub-respondents generated.
<invfrequency qno='QNO'> <stext><text lang='LANG'>TEXT</text></stext> <rtext><text lang='LANG'>TEXT</text></rtext> </invfrequency>
QNO specify the questionnaire number to be used for the inverse frequency question.
<stext> specify the text to be used as the question text.
<rtext> specify the text to be used for the row text.
Serial
You may insert a numeric question that contains the serial of the original respondents. The values will be generated sequentially starting with 1 unless the optional “start” attribute is specified. If the start attribute is specified, numbers will be generated starting with this value.
<serial qno='QNO' start='N'> <stext><text lang='LANG'>TEXT</text></stext> <rtext><text lang='LANG'>TEXT</text></rtext> </serial>
Split syntax – an example
<job type='split' splitkey='3' consitencycheck='YES' postfilter='\t1='> <duplicate addr='\gender,age,region'/> <split addr='Q\p1,,p3'/> <detailsplit> <template addr='\t1,t12,t13'/> <copy addr='\x1,x12,x13'/> <copy addr='\y1,y12,!' /> </detailsplit> <duplicate addr='\house_income,date'/> <index qno='index'> <stext><text lang='en'>Index values</text></stext> <seqtext><text lang='en'>Sub respondent sequence number</text></seqtext> <rectext><text lang='en'>Sub respondent record number</text></rectext> </index> <frequency qno='freq'> <stext><text lang='en'>Frequency</text></stext> <rtext><text lang='en'>Total sub-respondents</text></rtext> </frequency> <invfrequency qno='infreq'> <stext><text lang='en'>Inverse frequency</text></stext> <rtext><text lang='en'>Inverse total sub-respondents</text></rtext> </invfrequency> <serial qno='serial'> <stext><text lang='en'>Master serial</text></stext> <rtext><text lang='en'>Serial</text></rtext> </serial> </job>
If you are familiar with the Clean and Define tool from QuenchTech or are looking for help with specific commands/advanced functionality, please keep reading....
Definition
An Advanced operation contains one or many expression groups and each group contains one or many Clean/Define expressions. An expression group can have an optional filter describing which respondents that will be affected by the expressions within the group.
Three types of Clean/Define expressions are possible to define within a group:
- Clean expression
Assigns new values from an assignment expression (action). - Logical define expression
Fills categorical questions with data created from a logical expression (action). - Arithmetical define expression
Fills quantitative questions with data created from an arithmetical expression (action).
Each of these expressions can have an optional filter (condition) that will restrict the operation to respondents that satisfy the expression.
When combining group filters and expression conditions the following table shows the possible logical combinations.
Group filter and expression condition
Group filter | Expression condition | Result |
True | True | True |
True | False | False |
False | True | False |
False | False | False |
Not set | True | True |
Not set | False | False |
True | Not set | True |
False | Not set | False |
Group expression
Filter
[!]\QuesList.SubqList.RowList = ValueList
A normal logical expression including logical operators, parenthesis, intervals, lists, special codes and frequencies.
Clean/Define expressions
Clean
Condition - If
An optional condition can be set that will restrict the operation to respondents that satisfies the If expression.
Syntax
See 4.1
Action - Then
An assignment expression that will set values (codes, quantities or special codes) to each respondent that satisfies the optional group filter and condition.
Syntax
- [!]\QuesList.SubqList.Rowlist = [!,]ValueList
or
- \QuesList.SubqList.Rowlist = SpecialCode
Left hand side
Question/Sub question/Row reference including intervals (,,) and lists (,). An exclamation mark states that all values on the right hand side will be removed instead of set.
Right hand side
A Special code or a value list containing one or many values that will be set. An exclamation mark states that all values (answers) will be removed prior to setting the given values.
SPECIAL CODES
!? Not asked
- Not answered
!¤ Non-existence
VALUE LIST
v Single value (code or quantity)
v,v,…,v Multiple values for M/RM.
!,v,v,…,v All answers will be removed prior to setting the new values (M/RM).
The special case ”!,v” for all other data types is equal to “v” since only
one value is possible.
Examples
- Group filter: \1.a=1
Condition: \2.a.1=13:19
Action: \3.b=10
Will set \3.b to 10 for each respondent that satisfies (\1.a=1 AND \2.a.1=13:19) - Condition: \10:12.a:b.1=1 | \13.a=5
Action: \4.a.1,,10=!,2,3,4
Will first remove all answers from \4.a.1 to \4.a.10 and then set \4.a.1 to \4.a.10 to 2, 3 and 4 for each respondent that satisfies (\10:12.a:b.1=1 | \13.a=5) - Action: \1.a=!?
Will set \1.a as Not asked for all respondents - Action: !\1.a=1
Will remove code 1 from \1.a for all respondents - Action: \1,,5=-
Will set all components in question 1 to 5 to not answered - Action: \1.a,,c,f=-
Will set all components in sub questions A, B, C and F in question 1 to not answered - Action: \1.1,10,,12=-
Will set rows 1, 10 and 12 in all sub questions in question 1 to not answered
Logical Define
Condition - If
An optional condition can be set that will restrict the operation to respondents that satisfies the If expression.
Syntax
See 4.1
Action - Then
A logical expression that fills a categorical question with evaluated data (true/false).
Syntax
See 4.1
Location
A reference to the categorical row that will have its data filled or updated.
Examples
- Action: \1.a=1
Location: \2.a.2
Evaluates \1.a=1 and fills the second row in \2.a with true/false values. - Group filter: \1.a=1
Condition: \2.a.1=13:19
Action: \3.b=10
Location: \10.a.1
Evaluates \3.b=10 for each respondent that satisfies (\1.a=1 AND \2.a.1=13:19) and fills the first row in \10.a with true/false values.
Arithmetical Define
Condition - If
An optional condition can be set that will restrict the operation to respondents that satisfies the If expression.
Syntax
See 4.1
Action - Then
An arithmetical expression that fills a quantitative question with evaluated data.
Syntax
- N
Numeric value - [<[N]>]Address
Quantitative question: evaluates to a series of numerical answers.
Categorical question: evaluates to a series of codes (default) or to a series of categorical values (0/1).
<>Address means that non-answers should be kept.
<N>Address means that non-answers should be given the value N.
- Function
ABS(expression) absolute value,
NINT(expression) round to nearest integer,
MAX(expression) maximum value,
MIN(expression) minimum value,
INT(expression) convert to integer
Arithmetical operators: +, -, *, /, **
Parentheses can be used to override operator precedence.
Location
A reference to the quantitative row that will have its data filled or updated.
Examples
- Action: (2 + 3) * 10
Location: \2.a.2
Fills the second row in \2.a with 50 for all respondents. - Group filter: \1.a=1
Condition: \2.a.1=13:19
Action: INT(<10>\20.b.1 – 100)
Location: \10.a.1
Evaluates INT(<10>\20.b.1 – 100) for each respondent that satisfies (\1.a=1 AND \2.a.1=13:19) and fills the first row in \10.a. - XML Syntax
Groups and expressions can be expressed as an XML string as follows.
|
| Attributes |
root | <groups> |
|
Group | <group> | filter (optional) |
Clean | <set /> | condition (optional) action (required) |
Logical Define | <logic /> | condition (optional) action (required) location (required) |
Arithmetical Define | <arithmetic /> | condition (optional) action (required) location (required) useCatCodeValues (optional, default is true) |
The root element <groups> is required and must have one or many <group> elements as child elements. Each <group> element must have one or many expression elements (<set>, <logic> and <arithmetic>) as child elements.
Note that all special characters (&, <, >, “, ‘) have to be written as XML entities (&, <, >, ", ')
XML Example
<groups> <group name="Group1"> <arithmetic condition="\1.a=1" action="2008 - <>\2.a" location="\7.a.1" useCatCodeValues="true" /> <set condition="\1.a=1 & \1.a=2" action="\1.a=-" /> <!-- add more expressions here... --> </group> <!-- add more groups here... --> </groups>
Is equal to: