Chapter 5 Changing the order of variables in a dataset
I personally find reordering the order of variables with the order command to be useful. This is especially true when working with panel data. I like to order the panel data to have the cross-sectional unit first, such as personal id, firm id, etc.first and then have the time period second, so we have our N and T next to one another.
Let’s pull our survey of graduate students and describe our dataset
/Users/Sam/Desktop/Econ 645/Data/Mitchell
(Survey of graduate students)
Contains data from survey6.dta
obs: 8 Survey of graduate students
vars: 9 11 Mar 2024 14:40
size: 416 (_dta has notes)
----------------------------------------------------------------------------------------------
storage display value
variable name type format label variable label
----------------------------------------------------------------------------------------------
id float %9.0g Unique identification variable
gender float %9.0g mf Gender of student
race float %19.0g racelab * Race of student
havechild float %18.0g havelab * Given birth to a child?
ksex float %15.0g mfkid * Sex of child
bdays str10 %10s Birthday of student
income float %12.2fc Income of student
kidname str10 %-10s Name of child
kbday double %td
* indicated variables have notes
----------------------------------------------------------------------------------------------
Sorted by:
We might want to group our variables with similar types of variables. This can be helpful when you have a large dataset with hundreds of variables, such as the CPS.
Contains data from survey6.dta
obs: 8 Survey of graduate students
vars: 9 11 Mar 2024 14:40
size: 416 (_dta has notes)
----------------------------------------------------------------------------------------------
storage display value
variable name type format label variable label
----------------------------------------------------------------------------------------------
id float %9.0g Unique identification variable
gender float %9.0g mf Gender of student
race float %19.0g racelab * Race of student
bdays str10 %10s Birthday of student
income float %12.2fc Income of student
havechild float %18.0g havelab * Given birth to a child?
ksex float %15.0g mfkid * Sex of child
kidname str10 %-10s Name of child
kbday double %td
* indicated variables have notes
----------------------------------------------------------------------------------------------
Sorted by:
The variables that we leave off will remain in the same order as before after the new variables are moved to the left.
With the before option, we can move variable(s) before a defined variable. Let’s move kidname before ksex
Contains data from survey6.dta
obs: 8 Survey of graduate students
vars: 9 11 Mar 2024 14:40
size: 416 (_dta has notes)
----------------------------------------------------------------------------------------------
storage display value
variable name type format label variable label
----------------------------------------------------------------------------------------------
id float %9.0g Unique identification variable
gender float %9.0g mf Gender of student
race float %19.0g racelab * Race of student
bdays str10 %10s Birthday of student
income float %12.2fc Income of student
havechild float %18.0g havelab * Given birth to a child?
kidname str10 %-10s Name of child
ksex float %15.0g mfkid * Sex of child
kbday double %td
* indicated variables have notes
----------------------------------------------------------------------------------------------
Sorted by:
We can move newly created variables with the before and after options with the generate command
(8 missing values generated)
(8 missing values generated)
Contains data from survey6.dta
obs: 8 Survey of graduate students
vars: 11 11 Mar 2024 14:40
size: 544 (_dta has notes)
----------------------------------------------------------------------------------------------
storage display value
variable name type format label variable label
----------------------------------------------------------------------------------------------
id float %9.0g Unique identification variable
STUDENTVARS double %10.0g
gender float %9.0g mf Gender of student
race float %19.0g racelab * Race of student
bdays str10 %10s Birthday of student
income float %12.2fc Income of student
havechild float %18.0g havelab * Given birth to a child?
KIDSVARS double %10.0g
kidname str10 %-10s Name of child
ksex float %15.0g mfkid * Sex of child
kbday double %td
* indicated variables have notes
----------------------------------------------------------------------------------------------
Sorted by:
Note: Dataset has changed since last saved.
5.1 Practice
Let’s bring in the CPS: https://www.census.gov/data/datasets/time-series/demo/cps/cps-basic.html
- Generate a new variable from pemlr called employed where employed = 1 if the individual is employed (present or absent) and employed = 0 if the individual is unemployed. The value should be missing if the individual is not in the labor force.
- Label the variable “Currently employed”.
- Label the values for 0 “Not employed” 1 “Employed” . “Not in the Labor Force”.
- Move the variable after pemlr.
- Generate a date that appends hrmonth (month of interview), the string “12”, and the hryear4 (year of interview). We use 12 because the week of the 12th is the reference period.
- Now format the date so it is like 07/12/2025