Chapter 2 Ingesting Data

Mitchell Chapter 2

Note when your working directory, I recommend using “/” instead of “" Stata will use both, and the backslash”" often runs into string issues.

2.1 Getting our data

We’ll get our data that we will use for the course.

pwd
*Set your working directory
cd "/Users/Sam/Desktop/Econ 645/Data/"

*Use:
net from https://www.stata-press.com/data/dmus2
net get dmus1
net get dmus2

*All commands:
net install dmus1

2.2 Reading Stata data

We use the use command to read Stata data. Use the clear option to clear the data from memory so you can read the data file. Otherwise, you will get an error

We’ll add the clear option to clear our memory to provide an error

cd "/Users/Sam/Desktop/Econ 645/Data/Mitchell/"
use "dentists.dta", clear
list  
/Users/Sam/Desktop/Econ 645/Data/Mitchell

     +----------------------------------------------+
     |              name   years   fulltime   recom |
     |----------------------------------------------|
  1. | Y. Don Uflossmore    7.25          0       1 |
  2. |    Olive Tu'Drill   10.25          1       1 |
  3. | Isaac O'Yerbreath   32.75          1       1 |
  4. |      Ruth Canaale      22          1       1 |
  5. |        Mike Avity     8.5          0       0 |
     +----------------------------------------------+

We can read Stata data from websources. From Stata Press website:

use "https://www.stata-press.com/data/dmus2/dentists.dta", clear
list
     |              name   years   fulltime   recom |
     |----------------------------------------------|
  1. | Y. Don Uflossmore    7.25          0       1 |
  2. |    Olive Tu'Drill   10.25          1       1 |
  3. | Isaac O'Yerbreath   32.75          1       1 |
  4. |      Ruth Canaale      22          1       1 |
  5. |        Mike Avity     8.5          0       0 |
     +----------------------------------------------+

From NBER website

use "https://data.nber.org/morg/annual/morg20.dta", clear
list lfsr94 stfips sex in f/10
     |                      lfsr94   stfips   sex |
     |--------------------------------------------|
  1. |            Employed-At Work       AL     2 |
  2. |            Employed-At Work       AL     1 |
  3. |  Retired-Not In Labor Force       AL     1 |
  4. |  Retired-Not In Labor Force       AL     2 |
  5. |    Other-Not In Labor Force       AL     2 |
     |--------------------------------------------|
  6. | Disabled-Not In Labor Force       AL     1 |
  7. |  Retired-Not In Labor Force       AL     1 |
  8. |  Retired-Not In Labor Force       AL     2 |
  9. |  Retired-Not In Labor Force       AL     2 |
 10. |  Retired-Not In Labor Force       AL     1 |
     +--------------------------------------------+

2.3 Reading Stata data with subsets

Subsetting can be a helpful method to read in extremely large data sets, such as American Community Survey (ACS) public use file. We can subset by variables, observations, or both.

2.3.1 Subsetting by columns (variables)

You can read in a subset of Stata data by columns. This can be helpful when you have a large dataset but only need a few columns of data. However, you need to know the variable names. We will need to have a data dictionary in order to do this. The January CPS data dictionary can be found on Census’s website. https://www2.census.gov/programs-surveys/cps/datasets/2025/basic/2025_Basic_CPS_Public_Use_Record_Layout_plus_IO_Code_list.txt

quietly cd "/Users/Sam/Desktop/Data/CPS/"
use gestfips pemlr pternwa using "smalljul25pub.dta", clear
list in 1/3
     | gestfips   pemlr   pternwa |
     |----------------------------|
  1. |        1       .         . |
  2. |        1       5        -1 |
  3. |        1       6        -1 |
     +----------------------------+

If it is a large dataset, I don’t recommend using list by itself. Use in 1/N

2.3.2 Subsetting by rows (observations)

You can read in a subset of data by rows. In my opinion, this is a better method to subset than by columns. You are able to see the entire dataset. You can use a qualifiers if or in

quietly cd "/Users/Sam/Desktop/Data/CPS/"
use "smalljul25pub.dta" if gestfip == 24 & prtage == 30, clear
list in 1/10
  1. | hrhhid2 | hrmis | hrmonth | hryear4 | gestfips | gediv | pulineno | pudis | puiodp1 | puiodp2 | puiodp3  |
     |   17011 |     7 |       7 |    2025 |       24 |     5 |        3 |    -1 |      -1 |      -1 |      -1  |
     |-------------------------------------+----------+---------------------------------------------------------|
     | perrp | pesex | pemaritl | pehspnon | peeduca  | peafever  | ptdtrace  | prpertyp  | prtage  | pehractt  |
     |    49 |     1 |        6 |        2 |      40  |        2  |        2  |        2  |     30  |       12  |
     |------------------------------------------------------------------------------------+---------------------|
     | pehract1 | pehract2 | pemlr | peio1icd | peio1cow | peio2cow | prmjind1 | prmjind2 | prmjocc1 | prmjocc2 |
     |       12 |       -1 |     1 |     6380 |        4 |       -1 |        6 |       -1 |        5 |       -1 |
     |----------+----------+-------------------------------------------------------------------------+----------|
     | prdtind1 | prdtind2 | prdtocc1 | prdtocc2 | prsjmj | peernhry | peernhro | peernlab | prerelg | prdisflg |
     |       23 |       -1 |       17 |       -1 |      1 |       -1 |       -1 |       -1 |       0 |        2 |
     |----------+----------+----------+-------------------------------------------------------------------------|
     | pecert1  | pecert2  | pecert3  |  pwcmpwgt  |  pternwa  |  ptio1ocd  |     hrhhid  |  ptwk  |  gtmetsta  |
     |       2  |      -1  |      -1  |  99149562  |       -1  |      5510  |  9.401e+14  |     0  |         1  |
     +----------------------------------------------------------------------------------------------------------+

     +----------------------------------------------------------------------------------------------------------+
  2. | hrhhid2 | hrmis | hrmonth | hryear4 | gestfips | gediv | pulineno | pudis | puiodp1 | puiodp2 | puiodp3  |
     |   17011 |     6 |       7 |    2025 |       24 |     5 |        2 |    -1 |       1 |       2 |       1  |
     |-------------------------------------+----------+---------------------------------------------------------|
     | perrp | pesex | pemaritl | pehspnon | peeduca  | peafever  | ptdtrace  | prpertyp  | prtage  | pehractt  |
     |    51 |     1 |        1 |        2 |      39  |        2  |        4  |        2  |     30  |       52  |
     |------------------------------------------------------------------------------------+---------------------|
     | pehract1 | pehract2 | pemlr | peio1icd | peio1cow | peio2cow | prmjind1 | prmjind2 | prmjocc1 | prmjocc2 |
     |       44 |        8 |     1 |      770 |        4 |       -1 |        3 |       -1 |        8 |       -1 |
     |----------+----------+-------------------------------------------------------------------------+----------|
     | prdtind1 | prdtind2 | prdtocc1 | prdtocc2 | prsjmj | peernhry | peernhro | peernlab | prerelg | prdisflg |
     |        4 |       -1 |       20 |       -1 |      2 |       -1 |       -1 |       -1 |       0 |        2 |
     |----------+----------+----------+-------------------------------------------------------------------------|
     | pecert1  | pecert2  | pecert3  |  pwcmpwgt  |  pternwa  |  ptio1ocd  |     hrhhid  |  ptwk  |  gtmetsta  |
     |       2  |      -1  |      -1  |  48285071  |       -1  |      7315  |  4.700e+13  |     0  |         1  |
     +----------------------------------------------------------------------------------------------------------+

     +----------------------------------------------------------------------------------------------------------+
  3. | hrhhid2 | hrmis | hrmonth | hryear4 | gestfips | gediv | pulineno | pudis | puiodp1 | puiodp2 | puiodp3  |
     |   17012 |     8 |       7 |    2025 |       24 |     5 |        1 |    -1 |       1 |       2 |       1  |
     |-------------------------------------+----------+---------------------------------------------------------|
     | perrp | pesex | pemaritl | pehspnon | peeduca  | peafever  | ptdtrace  | prpertyp  | prtage  | pehractt  |
     |    41 |     1 |        6 |        2 |      42  |        2  |        2  |        2  |     30  |       40  |
     |------------------------------------------------------------------------------------+---------------------|
     | pehract1 | pehract2 | pemlr | peio1icd | peio1cow | peio2cow | prmjind1 | prmjind2 | prmjocc1 | prmjocc2 |
     |       40 |       -1 |     1 |     7280 |        4 |       -1 |        9 |       -1 |        5 |       -1 |
     |----------+----------+-------------------------------------------------------------------------+----------|
     | prdtind1 | prdtind2 | prdtocc1 | prdtocc2 | prsjmj | peernhry | peernhro | peernlab | prerelg | prdisflg |
     |       36 |       -1 |       17 |       -1 |      1 |        2 |       -1 |        2 |       1 |        2 |
     |----------+----------+----------+-------------------------------------------------------------------------|
     | pecert1  | pecert2  | pecert3  |  pwcmpwgt  |  pternwa  |  ptio1ocd  |     hrhhid  |  ptwk  |  gtmetsta  |
     |       2  |      -1  |      -1  | 113138434  |   183000  |      5120  |  3.411e+14  |     0  |         1  |
     +----------------------------------------------------------------------------------------------------------+

     +----------------------------------------------------------------------------------------------------------+
  4. | hrhhid2 | hrmis | hrmonth | hryear4 | gestfips | gediv | pulineno | pudis | puiodp1 | puiodp2 | puiodp3  |
     |   19011 |     3 |       7 |    2025 |       24 |     5 |        4 |    -1 |      -1 |      -1 |      -1  |
     |-------------------------------------+----------+---------------------------------------------------------|
     | perrp | pesex | pemaritl | pehspnon | peeduca  | peafever  | ptdtrace  | prpertyp  | prtage  | pehractt  |
     |    49 |     1 |        6 |        2 |      39  |        2  |        1  |        2  |     30  |       32  |
     |------------------------------------------------------------------------------------+---------------------|
     | pehract1 | pehract2 | pemlr | peio1icd | peio1cow | peio2cow | prmjind1 | prmjind2 | prmjocc1 | prmjocc2 |
     |       32 |       -1 |     1 |      770 |        4 |       -1 |        3 |       -1 |       10 |       -1 |
     |----------+----------+-------------------------------------------------------------------------+----------|
     | prdtind1 | prdtind2 | prdtocc1 | prdtocc2 | prsjmj | peernhry | peernhro | peernlab | prerelg | prdisflg |
     |        4 |       -1 |       22 |       -1 |      1 |       -1 |       -1 |       -1 |       0 |        2 |
     |----------+----------+----------+-------------------------------------------------------------------------|
     | pecert1  | pecert2  | pecert3  |  pwcmpwgt  |  pternwa  |  ptio1ocd  |     hrhhid  |  ptwk  |  gtmetsta  |
     |       2  |      -1  |      -1  |  44333596  |       -1  |      9130  |  9.007e+14  |     0  |         1  |
     +----------------------------------------------------------------------------------------------------------+

     +----------------------------------------------------------------------------------------------------------+
  5. | hrhhid2 | hrmis | hrmonth | hryear4 | gestfips | gediv | pulineno | pudis | puiodp1 | puiodp2 | puiodp3  |
     |   19011 |     2 |       7 |    2025 |       24 |     5 |        1 |    -1 |       1 |       2 |       1  |
     |-------------------------------------+----------+---------------------------------------------------------|
     | perrp | pesex | pemaritl | pehspnon | peeduca  | peafever  | ptdtrace  | prpertyp  | prtage  | pehractt  |
     |    40 |     1 |        1 |        2 |      39  |        2  |        1  |        2  |     30  |       50  |
     |------------------------------------------------------------------------------------+---------------------|
     | pehract1 | pehract2 | pemlr | peio1icd | peio1cow | peio2cow | prmjind1 | prmjind2 | prmjocc1 | prmjocc2 |
     |       50 |       -1 |     1 |      170 |        4 |       -1 |        1 |       -1 |        6 |       -1 |
     |----------+----------+-------------------------------------------------------------------------+----------|
     | prdtind1 | prdtind2 | prdtocc1 | prdtocc2 | prsjmj | peernhry | peernhro | peernlab | prerelg | prdisflg |
     |        1 |       -1 |       18 |       -1 |      1 |       -1 |       -1 |       -1 |       0 |        2 |
     |----------+----------+----------+-------------------------------------------------------------------------|
     | pecert1  | pecert2  | pecert3  |  pwcmpwgt  |  pternwa  |  ptio1ocd  |     hrhhid  |  ptwk  |  gtmetsta  |
     |       2  |      -1  |      -1  |  31435370  |       -1  |      6050  |  9.450e+14  |     0  |         1  |
     +----------------------------------------------------------------------------------------------------------+

     +----------------------------------------------------------------------------------------------------------+
  6. | hrhhid2 | hrmis | hrmonth | hryear4 | gestfips | gediv | pulineno | pudis | puiodp1 | puiodp2 | puiodp3  |
     |   17011 |     5 |       7 |    2025 |       24 |     5 |        1 |    -1 |      -1 |      -1 |      -1  |
     |-------------------------------------+----------+---------------------------------------------------------|
     | perrp | pesex | pemaritl | pehspnon | peeduca  | peafever  | ptdtrace  | prpertyp  | prtage  | pehractt  |
     |    41 |     1 |        6 |        1 |      45  |        2  |        4  |        2  |     30  |       40  |
     |------------------------------------------------------------------------------------+---------------------|
     | pehract1 | pehract2 | pemlr | peio1icd | peio1cow | peio2cow | prmjind1 | prmjind2 | prmjocc1 | prmjocc2 |
     |       40 |       -1 |     1 |     7980 |        4 |       -1 |       10 |       -1 |        2 |       -1 |
     |----------+----------+-------------------------------------------------------------------------+----------|
     | prdtind1 | prdtind2 | prdtocc1 | prdtocc2 | prsjmj | peernhry | peernhro | peernlab | prerelg | prdisflg |
     |       42 |       -1 |       10 |       -1 |      1 |       -1 |       -1 |       -1 |       0 |        2 |
     |----------+----------+----------+-------------------------------------------------------------------------|
     | pecert1  | pecert2  | pecert3  |  pwcmpwgt  |  pternwa  |  ptio1ocd  |     hrhhid  |  ptwk  |  gtmetsta  |
     |       1  |       1  |       1  |  75007444  |       -1  |      3010  |  4.260e+14  |     0  |         1  |
     +----------------------------------------------------------------------------------------------------------+

     +----------------------------------------------------------------------------------------------------------+
  7. | hrhhid2 | hrmis | hrmonth | hryear4 | gestfips | gediv | pulineno | pudis | puiodp1 | puiodp2 | puiodp3  |
     |   19011 |     3 |       7 |    2025 |       24 |     5 |        2 |     1 |      -1 |      -1 |      -1  |
     |-------------------------------------+----------+---------------------------------------------------------|
     | perrp | pesex | pemaritl | pehspnon | peeduca  | peafever  | ptdtrace  | prpertyp  | prtage  | pehractt  |
     |    58 |     1 |        6 |        2 |      39  |        2  |        2  |        2  |     30  |       -1  |
     |------------------------------------------------------------------------------------+---------------------|
     | pehract1 | pehract2 | pemlr | peio1icd | peio1cow | peio2cow | prmjind1 | prmjind2 | prmjocc1 | prmjocc2 |
     |       -1 |       -1 |     6 |       -1 |       -1 |       -1 |       -1 |       -1 |       -1 |       -1 |
     |----------+----------+-------------------------------------------------------------------------+----------|
     | prdtind1 | prdtind2 | prdtocc1 | prdtocc2 | prsjmj | peernhry | peernhro | peernlab | prerelg | prdisflg |
     |       -1 |       -1 |       -1 |       -1 |     -1 |       -1 |       -1 |       -1 |       0 |        1 |
     |----------+----------+----------+-------------------------------------------------------------------------|
     | pecert1  | pecert2  | pecert3  |  pwcmpwgt  |  pternwa  |  ptio1ocd  |     hrhhid  |  ptwk  |  gtmetsta  |
     |       2  |      -1  |      -1  | 106522944  |       -1  |        -1  |  6.604e+14  |     0  |         1  |
     +----------------------------------------------------------------------------------------------------------+

     +----------------------------------------------------------------------------------------------------------+
  8. | hrhhid2 | hrmis | hrmonth | hryear4 | gestfips | gediv | pulineno | pudis | puiodp1 | puiodp2 | puiodp3  |
     |   17011 |     6 |       7 |    2025 |       24 |     5 |        1 |    -1 |      -1 |      -1 |      -1  |
     |-------------------------------------+----------+---------------------------------------------------------|
     | perrp | pesex | pemaritl | pehspnon | peeduca  | peafever  | ptdtrace  | prpertyp  | prtage  | pehractt  |
     |    40 |     1 |        6 |        2 |      43  |        2  |        1  |        2  |     30  |       43  |
     |------------------------------------------------------------------------------------+---------------------|
     | pehract1 | pehract2 | pemlr | peio1icd | peio1cow | peio2cow | prmjind1 | prmjind2 | prmjocc1 | prmjocc2 |
     |       43 |       -1 |     1 |     6481 |        4 |       -1 |        7 |       -1 |        4 |       -1 |
     |----------+----------+-------------------------------------------------------------------------+----------|
     | prdtind1 | prdtind2 | prdtocc1 | prdtocc2 | prsjmj | peernhry | peernhro | peernlab | prerelg | prdisflg |
     |       25 |       -1 |       16 |       -1 |      1 |       -1 |       -1 |       -1 |       0 |        2 |
     |----------+----------+----------+-------------------------------------------------------------------------|
     | pecert1  | pecert2  | pecert3  |  pwcmpwgt  |  pternwa  |  ptio1ocd  |     hrhhid  |  ptwk  |  gtmetsta  |
     |       2  |      -1  |      -1  |  34698967  |       -1  |      4850  |  4.223e+14  |     0  |         1  |
     +----------------------------------------------------------------------------------------------------------+

     +----------------------------------------------------------------------------------------------------------+
  9. | hrhhid2 | hrmis | hrmonth | hryear4 | gestfips | gediv | pulineno | pudis | puiodp1 | puiodp2 | puiodp3  |
     |   19011 |     3 |       7 |    2025 |       24 |     5 |        1 |    -1 |       1 |       2 |       1  |
     |-------------------------------------+----------+---------------------------------------------------------|
     | perrp | pesex | pemaritl | pehspnon | peeduca  | peafever  | ptdtrace  | prpertyp  | prtage  | pehractt  |
     |    41 |     1 |        6 |        1 |      43  |        2  |        1  |        2  |     30  |       50  |
     |------------------------------------------------------------------------------------+---------------------|
     | pehract1 | pehract2 | pemlr | peio1icd | peio1cow | peio2cow | prmjind1 | prmjind2 | prmjocc1 | prmjocc2 |
     |       50 |       -1 |     1 |     7071 |        4 |       -1 |        8 |       -1 |        1 |       -1 |
     |----------+----------+-------------------------------------------------------------------------+----------|
     | prdtind1 | prdtind2 | prdtocc1 | prdtocc2 | prsjmj | peernhry | peernhro | peernlab | prerelg | prdisflg |
     |       34 |       -1 |        2 |       -1 |      1 |       -1 |       -1 |       -1 |       0 |        2 |
     |----------+----------+----------+-------------------------------------------------------------------------|
     | pecert1  | pecert2  | pecert3  |  pwcmpwgt  |  pternwa  |  ptio1ocd  |     hrhhid  |  ptwk  |  gtmetsta  |
     |       2  |      -1  |      -1  |  50143408  |       -1  |       800  |  1.913e+14  |     0  |         1  |
     +----------------------------------------------------------------------------------------------------------+

     +----------------------------------------------------------------------------------------------------------+
 10. | hrhhid2 | hrmis | hrmonth | hryear4 | gestfips | gediv | pulineno | pudis | puiodp1 | puiodp2 | puiodp3  |
     |   17011 |     5 |       7 |    2025 |       24 |     5 |        1 |    -1 |      -1 |      -1 |      -1  |
     |-------------------------------------+----------+---------------------------------------------------------|
     | perrp | pesex | pemaritl | pehspnon | peeduca  | peafever  | ptdtrace  | prpertyp  | prtage  | pehractt  |
     |    40 |     2 |        1 |        2 |      44  |        2  |        1  |        2  |     30  |        5  |
     |------------------------------------------------------------------------------------+---------------------|
     | pehract1 | pehract2 | pemlr | peio1icd | peio1cow | peio2cow | prmjind1 | prmjind2 | prmjocc1 | prmjocc2 |
     |        5 |       -1 |     1 |     7460 |        5 |       -1 |        9 |       -1 |        2 |       -1 |
     |----------+----------+-------------------------------------------------------------------------+----------|
     | prdtind1 | prdtind2 | prdtocc1 | prdtocc2 | prsjmj | peernhry | peernhro | peernlab | prerelg | prdisflg |
     |       36 |       -1 |        3 |       -1 |      1 |       -1 |       -1 |       -1 |       0 |        2 |
     |----------+----------+----------+-------------------------------------------------------------------------|
     | pecert1  | pecert2  | pecert3  |  pwcmpwgt  |  pternwa  |  ptio1ocd  |     hrhhid  |  ptwk  |  gtmetsta  |
     |       2  |      -1  |      -1  |  87833706  |       -1  |      1006  |  2.911e+14  |     0  |         1  |
     +----------------------------------------------------------------------------------------------------------+
quietly cd "/Users/Sam/Desktop/Data/CPS/"
use "smalljul25pub.dta" in 1/10, clear
list 
  1. | hrhhid2 | hrmis | hrmonth | hryear4 | gestfips | gediv | pulineno | pudis | puiodp1 | puiodp2 | puiodp3  |
     |   17011 |     8 |       7 |    2025 |        1 |     6 |        . |     . |       . |       . |       .  |
     |-------------------------------------+----------+---------------------------------------------------------|
     | perrp | pesex | pemaritl | pehspnon | peeduca  | peafever  | ptdtrace  | prpertyp  | prtage  | pehractt  |
     |     . |     . |        . |        . |       .  |        .  |        .  |        .  |      .  |        .  |
     |------------------------------------------------------------------------------------+---------------------|
     | pehract1 | pehract2 | pemlr | peio1icd | peio1cow | peio2cow | prmjind1 | prmjind2 | prmjocc1 | prmjocc2 |
     |        . |        . |     . |        . |        . |        . |        . |        . |        . |        . |
     |----------+----------+-------------------------------------------------------------------------+----------|
     | prdtind1 | prdtind2 | prdtocc1 | prdtocc2 | prsjmj | peernhry | peernhro | peernlab | prerelg | prdisflg |
     |        . |        . |        . |        . |      . |        . |        . |        . |       . |        . |
     |----------+----------+----------+-------------------------------------------------------------------------|
     | pecert1  | pecert2  | pecert3  |  pwcmpwgt  |  pternwa  |  ptio1ocd  |     hrhhid  |  ptwk  |  gtmetsta  |
     |       .  |       .  |       .  |         .  |        .  |         .  |  1.617e+14  |     0  |         1  |
     +----------------------------------------------------------------------------------------------------------+

     +----------------------------------------------------------------------------------------------------------+
  2. | hrhhid2 | hrmis | hrmonth | hryear4 | gestfips | gediv | pulineno | pudis | puiodp1 | puiodp2 | puiodp3  |
     |   17011 |     8 |       7 |    2025 |        1 |     6 |        1 |    -1 |      -1 |      -1 |      -1  |
     |-------------------------------------+----------+---------------------------------------------------------|
     | perrp | pesex | pemaritl | pehspnon | peeduca  | peafever  | ptdtrace  | prpertyp  | prtage  | pehractt  |
     |    40 |     2 |        1 |        2 |      43  |        2  |        1  |        2  |     61  |       -1  |
     |------------------------------------------------------------------------------------+---------------------|
     | pehract1 | pehract2 | pemlr | peio1icd | peio1cow | peio2cow | prmjind1 | prmjind2 | prmjocc1 | prmjocc2 |
     |       -1 |       -1 |     5 |       -1 |       -1 |       -1 |       -1 |       -1 |       -1 |       -1 |
     |----------+----------+-------------------------------------------------------------------------+----------|
     | prdtind1 | prdtind2 | prdtocc1 | prdtocc2 | prsjmj | peernhry | peernhro | peernlab | prerelg | prdisflg |
     |       -1 |       -1 |       -1 |       -1 |     -1 |       -1 |       -1 |       -1 |       0 |        2 |
     |----------+----------+----------+-------------------------------------------------------------------------|
     | pecert1  | pecert2  | pecert3  |  pwcmpwgt  |  pternwa  |  ptio1ocd  |     hrhhid  |  ptwk  |  gtmetsta  |
     |       2  |      -1  |      -1  |  18467177  |       -1  |        -1  |  1.090e+14  |     0  |         2  |
     +----------------------------------------------------------------------------------------------------------+

     +----------------------------------------------------------------------------------------------------------+
  3. | hrhhid2 | hrmis | hrmonth | hryear4 | gestfips | gediv | pulineno | pudis | puiodp1 | puiodp2 | puiodp3  |
     |   17011 |     8 |       7 |    2025 |        1 |     6 |        2 |     1 |      -1 |      -1 |      -1  |
     |-------------------------------------+----------+---------------------------------------------------------|
     | perrp | pesex | pemaritl | pehspnon | peeduca  | peafever  | ptdtrace  | prpertyp  | prtage  | pehractt  |
     |    42 |     1 |        1 |        2 |      43  |        1  |        1  |        2  |     61  |       -1  |
     |------------------------------------------------------------------------------------+---------------------|
     | pehract1 | pehract2 | pemlr | peio1icd | peio1cow | peio2cow | prmjind1 | prmjind2 | prmjocc1 | prmjocc2 |
     |       -1 |       -1 |     6 |       -1 |       -1 |       -1 |       -1 |       -1 |       -1 |       -1 |
     |----------+----------+-------------------------------------------------------------------------+----------|
     | prdtind1 | prdtind2 | prdtocc1 | prdtocc2 | prsjmj | peernhry | peernhro | peernlab | prerelg | prdisflg |
     |       -1 |       -1 |       -1 |       -1 |     -1 |       -1 |       -1 |       -1 |       0 |        2 |
     |----------+----------+----------+-------------------------------------------------------------------------|
     | pecert1  | pecert2  | pecert3  |  pwcmpwgt  |  pternwa  |  ptio1ocd  |     hrhhid  |  ptwk  |  gtmetsta  |
     |       2  |      -1  |      -1  |  27264237  |       -1  |        -1  |  1.090e+14  |     0  |         2  |
     +----------------------------------------------------------------------------------------------------------+

     +----------------------------------------------------------------------------------------------------------+
  4. | hrhhid2 | hrmis | hrmonth | hryear4 | gestfips | gediv | pulineno | pudis | puiodp1 | puiodp2 | puiodp3  |
     |   17011 |     8 |       7 |    2025 |        1 |     6 |        1 |    -1 |      -1 |      -1 |      -1  |
     |-------------------------------------+----------+---------------------------------------------------------|
     | perrp | pesex | pemaritl | pehspnon | peeduca  | peafever  | ptdtrace  | prpertyp  | prtage  | pehractt  |
     |    40 |     1 |        1 |        2 |      43  |        1  |        1  |        2  |     66  |       -1  |
     |------------------------------------------------------------------------------------+---------------------|
     | pehract1 | pehract2 | pemlr | peio1icd | peio1cow | peio2cow | prmjind1 | prmjind2 | prmjocc1 | prmjocc2 |
     |       -1 |       -1 |     5 |       -1 |       -1 |       -1 |       -1 |       -1 |       -1 |       -1 |
     |----------+----------+-------------------------------------------------------------------------+----------|
     | prdtind1 | prdtind2 | prdtocc1 | prdtocc2 | prsjmj | peernhry | peernhro | peernlab | prerelg | prdisflg |
     |       -1 |       -1 |       -1 |       -1 |     -1 |       -1 |       -1 |       -1 |       0 |        2 |
     |----------+----------+----------+-------------------------------------------------------------------------|
     | pecert1  | pecert2  | pecert3  |  pwcmpwgt  |  pternwa  |  ptio1ocd  |     hrhhid  |  ptwk  |  gtmetsta  |
     |       2  |      -1  |      -1  |  24488360  |       -1  |        -1  |  6.101e+14  |     0  |         2  |
     +----------------------------------------------------------------------------------------------------------+

     +----------------------------------------------------------------------------------------------------------+
  5. | hrhhid2 | hrmis | hrmonth | hryear4 | gestfips | gediv | pulineno | pudis | puiodp1 | puiodp2 | puiodp3  |
     |   17011 |     8 |       7 |    2025 |        1 |     6 |        2 |    -1 |       1 |       2 |       1  |
     |-------------------------------------+----------+---------------------------------------------------------|
     | perrp | pesex | pemaritl | pehspnon | peeduca  | peafever  | ptdtrace  | prpertyp  | prtage  | pehractt  |
     |    42 |     2 |        1 |        2 |      39  |        2  |        1  |        2  |     64  |        4  |
     |------------------------------------------------------------------------------------+---------------------|
     | pehract1 | pehract2 | pemlr | peio1icd | peio1cow | peio2cow | prmjind1 | prmjind2 | prmjocc1 | prmjocc2 |
     |        4 |       -1 |     1 |     7390 |        4 |       -1 |        9 |       -1 |        1 |       -1 |
     |----------+----------+-------------------------------------------------------------------------+----------|
     | prdtind1 | prdtind2 | prdtocc1 | prdtocc2 | prsjmj | peernhry | peernhro | peernlab | prerelg | prdisflg |
     |       36 |       -1 |        2 |       -1 |      1 |        1 |        4 |        2 |       1 |        2 |
     |----------+----------+----------+-------------------------------------------------------------------------|
     | pecert1  | pecert2  | pecert3  |  pwcmpwgt  |  pternwa  |  ptio1ocd  |     hrhhid  |  ptwk  |  gtmetsta  |
     |       2  |      -1  |      -1  |  18946917  |     8800  |       710  |  6.101e+14  |     0  |         2  |
     +----------------------------------------------------------------------------------------------------------+

     +----------------------------------------------------------------------------------------------------------+
  6. | hrhhid2 | hrmis | hrmonth | hryear4 | gestfips | gediv | pulineno | pudis | puiodp1 | puiodp2 | puiodp3  |
     |   17011 |     7 |       7 |    2025 |        1 |     6 |        1 |    -1 |      -1 |      -1 |      -1  |
     |-------------------------------------+----------+---------------------------------------------------------|
     | perrp | pesex | pemaritl | pehspnon | peeduca  | peafever  | ptdtrace  | prpertyp  | prtage  | pehractt  |
     |    41 |     2 |        3 |        2 |      36  |        2  |        1  |        2  |     63  |       -1  |
     |------------------------------------------------------------------------------------+---------------------|
     | pehract1 | pehract2 | pemlr | peio1icd | peio1cow | peio2cow | prmjind1 | prmjind2 | prmjocc1 | prmjocc2 |
     |       -1 |       -1 |     5 |       -1 |       -1 |       -1 |       -1 |       -1 |       -1 |       -1 |
     |----------+----------+-------------------------------------------------------------------------+----------|
     | prdtind1 | prdtind2 | prdtocc1 | prdtocc2 | prsjmj | peernhry | peernhro | peernlab | prerelg | prdisflg |
     |       -1 |       -1 |       -1 |       -1 |     -1 |       -1 |       -1 |       -1 |       0 |        2 |
     |----------+----------+----------+-------------------------------------------------------------------------|
     | pecert1  | pecert2  | pecert3  |  pwcmpwgt  |  pternwa  |  ptio1ocd  |     hrhhid  |  ptwk  |  gtmetsta  |
     |       2  |      -1  |      -1  |  20570322  |       -1  |        -1  |  9.101e+14  |     0  |         2  |
     +----------------------------------------------------------------------------------------------------------+

     +----------------------------------------------------------------------------------------------------------+
  7. | hrhhid2 | hrmis | hrmonth | hryear4 | gestfips | gediv | pulineno | pudis | puiodp1 | puiodp2 | puiodp3  |
     |   17011 |     8 |       7 |    2025 |        1 |     6 |        . |     . |       . |       . |       .  |
     |-------------------------------------+----------+---------------------------------------------------------|
     | perrp | pesex | pemaritl | pehspnon | peeduca  | peafever  | ptdtrace  | prpertyp  | prtage  | pehractt  |
     |     . |     . |        . |        . |       .  |        .  |        .  |        .  |      .  |        .  |
     |------------------------------------------------------------------------------------+---------------------|
     | pehract1 | pehract2 | pemlr | peio1icd | peio1cow | peio2cow | prmjind1 | prmjind2 | prmjocc1 | prmjocc2 |
     |        . |        . |     . |        . |        . |        . |        . |        . |        . |        . |
     |----------+----------+-------------------------------------------------------------------------+----------|
     | prdtind1 | prdtind2 | prdtocc1 | prdtocc2 | prsjmj | peernhry | peernhro | peernlab | prerelg | prdisflg |
     |        . |        . |        . |        . |      . |        . |        . |        . |       . |        . |
     |----------+----------+----------+-------------------------------------------------------------------------|
     | pecert1  | pecert2  | pecert3  |  pwcmpwgt  |  pternwa  |  ptio1ocd  |     hrhhid  |  ptwk  |  gtmetsta  |
     |       .  |       .  |       .  |         .  |        .  |         .  |  4.524e+13  |     0  |         1  |
     +----------------------------------------------------------------------------------------------------------+

     +----------------------------------------------------------------------------------------------------------+
  8. | hrhhid2 | hrmis | hrmonth | hryear4 | gestfips | gediv | pulineno | pudis | puiodp1 | puiodp2 | puiodp3  |
     |   17011 |     8 |       7 |    2025 |        1 |     6 |        . |     . |       . |       . |       .  |
     |-------------------------------------+----------+---------------------------------------------------------|
     | perrp | pesex | pemaritl | pehspnon | peeduca  | peafever  | ptdtrace  | prpertyp  | prtage  | pehractt  |
     |     . |     . |        . |        . |       .  |        .  |        .  |        .  |      .  |        .  |
     |------------------------------------------------------------------------------------+---------------------|
     | pehract1 | pehract2 | pemlr | peio1icd | peio1cow | peio2cow | prmjind1 | prmjind2 | prmjocc1 | prmjocc2 |
     |        . |        . |     . |        . |        . |        . |        . |        . |        . |        . |
     |----------+----------+-------------------------------------------------------------------------+----------|
     | prdtind1 | prdtind2 | prdtocc1 | prdtocc2 | prsjmj | peernhry | peernhro | peernlab | prerelg | prdisflg |
     |        . |        . |        . |        . |      . |        . |        . |        . |       . |        . |
     |----------+----------+----------+-------------------------------------------------------------------------|
     | pecert1  | pecert2  | pecert3  |  pwcmpwgt  |  pternwa  |  ptio1ocd  |     hrhhid  |  ptwk  |  gtmetsta  |
     |       .  |       .  |       .  |         .  |        .  |         .  |  1.101e+14  |     0  |         1  |
     +----------------------------------------------------------------------------------------------------------+

     +----------------------------------------------------------------------------------------------------------+
  9. | hrhhid2 | hrmis | hrmonth | hryear4 | gestfips | gediv | pulineno | pudis | puiodp1 | puiodp2 | puiodp3  |
     |   17011 |     8 |       7 |    2025 |        1 |     6 |        1 |    -1 |      -1 |      -1 |      -1  |
     |-------------------------------------+----------+---------------------------------------------------------|
     | perrp | pesex | pemaritl | pehspnon | peeduca  | peafever  | ptdtrace  | prpertyp  | prtage  | pehractt  |
     |    40 |     2 |        1 |        2 |      44  |        2  |        1  |        2  |     67  |       -1  |
     |------------------------------------------------------------------------------------+---------------------|
     | pehract1 | pehract2 | pemlr | peio1icd | peio1cow | peio2cow | prmjind1 | prmjind2 | prmjocc1 | prmjocc2 |
     |       -1 |       -1 |     5 |       -1 |       -1 |       -1 |       -1 |       -1 |       -1 |       -1 |
     |----------+----------+-------------------------------------------------------------------------+----------|
     | prdtind1 | prdtind2 | prdtocc1 | prdtocc2 | prsjmj | peernhry | peernhro | peernlab | prerelg | prdisflg |
     |       -1 |       -1 |       -1 |       -1 |     -1 |       -1 |       -1 |       -1 |       0 |        2 |
     |----------+----------+----------+-------------------------------------------------------------------------|
     | pecert1  | pecert2  | pecert3  |  pwcmpwgt  |  pternwa  |  ptio1ocd  |     hrhhid  |  ptwk  |  gtmetsta  |
     |       2  |      -1  |      -1  |  19775533  |       -1  |        -1  |  5.314e+14  |     0  |         1  |
     +----------------------------------------------------------------------------------------------------------+

     +----------------------------------------------------------------------------------------------------------+
 10. | hrhhid2 | hrmis | hrmonth | hryear4 | gestfips | gediv | pulineno | pudis | puiodp1 | puiodp2 | puiodp3  |
     |   17011 |     8 |       7 |    2025 |        1 |     6 |        2 |    -1 |      -1 |      -1 |      -1  |
     |-------------------------------------+----------+---------------------------------------------------------|
     | perrp | pesex | pemaritl | pehspnon | peeduca  | peafever  | ptdtrace  | prpertyp  | prtage  | pehractt  |
     |    42 |     1 |        1 |        2 |      43  |        2  |        1  |        2  |     68  |       -1  |
     |------------------------------------------------------------------------------------+---------------------|
     | pehract1 | pehract2 | pemlr | peio1icd | peio1cow | peio2cow | prmjind1 | prmjind2 | prmjocc1 | prmjocc2 |
     |       -1 |       -1 |     5 |       -1 |       -1 |       -1 |       -1 |       -1 |       -1 |       -1 |
     |----------+----------+-------------------------------------------------------------------------+----------|
     | prdtind1 | prdtind2 | prdtocc1 | prdtocc2 | prsjmj | peernhry | peernhro | peernlab | prerelg | prdisflg |
     |       -1 |       -1 |       -1 |       -1 |     -1 |       -1 |       -1 |       -1 |       0 |        2 |
     |----------+----------+----------+-------------------------------------------------------------------------|
     | pecert1  | pecert2  | pecert3  |  pwcmpwgt  |  pternwa  |  ptio1ocd  |     hrhhid  |  ptwk  |  gtmetsta  |
     |       2  |      -1  |      -1  |  24488360  |       -1  |        -1  |  5.314e+14  |     0  |         1  |
     +----------------------------------------------------------------------------------------------------------+

Subsetting by row and columns

quietly cd "/Users/Sam/Desktop/Data/CPS/"
use gestfips pemlr pternwa prerelg using "smalljul25pub.dta" if gestfip == 24 & prerelg==1, clear
list in 60/69
     | gestfips   pemlr   prerelg   pternwa |
     |--------------------------------------|
 60. |       24       1         1    132000 |
 61. |       24       1         1     84000 |
 62. |       24       1         1    108000 |
 63. |       24       1         1     80000 |
 64. |       24       1         1    163000 |
     |--------------------------------------|
 65. |       24       1         1    170000 |
 66. |       24       1         1     70000 |
 67. |       24       1         1     54000 |
 68. |       24       1         1    252000 |
 69. |       24       1         1     60000 |
     +--------------------------------------+

2.3.3 System data

There are data sets that are included in Stata and use the sysuse command.

sysuse auto, clear
list make price mpg rep78 foreign, sep(5)
(1978 Automobile Data)

     +-----------------------------------------------------+
     | make                 price   mpg   rep78    foreign |
     |-----------------------------------------------------|
  1. | AMC Concord          4,099    22       3   Domestic |
  2. | AMC Pacer            4,749    17       3   Domestic |
  3. | AMC Spirit           3,799    22       .   Domestic |
  4. | Buick Century        4,816    20       3   Domestic |
  5. | Buick Electra        7,827    15       4   Domestic |
     |-----------------------------------------------------|
  6. | Buick LeSabre        5,788    18       3   Domestic |
  7. | Buick Opel           4,453    26       .   Domestic |
  8. | Buick Regal          5,189    20       3   Domestic |
  9. | Buick Riviera       10,372    16       3   Domestic |
 10. | Buick Skylark        4,082    19       3   Domestic |
     |-----------------------------------------------------|
 11. | Cad. Deville        11,385    14       3   Domestic |
 12. | Cad. Eldorado       14,500    14       2   Domestic |
 13. | Cad. Seville        15,906    21       3   Domestic |
 14. | Chev. Chevette       3,299    29       3   Domestic |
 15. | Chev. Impala         5,705    16       4   Domestic |
     |-----------------------------------------------------|
 16. | Chev. Malibu         4,504    22       3   Domestic |
 17. | Chev. Monte Carlo    5,104    22       2   Domestic |
 18. | Chev. Monza          3,667    24       2   Domestic |
 19. | Chev. Nova           3,955    19       3   Domestic |
 20. | Dodge Colt           3,984    30       5   Domestic |
     |-----------------------------------------------------|
 21. | Dodge Diplomat       4,010    18       2   Domestic |
 22. | Dodge Magnum         5,886    16       2   Domestic |
 23. | Dodge St. Regis      6,342    17       2   Domestic |
 24. | Ford Fiesta          4,389    28       4   Domestic |
 25. | Ford Mustang         4,187    21       3   Domestic |
     |-----------------------------------------------------|
 26. | Linc. Continental   11,497    12       3   Domestic |
 27. | Linc. Mark V        13,594    12       3   Domestic |
 28. | Linc. Versailles    13,466    14       3   Domestic |
 29. | Merc. Bobcat         3,829    22       4   Domestic |
 30. | Merc. Cougar         5,379    14       4   Domestic |
     |-----------------------------------------------------|
 31. | Merc. Marquis        6,165    15       3   Domestic |
 32. | Merc. Monarch        4,516    18       3   Domestic |
 33. | Merc. XR-7           6,303    14       4   Domestic |
 34. | Merc. Zephyr         3,291    20       3   Domestic |
 35. | Olds 98              8,814    21       4   Domestic |
     |-----------------------------------------------------|
 36. | Olds Cutl Supr       5,172    19       3   Domestic |
 37. | Olds Cutlass         4,733    19       3   Domestic |
 38. | Olds Delta 88        4,890    18       4   Domestic |
 39. | Olds Omega           4,181    19       3   Domestic |
 40. | Olds Starfire        4,195    24       1   Domestic |
     |-----------------------------------------------------|
 41. | Olds Toronado       10,371    16       3   Domestic |
 42. | Plym. Arrow          4,647    28       3   Domestic |
 43. | Plym. Champ          4,425    34       5   Domestic |
 44. | Plym. Horizon        4,482    25       3   Domestic |
 45. | Plym. Sapporo        6,486    26       .   Domestic |
     |-----------------------------------------------------|
 46. | Plym. Volare         4,060    18       2   Domestic |
 47. | Pont. Catalina       5,798    18       4   Domestic |
 48. | Pont. Firebird       4,934    18       1   Domestic |
 49. | Pont. Grand Prix     5,222    19       3   Domestic |
 50. | Pont. Le Mans        4,723    19       3   Domestic |
     |-----------------------------------------------------|
 51. | Pont. Phoenix        4,424    19       .   Domestic |
 52. | Pont. Sunbird        4,172    24       2   Domestic |
 53. | Audi 5000            9,690    17       5    Foreign |
 54. | Audi Fox             6,295    23       3    Foreign |
 55. | BMW 320i             9,735    25       4    Foreign |
     |-----------------------------------------------------|
 56. | Datsun 200           6,229    23       4    Foreign |
 57. | Datsun 210           4,589    35       5    Foreign |
 58. | Datsun 510           5,079    24       4    Foreign |
 59. | Datsun 810           8,129    21       4    Foreign |
 60. | Fiat Strada          4,296    21       3    Foreign |
     |-----------------------------------------------------|
 61. | Honda Accord         5,799    25       5    Foreign |
 62. | Honda Civic          4,499    28       4    Foreign |
 63. | Mazda GLC            3,995    30       4    Foreign |
 64. | Peugeot 604         12,990    14       .    Foreign |
 65. | Renault Le Car       3,895    26       3    Foreign |
     |-----------------------------------------------------|
 66. | Subaru               3,798    35       5    Foreign |
 67. | Toyota Celica        5,899    18       5    Foreign |
 68. | Toyota Corolla       3,748    31       5    Foreign |
 69. | Toyota Corona        5,719    18       5    Foreign |
 70. | VW Dasher            7,140    23       4    Foreign |
     |-----------------------------------------------------|
 71. | VW Diesel            5,397    41       5    Foreign |
 72. | VW Rabbit            4,697    25       4    Foreign |
 73. | VW Scirocco          6,850    25       4    Foreign |
 74. | Volvo 260           11,995    17       5    Foreign |
     +-----------------------------------------------------+

You can use Stata files that are available, but didn’t ship with Stata and they use the webuse command

webuse fullauto, clear
list make price mpg rep78 foreign in 1/5
(Automobile Models)

     +-------------------------------------------+
     | make   price   mpg       rep78    foreign |
     |-------------------------------------------|
  1. |  AMC    4099    22     Average   Domestic |
  2. |  AMC    4749    17     Average   Domestic |
  3. |  AMC    3799    22           .   Domestic |
  4. | Audi    6295    23     Average    Foreign |
  5. | Audi    9690    17   Excellent    Foreign |
     +-------------------------------------------+

2.4 Importing Excel files

Many times you will need to grab data from a website and these data may be in an Excel format such as “.xls” or “.xlsx”. Our main command is import excel. We need to specify the firstrow option if we want the headers as variable names. Be careful if the headers are numerics, such as years, in the firstrow. This will throw an error and we’ll need to rename them in Excel before importing these files. For example, 2015 becomes y2015.

quietly cd "/Users/Sam/Desktop/Econ 645/Data/Mitchell/"
import excel using "dentists.xls", firstrow clear
list 
     |              name   years   fulltime   recom |
     |----------------------------------------------|
  1. | Y. Don Uflossmore    7.25          0       1 |
  2. |    Olive Tu'Drill   10.25          1       1 |
  3. | Isaac O'Yerbreath   32.75          1       1 |
  4. |      Ruth Canaale      22          1       1 |
  5. |        Mike Avity     8.5          0       0 |
     +----------------------------------------------+

We need to take caution of the sheet names of the excel files we are importing. If you don’t pay attention you may import the wrong sheet.

quietly cd "/Users/Sam/Desktop/Econ 645/Data/Mitchell/"
import excel using "dentists2.xls", firstrow clear
list

import excel using "dentists2.xls", firstrow clear sheet("dentists")
list
     |       name   years |
     |--------------------|
  1. | I. Sue Yoo       3 |
  2. |   A. Dewey       8 |
  3. | B. Cheetem      11 |
  4. |    C. Howe      21 |
     +--------------------+

     +----------------------------------------------+
     |              name   years   fulltime   recom |
     |----------------------------------------------|
  1. | Y. Don Uflossmore    7.25          0       1 |
  2. |    Olive Tu'Drill   10.25          1       1 |
  3. | Isaac O'Yerbreath   32.75          1       1 |
  4. |      Ruth Canaale      22          1       1 |
  5. |        Mike Avity     8.5          0       0 |
     +----------------------------------------------+

2.5 Importing SAS, SPSS, dbase files

In my opinion, importing files from other statistical packages is a pain. If you are using another statistical software packages like R, SAS, Python, or heaven-forbid SPSS, then export the data from those software package as CSV files. CSV files are fairly universal. There maybe some labeling issues, but if labels are not a concern then use CSV file formats.

Luckily, SAS data files are becoming less common. Even the Census Bureau has seen the light and provides their data in a csv format.

2.5.1 Importing .sas7bdat

For me, it is unavailable in Stata 14, but if you have Stata 16 or higher you can use this command.

Import sas is our main command to import SAS from version 7 or higher.

import sas "dentists.sas7bdat", clear
list

2.5.2 Importing SAS XPORT Version 5

import sasxport5 "dentists.xpt", clear
list

2.5.3 Importing SAS XPORT Version 8

import sasxport8 "dentists.xpt", clear
list

2.5.4 Importing SPSS files

For me, it is unavailable in Stata 14, but if you have Stata 16 or higher you can use this command.

import spss "dentlab.sav", clear
list

2.5.5 Importing dBase files

For me, it is unavailable in Stata 14, but if you have Stata 16 or higher you can use this command.

import dbase "dentlab.dbf", clear
list

2.6 Importing Raw Data files

We use the import delimited command to read in csv, txt, dat, etc. files.

2.6.1 Import Delimited

Importing csv files will likely be more common than other types of files so far with the exception of Stata and Excel files. Note: there are other types of delimited files, but comman-delimited are the most common. Other types of delimiters include “tab”, “space”, or “|” delimited.

CSV file If you read the manual you’ll noticed that “,” and “ are the default delimiters Using”,” delimiter:

quietly cd "/Users/Sam/Desktop/Econ 645/Data/Mitchell"
import delimited using "dentists.csv", clear
list

quietly cd "/Users/Sam/Desktop/Data/CPS/"
import delimited using "jul25pub.csv", clear
list gestfips pemlr pternwa in 1/10
(4 vars, 5 obs)

     +----------------------------------------------+
     |              name   years   fulltime   recom |
     |----------------------------------------------|
  1. | Y. Don Uflossmore    7.25          0       1 |
  2. |    Olive Tu'Drill   10.25          1       1 |
  3. | Isaac O'Yerbreath   32.75          1       1 |
  4. |      Ruth Canaale      22          1       1 |
  5. |        Mike Avity     8.5          0       0 |
     +----------------------------------------------+


(393 vars, 122,252 obs)

     +----------------------------+
     | gestfips   pemlr   pternwa |
     |----------------------------|
  1. |        1       .         . |
  2. |        1       5        -1 |
  3. |        1       6        -1 |
  4. |        1       5        -1 |
  5. |        1       1      8800 |
     |----------------------------|
  6. |        1       5        -1 |
  7. |        1       .         . |
  8. |        1       .         . |
  9. |        1       5        -1 |
 10. |        1       5        -1 |
     +----------------------------+

CSV files may come in .txt files or .csv files using “,” delimiter

quietly cd "/Users/Sam/Desktop/Econ 645/Data/Mitchell"
import delimited using "dentists1.txt", clear
list
(4 vars, 5 obs)

     +----------------------------------------------+
     |              name   years   fulltime   recom |
     |----------------------------------------------|
  1. | Y. Don Uflossmore    7.25          0       1 |
  2. |    Olive Tu'Drill   10.25          1       1 |
  3. | Isaac O'Yerbreath   32.75          1       1 |
  4. |      Ruth Canaale      22          1       1 |
  5. |        Mike Avity     8.5          0       0 |
     +----------------------------------------------+

There may be “tab”-delimited files that use “ delimiter

quietly cd "/Users/Sam/Desktop/Econ 645/Data/Mitchell"
import delimited using "dentists2.txt", clear
list
(4 vars, 5 obs)

     +----------------------------------------------+
     |              name   years   fulltime   recom |
     |----------------------------------------------|
  1. | Y. Don Uflossmore    7.25          0       1 |
  2. |    Olive Tu'Drill   10.25          1       1 |
  3. | Isaac O'Yerbreath   32.75          1       1 |
  4. |      Ruth Canaale      22          1       1 |
  5. |        Mike Avity     8.5          0       0 |
     +----------------------------------------------+

This will produce the same results as above since “ is a default

quietly cd "/Users/Sam/Desktop/Econ 645/Data/Mitchell"
import delimited using "dentists2.txt", clear delimiters("\t")
list
(4 vars, 5 obs)

     +----------------------------------------------+
     |              name   years   fulltime   recom |
     |----------------------------------------------|
  1. | Y. Don Uflossmore    7.25          0       1 |
  2. |    Olive Tu'Drill   10.25          1       1 |
  3. | Isaac O'Yerbreath   32.75          1       1 |
  4. |      Ruth Canaale      22          1       1 |
  5. |        Mike Avity     8.5          0       0 |
     +----------------------------------------------+

If we set the delimiter to “:”, we can import a “:”-delimited file

quietly cd "/Users/Sam/Desktop/Econ 645/Data/Mitchell"
import delimited using "dentists4.txt", clear delimiters(":") 
list
(4 vars, 5 obs)

     +-------------------------------------+
     |                v1      v2   v3   v4 |
     |-------------------------------------|
  1. | Y. Don Uflossmore    7.25    0    1 |
  2. |    Olive Tu'Drill   10.25    1    1 |
  3. | Isaac O'Yerbreath   32.75    1    1 |
  4. |      Ruth Canaale      22    1    1 |
  5. |        Mike Avity     8.5    0    0 |
     +-------------------------------------+

We can subset columns and rows with import delimited, which can be helpful for large files. Variable names will be defaulted for the first few rows, so when we use rowrange(1:3) this will only include 2 rows of data.

quietly cd "/Users/Sam/Desktop/Econ 645/Data/Mitchell"
import delimited using "dentists2.txt", clear rowrange(1:3)
list
(4 vars, 2 obs)

     +----------------------------------------------+
     |              name   years   fulltime   recom |
     |----------------------------------------------|
  1. | Y. Don Uflossmore    7.25          0       1 |
  2. |    Olive Tu'Drill   10.25          1       1 |
     +----------------------------------------------+

Import delimited from the web

import delimited using "https://www2.census.gov/programs-surveys/cps/datasets/2023/basic/jun23pub.csv", clear
pwd
save "jun23pub.dta", replace

2.6.2 Import delimited with subset of rows

You can use the rowrange option to subset large dataset. For example, the NPPES admin data is over 10 GB. If we want to inspect the data, we can just use rowrange. Another example would be the American Community Survey Public Use File.

cd "/Users/Sam/Desktop/Data/NPPES/"
import delimited using npidata_pfile_20050523-20241208.csv, varnames(1) rowrange(1:10) clear

2.6.3 Import space-separated files

Sometimes files have space-delimited files (which is a bad idea when strings are involved). Dentists5.txt is an example of such a file. We can use the infile command to import these data, but we need to specify that the name variable is a string at least 17 characters wide.

quietly cd "/Users/Sam/Desktop/Econ 645/Data/Mitchell"
infile str17 name years full rec using "dentists5.txt", clear
list
(5 observations read)

     +----------------------------------------+
     |              name   years   full   rec |
     |----------------------------------------|
  1. | Y. Don Uflossmore    7.25      0     1 |
  2. |    Olive Tu'Drill   10.25      1     1 |
  3. | Isaac O'Yerbreath   32.75      1     1 |
  4. |      Ruth Canaale      22      1     1 |
  5. |        Mike Avity     8.5      0     0 |
     +----------------------------------------+

Infile does not read in variable names in the first row.

We can also use import delimited with delimiters(” “)

quietly cd "/Users/Sam/Desktop/Econ 645/Data/Mitchell"
import delimited using "dentists6.txt", delimiters(" ") clear
list
(4 vars, 5 obs)

     +----------------------------------------------+
     |              name   years   fulltime   recom |
     |----------------------------------------------|
  1. | Y. Don Uflossmore    7.25          0       1 |
  2. |    Olive Tu'Drill   10.25          1       1 |
  3. | Isaac O'Yerbreath   32.75          1       1 |
  4. |      Ruth Canaale      22          1       1 |
  5. |        Mike Avity     8.5          0       0 |
     +----------------------------------------------+

2.6.4 Importing fixed-column files

Unfortunately, fixed-column files are more common than I prefer. They are a pain, since you need to specify each column length. I have found this with files with a “.dat” file extension.

We can use the infix command to read in fixed-column files. Let’s look at our fixed-column width files.

quietly cd "/Users/Sam/Desktop/Econ 645/Data/Mitchell/"
type "dentists7.txt"
Y. Don Uflossmore 7.2501
Olive Tu'Drill   10.2511
Isaac O'Yerbreath32.7511
Ruth Canaale     22.0011
Mike Avity        8.5000

Our name variable is between 1 and 17, years are between 18 and 22. Our fulltime and recommend binaries are only length of 1 with 23 and 24.

quietly cd "/Users/Sam/Desktop/Econ 645/Data/Mitchell/"
infix str name 1-17 years 18-22 fulltime 23 recom 24 using "dentists7.txt", clear
list
(5 observations read)

     +----------------------------------------------+
     |              name   years   fulltime   recom |
     |----------------------------------------------|
  1. | Y. Don Uflossmore    7.25          0       1 |
  2. |    Olive Tu'Drill   10.25          1       1 |
  3. | Isaac O'Yerbreath   32.75          1       1 |
  4. |      Ruth Canaale      22          1       1 |
  5. |        Mike Avity     8.5          0       0 |
     +----------------------------------------------+

This is a real pain, but luckily Stata has dictionary files to help if one of these files are available. They have a .dct file extension. We already have a fixed-column dictionary with dentists1.dct and we can use it to open a fixed-column width file dentists7.txt

Let’s look at our dictionary file.

quietly cd "/Users/Sam/Desktop/Econ 645/Data/Mitchell/"
type "dentists1.dct"
infix dictionary {
  str name 1-17 years 18-22 fulltime 23 recom 24
}

Let’s look at our fixed-column width file

quietly cd "/Users/Sam/Desktop/Econ 645/Data/Mitchell/"
type "dentists7.txt"
Y. Don Uflossmore 7.2501
Olive Tu'Drill   10.2511
Isaac O'Yerbreath32.7511
Ruth Canaale     22.0011
Mike Avity        8.5000

Let’s use the two together to import the data file

quietly cd "/Users/Sam/Desktop/Econ 645/Data/Mitchell/"
infix using "dentists1.dct", using(dentists7.txt) clear
list
infix dictionary {
  str name 1-17 years 18-22 fulltime 23 recom 24
}
(5 observations read)

     +----------------------------------------------+
     |              name   years   fulltime   recom |
     |----------------------------------------------|
  1. | Y. Don Uflossmore    7.25          0       1 |
  2. |    Olive Tu'Drill   10.25          1       1 |
  3. | Isaac O'Yerbreath   32.75          1       1 |
  4. |      Ruth Canaale      22          1       1 |
  5. |        Mike Avity     8.5          0       0 |
     +----------------------------------------------+

We can combine infile command with a dictionary file.

quietly cd "/Users/Sam/Desktop/Econ 645/Data/Mitchell/"
type "dentists3.dct"
infile dictionary using dentists7.txt {
  str17 name     %17s  "Name of dentist"
        years    %5f   "Years in practice"
        fulltime %1f   "Full time?"
        recom    %1f   "Recommend Quaddent?"
}

Notice that the first line of the dictionary file is infile dictionary using dentists7.txt and then the column widths

quietly cd "/Users/Sam/Desktop/Econ 645/Data/Mitchell/"
infile using "dentists3.dct", clear
list
infile dictionary using dentists7.txt {
  str17 name     %17s  "Name of dentist"
        years    %5f   "Years in practice"
        fulltime %1f   "Full time?"
        recom    %1f   "Recommend Quaddent?"
}

(5 observations read)

     +----------------------------------------------+
     |              name   years   fulltime   recom |
     |----------------------------------------------|
  1. | Y. Don Uflossmore    7.25          0       1 |
  2. |    Olive Tu'Drill   10.25          1       1 |
  3. | Isaac O'Yerbreath   32.75          1       1 |
  4. |      Ruth Canaale      22          1       1 |
  5. |        Mike Avity     8.5          0       0 |
     +----------------------------------------------+

Dictionary .dct files are very helpful when they are available with fixed-column width files.

2.6.5 Subsetting fixed-column width files

We can use the “in” qualifier to subset the number of rows to import with either infix or infile.

quietly cd "/Users/Sam/Desktop/Econ 645/Data/Mitchell/"
infix str name 1-17 years 18-22 fulltime 23 recom 24 using "dentists7.txt" in 1/3, clear
list

infix using "dentists1.dct" in 1/3, using(dentists7.txt) clear
list
(3 observations read)

     +----------------------------------------------+
     |              name   years   fulltime   recom |
     |----------------------------------------------|
  1. | Y. Don Uflossmore    7.25          0       1 |
  2. |    Olive Tu'Drill   10.25          1       1 |
  3. | Isaac O'Yerbreath   32.75          1       1 |
     +----------------------------------------------+

infix dictionary {
  str name 1-17 years 18-22 fulltime 23 recom 24
}
(3 observations read)

     +----------------------------------------------+
     |              name   years   fulltime   recom |
     |----------------------------------------------|
  1. | Y. Don Uflossmore    7.25          0       1 |
  2. |    Olive Tu'Drill   10.25          1       1 |
  3. | Isaac O'Yerbreath   32.75          1       1 |
     +----------------------------------------------+

2.6.6 2.8 Common errors when reading and importing files

Tip: Don’t forget to clear your data. Either the line before or add it as an option , clear

Error 1 If you do not clear your data with the use command, then you will get the message “No; dataset in memory has changed since last saved” - Use clear in the line before ingesting the data, or use clear as an option.

Error 2 If you do not clear your data with the import, infix, or infile commands, then you will get the message: “You must start with an empty dataset”

2.6.7 Entering data directly into Stata Data Editor

My opinion: Please do not do this. If you need to manually enter data, I don’t recommend manually entering data into STATA. I recommend using a spreadsheet, then import the spreadsheet.

If for some reason you don’t have access to Excel or a spreadsheet application like OSX Numbers or Google Sheets, then read pages 43-50.