Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Import multiple raw data with unique dictionary

    Hello forum,

    I need to create a single database out of a couple thousand raw data files. I created a dictionary out of the codebook (it is public information about water precipitations captured by thousands of rain stations and each have their own file). The dictionary works fine, and I managed to create the 2000+ dictionaries with python, so I can call them on in Stata... but I am not happy with that solution. There must be a way to use only one dictionary to import more than one datafile if they all have the same structure. The problem is that, because Stata only uses the line 'quietly infile using dictionary_name', but inside the dictionary I need to have previously specified what data it is suppose to call, I do not know how to change the file the dictionary is calling in a loop. I hope my question is clear enough. I would appreciate if someone could help me sort this out.

    The dictionary (and thus, the data), looks like this:

    Code:
    dictionary using dictionary_name.hly {
    _column(1) str11 ID %11s
    _column(12) int YEAR %4f
    _column(16) int MONTH %2f
    _column(18) int DAY %2f
    _column(20) str4 ELEMENT %4s
    _column(24) int VALUE1 %5f
    _column(29) str1 MFLAG1 %1s
    _column(30) str1 QFLAG1 %1s
    _column(31) str1 SFLAG1 %1s
    _column(32) str1 S2FLAG1 %1s
    _column(33) int VALUE2 %5f
    _column(38) str1 MFLAG2 %1s
    _column(39) str1 QFLAG2 %1s
    _column(40) str1 SFLAG2 %1s
    _column(41) str1 S2FLAG2 %1s
    ...
    ....
    _column(231) int VALUE24 %5f
    _column(236) str1 MFLAG24 %1s
    _column(237) str1 QFLAG24 %1s
    _column(238) str1 SFLAG24 %1s
    _column(239) str1 S2FLAG24 %1s
    }
    Thanks!
    Last edited by Jonathan Moreno; 22 Jul 2017, 16:20.

  • #2
    Welcome to Statalist.

    Along with the using clause that specifies the dictionary, there is a separate using() option for specifying the text file input. See the output of help infile2 which describes the use of the infile command in conjunction with a dictionary. You will want something like
    Code:
    infile using dictionary_name, using(datafile_name)

    Comment

    Working...
    X