Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • variable names

    Is there any essential reason for
    - import delimited to use option varnames(1) and
    - import excel to use option firstrow
    for essentially the same thing?

    Similarly with the data types options, the two commands provide differential syntaxes for essentially saying which variables are strings and which ones are numbers, though that could be unified.

    Everything works as documented, but since both commands are coming from the same source, perhaps their syntax could be following the same naming rules and provide similar control capabilities?

    (just convinced my colleague to change storage type of input files from Excel to a more transparent and lightweight CSV, now have to go through my codes to do the adjustments...).

  • #2
    It used to be more common for csv and dat files (typically ASCII encoded) to have nonstandard header information, like comments and variable definitions. Having a way to state on which lines the header and data are located is a time-saver and prevents errors of incorrectly importing data as the wrong type. You can still find these with long-running government public datasets.
    Last edited by Leonardo Guizzetti; 29 Jan 2024, 17:54. Reason: edit: grammar

    Comment


    • #3
      Hello Leonardo,

      I understand that. And there are equally many Excel files with various complexities in the header.
      The question is why the import delimited command is using specifically the word varnames for that, while import excel is using firstrow for that?

      Thank you, Sergiy

      Comment


      • #4
        I'm inclined to agree with Sergiy. In fact, I have brought up this issue in the Wishlist thread previously in a slightly different context: why do we specify the display width for a variable name as -varwidth()- when running -ds- but as -abbreviate()- when running -list-? There are some other instances of this in Stata, too. I think the language would be improved by using consistent names for things that are the same across commands. Evidently, the old names would need to be preserved under version control, but going forward consistency would be a virtue.

        Comment

        Working...
        X