Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • first row from csv file shows up as var1 , var2 and var3

    After using both of these commands below, my loaded data on stata looks like following where the variable name shows up in row 1, and in the row before varable name comes up as var 1, var 2 , var 3 and so on

    what can be done to get rid of this issue ? Any suggestion is appreciated!

    Code:
    insheet using "est16all.csv", clear names
    
    import delimited using est16all, varnames(1) clear
    ----------------------- copy starting from the next line -----------------------
    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input str16 var2
    "County FIPS Code"
    "000"             
    "000"             
    "001"             
    "003"             
    "005"             
    "007"             
    "009"             
    "011"             
    "013"             
    "015"             
    "017"             
    "019"             
    "021"             
    "023"             
    "025"        
    "290"             
    end

  • #2
    A better example of my data after landing the CSV file on stata


    ----------------------- copy starting from the next line -----------------------
    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input str15 Tablewithcolumnheadersinrows3and str33 v4 str16 v2
    "State FIPS Code" "Name"                              "County FIPS Code"
    "00"              "United States"                     "000"             
    "01"              "Alabama"                           "000"             
    "01"              "Autauga County"                    "001"             
    "01"              "Baldwin County"                    "003"             
    "01"              "Barbour County"                    "005"             
    "01"              "Bibb County"                       "007"             
    "01"              "Blount County"                     "009"      
    end

    Comment


    • #3
      What do the first few lines of your CSV file look like? We would need those to diagnose a method of import. If your csv file is well behaved, with variable names on the first row and data on all following rows, the import command should work fine.

      Comment


      • #4
        The reason that Stata is ignoring your -varnames(1)- option and giving you var1-var3 as the variable names is because the contents of that first row in the CSV file are not legal variable names. Legal variable names contain only letters, digits, and the underscore (_) character; in particular, blank spaces are not allowed. Moreover the first character may not be a digit, and the total length cannot exceed 32 characters. Except for the length limitation, Stata's -strtoname()- function will convert strings so that they satisfy the requirements.

        Here's how you can fix up the imported file:

        Code:
        foreach v of varlist _all {
            replace `v' = strtoname(substr(`v', 1, 32)) in 1
            rename `v' `=`v'[1]'
        }
        drop in 1
        Note: A problem can arise if two or more of the intended variable names are identical in their first 32 characters. In that case, the above code will attempt to give two variables the same new name, and Stata will not do that. It will break and give you an error message. In this situation, you will have to manually review the original variable names and edit them in some way that shortens them to 32 characters or less but preserves their distinctiveness. How you do that in a way that preserves other desirable attributes of variable names, such as ease of typing, understandability, and representativeness of what the variable is, is a non-automatable task that must be done by a human (or, I suppose, artificial) intelligence.

        Added: Crossed with #3.
        Last edited by Clyde Schechter; 21 Oct 2023, 10:33.

        Comment


        • #5
          while I'm trying to rename the variable some are successful , some are giving me the following error

          Code:
          
          . rename v2 county 
          
          . 
          . rename v3 poverty estimate all 
          syntax error
              Syntax is
                  rename  oldname    newname   [, renumber[(#)] addnumber[(#)] sort ...]
                  rename (oldnames) (newnames) [, renumber[(#)] addnumber[(#)] sort ...]
                  rename  oldnames              , {upper|lower|proper}
          r(198);

          Comment


          • #6
            Originally posted by Clyde Schechter View Post
            The reason that Stata is ignoring your -varnames(1)- option and giving you var1-var3 as the variable names is because the contents of that first row in the CSV file are not legal variable names. Legal variable names contain only letters, digits, and the underscore (_) character; in particular, blank spaces are not allowed. Moreover the first character may not be a digit, and the total length cannot exceed 32 characters. Except for the length limitation, Stata's -strtoname()- function will convert strings so that they satisfy the requirements.

            Here's how you can fix up the imported file:

            Code:
            foreach v of varlist _all {
            replace `v' = strtoname(substr(`v', 1, 32)) in 1
            rename `v' `=`v'[1]'
            }
            drop in 1
            Note: A problem can arise if two or more of the intended variable names are identical in their first 32 characters. In that case, the above code will attempt to give two variables the same new name, and Stata will not do that. It will break and give you an error message. In this situation, you will have to manually review the original variable names and edit them in some way that shortens them to 32 characters or less but preserves their distinctiveness. How you do that in a way that preserves other desirable attributes of variable names, such as ease of typing, understandability, and representativeness of what the variable is, is a non-automatable task that must be done by a human (or, I suppose, artificial) intelligence.
            after using your command until variable 8 everything went well but ir broke when it came to variable 9 to variable 32 with the following error :


            Code:
             foreach v of varlist _all {
              2.     replace `v' = strtoname(substr(`v', 1, 32)) in 1
              3.     rename `v' `=`v'[1]'
              4. }
            (1 real change made)
            (1 real change made)
            (1 real change made)
            (0 real changes made)
            (1 real change made)
            variable v6 was str18 now str19
            (1 real change made)
            variable v7 was str18 now str19
            (1 real change made)
            (1 real change made)
            variable v9 was str18 now str19
            (1 real change made)
            variable _90__CI_Lower_Bound already defined
            r(110);
            
            end of do-file
            
            r(110);

            Comment


            • #7
              I found a way around it since I don't need all the variables. thanks again for helping me for the nth time with the problem.

              your mentoring is always appreciated ! highly obliged

              Comment


              • #8
                -rename- does not know how to parse -rename v3 poverty estimate all-. Do you want to rename v3 to poverty and estimate to all? Or do you want to rename v32 and poverty to estimate and all, respectively? If the latter, it's -rename (v3 poverty) (estimate all)-. If the former, it's -rename (v3 estimate) (poverty all)-.

                Comment


                • #9
                  Alright Mr Schechter! Understood and duly noted this information hereon.

                  Comment

                  Working...
                  X