first row from csv file shows up as var1 , var2 and var3

Tariq Abdullah

Join Date: Apr 2021
Posts: 366

first row from csv file shows up as var1 , var2 and var3

21 Oct 2023, 09:46

After using both of these commands below, my loaded data on stata looks like following where the variable name shows up in row 1, and in the row before varable name comes up as var 1, var 2 , var 3 and so on

what can be done to get rid of this issue ? Any suggestion is appreciated!

Code:

insheet using "est16all.csv", clear names

import delimited using est16all, varnames(1) clear

----------------------- copy starting from the next line -----------------------

Code:

* Example generated by -dataex-. For more info, type help dataex
clear
input str16 var2
"County FIPS Code"
"000"             
"000"             
"001"             
"003"             
"005"             
"007"             
"009"             
"011"             
"013"             
"015"             
"017"             
"019"             
"021"             
"023"             
"025"        
"290"             
end

Tags: None

Tariq Abdullah

Join Date: Apr 2021
Posts: 366

21 Oct 2023, 10:16

A better example of my data after landing the CSV file on stata

----------------------- copy starting from the next line -----------------------

Code:

* Example generated by -dataex-. For more info, type help dataex
clear
input str15 Tablewithcolumnheadersinrows3and str33 v4 str16 v2
"State FIPS Code" "Name"                              "County FIPS Code"
"00"              "United States"                     "000"             
"01"              "Alabama"                           "000"             
"01"              "Autauga County"                    "001"             
"01"              "Baldwin County"                    "003"             
"01"              "Barbour County"                    "005"             
"01"              "Bibb County"                       "007"             
"01"              "Blount County"                     "009"      
end

Comment

Leonardo Guizzetti

Join Date: Jul 2016

Posts: 2389
#3

21 Oct 2023, 10:27

What do the first few lines of your CSV file look like? We would need those to diagnose a method of import. If your csv file is well behaved, with variable names on the first row and data on all following rows, the import command should work fine.
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 29956
#4

21 Oct 2023, 10:27

The reason that Stata is ignoring your -varnames(1)- option and giving you var1-var3 as the variable names is because the contents of that first row in the CSV file are not legal variable names. Legal variable names contain only letters, digits, and the underscore (_) character; in particular, blank spaces are not allowed. Moreover the first character may not be a digit, and the total length cannot exceed 32 characters. Except for the length limitation, Stata's -strtoname()- function will convert strings so that they satisfy the requirements.

Here's how you can fix up the imported file:

Code:

foreach v of varlist _all { replace `v' = strtoname(substr(`v', 1, 32)) in 1 rename `v' `=`v'[1]' } drop in 1

Note: A problem can arise if two or more of the intended variable names are identical in their first 32 characters. In that case, the above code will attempt to give two variables the same new name, and Stata will not do that. It will break and give you an error message. In this situation, you will have to manually review the original variable names and edit them in some way that shortens them to 32 characters or less but preserves their distinctiveness. How you do that in a way that preserves other desirable attributes of variable names, such as ease of typing, understandability, and representativeness of what the variable is, is a non-automatable task that must be done by a human (or, I suppose, artificial) intelligence.

Added: Crossed with #3.

Last edited by Clyde Schechter; 21 Oct 2023, 10:33.
1 like
Comment

Tariq Abdullah

Join Date: Apr 2021
Posts: 366

21 Oct 2023, 10:29

while I'm trying to rename the variable some are successful , some are giving me the following error

Code:


. rename v2 county 

. 
. rename v3 poverty estimate all 
syntax error
    Syntax is
        rename  oldname    newname   [, renumber[(#)] addnumber[(#)] sort ...]
        rename (oldnames) (newnames) [, renumber[(#)] addnumber[(#)] sort ...]
        rename  oldnames              , {upper|lower|proper}
r(198);

Comment

Tariq Abdullah

Join Date: Apr 2021

Posts: 366
#6

21 Oct 2023, 10:33

Originally posted by Clyde Schechter View Post

The reason that Stata is ignoring your -varnames(1)- option and giving you var1-var3 as the variable names is because the contents of that first row in the CSV file are not legal variable names. Legal variable names contain only letters, digits, and the underscore (_) character; in particular, blank spaces are not allowed. Moreover the first character may not be a digit, and the total length cannot exceed 32 characters. Except for the length limitation, Stata's -strtoname()- function will convert strings so that they satisfy the requirements.

Here's how you can fix up the imported file:

Code:

foreach v of varlist _all { replace `v' = strtoname(substr(`v', 1, 32)) in 1 rename `v' `=`v'[1]' } drop in 1

Note: A problem can arise if two or more of the intended variable names are identical in their first 32 characters. In that case, the above code will attempt to give two variables the same new name, and Stata will not do that. It will break and give you an error message. In this situation, you will have to manually review the original variable names and edit them in some way that shortens them to 32 characters or less but preserves their distinctiveness. How you do that in a way that preserves other desirable attributes of variable names, such as ease of typing, understandability, and representativeness of what the variable is, is a non-automatable task that must be done by a human (or, I suppose, artificial) intelligence.

after using your command until variable 8 everything went well but ir broke when it came to variable 9 to variable 32 with the following error :

Code:

foreach v of varlist _all { 2. replace `v' = strtoname(substr(`v', 1, 32)) in 1 3. rename `v' `=`v'[1]' 4. } (1 real change made) (1 real change made) (1 real change made) (0 real changes made) (1 real change made) variable v6 was str18 now str19 (1 real change made) variable v7 was str18 now str19 (1 real change made) (1 real change made) variable v9 was str18 now str19 (1 real change made) variable _90__CI_Lower_Bound already defined r(110); end of do-file r(110);
Comment
Tariq Abdullah

Join Date: Apr 2021

Posts: 366
#7

21 Oct 2023, 10:39

I found a way around it since I don't need all the variables. thanks again for helping me for the nth time with the problem.

your mentoring is always appreciated ! highly obliged
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 29956
#8

21 Oct 2023, 10:45

-rename- does not know how to parse -rename v3 poverty estimate all-. Do you want to rename v3 to poverty and estimate to all? Or do you want to rename v32 and poverty to estimate and all, respectively? If the latter, it's -rename (v3 poverty) (estimate all)-. If the former, it's -rename (v3 estimate) (poverty all)-.
1 like
Comment
Tariq Abdullah

Join Date: Apr 2021

Posts: 366
#9

21 Oct 2023, 19:13

Alright Mr Schechter! Understood and duly noted this information hereon.
Comment

Announcement

first row from csv file shows up as var1 , var2 and var3

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment