Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • test

    Thank you so much for your detailed advice. I will keep that in mind.

    This is the sample data structure of two county groups and their municipal subdivisions. I want county names such as "Allegany County" and "Anne Arundel County" to appear beside all the municipal subdivisions below them as a separate column.
    I also want to rename the first "Totals" in each county group as "Totals_incorporated"

    I am working on how to modify our code for these two tasks.

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input str27 municipal_division str17 Number_Returns_Filed
    "Allegany County"             ""          
    "Barton"                      "209"        
    "Bel Air"                     "683"        
    "Bowling Green-Roberts Place" "488"        
    "Cresaptown"                  "711"        
    "Cumberland"                  "8272"      
    "Ellerslie"                   "233"        
    "Frostburg"                   "2844"      
    "LaVale"                      "2816"      
    "Lonaconing"                  "422"        
    "Luke"                        "45"        
    "McCoole"                     "139"        
    "Midland"                     "203"        
    "Mt. Savage"                  "359"        
    "Potomac Park Addition"       "396"        
    "Westernport"                 "751"        
    "Totals"                      "18571"      
    "Rural and Unincorporated"    "10171"      
    "Totals"                      "28742"      
    ""                            ""          
    "Anne Arundel County"         ""          
    "Annapolis"                   "21414"      
    "Highland Beach"              "91"        
    "Totals"                      "21505"      
    "Rural and Unincorporated"    "263550"    
    "Totals"                      "285055"    
    ""                            ""        
    end
    Thank you once again!
    Last edited by Chinmay Korgaonkar; 04 Oct 2023, 17:48.

  • #2
    Hello.
    Everyone and
    I am facing two issues while cleaning the data for other years. These are due formatting of Excel files when I converted them from PDF files.

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input str29 Municipal_division str13 Number_Returns_Filed str17(Net_State_Tax Net_Local_Tax Net_Total_Tax) double G
    "Calvert"                    ""       ""                  ""             ""                             .
    "Chesapeake Beach"           "2571"   "6731199.41"        "3983224.57"   "10714423.98"                  .
    "North Beach"                "1004"   "1944394.88"        "1171853.44"   "3116248.32"                   .
    "Totals"                     "3575"   "8675594.289999999" "5155078.01"   "13830672.3"                   .
    " Rural and Unincorporated " "37878"  "98603342.31999999" "58553325.56"  "157156667.88"                 .
    "Totals"                     "41453"  "107278936.61"      "63708403.57"  "170987340.18"                 .
    "Caroline"                   ""       ""                  ""             ""                             .
    "Denton"                     "1687"   "2143260.67"        "1275887.03"   "3419147.7"                    .
    "Federalsburg"               "1186"   "797918.3"          "496939.62"    "1294857.92"                   .
    "Goldsboro"                  "124"    "99374.85000000001" "66705.64"     "166080.49"                    .
    "Greensboro"                 "768"    "736489.26"         "450834.62"    "1187323.88"                   .
    "Henderson"                  "56"     "30126"             "20624.84"     "50750.84"                     .
    "Hillsboro"                  "89"     "64075.18"          "38285.98"     "102361.16"                    .
    "Marydel"                    "88"     "53539.35"          "37393.83"     "90933.17999999999"            .
    "Preston"                    "329"    "455103.3"          "263709.39"    "718812.6899999999"            .
    "Ridgely"                    "696"    "901211.84"         "530353.13"    "1431564.97"                   .
    "Templeville"                "12"     "3974"              "2929.06"      "6903.06"                      .
    "Totals"                     "5035"   "5285072.75"        "3183663.14"   "8468735.890000001"            .
    " Rural and Unincorporated " "9369"   "13495233.97"       "8070272.44"   "21565506.41"                  .
    "Totals"                     "14404"  "18780306.72"       "11253935.58"  "30034242.3"                   .
    ""                           ""       ""                  ""             ""                             .
    ""                           ""       ""                  ""             ""                             .
    ""                           ""       ""                  ""             ""                             .
    "Carroll"                    ""       ""                  ""             ""                             .
    "Hampstead"                  ""       "3014"              "6087945.03"   "3977285.9"          10065230.93
    "Manchester"                 ""       "2123"              "4242961.55"   "2776010.67"          7018972.22
    "Mt. Airy"                   ""       "2305"              "7008405.53"   "4521873.2"          11530278.73
    "New Windsor"                ""       "604"               "1104205.52"   "723073.86"           1827279.38
    "Sykesville"                 ""       "2105"              "5372577.96"   "3507678.14"           8880256.1
    "Taneytown"                  ""       "2989"              "4410801.97"   "2902180.52"          7312982.49
    "Union Bridge"               ""       "381"               "487966.08"    "322711.6"             810677.68
    "Westminster"                ""       "8777"              "15246812.22"  "9955907.35"         25202719.57
    "Totals"                     "TOTALS" "22298"             "43961675.86"  "28686721.24"         72648397.1
    " Rural and Unincorporated " ""       "57765"             "156351597.97" "100549786.85"      256901384.82
    "Totals"                     "TOTALS" "80063"             "200313273.83" "129236508.09"      329549781.92
    end
    1. What to do when the header is not clearly identified as there is no space between the two observations. For example, Caroline and Calvert are two different counties. But as there is no space between them, Stata does not identify them as headers. If I use insobs 1, after (1) the issue is resolved for one case. But I have to use this for each such case in data.
    2. For Carroll County, the variables are shifted to the right. The G variable should not exit. Using replace to change the value of each cell is impossible.
    Thanks in advance!

    Comment


    • #3
      Hello everyone,

      I want to generate a new variable if the string variable contains a specific word. For example, I want to generate a new var1 = 1 if any observation in the variable "Municipal_division" contains the word "County". I tried split and substring functions but was not able to get what I wanted. Thanks in advance.

      Code:
      * Example generated by -dataex-. For more info, type help dataex
      clear
      input str35 Municipal_division str17 Real_property_tax_municipal
      "Allegany County"           ""      
      "Barton"                    ".225"  
      "Cumberland"                "1.0595"
      "Frostburg"                 ".7"    
      "Lonaconing"                ".3084" 
      "Luke - Residential"        "1.5"   
      "Luke - Commercial"         "2.5"   
      "Luke - Rented Residential" "2.07"  
      "Midland"                   ".28"   
      "Westernport"               ".6"    
      "Anne Arundel County"       ""      
      "Annapolis"                 ".738"  
      "Highland Beach"            ".1505" 
      "Baltimore City"            ""      
      "Baltimore County"          ""      
      end

      Comment

      Working...
      X