Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to Sum or add Across Column with if command

    Hello Fellow Statalist,

    I have a dataset that contains the following vatiables: Date, CIF, FOB, Country and Descriptions (store as string) in excel. I have imported the data into Stata and converted the string vatiable(Description) to numeric using the encode command:

    encode Description, gen(Des)

    The encode command generates a numeric variable called Des. However, Des(new variable) is a categorical variable so I went on to create a Dummy variable (with multiple indicators) DumDes to allow me sum or total across columns excluding certain indicators in DumDes(another New variable).

    gen DumDes=Des

    The indicators for DumDes are Exclusion1, Exclusion2, CommonRegime, SpecialRegime1, SpecialRegime2 and they are coded 1, 2, 3, 4, and 5, respectively.

    Now with the above Dummy variable (with multiple indicators) I want to fine the total or sum across Column say for CIF. But when computing the sum of CIF I need not to include both Exclusion1 and Exclusion2. In this case how do I obtain the CIF that excludes the two indicators under DumDes?

    When I type the codes:

    total CIF I get the sum of CIF
    total CIF if DumDes==1 I get the sum of CIF less Exclusion1
    total CIF if DumDes==2 I get the sum of CIF less Exclusion2
    total CIF if DumDes==1&2 It doesn't give me the value I get when I manually add the two Exclusions(1&2) above
    total CIF if DumDes !=1&2 Does not give me the actual CIF value.

    NB: CIF = Total CIF - indicators Exclusion 1 and Exclusion2.

    In short kindly help me with the Stata command for summing across column using the if option.

    Many thanks!
    Last edited by Amo Lewis; 09 Oct 2020, 18:59.

  • #2
    Code:
    DumDes==1&2
    is imaginary Stata syntax not supported by the documentation. The logical expression
    Code:
    1&2
    evaluates to 1.

    What you want is
    Code:
    total CIF if DumDes==1 | DumDes==2 
    total CIF if DumDes!=1 & DumDes!=2
    which demonstrates correct syntax for Stata logical expressions. A more compact version would be
    Code:
    total CIF if inlist(DumDes,1,2)
    total CIF if !inlist(DumDes,1,2)

    Comment


    • #3
      William Lisowski gave an excellent answer, and this is just a footnote.

      if as used here is best thought of as a qualifier to a command. There is in fact an if command which is -- despite the name -- different, and typically used in programs rather than interactively.

      The if qualifier means do something "for the subset of observations I specify" -- and the extreme cases are not exceptions: the subset might be empty (in which case nothing much happens) or it might be the entire dataset. It is always part of a command.


      The if command means "do what I say next if a condition is true". It is often, but not necessarily, paired with else

      Comment


      • #4
        Hello,stata;
        Code:
        * Example generated by -dataex-. To install: ssc install dataex
        clear
        input int year str5 exporter str17 importer str8 HS4 long Import_value
        2008 "CHINA" "AUSTRALIA" " '0101 "  1364
        2008 "CHINA" "AUSTRALIA" " '8906 "     0
        2008 "CHINA" "AUSTRALIA" " '2706 "     0
        2008 "CHINA" "AUSTRALIA" " '8711 "   274
        2008 "CHINA" "AUSTRALIA" " '0702 "     0
        2008 "CHINA" "AUSTRALIA" " '2914 "   254
        2008 "CHINA" "AUSTRALIA" " '8467 "  1728
        2008 "CHINA" "AUSTRALIA" " '7208 "     4
        2008 "CHINA" "AUSTRALIA" " '9112 "     8
        2008 "CHINA" "AUSTRALIA" " '0808 "     0
        2008 "CHINA" "AUSTRALIA" " '8533 "   114
        2008 "CHINA" "AUSTRALIA" " '2923 "     0
        2008 "CHINA" "AUSTRALIA" " '6603 "     3
        2008 "CHINA" "AUSTRALIA" " '9007 "    86
        2008 "CHINA" "AUSTRALIA" " '5303 "     0
        2008 "CHINA" "AUSTRALIA" " '7314 "   673
        2008 "CHINA" "AUSTRALIA" " '5506 "     0
        2008 "CHINA" "AUSTRALIA" " '2621 "     0
        2008 "CHINA" "AUSTRALIA" " '9605 "   888
        2008 "CHINA" "AUSTRALIA" " '9620 "     0
        2008 "CHINA" "AUSTRALIA" " '9113 "   550
        2008 "CHINA" "AUSTRALIA" " '4004 "     0
        2008 "CHINA" "AUSTRALIA" " '5501 "     0
        2008 "CHINA" "AUSTRALIA" " '0505 "     1
        2008 "CHINA" "AUSTRALIA" " '8508 "  1707
        2008 "CHINA" "AUSTRALIA" " '1205 "     0
        2008 "CHINA" "AUSTRALIA" " '5408 "    47
        2008 "CHINA" "AUSTRALIA" " '2819 "     0
        2008 "CHINA" "AUSTRALIA" " '6501 "     0
        2008 "CHINA" "AUSTRALIA" " '8710 "     0
        2008 "CHINA" "AUSTRALIA" " '4107 "    59
        2008 "CHINA" "AUSTRALIA" " '2830 "     0
        2008 "CHINA" "AUSTRALIA" " '3823 "     0
        2008 "CHINA" "AUSTRALIA" " '1804 "     0
        2008 "CHINA" "AUSTRALIA" " '2838 "     0
        2008 "CHINA" "AUSTRALIA" " '8460 "    25
        2008 "CHINA" "AUSTRALIA" " '5112 "    73
        2008 "CHINA" "AUSTRALIA" " '9201 "     0
        2008 "CHINA" "AUSTRALIA" " '2513 "     0
        2008 "CHINA" "AUSTRALIA" " '2910 "    82
        2008 "CHINA" "AUSTRALIA" " '9033 "    24
        2008 "CHINA" "AUSTRALIA" " '2001 "     6
        2008 "CHINA" "AUSTRALIA" " '4406 "     0
        2008 "CHINA" "AUSTRALIA" " '7613 "     0
        2008 "CHINA" "AUSTRALIA" " '8525 " 13472
        2008 "CHINA" "AUSTRALIA" " '7504 "     0
        2008 "CHINA" "AUSTRALIA" " '7008 "     0
        2008 "CHINA" "AUSTRALIA" " '5004 "     0
        2008 "CHINA" "AUSTRALIA" " '4104 "     0
        2008 "CHINA" "AUSTRALIA" " '7508 "    21
        2008 "CHINA" "AUSTRALIA" " '7019 "   160
        2008 "CHINA" "AUSTRALIA" " '6115 "   363
        2008 "CHINA" "AUSTRALIA" " '8214 "   151
        2008 "CHINA" "AUSTRALIA" " '4805 "    72
        2008 "CHINA" "AUSTRALIA" " '9704 "   276
        2008 "CHINA" "AUSTRALIA" " '8704 "     0
        2008 "CHINA" "AUSTRALIA" " '7324 "  2113
        2008 "CHINA" "AUSTRALIA" " '4416 "     0
        2008 "CHINA" "AUSTRALIA" " '8484 "    19
        2008 "CHINA" "AUSTRALIA" " '5113 "     0
        2008 "CHINA" "AUSTRALIA" " '7418 "   304
        2008 "CHINA" "AUSTRALIA" " '5507 "     0
        2008 "CHINA" "AUSTRALIA" " '6210 "  1463
        2008 "CHINA" "AUSTRALIA" " '7407 "   185
        2008 "CHINA" "AUSTRALIA" " '6117 "   524
        2008 "CHINA" "AUSTRALIA" " '9504 "  6268
        2008 "CHINA" "AUSTRALIA" " '6214 "   383
        2008 "CHINA" "AUSTRALIA" " '4101 "     0
        2008 "CHINA" "AUSTRALIA" " '7013 "  1369
        2008 "CHINA" "AUSTRALIA" " '3801 "     0
        2008 "CHINA" "AUSTRALIA" " '0813 "    31
        2008 "CHINA" "AUSTRALIA" " '8112 "     0
        2008 "CHINA" "AUSTRALIA" " '4907 "     6
        2008 "CHINA" "AUSTRALIA" " '7906 "     0
        2008 "CHINA" "AUSTRALIA" " '2608 "     0
        2008 "CHINA" "AUSTRALIA" " '8548 "     8
        2008 "CHINA" "AUSTRALIA" " '8603 "     0
        2008 "CHINA" "AUSTRALIA" " '5106 "     3
        2008 "CHINA" "AUSTRALIA" " '7903 "     0
        2008 "CHINA" "AUSTRALIA" " '7607 "    67
        2008 "CHINA" "AUSTRALIA" " '3911 "   197
        2008 "CHINA" "AUSTRALIA" " '7107 "     0
        2008 "CHINA" "AUSTRALIA" " '3917 "  2527
        2008 "CHINA" "AUSTRALIA" " '1301 "     0
        2008 "CHINA" "AUSTRALIA" " '5005 "     0
        2008 "CHINA" "AUSTRALIA" " '7117 "  5242
        2008 "CHINA" "AUSTRALIA" " '2812 "     0
        2008 "CHINA" "AUSTRALIA" " '6201 "   636
        2008 "CHINA" "AUSTRALIA" " '4203 "  4047
        2008 "CHINA" "AUSTRALIA" " '1207 "     5
        2008 "CHINA" "AUSTRALIA" " '6106 "  2512
        2008 "CHINA" "AUSTRALIA" " '3704 "     0
        2008 "CHINA" "AUSTRALIA" " '5107 "    11
        2008 "CHINA" "AUSTRALIA" " '5006 "     2
        2008 "CHINA" "AUSTRALIA" " '5210 "   106
        2008 "CHINA" "AUSTRALIA" " '3304 "  2607
        2008 "CHINA" "AUSTRALIA" " '7309 "     0
        2008 "CHINA" "AUSTRALIA" " '6004 "   503
        2008 "CHINA" "AUSTRALIA" " '6107 "  2804
        2008 "CHINA" "AUSTRALIA" " '5110 "     0
        end
        the full dataset ranges from 2008-2022 with 13 more importing countries other than Australia. I need to create a new numerical variable "sector" with 10 different sectors coded from 0 to 9 such that sector 0 contains the added value of imports of all those "HS4" values that starts with 0 and in some cases to add up import values of HS4 level that start with 1 and 2.
        please help me with the command.
        I am familiar with the one containing string like
        Code:
        by year iso3_o , sort: egen mnmanganese = total( tradeflow_wto_d ) if (strpos(lower( ProductSector)), "manganese")
        but i don't know how to deal with numerals.
        thank you

        Comment


        • #5
          You don't say if you want this sum to be disaggregated by exporter or importer, or both, and whether by year or not. I'll assume you want it by exporter and year--modify the code accordingly if I have guessed wrong.

          Code:
          gen sector = substr(HS4, 3, 1)
          by year exporter sector, sort: egen wanted = total(Import_value)
          That said, I don't understand when you say "such that sector 0 contains the added value of imports of all those "HS4" values that starts with 0 and in some cases to add up import values of HS4 level that start with 1 and 2." The above code ignores that part. To implement it, you have to say explicitly how to decide, based on the data, which cases where the HS4 level starts with 1 or 2 are to be included with sector 0.

          Comment


          • #6
            thank you Clyde Schechter for your prompt response.
            1. i want to aggregate it by importer exporter year so the code should be
            Code:
             
             gen sector = substr(HS4, 3, 1) by year importer exporter sector, sort: egen wanted = total(Import_value)
            2. let me put it this way. i want to create a variable name "sector_code". this variable shall have code 0 that should aggregate import value of all HS4 level codes that start with 0 (0 at like 0101, 0702 etc. after 0 this variable shall have code 1 that should aggregate import value of all HS4 level codes starting with 1 like 1205, 1804 etc and this shall continue till code 9. something like this
            Code:
             
            year exporter importer HS4 Import_value sector_code sector_level_import
            2008 "CHINA" "AUSTRALIA" '0101 1364 0 1401
            2008 "CHINA" "AUSTRALIA" '0505 1 0 1401
            2008 "CHINA" "AUSTRALIA" '0702 0 0 1401
            2008 "CHINA" "AUSTRALIA" '0808 0 0 1401
            2008 "CHINA" "AUSTRALIA" '0813 31 0 1401
            2008 "CHINA" "AUSTRALIA" '1205 0 1 1401
            2008 "CHINA" "AUSTRALIA" '1207 5 1 1401
            2008 "CHINA" "AUSTRALIA" '1301 0 1 1401
            2008 "CHINA" "AUSTRALIA" '1804 0 1 1401
            2008 "CHINA" "AUSTRALIA" '2001 6 2 342
            2008 "CHINA" "AUSTRALIA" '2513 0 2 342
            2008 "CHINA" "AUSTRALIA" '2608 0 2 342
            2008 "CHINA" "AUSTRALIA" '2621 0 2 342
            2008 "CHINA" "AUSTRALIA" '2706 0 2 342
            2008 "CHINA" "AUSTRALIA" '2812 0 2 342
            2008 "CHINA" "AUSTRALIA" '2819 0 2 342
            2008 "CHINA" "AUSTRALIA" '2830 0 2 342
            2008 "CHINA" "AUSTRALIA" '2838 0 2 342
            2008 "CHINA" "AUSTRALIA" '2910 82 2 342
            2008 "CHINA" "AUSTRALIA" '2914 254 2 342
            2008 "CHINA" "AUSTRALIA" '2923 0 2 342
            2008 "CHINA" "AUSTRALIA" '3304 2607 3 5331
            2008 "CHINA" "AUSTRALIA" '3704 0 3 5331
            2008 "CHINA" "AUSTRALIA" '3801 0 3 5331
            2008 "CHINA" "AUSTRALIA" '3823 0 3 5331
            2008 "CHINA" "AUSTRALIA" '3911 197 3 5331
            2008 "CHINA" "AUSTRALIA" '3917 2527 3 5331
            2008 "CHINA" "AUSTRALIA" '4004 0 4 4184
            2008 "CHINA" "AUSTRALIA" '4101 0 4 4184
            2008 "CHINA" "AUSTRALIA" '4104 0 4 4184
            2008 "CHINA" "AUSTRALIA" '4107 59 4 4184
            2008 "CHINA" "AUSTRALIA" '4203 4047 4 4184
            2008 "CHINA" "AUSTRALIA" '4406 0 4 4184
            2008 "CHINA" "AUSTRALIA" '4416 0 4 4184
            2008 "CHINA" "AUSTRALIA" '4805 72 4 4184
            2008 "CHINA" "AUSTRALIA" '4907 6 4 4184
            2008 "CHINA" "AUSTRALIA" '5004 0 5 242
            2008 "CHINA" "AUSTRALIA" '5005 0 5 242
            2008 "CHINA" "AUSTRALIA" '5006 2 5 242
            2008 "CHINA" "AUSTRALIA" '5106 3 5 242
            2008 "CHINA" "AUSTRALIA" '5107 11 5 242
            2008 "CHINA" "AUSTRALIA" '5110 0 5 242
            2008 "CHINA" "AUSTRALIA" '5112 73 5 242
            2008 "CHINA" "AUSTRALIA" '5113 0 5 242
            2008 "CHINA" "AUSTRALIA" '5210 106 5 242
            2008 "CHINA" "AUSTRALIA" '5303 0 5 242
            2008 "CHINA" "AUSTRALIA" '5408 47 5 242
            2008 "CHINA" "AUSTRALIA" '5501 0 5 242
            2008 "CHINA" "AUSTRALIA" '5506 0 5 242
            2008 "CHINA" "AUSTRALIA" '5507 0 5 242
            2008 "CHINA" "AUSTRALIA" '6004 503 6 9191
            2008 "CHINA" "AUSTRALIA" '6106 2512 6 9191
            2008 "CHINA" "AUSTRALIA" '6107 2804 6 9191
            2008 "CHINA" "AUSTRALIA" '6115 363 6 9191
            2008 "CHINA" "AUSTRALIA" '6117 524 6 9191
            2008 "CHINA" "AUSTRALIA" '6201 636 6 9191
            2008 "CHINA" "AUSTRALIA" '6210 1463 6 9191
            2008 "CHINA" "AUSTRALIA" '6214 383 6 9191
            2008 "CHINA" "AUSTRALIA" '6501 0 6 9191
            2008 "CHINA" "AUSTRALIA" '6603 3 6 9191
            2008 "CHINA" "AUSTRALIA" '7008 0 7 10138
            2008 "CHINA" "AUSTRALIA" '7013 1369 7 10138
            2008 "CHINA" "AUSTRALIA" '7019 160 7 10138
            2008 "CHINA" "AUSTRALIA" '7107 0 7 10138
            2008 "CHINA" "AUSTRALIA" '7117 5242 7 10138
            2008 "CHINA" "AUSTRALIA" '7208 4 7 10138
            2008 "CHINA" "AUSTRALIA" '7309 0 7 10138
            2008 "CHINA" "AUSTRALIA" '7314 673 7 10138
            2008 "CHINA" "AUSTRALIA" '7324 2113 7 10138
            2008 "CHINA" "AUSTRALIA" '7407 185 7 10138
            2008 "CHINA" "AUSTRALIA" '7418 304 7 10138
            2008 "CHINA" "AUSTRALIA" '7504 0 7 10138
            2008 "CHINA" "AUSTRALIA" '7508 21 7 10138
            2008 "CHINA" "AUSTRALIA" '7607 67 7 10138
            2008 "CHINA" "AUSTRALIA" '7613 0 7 10138
            2008 "CHINA" "AUSTRALIA" '7903 0 7 10138
            2008 "CHINA" "AUSTRALIA" '7906 0 7 10138
            2008 "CHINA" "AUSTRALIA" '8112 0 8 17498
            2008 "CHINA" "AUSTRALIA" '8214 151 8 17498
            2008 "CHINA" "AUSTRALIA" '8460 25 8 17498
            2008 "CHINA" "AUSTRALIA" '8467 1728 8 17498
            2008 "CHINA" "AUSTRALIA" '8484 19 8 17498
            2008 "CHINA" "AUSTRALIA" '8508 1707 8 17498
            2008 "CHINA" "AUSTRALIA" '8525 13472 8 17498
            2008 "CHINA" "AUSTRALIA" '8533 114 8 17498
            2008 "CHINA" "AUSTRALIA" '8548 8 8 17498
            2008 "CHINA" "AUSTRALIA" '8603 0 8 17498
            2008 "CHINA" "AUSTRALIA" '8704 0 8 17498
            2008 "CHINA" "AUSTRALIA" '8710 0 8 17498
            2008 "CHINA" "AUSTRALIA" '8711 274 8 17498
            2008 "CHINA" "AUSTRALIA" '8906 0 8 17498
            2008 "CHINA" "AUSTRALIA" '9007 86 9 8100
            2008 "CHINA" "AUSTRALIA" '9033 24 9 8100
            2008 "CHINA" "AUSTRALIA" '9112 8 9 8100
            2008 "CHINA" "AUSTRALIA" '9113 550 9 8100
            2008 "CHINA" "AUSTRALIA" '9201 0 9 8100
            2008 "CHINA" "AUSTRALIA" '9504 6268 9 8100
            2008 "CHINA" "AUSTRALIA" '9605 888 9 8100
            2008 "CHINA" "AUSTRALIA" '9620 0 9 8100
            2008 "CHINA" "AUSTRALIA" '9704 276 9 8100

            Comment


            • #7
              Thank you for clarifying. Based on that understanding, the code you show in #2 is correct.

              Comment


              • #8
                thank you again Clyde Schechter but for the sake of clarity and to understand the #12 code well, may you please tell me what this 3 and 1 mean in (HS4, 3, 1)?

                Comment


                • #9
                  All of the HS4 values in the example data look like " '####", where # stands for any digit. So the first character in HS4 is always a blank space, the second one is always ', and the third one is the first digit. So to pull out the first digit from this kind of string, we need a substring that starts at the 3rd character and is 1 character long.

                  Comment


                  • #10
                    Clyde Schechter job done.....thank you so much.......

                    Comment

                    Working...
                    X