Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • #16
    Good point; and surely, as adding the qualifier

    Code:
     
     if strpos(upper(test), "X")
    will select only positive finds.

    Comment


    • #17
      Dear Stata members,

      Sorry for revisiting these again but I have two questions:

      QUESTION ONE:

      The var NAME in my dataset is in the form of the LAST NAME, FIRST NAME , and then middle name without a separate comma

      How I can extract only the two names to be consistent along my data ?? as in this example
      NAME WANTED
      QAZI, MOHAMED QAZI, MOHAMED
      QAZI, MOHAMED QAZI, MOHAMED
      QAZI, MOHAMED ASHRAF QAZI, MOHAMED
      RADOW, NORMAN RADOW, NORMAN
      RADOW, NORMAN RADOW, NORMAN
      RADOW, NORMAN RADOW, NORMAN
      RADOW, NORMAN J RADOW, NORMAN
      RAESE, JOHN RAESE, JOHN
      RAESE, JOHN R MR RAESE, JOHN

      QUESTION TWO:

      to be able to merge between datasets the second dataset includes only the last name and first name separately (i.e. the first letter is a upper case and the rest of the name is in a lowercase format?

      How to combine the last name (lower case) with the first name (lower case) to form an uppercase full name with the comma in between? like below
      LAST NAME FIRST NAME WANTED WITH COMMA WANTEDWITHOUT COMMA
      Qazi Mohmed QAZI, MOHAMED QAZI MOHAMED
      Radow Norman RADOW, NORMAN RADOW NORMAN

      Comment


      • #18
        #1

        Code:
        * Example generated by -dataex-. For more info, type help dataex
        clear
        input str20 name
        "QAZI, MOHAMED"       
        "QAZI, MOHAMED"       
        "QAZI, MOHAMED ASHRAF"
        "RADOW, NORMAN"       
        "RADOW, NORMAN"       
        "RADOW, NORMAN"       
        "RADOW, NORMAN J"     
        "RAESE, JOHN"         
        "RAESE, JOHN R MR"    
        end
        
        gen wanted= ustrregexra(upper(name), "(\w+\,)\s+(\w+)\s+(.*)", "$1 $2")
        Res.:

        Code:
        . l, sep(0)
        
             +--------------------------------------+
             |                 name          wanted |
             |--------------------------------------|
          1. |        QAZI, MOHAMED   QAZI, MOHAMED |
          2. |        QAZI, MOHAMED   QAZI, MOHAMED |
          3. | QAZI, MOHAMED ASHRAF   QAZI, MOHAMED |
          4. |        RADOW, NORMAN   RADOW, NORMAN |
          5. |        RADOW, NORMAN   RADOW, NORMAN |
          6. |        RADOW, NORMAN   RADOW, NORMAN |
          7. |      RADOW, NORMAN J   RADOW, NORMAN |
          8. |          RAESE, JOHN     RAESE, JOHN |
          9. |     RAESE, JOHN R MR     RAESE, JOHN |
             +--------------------------------------+
        #2

        Code:
        * Example generated by -dataex-. For more info, type help dataex
        clear
        input str5 lastname str6 firstname
        "Qazi"  "Mohmed"
        "Radow" "Norman"
        end
        
        g wanted1= upper(lastname)+ ", "+ upper(firstname)
        g wanted2 = upper(lastname)+ " "+ upper(firstname)
        Res.:

        Code:
        . l
        
             +----------------------------------------------------+
             | lastname   firstn~e         wanted1        wanted2 |
             |----------------------------------------------------|
          1. |     Qazi     Mohmed    QAZI, MOHMED    QAZI MOHMED |
          2. |    Radow     Norman   RADOW, NORMAN   RADOW NORMAN |
             +----------------------------------------------------+

        Comment


        • #19
          Many thanks, Professor Andrew, it works very well, one more question, please

          This is my current dataset
          Name year Amount groups
          MOHAMED QAZI 2013 500 A
          MOHAMED QAZI 2013 500 B
          MOHAMED QAZI 2013 100 A
          MOHAMED QAZI 2014 1000 A
          MOHAMED QAZI 2014 1000 A
          MOHAMED QAZI 2014 1000 A
          ANDREW MOT 2013 5000 B

          I need the sum up the amount for each person in each year for each group, to be in the following format
          Name year A B
          MOHAMED QAZI 2013 600 500
          MOHAMED QAZI 2014 3000 0
          ANDREW MOT 2013 0 5000
          Actually I tried the collapse command but it did not work as I lost some other variables.

          further, I tried to use this code

          -----------------------------------------------------------------------------------------------------------------------------------------
          Code: bysort Name year: egen A = sum(Amount )if groups=="A"
          --------------------------------------------------------------------------------------------------------------------------------------
          but it end up with lots of duplicates

          Do you have any ideas how can I solve this problem ?






          Comment


          • #20
            This is a different question not directly relating to this thread. For any such follow-ups, please start a new thread and title it appropriately to reflect the problem. Also use dataex in future to present data examples as it matters in the code below whether groups is a string variable or a numerical variable with value labels. See FAQ Advice #12 for details.

            Code:
            * Example generated by -dataex-. For more info, type help dataex
            clear
            input str12 name int(year amount) str1 groups
            "MOHAMED QAZI" 2013  500 "A"
            "MOHAMED QAZI" 2013  500 "B"
            "MOHAMED QAZI" 2013  100 "A"
            "MOHAMED QAZI" 2014 1000 "A"
            "MOHAMED QAZI" 2014 1000 "A"
            "MOHAMED QAZI" 2014 1000 "A"
            "ANDREW MOT"   2013 5000 "B"
            end
            
            bys name year: g which=_n
            reshape wide amount, i(name year which) j(groups) string
            rename amount* *
            bys name year (which): replace A= sum(A)
            bys name year (which): replace B= sum(B)
            by name year: keep if _n==_N
            drop which
            Res.:

            Code:
            . l
            
                 +-----------------------------------+
                 |         name   year      A      B |
                 |-----------------------------------|
              1. |   ANDREW MOT   2013      0   5000 |
              2. | MOHAMED QAZI   2013    600    500 |
              3. | MOHAMED QAZI   2014   3000      0 |
                 +-----------------------------------+

            Comment

            Working...
            X