Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Reordering string variables

    Hi all,

    I have a data set that looks as follows:


    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str32(productstype productstype2 productstype3 productstype4) str21 productstype5
    "DIFM Enhanced"       "Deploy"              "Google Ads"                       "Marketer"   "Originator Everywhere"
    "Advanced local PLUS" "Advanced Local PLUS" "ConversionPro Blog"               "Local PLUS" "Presence"             
    "DIFM Enhanced"       "DIFM Starter"        "Marketer"                         "SuperCalc"  "Website"              
    "Advanced Local PLUS" "ConversionPro Blog"  "Marketer"                         "SuperCalc"  "Website"              
    "DIFM Starter"        "Marketer Pro"        "Listings + Reputation Management" "Presence"   "Website"              
    end
    My goal is to re-order the observations in a specific way, such that productstype1 is "Deploy PLUS" if it exists, otherwise it is "Deploy" if it exists, otherwise it is "Advanced Local PLUS" if it exists...
    And then productypes2 follows the same logic. I am having a hard time wrapping my brain around how to do this. Any help would be greatly appreciated.

    Thanks,

    Nick

  • #2
    productstype1 is "Deploy PLUS" if it exists, otherwise it is "Deploy" if it exists, otherwise it is "Advanced Local PLUS" if it exists...
    Sorry, but if there is some regularity or pattern that ... is intended to suggest, it has gone right over my head. Even if I assume that after "Advanced Local PLUS" comes "Advanced Local", what is supposed to come after that? Better still, what is the rule for deciding what comes next after any of them? If you can spell that out, coding it in Stata won't be particularly difficult.

    Comment


    • #3
      Sorry, Clyde. here is the order I need to put them in:

      Deploy Plus
      Deploy
      Advanced Local PLUS
      Local PLUS
      Presence
      Website
      DIFM Enhanced
      DIFM Starter
      DIFM Ultimate
      Google Ads
      Originator Everywhere
      ConversionPro Blog
      Excellence Website Theme
      Listings + Reputation Management
      Reputation Management (Vendasta)
      Marketer
      Marketer Pro
      SuperCalc

      So if "Deploy Plus" exists in any column, it should be under productstype. Essentially, for a given observation, the highest product type on the above list should be under productstype, the second highest on the list should be productstype2, etc. I apologize for not being clear.

      Comment


      • #4
        To further clarify, the desired output for the first row above should be:

        Deploy; DIFM Enhanced; Originator Everywhere; Google Ads; Marketer

        Comment


        • #5
          This should start you in a useful direction to what you requested, assuming you're using Stata 16 or later.
          Code:
          rename productstype productstype1
          generate obsid = _n, before(productstype1)
          reshape long productstype, i(obsid) j(seq)
          
          frame create reference
          frame change reference
          
          * Example generated by -dataex-. For more info, type help dataex
          clear
          input str32 productstype
          "Deploy Plus"                    
          "Deploy"                          
          "Advanced Local PLUS"            
          "Local PLUS"                      
          "Presence"                        
          "Website"                        
          "DIFM Enhanced"                  
          "DIFM Starter"                    
          "DIFM Ultimate"                  
          "Google Ads"                      
          "Originator Everywhere"          
          "ConversionPro Blog"              
          "Excellence Website Theme"        
          "Listings + Reputation Management"
          "Reputation Management (Vendasta)"
          "Marketer"                        
          "Marketer Pro"                    
          "SuperCalc"                      
          end
          generate order = _n
          
          frame change default
          
          frlink m:1 productstype, frame(reference) generate(reflink)
          frget order, from(reflink)
          drop reflink
          list if missing(order), clean
          by obsid (order), sort: replace seq = _n
          drop order
          reshape wide productstype, i(obsid) j(seq)
          list, clean abbreviate(20)
          Code:
          . list if missing(order), clean
          
                 obsid   seq          productstype   order  
            6.       2     1   Advanced local PLUS       .
          Code:
          . list, clean abbreviate(20)
          
                 obsid         productstype1   productstype2        productstype3                      productstype4         productstype5  
            1.       1                Deploy   DIFM Enhanced           Google Ads              Originator Everywhere              Marketer  
            2.       2   Advanced Local PLUS      Local PLUS             Presence                 ConversionPro Blog   Advanced local PLUS  
            3.       3               Website   DIFM Enhanced         DIFM Starter                           Marketer             SuperCalc  
            4.       4   Advanced Local PLUS         Website   ConversionPro Blog                           Marketer             SuperCalc  
            5.       5              Presence         Website         DIFM Starter   Listings + Reputation Management          Marketer Pro
          Note that observation 2 demonstrates what happens if there is an value of productstype not included in your reference list. Spelling, spacing, punctuation, and case must match exactly.
          Last edited by William Lisowski; 03 Mar 2022, 17:32.

          Comment

          Working...
          X