Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Is there a way to format variable?

    Hello friends,

    I am currently using data that has one variable that shows the value of the date, for example, a variable named effective_date shows the value 20120105 means 2012-January-5th. If I only need the value of year (for my analysis purpose, as I need to create an arrival cohort based on years, not the exact date).

    Right now I recode it like (20120000/20129999 = 2012), It works, but I am curious is there a way to formate variable just show the first four digitals instead of recoding it?

    Please advise

    Thanks

  • #2
    Formatting the date variable to only present the first four digits will not solve your analytic problem, because Stata doesn't pay attention to the format and will treate 20120106 as different than 20120107.

    The easy way to get the year from your numeric variable is to use mathematics.
    Code:
    generate year = floor(date/10000)
    Code:
    . * Example generated by -dataex-. For more info, type help dataex
    . clear
    
    . input long date
    
                 date
      1. 20120105
      2. 20211231
      3. end
    
    . generate year = floor(date/10000)
    
    . list, clean
    
               date   year  
      1.   20120105   2012  
      2.   20211231   2021  
    
    .
    For documentation
    Code:
    help floor()

    Comment


    • #3
      Originally posted by William Lisowski View Post
      Formatting the date variable to only present the first four digits will not solve your analytic problem, because Stata doesn't pay attention to the format and will treate 20120106 as different than 20120107.

      The easy way to get the year from your numeric variable is to use mathematics.
      Code:
      generate year = floor(date/10000)
      Code:
      . * Example generated by -dataex-. For more info, type help dataex
      . clear
      
      . input long date
      
      date
      1. 20120105
      2. 20211231
      3. end
      
      . generate year = floor(date/10000)
      
      . list, clean
      
      date year
      1. 20120105 2012
      2. 20211231 2021
      
      .
      For documentation
      Code:
      help floor()
      Thank you so much for your help William!

      Comment


      • #4
        William Lisowski gave a fine solution. Note that date functions will work here too given a twist.

        Code:
        . clear
        
        . set obs 1
        Number of observations (_N) was 0, now 1.
        
        . gen long problem = 20120105
        
        . l
        
             +----------+
             |  problem |
             |----------|
          1. | 20120105 |
             +----------+
        
        . gen year = year(daily(strofreal(problem, "%8.0f"), "YMD"))
        
        . l
        
             +-----------------+
             |  problem   year |
             |-----------------|
          1. | 20120105   2012 |
             +-----------------+

        Comment

        Working...
        X