Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Transforming a time string type variable

    Hello Stata people ;

    I'm working with this data:

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str8 mois float confiancedesmnages
    "janv.-05" 47.4
    "févr.-05" 47.6
    "mars-05"  45.2
    "avr.-05"  47.4
    "mai-05"   48.3
    "juin-05"  46.6
    "juil.-05" 48.1
    "août-05"  48.4
    "sept.-05" 45.5
    "oct.-05"  47.9
    "nov.-05"  48.2
    "déc.-05"  46.5
    "janv.-06" 49.5
    "févr.-06" 49.8
    "mars-06"  47.9
    "avr.-06"    50
    "mai-06"   49.8
    "juin-06"  47.2
    "juil.-06" 48.6
    "août-06"  47.6
    "sept.-06" 46.3
    "oct.-06"  48.2
    "nov.-06"  48.7
    "déc.-06"  45.9
    "janv.-07" 48.1
    "févr.-07" 48.4
    "mars-07"  46.8
    "avr.-07"  47.4
    "mai-07"   47.3
    "juin-07"    45
    "juil.-07" 44.4
    "août-07"    44
    "sept.-07" 44.1
    "oct.-07"  42.8
    "nov.-07"  39.8
    "déc.-07"    38
    "janv.-08" 37.5
    "févr.-08" 36.1
    "mars-08"  36.7
    "avr.-08"  35.2
    "mai-08"   33.9
    "juin-08"  32.6
    "juil.-08" 31.4
    "août-08"  30.1
    "sept.-08" 31.4
    "oct.-08"  29.4
    "nov.-08"  28.4
    "déc.-08"  26.2
    "janv.-09" 26.4
    "févr.-09" 26.7
    "mars-09"  28.9
    "avr.-09"  32.4
    "mai-09"   35.7
    "juin-09"  37.6
    "juil.-09" 39.4
    "août-09"  40.1
    "sept.-09" 40.5
    "oct.-09"  40.5
    "nov.-09"  39.5
    "déc.-09"  37.6
    "janv.-10"   39
    "févr.-10" 39.8
    "mars-10"  40.9
    "avr.-10"    42
    "mai-10"   42.8
    "juin-10"  43.5
    "juil.-10" 43.3
    "août-10"  42.4
    "sept.-10" 41.2
    "oct.-10"  40.9
    "nov.-10"  40.4
    "déc.-10"  40.1
    "janv.-11" 41.1
    "févr.-11" 40.6
    "mars-11"  38.3
    "avr.-11"  33.4
    "mai-11"   34.8
    "juin-11"  36.2
    "juil.-11" 37.7
    "août-11"  37.4
    "sept.-11" 38.5
    "oct.-11"  38.6
    "nov.-11"  37.5
    "déc.-11"  38.1
    "janv.-12" 39.6
    "févr.-12" 39.1
    "mars-12"  40.1
    "avr.-12"  40.1
    "mai-12"   40.7
    "juin-12"  40.8
    "juil.-12" 40.5
    "août-12"  40.6
    "sept.-12" 40.4
    "oct.-12"  39.7
    "nov.-12"    39
    "déc.-12"    39
    "janv.-13" 43.1
    "févr.-13" 44.2
    "mars-13"    45
    "avr.-13"  44.5
    end
    It is about the evolution of the confidence of a population by month, from Januare 2005 to October 2024. My goal is to transform the "mois" variable in a way that I could preserve the month and the year in one variable, but to have it set in a way to apply a tsset command on it and draw a tsling too. I wanna see the name of the month and the year on the final graph (if that's possible too).

    Any help please? With thanks!

  • #2
    wanna is not good English! Otherwise this needs schoolchild French and some fooling around with Stata macros and functions. Here's my attempt. I needed to read in the date variable as str9,by the way.

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str9 mois float confiancedesmnages
    "janv.-05" 47.4
    "févr.-05" 47.6
    "mars-05"  45.2
    "avr.-05"  47.4
    "mai-05"   48.3
    "juin-05"  46.6
    "juil.-05" 48.1
    "août-05"  48.4
    "sept.-05" 45.5
    "oct.-05"  47.9
    "nov.-05"  48.2
    "déc.-05"  46.5
    "janv.-06" 49.5
    "févr.-06" 49.8
    "mars-06"  47.9
    "avr.-06"    50
    "mai-06"   49.8
    "juin-06"  47.2
    "juil.-06" 48.6
    "août-06"  47.6
    "sept.-06" 46.3
    "oct.-06"  48.2
    "nov.-06"  48.7
    "déc.-06"  45.9
    "janv.-07" 48.1
    "févr.-07" 48.4
    "mars-07"  46.8
    "avr.-07"  47.4
    "mai-07"   47.3
    "juin-07"    45
    "juil.-07" 44.4
    "août-07"    44
    "sept.-07" 44.1
    "oct.-07"  42.8
    "nov.-07"  39.8
    "déc.-07"    38
    "janv.-08" 37.5
    "févr.-08" 36.1
    "mars-08"  36.7
    "avr.-08"  35.2
    "mai-08"   33.9
    "juin-08"  32.6
    "juil.-08" 31.4
    "août-08"  30.1
    "sept.-08" 31.4
    "oct.-08"  29.4
    "nov.-08"  28.4
    "déc.-08"  26.2
    "janv.-09" 26.4
    "févr.-09" 26.7
    "mars-09"  28.9
    "avr.-09"  32.4
    "mai-09"   35.7
    "juin-09"  37.6
    "juil.-09" 39.4
    "août-09"  40.1
    "sept.-09" 40.5
    "oct.-09"  40.5
    "nov.-09"  39.5
    "déc.-09"  37.6
    "janv.-10"   39
    "févr.-10" 39.8
    "mars-10"  40.9
    "avr.-10"    42
    "mai-10"   42.8
    "juin-10"  43.5
    "juil.-10" 43.3
    "août-10"  42.4
    "sept.-10" 41.2
    "oct.-10"  40.9
    "nov.-10"  40.4
    "déc.-10"  40.1
    "janv.-11" 41.1
    "févr.-11" 40.6
    "mars-11"  38.3
    "avr.-11"  33.4
    "mai-11"   34.8
    "juin-11"  36.2
    "juil.-11" 37.7
    "août-11"  37.4
    "sept.-11" 38.5
    "oct.-11"  38.6
    "nov.-11"  37.5
    "déc.-11"  38.1
    "janv.-12" 39.6
    "févr.-12" 39.1
    "mars-12"  40.1
    "avr.-12"  40.1
    "mai-12"   40.7
    "juin-12"  40.8
    "juil.-12" 40.5
    "août-12"  40.6
    "sept.-12" 40.4
    "oct.-12"  39.7
    "nov.-12"    39
    "déc.-12"    39
    "janv.-13" 43.1
    "févr.-13" 44.2
    "mars-13"    45
    "avr.-13"  44.5
    end
    
    tokenize "janv. févr. mars avr. mai juin juil.  août sept. oct. nov. déc." 
    
    clonevar work = mois 
    
    forval m = 1/12 { 
        replace work = subinstr(work, "``m''", "`m'", .)
    }
    
    replace work = subinstr(work, "-", " ", .)
    
    gen wanted = monthly(word(work, 1) + " 20" + word(work, 2), "MY")
    
    format wanted %tm 
    
    list, sep(12)
    
         +---------------------------------------+
         |     mois   confia~s    work    wanted |
         |---------------------------------------|
      1. | janv.-05       47.4    1 05    2005m1 |
      2. | févr.-05       47.6    2 05    2005m2 |
      3. |  mars-05       45.2    3 05    2005m3 |
      4. |  avr.-05       47.4    4 05    2005m4 |
      5. |   mai-05       48.3    5 05    2005m5 |
      6. |  juin-05       46.6    6 05    2005m6 |
      7. | juil.-05       48.1    7 05    2005m7 |
      8. |  août-05       48.4    8 05    2005m8 |
      9. | sept.-05       45.5    9 05    2005m9 |
     10. |  oct.-05       47.9   10 05   2005m10 |
     11. |  nov.-05       48.2   11 05   2005m11 |
     12. |  déc.-05       46.5   12 05   2005m12 |
         |---------------------------------------|
     13. | janv.-06       49.5    1 06    2006m1 |
     14. | févr.-06       49.8    2 06    2006m2 |
     15. |  mars-06       47.9    3 06    2006m3 |
     16. |  avr.-06         50    4 06    2006m4 |
     17. |   mai-06       49.8    5 06    2006m5 |
     18. |  juin-06       47.2    6 06    2006m6 |
     19. | juil.-06       48.6    7 06    2006m7 |
     20. |  août-06       47.6    8 06    2006m8 |
     21. | sept.-06       46.3    9 06    2006m9 |
     22. |  oct.-06       48.2   10 06   2006m10 |
     23. |  nov.-06       48.7   11 06   2006m11 |
     24. |  déc.-06       45.9   12 06   2006m12 |
         |---------------------------------------|
     25. | janv.-07       48.1    1 07    2007m1 |
     26. | févr.-07       48.4    2 07    2007m2 |
     27. |  mars-07       46.8    3 07    2007m3 |
     28. |  avr.-07       47.4    4 07    2007m4 |
     29. |   mai-07       47.3    5 07    2007m5 |
     30. |  juin-07         45    6 07    2007m6 |
     31. | juil.-07       44.4    7 07    2007m7 |
     32. |  août-07         44    8 07    2007m8 |
     33. | sept.-07       44.1    9 07    2007m9 |
     34. |  oct.-07       42.8   10 07   2007m10 |
     35. |  nov.-07       39.8   11 07   2007m11 |
     36. |  déc.-07         38   12 07   2007m12 |
         |---------------------------------------|
     37. | janv.-08       37.5    1 08    2008m1 |
     38. | févr.-08       36.1    2 08    2008m2 |
     39. |  mars-08       36.7    3 08    2008m3 |
     40. |  avr.-08       35.2    4 08    2008m4 |
     41. |   mai-08       33.9    5 08    2008m5 |
     42. |  juin-08       32.6    6 08    2008m6 |
     43. | juil.-08       31.4    7 08    2008m7 |
     44. |  août-08       30.1    8 08    2008m8 |
     45. | sept.-08       31.4    9 08    2008m9 |
     46. |  oct.-08       29.4   10 08   2008m10 |
     47. |  nov.-08       28.4   11 08   2008m11 |
     48. |  déc.-08       26.2   12 08   2008m12 |
         |---------------------------------------|
     49. | janv.-09       26.4    1 09    2009m1 |
     50. | févr.-09       26.7    2 09    2009m2 |
     51. |  mars-09       28.9    3 09    2009m3 |
     52. |  avr.-09       32.4    4 09    2009m4 |
     53. |   mai-09       35.7    5 09    2009m5 |
     54. |  juin-09       37.6    6 09    2009m6 |
     55. | juil.-09       39.4    7 09    2009m7 |
     56. |  août-09       40.1    8 09    2009m8 |
     57. | sept.-09       40.5    9 09    2009m9 |
     58. |  oct.-09       40.5   10 09   2009m10 |
     59. |  nov.-09       39.5   11 09   2009m11 |
     60. |  déc.-09       37.6   12 09   2009m12 |
         |---------------------------------------|
     61. | janv.-10         39    1 10    2010m1 |
     62. | févr.-10       39.8    2 10    2010m2 |
     63. |  mars-10       40.9    3 10    2010m3 |
     64. |  avr.-10         42    4 10    2010m4 |
     65. |   mai-10       42.8    5 10    2010m5 |
     66. |  juin-10       43.5    6 10    2010m6 |
     67. | juil.-10       43.3    7 10    2010m7 |
     68. |  août-10       42.4    8 10    2010m8 |
     69. | sept.-10       41.2    9 10    2010m9 |
     70. |  oct.-10       40.9   10 10   2010m10 |
     71. |  nov.-10       40.4   11 10   2010m11 |
     72. |  déc.-10       40.1   12 10   2010m12 |
         |---------------------------------------|
     73. | janv.-11       41.1    1 11    2011m1 |
     74. | févr.-11       40.6    2 11    2011m2 |
     75. |  mars-11       38.3    3 11    2011m3 |
     76. |  avr.-11       33.4    4 11    2011m4 |
     77. |   mai-11       34.8    5 11    2011m5 |
     78. |  juin-11       36.2    6 11    2011m6 |
     79. | juil.-11       37.7    7 11    2011m7 |
     80. |  août-11       37.4    8 11    2011m8 |
     81. | sept.-11       38.5    9 11    2011m9 |
     82. |  oct.-11       38.6   10 11   2011m10 |
     83. |  nov.-11       37.5   11 11   2011m11 |
     84. |  déc.-11       38.1   12 11   2011m12 |
         |---------------------------------------|
     85. | janv.-12       39.6    1 12    2012m1 |
     86. | févr.-12       39.1    2 12    2012m2 |
     87. |  mars-12       40.1    3 12    2012m3 |
     88. |  avr.-12       40.1    4 12    2012m4 |
     89. |   mai-12       40.7    5 12    2012m5 |
     90. |  juin-12       40.8    6 12    2012m6 |
     91. | juil.-12       40.5    7 12    2012m7 |
     92. |  août-12       40.6    8 12    2012m8 |
     93. | sept.-12       40.4    9 12    2012m9 |
     94. |  oct.-12       39.7   10 12   2012m10 |
     95. |  nov.-12         39   11 12   2012m11 |
     96. |  déc.-12         39   12 12   2012m12 |
         |---------------------------------------|
     97. | janv.-13       43.1    1 13    2013m1 |
     98. | févr.-13       44.2    2 13    2013m2 |
     99. |  mars-13         45    3 13    2013m3 |
    100. |  avr.-13       44.5    4 13    2013m4 |
         +---------------------------------------+

    Comment


    • #3
      To add a little explanation for those who want it:

      tokenize maps the distinct words of the string presented to local macros 1 up, here 1 to 12.

      So local macro 1 contains the string "janv." (short for janvier, January in English), and so on.

      We replace a clone of the variable with literal 1 to 12 in place of the contents of local macros 1 to 12.

      I always (try to remember to) work with a clone if I am going to change the variable. I often don't get the syntax right first time, and if I mess up my only copy of the variable, I need to read in the data again.

      monthly() is not terribly smart so we need to remove the hyphens too.

      Code:
      help monthly() 
      shows another way to spell out the century too. You would need that, or something else, if you had 20th century dates as well.

      Here is another way to do it without tokenize.


      Code:
      local months "janv. févr. mars avr. mai juin juil.  août sept. oct. nov. déc." 
      
      clonevar work = mois 
      
      forval m = 1/12 { 
          local M : word `m' of `months'
          replace work = subinstr(work, "`M'", "`m'", .)
      }
      
      replace work = subinstr(work, "-", " ", .)
      
      gen wanted = monthly(word(work, 1) + " 20" + word(work, 2), "MY")
      
      format wanted %tm 
      
      list, sep(12)

      Comment

      Working...
      X