Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to create variable of relative change?

    Dear experts,

    I have a dataset including monthly observations from June 2017 - June 2019, where I would like to calculate the 12-month relative change in price over a year for a specific product (i.e. pit=(pit/pit-12)). Whit the first possible number to get will then be June 2018.

    Variable y_m is year and month such as 1804 corresponds to April 2018.

    Any good advice on how to do this in STATA?

    This is how the data looks like:

    Click image for larger version

Name:	Skärmklipp statalist3.PNG
Views:	1
Size:	25.7 KB
ID:	1514559



    Thank you,
    Sofie

  • #2
    So, first, you need a real Stata date variable, y_m as you have it will not be helpful in Stata.

    Code:
    gen int year = 2000 + real(substr(y_m, 1, 2))
    gen int month = real(substr(y_m, 3, 2))
    gen mdate = ym(year, month)
    format mdate %tm
    Once you have that, it's pretty simple:

    Code:
    xtset product mdate
    gen rel_change = price/L12.price
    Note: You have more than one price variable in your data set: replace "price" in the above by the name of the actual variable whose relative change you want to compute.

    In the future, please do not use screenshots to show example data. You were lucky in this case: yours was readable. Mostly they are not. Even when readable, if the problem is complicated enough to require developing and testing some code, a screenshot cannot be imported into Stata to work with. The helpful way to show example data is with the -dataex- command. If you are running version 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

    Note also that, as a consequence of their being no usable example data in your post, this code is not tested and may contain typographical errors.

    Comment


    • #3
      Many thanks Clyde, I will make sure to use -dataex- in the future.

      However, I get
      Code:
      . xtset product_ mdate
      repeated time values within panel
      when running
      Code:
       
       xtset product mdate gen rel_change = price/L12.price
      Any advice on this that can be helpfull?


      Code:
      * Example generated by -dataex-. To install: ssc install dataex
      clear
      input str9 period str8 product long product_ str14 valueQUANTITY_IN_100KG long quantity_in_100kg str14 valueVALUE_IN_EUROS double value_in_euros str4 y_m float(dummy tariff unit_price unit_p_tariffinc) int(year month) float mdate
      "Jun. 2017" "01011010" 1011010 ":" . ":" . "1706" 0 1 . . 2017  6 689
      "Jul. 2017" "01011010" 1011010 ":" . ":" . "1707" 0 1 . . 2017  7 690
      "Aug. 2017" "01011010" 1011010 ":" . ":" . "1708" 0 1 . . 2017  8 691
      "Sep. 2017" "01011010" 1011010 ":" . ":" . "1709" 0 1 . . 2017  9 692
      "Oct. 2017" "01011010" 1011010 ":" . ":" . "1710" 0 1 . . 2017 10 693
      "Nov. 2017" "01011010" 1011010 ":" . ":" . "1711" 0 1 . . 2017 11 694
      "Dec. 2017" "01011010" 1011010 ":" . ":" . "1712" 0 1 . . 2017 12 695
      "Jan. 2018" "01011010" 1011010 ":" . ":" . "1801" 0 1 . . 2018  1 696
      "Feb. 2018" "01011010" 1011010 ":" . ":" . "1802" 0 1 . . 2018  2 697
      "Mar. 2018" "01011010" 1011010 ":" . ":" . "1803" 0 1 . . 2018  3 698
      "1804"      "01011010" 1011010 ":" . ":" . "1804" 0 1 . . 2018  4 699
      "May. 2018" "01011010" 1011010 ":" . ":" . "1805" 0 1 . . 2018  5 700
      "Jun. 2018" "01011010" 1011010 ":" . ":" . "1806" 0 1 . . 2018  6 701
      "Jul. 2018" "01011010" 1011010 ":" . ":" . "1807" 0 1 . . 2018  7 702
      "Aug. 2018" "01011010" 1011010 ":" . ":" . "1808" 0 1 . . 2018  8 703
      "Sep. 2018" "01011010" 1011010 ":" . ":" . "1809" 0 1 . . 2018  9 704
      "Oct. 2018" "01011010" 1011010 ":" . ":" . "1810" 0 1 . . 2018 10 705
      "Nov. 2018" "01011010" 1011010 ":" . ":" . "1811" 0 1 . . 2018 11 706
      "Dec. 2018" "01011010" 1011010 ":" . ":" . "1812" 0 1 . . 2018 12 707
      "Jan. 2019" "01011010" 1011010 ":" . ":" . "1901" 0 1 . . 2019  1 708
      "Feb. 2019" "01011010" 1011010 ":" . ":" . "1902" 0 1 . . 2019  2 709
      "Mar. 2019" "01011010" 1011010 ":" . ":" . "1903" 0 1 . . 2019  3 710
      "Apr. 2019" "01011010" 1011010 ":" . ":" . "1904" 0 1 . . 2019  4 711
      "May. 2019" "01011010" 1011010 ":" . ":" . "1905" 0 1 . . 2019  5 712
      "Jun. 2019" "01011010" 1011010 ":" . ":" . "1906" 0 1 . . 2019  6 713
      "Jun. 2017" "01011090" 1011090 ":" . ":" . "1706" 0 1 . . 2017  6 689
      "Jul. 2017" "01011090" 1011090 ":" . ":" . "1707" 0 1 . . 2017  7 690
      "Aug. 2017" "01011090" 1011090 ":" . ":" . "1708" 0 1 . . 2017  8 691
      "Sep. 2017" "01011090" 1011090 ":" . ":" . "1709" 0 1 . . 2017  9 692
      "Oct. 2017" "01011090" 1011090 ":" . ":" . "1710" 0 1 . . 2017 10 693
      "Nov. 2017" "01011090" 1011090 ":" . ":" . "1711" 0 1 . . 2017 11 694
      "Dec. 2017" "01011090" 1011090 ":" . ":" . "1712" 0 1 . . 2017 12 695
      "Jan. 2018" "01011090" 1011090 ":" . ":" . "1801" 0 1 . . 2018  1 696
      "Feb. 2018" "01011090" 1011090 ":" . ":" . "1802" 0 1 . . 2018  2 697
      "Mar. 2018" "01011090" 1011090 ":" . ":" . "1803" 0 1 . . 2018  3 698
      "1804"      "01011090" 1011090 ":" . ":" . "1804" 0 1 . . 2018  4 699
      "May. 2018" "01011090" 1011090 ":" . ":" . "1805" 0 1 . . 2018  5 700
      "Jun. 2018" "01011090" 1011090 ":" . ":" . "1806" 0 1 . . 2018  6 701
      "Jul. 2018" "01011090" 1011090 ":" . ":" . "1807" 0 1 . . 2018  7 702
      "Aug. 2018" "01011090" 1011090 ":" . ":" . "1808" 0 1 . . 2018  8 703
      "Sep. 2018" "01011090" 1011090 ":" . ":" . "1809" 0 1 . . 2018  9 704
      "Oct. 2018" "01011090" 1011090 ":" . ":" . "1810" 0 1 . . 2018 10 705
      "Nov. 2018" "01011090" 1011090 ":" . ":" . "1811" 0 1 . . 2018 11 706
      "Dec. 2018" "01011090" 1011090 ":" . ":" . "1812" 0 1 . . 2018 12 707
      "Jan. 2019" "01011090" 1011090 ":" . ":" . "1901" 0 1 . . 2019  1 708
      "Feb. 2019" "01011090" 1011090 ":" . ":" . "1902" 0 1 . . 2019  2 709
      "Mar. 2019" "01011090" 1011090 ":" . ":" . "1903" 0 1 . . 2019  3 710
      "Apr. 2019" "01011090" 1011090 ":" . ":" . "1904" 0 1 . . 2019  4 711
      "May. 2019" "01011090" 1011090 ":" . ":" . "1905" 0 1 . . 2019  5 712
      "Jun. 2019" "01011090" 1011090 ":" . ":" . "1906" 0 1 . . 2019  6 713
      "Jun. 2017" "01011100" 1011100 ":" . ":" . "1706" 0 1 . . 2017  6 689
      "Jul. 2017" "01011100" 1011100 ":" . ":" . "1707" 0 1 . . 2017  7 690
      "Aug. 2017" "01011100" 1011100 ":" . ":" . "1708" 0 1 . . 2017  8 691
      "Sep. 2017" "01011100" 1011100 ":" . ":" . "1709" 0 1 . . 2017  9 692
      "Oct. 2017" "01011100" 1011100 ":" . ":" . "1710" 0 1 . . 2017 10 693
      "Nov. 2017" "01011100" 1011100 ":" . ":" . "1711" 0 1 . . 2017 11 694
      "Dec. 2017" "01011100" 1011100 ":" . ":" . "1712" 0 1 . . 2017 12 695
      "Jan. 2018" "01011100" 1011100 ":" . ":" . "1801" 0 1 . . 2018  1 696
      "Feb. 2018" "01011100" 1011100 ":" . ":" . "1802" 0 1 . . 2018  2 697
      "Mar. 2018" "01011100" 1011100 ":" . ":" . "1803" 0 1 . . 2018  3 698
      "1804"      "01011100" 1011100 ":" . ":" . "1804" 0 1 . . 2018  4 699
      "May. 2018" "01011100" 1011100 ":" . ":" . "1805" 0 1 . . 2018  5 700
      "Jun. 2018" "01011100" 1011100 ":" . ":" . "1806" 0 1 . . 2018  6 701
      "Jul. 2018" "01011100" 1011100 ":" . ":" . "1807" 0 1 . . 2018  7 702
      "Aug. 2018" "01011100" 1011100 ":" . ":" . "1808" 0 1 . . 2018  8 703
      "Sep. 2018" "01011100" 1011100 ":" . ":" . "1809" 0 1 . . 2018  9 704
      "Oct. 2018" "01011100" 1011100 ":" . ":" . "1810" 0 1 . . 2018 10 705
      "Nov. 2018" "01011100" 1011100 ":" . ":" . "1811" 0 1 . . 2018 11 706
      "Dec. 2018" "01011100" 1011100 ":" . ":" . "1812" 0 1 . . 2018 12 707
      "Jan. 2019" "01011100" 1011100 ":" . ":" . "1901" 0 1 . . 2019  1 708
      "Feb. 2019" "01011100" 1011100 ":" . ":" . "1902" 0 1 . . 2019  2 709
      "Mar. 2019" "01011100" 1011100 ":" . ":" . "1903" 0 1 . . 2019  3 710
      "Apr. 2019" "01011100" 1011100 ":" . ":" . "1904" 0 1 . . 2019  4 711
      "May. 2019" "01011100" 1011100 ":" . ":" . "1905" 0 1 . . 2019  5 712
      "Jun. 2019" "01011100" 1011100 ":" . ":" . "1906" 0 1 . . 2019  6 713
      "Jun. 2017" "01011910" 1011910 ":" . ":" . "1706" 0 1 . . 2017  6 689
      "Jul. 2017" "01011910" 1011910 ":" . ":" . "1707" 0 1 . . 2017  7 690
      "Aug. 2017" "01011910" 1011910 ":" . ":" . "1708" 0 1 . . 2017  8 691
      "Sep. 2017" "01011910" 1011910 ":" . ":" . "1709" 0 1 . . 2017  9 692
      "Oct. 2017" "01011910" 1011910 ":" . ":" . "1710" 0 1 . . 2017 10 693
      "Nov. 2017" "01011910" 1011910 ":" . ":" . "1711" 0 1 . . 2017 11 694
      "Dec. 2017" "01011910" 1011910 ":" . ":" . "1712" 0 1 . . 2017 12 695
      "Jan. 2018" "01011910" 1011910 ":" . ":" . "1801" 0 1 . . 2018  1 696
      "Feb. 2018" "01011910" 1011910 ":" . ":" . "1802" 0 1 . . 2018  2 697
      "Mar. 2018" "01011910" 1011910 ":" . ":" . "1803" 0 1 . . 2018  3 698
      "1804"      "01011910" 1011910 ":" . ":" . "1804" 0 1 . . 2018  4 699
      "May. 2018" "01011910" 1011910 ":" . ":" . "1805" 0 1 . . 2018  5 700
      "Jun. 2018" "01011910" 1011910 ":" . ":" . "1806" 0 1 . . 2018  6 701
      "Jul. 2018" "01011910" 1011910 ":" . ":" . "1807" 0 1 . . 2018  7 702
      "Aug. 2018" "01011910" 1011910 ":" . ":" . "1808" 0 1 . . 2018  8 703
      "Sep. 2018" "01011910" 1011910 ":" . ":" . "1809" 0 1 . . 2018  9 704
      "Oct. 2018" "01011910" 1011910 ":" . ":" . "1810" 0 1 . . 2018 10 705
      "Nov. 2018" "01011910" 1011910 ":" . ":" . "1811" 0 1 . . 2018 11 706
      "Dec. 2018" "01011910" 1011910 ":" . ":" . "1812" 0 1 . . 2018 12 707
      "Jan. 2019" "01011910" 1011910 ":" . ":" . "1901" 0 1 . . 2019  1 708
      "Feb. 2019" "01011910" 1011910 ":" . ":" . "1902" 0 1 . . 2019  2 709
      "Mar. 2019" "01011910" 1011910 ":" . ":" . "1903" 0 1 . . 2019  3 710
      "Apr. 2019" "01011910" 1011910 ":" . ":" . "1904" 0 1 . . 2019  4 711
      "May. 2019" "01011910" 1011910 ":" . ":" . "1905" 0 1 . . 2019  5 712
      "Jun. 2019" "01011910" 1011910 ":" . ":" . "1906" 0 1 . . 2019  6 713
      end
      format %tm mdate

      Comment


      • #4
        Well, thank you for using -dataex-. Unfortunately, the example data you show does not present the problem you want addressed: running -xtset product_ mdate- on this data produces no error message. However, this is a commonly encountered situation and it is simple enough to describe what you need to do about it.

        First, what it means. It means your data set is not suitable for your purposes. Either your purposes are misguided, or more likely, the data set contains errors that need to be fixed. The error message is more or less self explanatory: there are situations where two or more observations have both the same product and the same mdate. Data like that cannot be used to do calculations involving lagged observations: if there are two observations (or more) for a given product and mdate, when trying to calculate relative price change for the observation one year later, there are going to be two (or more) possible values of pit-12, and there is no basis to decide which one if any is the correct one to use.

        So, there are two possibilities here. Either these multiple observations with same date and product are data errors or they are not errors. If they are not data errors, then it means that pit-12 is undefinable and your goal is unachievable and must be rethought.

        More commonly, however, these observations represent data errors. So you must find them and fix them. Finding them is simple:
        Code:
        duplicates tag product_ mdate, gen(flag)
        order product_ mdate, first
        browse if flag
        Fixing them might be easy or hard depending on what you see. It may be that these duplicated observations are fully duplicates in every single variable. In that case, somehow in data management, you got extra copies of some variables creeping in (or maybe the original data you started from contained duplicates). In this case, -duplicates drop- will eliminate them with no loss of information. If, however, these observations differ in the values of some other variables then you must figure out which among the duplicates is correct. Or perhaps none of them is correct and you need to take a mean, or a max, or whatever. The particular fix will depend on your specific context and you will have to figure it out.

        One more thing, even if the fix turns out to be as simple as -duplicates drop-, you should not rest easy about your data. The inclusion of these duplicates in your data set indicates that the data management that produced the data set is flawed. Either the source data you were provided with contains errors, or they were introduced by programming errors during data management. Programming errors are like cockroaches: if you see one it is likely there are others, perhaps many others. And if the original source data contained these errors, you should consult with whoever provided it to alert them to this problem and ask them to review the process that created it looking for both the source of this error and possible other errors.

        So before plunging ahead, it would be prudent to investigate the entire data set creation process to see if there are other problems in your data set. It is, at best, a waste of time to analyze wrong data. At worst, you will get misleading results that, acted upon, do harm. So before proceeding you should satisfy yourself that you have made every possible effort to discover and fix all the problems in your data.

        Finally, I note that the code -gen rel_price = price/L12.price- will fail because the data set contains no variable named price, nor even any whose name begins with price. So you will need to change that line of code to use the name of the actual variable you are interested in here. Perhaps it's unit_price, or perhaps unit_p_tarriffinc, or maybe one of the value* variables?

        Comment

        Working...
        X