Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Issue with panel dataset and xtline

    Good afternoon everyone,

    I am currently working on a research project in labor economics. My dataset is an unbalanced panel data of individuals listing all the official endings of contracts for them from 2018 to 2021. The observations are daily, and sometimes the same worker has, for instance, 4 contracts concluded in one month as he is a temporary worker. My goal was to draw a graph depicting all the layoffs of workers across these years, and I wanted to do so by using xtline. I have managed to set up the dataset as panel data, including the id of the workers as panel variable (unbalanced), and monthly observations in time variable (STATA says there are gaps in the data, even if this is not possible, since there are like 4 observations for each day of each year). When trying to plot it, however, an error appears;
    "macro substitution results in line that is too long
    The line resulting from substituting macros would be longer than allowed. The maximum allowed length is
    645,216 characters, which is calculated on the basis of set maxvar."
    I am at a loss. Am i doing something wrongly in managing my data?

    Thanks to anyone who answers beforehand.


  • #2
    For such a plot, each line represents an individual. What are you trying to accomplish with such a graph? Even with as few as six overlaid line graphs, you can get into a tangled mess, or the so-called spaghetti problem. See references in the links below:

    https://journals.sagepub.com/doi/abs...36867X19893641
    https://journals.sagepub.com/doi/ful...6867X211025838

    If you want a matrix display, create several matrices by graphing using subsamples of the data rather than doing it all at once.
    Last edited by Andrew Musau; 19 Mar 2022, 06:50.

    Comment


    • #3
      STATA says there are gaps in the data, even if this is not possible, since there are like 4 observations for each day of each year
      Stata is correct in what it says; it is your understanding of the message that is lacking.

      There are "gaps in the data" because individuals - your panels - do not have observations on every month between their first observation and their last observation. Or, your dates are expressed as a Stata daily date (e.g. 1jan2020) rather than as a Stata monthly date (e.g. 2020m1) and so there are 30 missing days between January 1 and February 1.

      The output of the xtset command might shed light on this, as perhaps would an example of your data.

      Stata's "date and time" variables are complicated and there is a lot to learn. If you have not already read the very detailed Chapter 24 (Working with dates and times) of the Stata User's Guide PDF, do so now. If you have, it's time for a refresher. After that, the help datetime documentation will usually be enough to point the way. You can't remember everything; even the most experienced users end up referring to the help datetime documentation or back to the manual for details. But at least you will get a good understanding of the basics and the underlying principles. An investment of time that will be amply repaid.

      All Stata manuals are included as PDFs in the Stata installation and are accessible from within Stata - for example, through the PDF Documentation section of Stata's Help menu.

      Comment

      Working...
      X