I have a dataset that includes information on consumer-, year-, and product-specific spending. Basically summing over different stores in which a particular consumer has purchased the same product in a given year, I collapse the data as follows:
I then define an index/label for each consumer-product combination which is supposed to serve as the cross-sectional index in a panel structure:
When I run this, I get the following error message: repeated time values within panel; r(451)
I do not quite understand how that can possibly be given that I have collapsed the data beforehand. I interpret this error message to mean that for a given consumer-product combination there are multiple spending observations in a given year.
Following my first collapse of the data I have roughly 270,000,000 observations. When I subsequently collapse the data via
I end up with about 130,000,000 observations. But I don't see along which dimension I am summing here to end up with the lower number of observations.
Code:
collapse (sum) spending, by(consumer product year)
Code:
egen consumer_product = group(consumer product) xtset consumer_product year
I do not quite understand how that can possibly be given that I have collapsed the data beforehand. I interpret this error message to mean that for a given consumer-product combination there are multiple spending observations in a given year.
Following my first collapse of the data I have roughly 270,000,000 observations. When I subsequently collapse the data via
Code:
collapse (sum) spending, by(consumer_product year)
Comment