Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Observation count using _n is wrong

    I'm using a dataset with 75 million observation and '_n' gives me the wrong observation numbers once the count exceeds several million. I'm using Stata 17.0 MP (updated today) on a Windows 10 64bit system. The observation count is correct for the first 16,777,215 observations, but it is off beyond that. See screenshot from data editor below. `V1' is a string variable from the original data set. The other two variables were generated as gen obs_num = _n and gen count = 1 ; replace count = count[_n - 1] + 1 if _n > 1. Note, the problem is not related to the particular data set I'm using; it exists when I create a dummy data set as well. Any ideas what the issue is or how to fix it?


    Click image for larger version

Name:	Capture.PNG
Views:	1
Size:	35.1 KB
ID:	1665346

  • #2
    Not a bug
    rather a precision problem.
    try
    gen double count = _n

    HTH

    Comment


    • #3
      More generally,

      Code:
      generate `c(obs_t)' newvar = _n

      Comment


      • #4
        Thanks Fernando and Daniel, that resolves the issue

        Comment

        Working...
        X