Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Counting distinct values for a variable by other variable

    ~~Hi,
    I have a problem counting distinct values. Do you know if you can count the number of unique values in a variable by other variable? I mean, I want to count the number of distinct companies within a specific gegographical area, so I have a variable containing gegographical data (a categorical variable), and a variable containing different companies, and I want to count how many different companies are in the different areas.
    Thanks in advance.
    David

  • #2
    The question of counting distinct (not unique) values was reviewed in nauseating detail in

    SJ-8-4 dm0042 . . . . . . . . . . . . Speaking Stata: Distinct observations
    (help distinct if installed) . . . . . . N. J. Cox and G. M. Longton
    Q4/08 SJ 8(4):557--568
    shows how to answer questions about distinct observations
    from first principles; provides a convenience command

    .pdf at http://www.stata-journal.com/sjpdf.h...iclenum=dm0042

    Many ways to do it: here is one.

    Code:
    egen tag = tag(company area) 
    egen count = total(tag), by(area)
    Here is another:

    Code:
    bysort area company: gen count = _n == 1 
    by area: replace count = sum(count) 
    by area: replace count = count[_N]

    Comment

    Working...
    X