Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Cell Extraction and reading data


    Hello,

    I have an extraction issue in Stata.
    I want to extract a specific cell from Stata data based on the value of another variable. Conceptually, I need for Stata to read down a column of data, and I need it to continue to read the column of data while it is decreasing. When the data in the column increases, I need Stata to select data in a cell from another variable.

    To clarify, here is what I need Stata to do in three steps. Step 1, I need Stata to read the data in variable "sco". Step 2, when Stata reaches data point 153 in sco, I need it to select number 3 in the group variable. Step 3, put the number 3 in the group variable in a local variable to be used elsewhere. I hope it is obvious, but I need the syntax to have the flexibility to be any number, as this is just an example. Does anyone have any idea how to do this?

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input float(sco female group)
    147 1 1
    108 0 1
     18 0 3
    153 0 1
     50 0 2
     51 1 2
    102 0 1
     57 1 1
    160 1 1
    136 0 1
    end

    Thank you in advance.

  • #2
    A caveat here is that what you are describing doesn't fit very well with the way Stata "thinks" ("cells", implicit order of observations, and moving values of variables to locals). This inclines me to think that the goal for which you want to do what you describe might be better accomplished by some other means. You'd probably do well to describe the context/goal for why you want to do this and maybe we can offer you a nicer approach to the whole situation.

    Anyway, to clarify for myself, I'd describe the thing you want as: Given a set of observations sorted according to some pre-determined order, find the first observation with a value greater than the previous observation, with "first" and "previous" defined by the given sort order.
    Beyond that, I'll assume that your rule for the value of interest is to take the value of "group" for the observation immediately previous to the observation fulfilling the stated condition. (Note that "3" also has observation number 3 in your example, something I'll assume to be a coincidence.)

    All that being said, here's some code that I think does what you want. I have *not* carefully considered the situation in which values of interest are in observation 1 or observation _N, so there could be a problem there.

    Code:
    gen byte gt = sum(sco > sco[_n-1])
    gen byte first = gt & gt[_n-1] == 0)
    summ group if first[_n+1] ==1
    local TheValue = r(mean)

    Comment


    • #3
      Hello Mike,
      Yes, this is exactly what I am seeking. For others that may need this code, there is a syntax error in Mike's post. The second line of Mike's code has an extra ")" at the end of the line that creates a syntax problem. I found removing the ")" in the second line allows the syntax to work flawlessly. Thank you Mike.

      Comment

      Working...
      X