Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Listing and storing values of observations nearest to specified value

    Hi all,

    I would very much value some help with the below!


    Based on this example dataset
    v1 v2 v3
    19 1.5 1.7
    19 1.5 1.7
    20 1.6 1.8
    21 0.5 0.6
    22 1.2 1.4
    22 1.2 1.4
    22 1.2 1.4
    23 1.9 2.2
    I would like to store v2 and v3 for each of the following 2 values of v1: 19.5 22.3
    Specifically, since v1 only consists of integers, I want to find the nearest value of v1 (rounded up)- in this case it would be 20 and 22.
    There are also repetitions as shown in the above table- I only want one of these values.

    I would like to then export into an excel file the below desired output table:
    v1 v2 v3
    20 1.6 1.8
    22 1.2 1.4
    I would like for the code not to destroy the data- i.e. it will need to store the values and then output them.

    Many thanks in advance for any help!

  • #2
    Your question is unclear in some respects, so I am making a few assumptions to fill in the blanks. If these assumptions are not in accord with what you want to do, please post back with additional information to clarify.

    Assumption 1: You say you want to "round up," yet you associate 22.3 with 22, which is rounding down. On the other hand, associating 19.5 with 20 is rounding up. I assume what you really mean is you just want to round to the nearest integer. By convention, x.5 rounded to the nearest integer is x+1.

    Assumption 2: There are three observations with v1 = 22. As it happens they all agree o the values of v2 and v3, and you have retained only one observation with their common values. It is unclear whether this fortunate agreement on v2 and v3 always occurs in the full data set. (And if it does, it's unclear why you even have multiple such observations that provide no new information.) I will assume that this fortunate agreement does not necessarily always occur. Consequently, my code will retain all observations associated with those nearest integers, but will reduce any groups of identical observations to a single one.

    Assumption 3: Your question is stated in terms of two target values, 19.5 and 22.3. With your full data set and real problem, there could be more than two such values you are concerned with. The code below will work with up to 249 such target values. If you will need to do this with more than that, a slightly different approach is needed: post back with a good estimate of how many target values you need to be able to handle for specific changes.

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input byte v1 float(v2 v3)
    19 1.5 1.7
    19 1.5 1.7
    20 1.6 1.8
    21  .5  .6
    22 1.2 1.4
    22 1.2 1.4
    22 1.2 1.4
    23 1.9 2.2
    end
    
    local targets 19.5 22.3
    
    local nearest_integers
    foreach t of local targets {
        local nearest_integers `nearest_integers' `=round(`t')'
    }
    local nearest_integers: subinstr local nearest_integers " " ", ", all
    
    keep if inlist(v1, `nearest_integers')
    duplicates drop
    Added: The code creates a Stata data set with the information you requested. To export it to excel, use the -export excel- command. If you are not familiar with it, read -help export excel-.

    In the future, when showing data examples, please use the -dataex- command to do so, as I have done here. If you are running version 18, 17, 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.
    Last edited by Clyde Schechter; 22 Dec 2024, 14:03.

    Comment

    Working...
    X