Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Replace Sample vs Population

    Hello, i need help. I have 4 variables i.e.
    VRegS VRegP VregPmake and VRegSmake.
    Capital S stands for sample while P stands for population

    VRegS=vehicle registration for sample

    VRegP= vehicle registration for population

    VregPmake=...make of a vehicle from the population

    VRegSmake=...make of a vehicle from sample
    Sample size is 100
    Population is 1000

    So, I have 100 observations VRegS, 1000 observations of VRegP, and 1000 observations VRegPmake.

    And
    100 missing values of VRegSmake

    So, I want to replace VRegSmake
    How?

    Here is the catch:
    The 100 makes for S can be found in the 1000 makes for P.

    I can't use

    . replace VRegSmake = VRegPmake if VRegS == VRegP

    Because S is smaller than P and the values are not matched one to one, even if I sort, you getting the point?

    So, what should I do?

  • #2
    hellooo someone help me please

    Comment


    • #3
      you getting the point?
      No. In fact, I'm totally confused about what you want to do. I suspect others are as well, which is probably why you got no response. I suggest you post a small representative sample of your data, and then show us a hand-worked solution for that sample so we can see what you want. Please use -datatex- to post your data sample so that those who might want to help you can quickly, easily, and completely faithfully reproduce your example and experiment with it. Run -ssc install dataex- to get the -dataex-command if you don't already have it. Simple instructions for using it are in -help dataex-. I also recommend you read the FAQ for some good suggestions about how to ask questions clearly.

      Comment


      • #4
        All thanks. I have pasted an example of what I am dealing with below.
        idS VRegS VRegS_make idP VRegP VRegP_make
        1 BAB2010 . 1 BAB2010 TOYOTA
        2 AAB300 . 2 G0123BT FORD
        3 ACK4500 . 3 AAB300 SCANIA
        4 BAC200 . 4 BAA290 MAZDA
        5 ADD4205 . 5 ACK4500 BMW
        6 ALM2077 LEXUS
        7 BAC200 MITSUBISHI
        8 ACF500 LOTUS
        9 ADD4205 AUDI
        10 BBA2240 OPEL
        Want I want is to replace VRegS_make using the makes for VRegP_make. How? All VRegS is found in VRegP, and VRegP has the VRegP_make (needed to populate VRegS_make).

        The challenge is I have 3,054 VRegS observations and 647,521 VRegP observations. Thus, it's not as easy as simply getting rid of the "5" extra observations in the pasted example.

        The pasted dataset will thus only be used as a guide for me to work on the larger dataset.

        Please help me.

        Comment


        • #5
          Please find attached dataset in stata format. Thanks.
          Attached Files

          Comment


          • #6
            You would be better of keeping separate info in separate datasets in this case, and merging them only in a way that keeps info in each row with the observation they belong to

            Code:
            *move population data to different set
            preserve
            drop ids vregs vregs_make
            ren vregp vregs
            save "C:\XYZ\P.dta", replace
            restore
            *remove population data from sample set
            drop vregs_make idp vregp vregp_make
            save "C:\XYZ\S.dta", replace
            *get rid of missings from sample set
            drop if ids ==.
            *reintroudce the vreg info properly, thourgh merging them on vreg info
            merge 1:1 vregs using C:\XYZ\P.dta

            Comment

            Working...
            X