Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to go from a string variable to date, and a specific duplicate

    Hello,

    I have two problems. THe first problem is with the date variable below. this is a string variable. When I try to use the replace command, it says it cannot do it. I understand it is because stata reads it as a string but cannot read what is in the string. How could I make stata create a date variable from my Date below?

    Also, for the specific duplicate: I have a company that might have been copied twice: CKXE. Unfortunately I cannot show it to you in this dataex, because its in the middle of the big dataset (unelss you all can tell me a way of getting the specific range of observations in which CKXE is there).
    Once I find the duplicated TIC, I need to delete it.

    How would I go about doing this?

    Danke !


    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str7 TIC str10 CIK str4 Date str72 Company_Conformed_Name str9 Form_Type str40 BA_Street1 str30 BA_City
    "AIR"  "0000001750" "2015" "AAR CORP"                          "10-K"   "1100 N WOOD DALE RD"           "WOOD DALE"      
    "AIR"  "0000001750" "2014" "AAR CORP"                          "10-K"   "1100 N WOOD DALE RD"           "WOOD DALE"      
    "AIR"  "0000001750" "2013" "AAR CORP"                          "10-K"   "1100 N WOOD DALE RD"           "WOOD DALE"      
    "AIR"  "0000001750" "2012" "AAR CORP"                          "10-K"   "1100 N WOOD DALE RD"           "WOOD DALE"      
    "AIR"  "0000001750" "2011" "AAR CORP"                          "10-K"   "1100 N WOOD DALE RD"           "WOOD DALE"      
    "AIR"  "0000001750" "2010" "AAR CORP"                          "10-K"   "1100 N WOOD DALE RD"           "WOOD DALE"      
    "AIR"  "0000001750" "2009" "AAR CORP"                          "10-K"   "1100 N WOOD DALE RD"           "WOOD DALE"      
    "AAL"  "0000006201" "2015" "AMERICAN AIRLINES INC"             "10-K"   "4333 AMON CARTER BLVD"         "FORT WORTH"     
    "AAL"  "0000006201" "2014" "AMERICAN AIRLINES INC"             "10-K"   "4333 AMON CARTER BLVD"         "FORT WORTH"     
    "AAL"  "0000006201" "2013" "AMR CORP"                          "10-K/A" "4333 AMON CARTER BLVD"         "FORT WORTH"     
    "AAL"  "0000006201" "2013" "AMR CORP"                          "10-K"   "4333 AMON CARTER BLVD"         "FORT WORTH"     
    "AAL"  "0000006201" "2012" "AMR CORP"                          "10-K"   "4333 AMON CARTER BLVD"         "FORT WORTH"     
    "AAL"  "0000006201" "2011" "AMR CORP"                          "10-K"   "4333 AMON CARTER BLVD"         "FORT WORTH"     
    "AAL"  "0000006201" "2010" "AMR CORP"                          "10-K"   "4333 AMON CARTER BLVD"         "FORT WORTH"     
    "AAL"  "0000006201" "2009" "AMR CORP"                          "10-K"   "4333 AMON CARTER BLVD"         "FORT WORTH"     
    "CECE" "0000003197" "2015" "CECO ENVIRONMENTAL CORP"           "10-K"   "4625 RED BANK ROAD, SUITE 200" "CINCINNATI"     
    "CECE" "0000003197" "2014" "CECO ENVIRONMENTAL CORP"           "10-K"   "4625 RED BANK ROAD, SUITE 200" "CINCINNATI"     
    "CECE" "0000003197" "2013" "CECO ENVIRONMENTAL CORP"           "10-K"   "4625 RED BANK ROAD, SUITE 200" "CINCINNATI"     
    "CECE" "0000003197" "2012" "CECO ENVIRONMENTAL CORP"           "10-K"   "4625 RED BANK ROAD, SUITE 200" "CINCINNATI"     
    "CECE" "0000003197" "2011" "CECO ENVIRONMENTAL CORP"           "10-K"   "3120 FORRER STREET"            "CINCINNATI"     
    "CECE" "0000003197" "2010" "CECO ENVIRONMENTAL CORP"           "10-K"   "3120 FORRER STREET"            "CINCINNATI"     
    "CECE" "0000003197" "2009" "CECO ENVIRONMENTAL CORP"           "10-K/A" "3120 FORRER STREET"            "CINCINNATI"     
    "CECE" "0000003197" "2009" "CECO ENVIRONMENTAL CORP"           "10-K"   "3120 FORRER STREET"            "CINCINNATI"     
    "AVX"  "0000859163" "2015" "AVX Corp"                          "10-K"   "1 AVX BOULEVARD"               "FOUNTAIN INN"   
    "AVX"  "0000859163" "2014" "AVX Corp"                          "10-K/A" "1 AVX BOULEVARD"               "FOUNTAIN INN"   
    "AVX"  "0000859163" "2014" "AVX Corp"                          "10-K"   "1 AVX BOULEVARD"               "FOUNTAIN INN"   
    "AVX"  "0000859163" "2013" "AVX Corp"                          "10-K/A" "1 AVX BOULEVARD"               "FOUNTAIN INN"   
    "AVX"  "0000859163" "2013" "AVX Corp"                          "10-K"   "1 AVX BOULEVARD"               "FOUNTAIN INN"   
    "AVX"  "0000859163" "2012" "AVX Corp"                          "10-K"   "1 AVX BOULEVARD"               "FOUNTAIN INN"   
    "AVX"  "0000859163" "2011" "AVX Corp"                          "10-K"   "1 AVX BOULEVARD"               "FOUNTAIN INN"   
    "AVX"  "0000859163" "2010" "AVX CORP"                          "10-K"   "801 17TH AVE S"                "MYRTLE BEACH"   
    "AVX"  "0000859163" "2009" "AVX CORP"                          "10-K"   "801 17TH AVE S"                "MYRTLE BEACH"   
    "PNW"  "0000764622" "2015" "ARIZONA PUBLIC SERVICE CO"         "10-K"   "400 NORTH FIFTH STREET"        "PHOENIX"        
    "PNW"  "0000764622" "2014" "ARIZONA PUBLIC SERVICE CO"         "10-K"   "400 NORTH FIFTH STREET"        "PHOENIX"        
    "PNW"  "0000764622" "2013" "ARIZONA PUBLIC SERVICE CO"         "10-K"   "400 NORTH FIFTH STREET"        "PHOENIX"        
    "PNW"  "0000764622" "2012" "PINNACLE WEST CAPITAL CORP"        "10-K"   "400 N FIFTH ST"                "PHOENIX"        
    "PNW"  "0000764622" "2011" "PINNACLE WEST CAPITAL CORP"        "10-K"   "400 N FIFTH ST"                "PHOENIX"        
    "PNW"  "0000764622" "2010" "PINNACLE WEST CAPITAL CORP"        "10-K"   "400 N FIFTH ST"                "PHOENIX"        
    "PNW"  "0000764622" "2009" "PINNACLE WEST CAPITAL CORP"        "10-K"   "400 N FIFTH ST"                "PHOENIX"        
    "AAN"  "0000706688" "2015" "AARON'S INC"                       "10-K"   "309 E. PACES FERRY ROAD, N.E." "ATLANTA"        
    "AAN"  "0000706688" "2014" "AARON'S INC"                       "10-K/A" "309 E. PACES FERRY ROAD, N.E." "ATLANTA"        
    "AAN"  "0000706688" "2014" "AARON'S INC"                       "10-K"   "309 E. PACES FERRY ROAD, N.E." "ATLANTA"        
    "AAN"  "0000706688" "2013" "AARON'S INC"                       "10-K"   "309 E. PACES FERRY ROAD, N.E." "ATLANTA"        
    "AAN"  "0000706688" "2012" "AARON'S INC"                       "10-K"   "309 E. PACES FERRY ROAD, N.E." "ATLANTA"        
    "AAN"  "0000706688" "2011" "AARON'S INC"                       "10-K"   "309 E. PACES FERRY ROAD, N.E." "ATLANTA"        
    "AAN"  "0000706688" "2010" "AARON'S INC"                       "10-K"   "309 E. PACES FERRY ROAD, N.E." "ATLANTA"        
    "AAN"  "0000706688" "2009" "AARON RENTS INC"                   "10-K"   "309 E. PACES FERRY ROAD, N.E." "ATLANTA"        
    "ABT"  "0000001800" "2015" "ABBOTT LABORATORIES"               "10-K"   "100 ABBOTT PARK ROAD"          "ABBOTT PARK"    
    "ABT"  "0000001800" "2014" "ABBOTT LABORATORIES"               "10-K"   "100 ABBOTT PARK ROAD"          "ABBOTT PARK"    
    "ABT"  "0000001800" "2013" "ABBOTT LABORATORIES"               "10-K/A" "100 ABBOTT PARK ROAD"          "ABBOTT PARK"    
    "ABT"  "0000001800" "2013" "ABBOTT LABORATORIES"               "10-K"   "100 ABBOTT PARK ROAD"          "ABBOTT PARK"    
    "ABT"  "0000001800" "2012" "ABBOTT LABORATORIES"               "10-K"   "100 ABBOTT PARK ROAD"          "ABBOTT PARK"    
    "ABT"  "0000001800" "2011" "ABBOTT LABORATORIES"               "10-K"   "100 ABBOTT PARK ROAD"          "ABBOTT PARK"    
    "ABT"  "0000001800" "2010" "ABBOTT LABORATORIES"               "10-K"   "100 ABBOTT PARK ROAD"          "ABBOTT PARK"    
    "ABT"  "0000001800" "2009" "ABBOTT LABORATORIES"               "10-K"   "100 ABBOTT PARK ROAD"          "ABBOTT PARK"    
    "ACET" "0000002034" "2015" "ACETO CORP"                        "10-K"   "4 TRI HARBOR COURT"            "PORT WASHINGTON"
    "ACET" "0000002034" "2014" "ACETO CORP"                        "10-K"   "4 TRI HARBOR COURT"            "PORT WASHINGTON"
    "ACET" "0000002034" "2013" "ACETO CORP"                        "10-K"   "4 TRI HARBOR COURT"            "PORT WASHINGTON"
    "ACET" "0000002034" "2012" "ACETO CORP"                        "10-K"   "4 TRI HARBOR COURT"            "PORT WASHINGTON"
    "ACET" "0000002034" "2011" "ACETO CORP"                        "10-K"   "ONE HOLLOW LANE"               "LAKE SUCCESS"   
    "ACET" "0000002034" "2010" "ACETO CORP"                        "10-K"   "ONE HOLLOW LANE"               "LAKE SUCCESS"   
    "ACET" "0000002034" "2009" "ACETO CORP"                        "10-K"   "ONE HOLLOW LANE"               "LAKE SUCCESS"   
    "AE"   "0000002178" "2015" "ADAMS RESOURCES & ENERGY, INC."    "10-K"   "17 S. BRIAR HOLLOW LN."        "HOUSTON"        
    "AE"   "0000002178" "2014" "ADAMS RESOURCES & ENERGY, INC."    "10-K"   "17 S. BRIAR HOLLOW LN."        "HOUSTON"        
    "AE"   "0000002178" "2013" "ADAMS RESOURCES & ENERGY, INC."    "10-K"   "17 S. BRIAR HOLLOW LN."        "HOUSTON"        
    "AE"   "0000002178" "2012" "ADAMS RESOURCES & ENERGY, INC."    "10-K"   "4400 POST OAK PKWY STE 2700"   "HOUSTON"        
    "AE"   "0000002178" "2011" "ADAMS RESOURCES & ENERGY, INC."    "10-K"   "4400 POST OAK PKWY STE 2700"   "HOUSTON"        
    "AE"   "0000002178" "2010" "ADAMS RESOURCES & ENERGY, INC."    "10-K"   "4400 POST OAK PKWY STE 2700"   "HOUSTON"        
    "AE"   "0000002178" "2009" "ADAMS RESOURCES & ENERGY, INC."    "10-K"   "4400 POST OAK PKWY STE 2700"   "HOUSTON"        
    "AMD"  "0000002488" "2015" "ADVANCED MICRO DEVICES INC"        "10-K"   "ONE AMD PL"                    "SUNNYVALE"      
    "AMD"  "0000002488" "2014" "ADVANCED MICRO DEVICES INC"        "10-K"   "ONE AMD PL"                    "SUNNYVALE"      
    "AMD"  "0000002488" "2013" "ADVANCED MICRO DEVICES INC"        "10-K"   "ONE AMD PL"                    "SUNNYVALE"      
    "AMD"  "0000002488" "2012" "ADVANCED MICRO DEVICES INC"        "10-K"   "ONE AMD PL"                    "SUNNYVALE"      
    "AMD"  "0000002488" "2011" "ADVANCED MICRO DEVICES INC"        "10-K"   "ONE AMD PL"                    "SUNNYVALE"      
    "AMD"  "0000002488" "2010" "ADVANCED MICRO DEVICES INC"        "10-K"   "ONE AMD PL"                    "SUNNYVALE"      
    "AMD"  "0000002488" "2009" "ADVANCED MICRO DEVICES INC"        "10-K"   "ONE AMD PL"                    "SUNNYVALE"      
    "AIM"  "0000109471" "2013" "AEROSONIC CORP /DE/"               "10-K/A" "1212 N HERCULES AVE"           "CLEARWATER"     
    "AIM"  "0000109471" "2013" "AEROSONIC CORP /DE/"               "10-K"   "1212 N HERCULES AVE"           "CLEARWATER"     
    "AIM"  "0000109471" "2012" "AEROSONIC CORP /DE/"               "10-K"   "1212 N HERCULES AVE"           "CLEARWATER"     
    "AIM"  "0000109471" "2011" "AEROSONIC CORP /DE/"               "10-K"   "1212 N HERCULES AVE"           "CLEARWATER"     
    "AIM"  "0000109471" "2010" "AEROSONIC CORP /DE/"               "10-K"   "1212 N HERCULES AVE"           "CLEARWATER"     
    "AIM"  "0000109471" "2009" "AEROSONIC CORP /DE/"               "10-K"   "1212 N HERCULES AVE"           "CLEARWATER"     
    "APD"  "0000002969" "2015" "AIR PRODUCTS & CHEMICALS INC /DE/" "10-K"   "7201 HAMILTON BLVD"            "ALLENTOWN"      
    "APD"  "0000002969" "2014" "AIR PRODUCTS & CHEMICALS INC /DE/" "10-K"   "7201 HAMILTON BLVD"            "ALLENTOWN"      
    "APD"  "0000002969" "2013" "AIR PRODUCTS & CHEMICALS INC /DE/" "10-K"   "7201 HAMILTON BLVD"            "ALLENTOWN"      
    "APD"  "0000002969" "2012" "AIR PRODUCTS & CHEMICALS INC /DE/" "10-K"   "7201 HAMILTON BLVD"            "ALLENTOWN"      
    "APD"  "0000002969" "2011" "AIR PRODUCTS & CHEMICALS INC /DE/" "10-K"   "7201 HAMILTON BLVD"            "ALLENTOWN"      
    "APD"  "0000002969" "2010" "AIR PRODUCTS & CHEMICALS INC /DE/" "10-K"   "7201 HAMILTON BLVD"            "ALLENTOWN"      
    "APD"  "0000002969" "2009" "AIR PRODUCTS & CHEMICALS INC /DE/" "10-K"   "7201 HAMILTON BLVD"            "ALLENTOWN"      
    "ALAN" "0000098618" "2015" "ALANCO TECHNOLOGIES INC"           "10-K"   "7950 E. ACOMA DRIVE"           "SCOTTSDALE"     
    "ALAN" "0000098618" "2014" "ALANCO TECHNOLOGIES INC"           "10-K"   "7950 E. ACOMA DRIVE"           "SCOTTSDALE"     
    "ALAN" "0000098618" "2013" "ALANCO TECHNOLOGIES INC"           "10-K"   "7950 E. ACOMA DRIVE"           "SCOTTSDALE"     
    "ALAN" "0000098618" "2012" "ALANCO TECHNOLOGIES INC"           "10-K/A" "7950 E. ACOMA DRIVE"           "SCOTTSDALE"     
    "ALAN" "0000098618" "2012" "ALANCO TECHNOLOGIES INC"           "10-K"   "7950 E. ACOMA DRIVE"           "SCOTTSDALE"     
    "ALAN" "0000098618" "2012" "ALANCO TECHNOLOGIES INC"           "10-K/A" "7950 E. ACOMA DRIVE"           "SCOTTSDALE"     
    "ALAN" "0000098618" "2011" "ALANCO TECHNOLOGIES INC"           "10-K/A" "7950 E. ACOMA DRIVE"           "SCOTTSDALE"     
    "ALAN" "0000098618" "2011" "ALANCO TECHNOLOGIES INC"           "10-K"   "7950 E. ACOMA DRIVE"           "SCOTTSDALE"     
    "ALAN" "0000098618" "2010" "ALANCO TECHNOLOGIES INC"           "10-K"   "15575 N 83RD WAY"              "SCOTTSDALE"     
    "ALAN" "0000098618" "2009" "ALANCO TECHNOLOGIES INC"           "10-K"   "15575 N 83RD WAY"              "SCOTTSDALE"     
    "ALK"  "0000766421" "2015" "ALASKA AIR GROUP, INC."            "10-K"   "19300 INTERNATIONAL BOULEVARD" "SEATTLE"        
    end

  • #2
    Code:
    help datetime
    help duplicates

    Comment


    • #3
      For the yearly Date, just destring the variable.

      Code:
      . destring Date, replace
      Date: all characters numeric; replaced as int
      For the duplicates:

      Code:
      . duplicates report
      
      Duplicates in terms of all variables
      
      --------------------------------------
         copies | observations       surplus
      ----------+---------------------------
              1 |           98             0
              2 |            2             1
      --------------------------------------
      
      . duplicates list
      
      Duplicates in terms of all variables
      
        +--------------------------------------------------------------------------------------------+
        | obs: |  TIC |        CIK | Date |  Company_Conformed_Name | Form_T~e |          BA_Street1 |
        |   46 | ALAN | 0000098618 | 2012 | ALANCO TECHNOLOGIES INC |   10-K/A | 7950 E. ACOMA DRIVE |
        |--------------------------------------------------------------------------------------------|
        |                                            BA_City                                         |
        |                                         SCOTTSDALE                                         |
        +--------------------------------------------------------------------------------------------+
      
        +--------------------------------------------------------------------------------------------+
        | obs: |  TIC |        CIK | Date |  Company_Conformed_Name | Form_T~e |          BA_Street1 |
        |   55 | ALAN | 0000098618 | 2012 | ALANCO TECHNOLOGIES INC |   10-K/A | 7950 E. ACOMA DRIVE |
        |--------------------------------------------------------------------------------------------|
        |                                            BA_City                                         |
        |                                         SCOTTSDALE                                         |
        +--------------------------------------------------------------------------------------------+

      Comment


      • #4
        Unfortunately I cannot show it to you in this dataex, because its in the middle of the big dataset
        It's actually easy--just create a variable to_share==1 if TIC=="CKXE"

        Code:
        gen to_share=1 if inlist(TIC, "AMD", "APD")
        dataex TIC CIK Date Company_Conformed_Name Form_Type BA_Street1 BA_City if to_share==1
        
        * Output from dataex
        clear
        input str7 TIC str10 CIK str4 Date str72 Company_Conformed_Name str9 Form_Type str40 BA_Street1 str30 BA_City
        "AMD" "0000002488" "2015" "ADVANCED MICRO DEVICES INC"        "10-K" "ONE AMD PL"         "SUNNYVALE"
        "AMD" "0000002488" "2014" "ADVANCED MICRO DEVICES INC"        "10-K" "ONE AMD PL"         "SUNNYVALE"
        "AMD" "0000002488" "2013" "ADVANCED MICRO DEVICES INC"        "10-K" "ONE AMD PL"         "SUNNYVALE"
        "AMD" "0000002488" "2012" "ADVANCED MICRO DEVICES INC"        "10-K" "ONE AMD PL"         "SUNNYVALE"
        "AMD" "0000002488" "2011" "ADVANCED MICRO DEVICES INC"        "10-K" "ONE AMD PL"         "SUNNYVALE"
        "AMD" "0000002488" "2010" "ADVANCED MICRO DEVICES INC"        "10-K" "ONE AMD PL"         "SUNNYVALE"
        "AMD" "0000002488" "2009" "ADVANCED MICRO DEVICES INC"        "10-K" "ONE AMD PL"         "SUNNYVALE"
        "APD" "0000002969" "2015" "AIR PRODUCTS & CHEMICALS INC /DE/" "10-K" "7201 HAMILTON BLVD" "ALLENTOWN"
        "APD" "0000002969" "2014" "AIR PRODUCTS & CHEMICALS INC /DE/" "10-K" "7201 HAMILTON BLVD" "ALLENTOWN"
        "APD" "0000002969" "2013" "AIR PRODUCTS & CHEMICALS INC /DE/" "10-K" "7201 HAMILTON BLVD" "ALLENTOWN"
        "APD" "0000002969" "2012" "AIR PRODUCTS & CHEMICALS INC /DE/" "10-K" "7201 HAMILTON BLVD" "ALLENTOWN"
        "APD" "0000002969" "2011" "AIR PRODUCTS & CHEMICALS INC /DE/" "10-K" "7201 HAMILTON BLVD" "ALLENTOWN"
        "APD" "0000002969" "2010" "AIR PRODUCTS & CHEMICALS INC /DE/" "10-K" "7201 HAMILTON BLVD" "ALLENTOWN"
        "APD" "0000002969" "2009" "AIR PRODUCTS & CHEMICALS INC /DE/" "10-K" "7201 HAMILTON BLVD" "ALLENTOWN"
        end

        Comment


        • #5
          Kolev,

          When I run the command it returns:

          . destring Date, replace
          Date: contains nonnumeric characters; no replace

          Perhaps when importing the numbers from dataex, they become readable by stata. But I cannot destring the date and replace it.


          I tried using :

          gen date2 = date(Date, "DMY", 2007)

          However, it only returns missing observations.
          Last edited by Falco Wolf; 03 Feb 2019, 14:10.

          Comment


          • #6
            It looks like your Date variable has a non-numeric character somewhere in your dataset. You can use the force option in your destring, but you'll probably want to fix the problem beforehand rather than resorting to using the force option in your destring.

            To get a feel for where the problem lies I would start with

            Code:
            contract Date
            drop _freq
            
            gen date_check = Date
            destring date_check, replace force
            
            tab Date if date_check == .



            Comment


            • #7
              Hey! The destringing is now solved. Justin you were right, I had som "nan" values there. Thanks!

              Ill open a new thread for the duplicates. Because I have a slightly different question.

              Comment

              Working...
              X