Hi all,
I am currently working on this dataset and got a couple of queries please regarding rearranging the data. My initial dataset looks as below which has date of events by State-LGAs-Year, with the final column containing the date of events.
I would like to convert the duplicate rows by State-LGA-Year into columns. For example, rows 1 and 2 from the above data has event information for the state New South Wales, LGA Blue Mountains for the year 2011. Date of the duplicate event have to be rearranged as a column for all the duplicate rows and the resulting output should look as below:
Finally, I need to find the most recent date of the last three columns (that provides the date of each event happened in a year), for which the resulting dataset would look as below:
Thanks for your help.
Regards
Vignesh.
I am currently working on this dataset and got a couple of queries please regarding rearranging the data. My initial dataset looks as below which has date of events by State-LGAs-Year, with the final column containing the date of events.
Code:
* Example generated by -dataex-. For more info, type help dataex clear input str28 state str46 LGA float year int Date_Start "New South Wales" "Blue Mountains" 2011 18890 "New South Wales" "Blue Mountains" 2011 18924 "New South Wales" "Blue Mountains" 2013 19611 "New South Wales" "Blue Mountains" 2013 19644 "New South Wales" "Blue Mountains" 2013 19647 "New South Wales" "Blue Mountains" 2014 20028 "New South Wales" "Blue Mountains" 2014 20041 "New South Wales" "Blue Mountains" 2015 20302 "New South Wales" "Blue Mountains" 2019 21792 "Victoria" "Blue Mountains" 2015 20159 "Victoria" "Blue Mountains" 2015 20376 "Victoria" "Blue Mountains" 2017 20862 "Victoria" "Peyton" 2017 21386 end format %tdnn/dd/CCYY Date_Start
I would like to convert the duplicate rows by State-LGA-Year into columns. For example, rows 1 and 2 from the above data has event information for the state New South Wales, LGA Blue Mountains for the year 2011. Date of the duplicate event have to be rearranged as a column for all the duplicate rows and the resulting output should look as below:
Code:
* Example generated by -dataex-. For more info, type help dataex clear input str28 state str46 LGA float year int(Date_Start Date_Start_01 Date_Start_02) "New South Wales" "Blue Mountains" 2011 18890 18924 . "New South Wales" "Blue Mountains" 2013 19611 19644 19647 "New South Wales" "Blue Mountains" 2014 20028 20041 . "New South Wales" "Blue Mountains" 2015 20302 . . "New South Wales" "Blue Mountains" 2019 21792 . . "Victoria" "Blue Mountains" 2015 20159 20376 . "Victoria" "Blue Mountains" 2017 20862 . . "Victoria" "Peyton" 2017 21386 . . end format %tdnn/dd/CCYY Date_Start format %tdnn/dd/CCYY Date_Start_01 format %tdnn/dd/CCYY Date_Start_02
Finally, I need to find the most recent date of the last three columns (that provides the date of each event happened in a year), for which the resulting dataset would look as below:
Code:
* Example generated by -dataex-. For more info, type help dataex clear input str28 state str46 LGA float year int(Date_Start Date_Start_01 Date_Start_02 recentdate) "New South Wales" "Blue Mountains" 2011 18890 18924 . 18924 "New South Wales" "Blue Mountains" 2013 19611 19644 19647 19644 "New South Wales" "Blue Mountains" 2014 20028 20041 . 20041 "New South Wales" "Blue Mountains" 2015 20302 . . . "New South Wales" "Blue Mountains" 2019 21792 . . . "Victoria" "Blue Mountains" 2015 20159 20376 . 20376 "Victoria" "Blue Mountains" 2017 20862 . . . "Victoria" "Peyton" 2017 21386 . . 21386 end format %tdnn/dd/CCYY Date_Start format %tdnn/dd/CCYY Date_Start_01 format %tdnn/dd/CCYY Date_Start_02 format %tdnn/dd/CCYY recentdate
Thanks for your help.
Regards
Vignesh.
Comment