oracle - Finding the group of 3 minutes for each ID in Hive SQL -
i having data such ,
id time 1 9/6/2016 00:01:00 1 9/6/2016 00:01:30 1 9/6/2016 00:02:00 1 9/6/2016 00:04:30 1 9/6/2016 00:05:30 1 9/6/2016 01:05:30 1 9/6/2016 05:05:30 1 9/6/2016 05:06:30 2 9/6/2016 01:55:00 2 9/6/2016 01:56:29 2 9/6/2016 01:57:31 2 9/6/2016 03:55:00 2 9/6/2016 04:13:00 2 9/6/2016 04:15:21
for each id, want set new variable called flag 1 , check first value of time. first value of time, want check entries within 3 minutes first entry , set every thing 1. once time entries above 3 minutes, want set flag variable 2 , again check entries within 3 minutes time , needs go on each id. want find 3 minutes groups each id, can form sets each id.
the output want is,
id time flag 1 9/6/2016 00:01:00 1 1 9/6/2016 00:01:30 1 1 9/6/2016 00:02:00 1 1 9/6/2016 00:04:30 2 1 9/6/2016 00:05:30 2 1 9/6/2016 01:05:30 2 1 9/6/2016 05:05:30 2 1 9/6/2016 05:06:30 2 2 9/6/2016 01:55:00 1 2 9/6/2016 01:56:29 1 2 9/6/2016 01:57:31 1 2 9/6/2016 03:55:00 2 2 9/6/2016 04:13:00 3 2 9/6/2016 04:15:21 3
here id 1, flag sets 1 , keeps checking 3 minute entries until 3rd row , once above 3 minutes, sets 2 , again starts checking 3 minute entries. id 2 well.
the following tried,
select id, time, rank() on (order time) rank table_name;
this 1 ranking entire table. thinking, can rank each id , call first value , subtract remaining values , write sub query here.
is there better efficient way this? using hive queries here. appreciated.
please note, sample output incorrect; id = 1, time = 01:05:30 full hour after 00:05:30, yet have same flag both.
here solution using exclusively oracle sql. check "flag changeover" condition; way wrote it, new flag starts when strictly greater 3 minutes have passed. if want start new count when 3 minutes have passed, change first inequality non-strict , second strict.
the solution* uses recursive factored subquery, requires oracle 11.2 or above.
* note: logically shouldn't have subtract 3/(24*60) do; instead, inequalities should compare against 3/(24*60). works in oracle 12, crashes database on oracle 11.2 ora-00600, told marker known bug in oracle implementation of recursive queries in 11.2 (which fixed in 12.1). non-paying customer, don't have access bug info myself. tested query posted below on oracle 11.2 , works, while simplified version crashes. both versions worked fine on 12.1.
with inputs ( id, time ) ( select 1, to_date('9/6/2016 00:01:00', 'mm/dd/yyyy hh24:mi:ss') dual union select 1, to_date('9/6/2016 00:01:30', 'mm/dd/yyyy hh24:mi:ss') dual union select 1, to_date('9/6/2016 00:02:00', 'mm/dd/yyyy hh24:mi:ss') dual union select 1, to_date('9/6/2016 00:04:30', 'mm/dd/yyyy hh24:mi:ss') dual union select 1, to_date('9/6/2016 00:05:30', 'mm/dd/yyyy hh24:mi:ss') dual union select 1, to_date('9/6/2016 01:05:30', 'mm/dd/yyyy hh24:mi:ss') dual union select 1, to_date('9/6/2016 05:05:30', 'mm/dd/yyyy hh24:mi:ss') dual union select 1, to_date('9/6/2016 05:06:30', 'mm/dd/yyyy hh24:mi:ss') dual union select 2, to_date('9/6/2016 01:55:00', 'mm/dd/yyyy hh24:mi:ss') dual union select 2, to_date('9/6/2016 01:56:29', 'mm/dd/yyyy hh24:mi:ss') dual union select 2, to_date('9/6/2016 01:57:31', 'mm/dd/yyyy hh24:mi:ss') dual union select 2, to_date('9/6/2016 03:55:00', 'mm/dd/yyyy hh24:mi:ss') dual union select 2, to_date('9/6/2016 04:13:00', 'mm/dd/yyyy hh24:mi:ss') dual union select 2, to_date('9/6/2016 04:15:21', 'mm/dd/yyyy hh24:mi:ss') dual ), rec ( id, time, flag, time_diff ) ( select id, time, 1, time - min(time) on (partition id order time) - 3/(24*60) inputs union select id, time, flag + 1, time - min(time) on (partition id order time) - 3/(24*60) rec time_diff > 0 ) select id, time, flag rec time_diff <= 0 order id, time ;
output:
id time flag ---- ------------------- ---------- 1 06/09/2016 00:01:00 1 1 06/09/2016 00:01:30 1 1 06/09/2016 00:02:00 1 1 06/09/2016 00:04:30 2 1 06/09/2016 00:05:30 2 1 06/09/2016 01:05:30 3 1 06/09/2016 05:05:30 4 1 06/09/2016 05:06:30 4 2 06/09/2016 01:55:00 1 2 06/09/2016 01:56:29 1 2 06/09/2016 01:57:31 1 2 06/09/2016 03:55:00 2 2 06/09/2016 04:13:00 3 2 06/09/2016 04:15:21 3 14 rows selected
Comments
Post a Comment