oracle - Finding the group of 3 minutes for each ID in Hive SQL -


i having data such ,

id  time 1   9/6/2016 00:01:00 1   9/6/2016 00:01:30 1   9/6/2016 00:02:00 1   9/6/2016 00:04:30 1   9/6/2016 00:05:30 1   9/6/2016 01:05:30 1   9/6/2016 05:05:30 1   9/6/2016 05:06:30 2   9/6/2016 01:55:00 2   9/6/2016 01:56:29 2   9/6/2016 01:57:31 2   9/6/2016 03:55:00 2   9/6/2016 04:13:00 2   9/6/2016 04:15:21 

for each id, want set new variable called flag 1 , check first value of time. first value of time, want check entries within 3 minutes first entry , set every thing 1. once time entries above 3 minutes, want set flag variable 2 , again check entries within 3 minutes time , needs go on each id. want find 3 minutes groups each id, can form sets each id.

the output want is,

id  time              flag 1   9/6/2016 00:01:00   1 1   9/6/2016 00:01:30   1 1   9/6/2016 00:02:00   1 1   9/6/2016 00:04:30   2 1   9/6/2016 00:05:30   2 1   9/6/2016 01:05:30   2 1   9/6/2016 05:05:30   2 1   9/6/2016 05:06:30   2 2   9/6/2016 01:55:00   1 2   9/6/2016 01:56:29   1 2   9/6/2016 01:57:31   1 2   9/6/2016 03:55:00   2 2   9/6/2016 04:13:00   3 2   9/6/2016 04:15:21   3 

here id 1, flag sets 1 , keeps checking 3 minute entries until 3rd row , once above 3 minutes, sets 2 , again starts checking 3 minute entries. id 2 well.

the following tried,

select id, time, rank() on (order time) rank table_name; 

this 1 ranking entire table. thinking, can rank each id , call first value , subtract remaining values , write sub query here.

is there better efficient way this? using hive queries here. appreciated.

please note, sample output incorrect; id = 1, time = 01:05:30 full hour after 00:05:30, yet have same flag both.

here solution using exclusively oracle sql. check "flag changeover" condition; way wrote it, new flag starts when strictly greater 3 minutes have passed. if want start new count when 3 minutes have passed, change first inequality non-strict , second strict.

the solution* uses recursive factored subquery, requires oracle 11.2 or above.

* note: logically shouldn't have subtract 3/(24*60) do; instead, inequalities should compare against 3/(24*60). works in oracle 12, crashes database on oracle 11.2 ora-00600, told marker known bug in oracle implementation of recursive queries in 11.2 (which fixed in 12.1). non-paying customer, don't have access bug info myself. tested query posted below on oracle 11.2 , works, while simplified version crashes. both versions worked fine on 12.1.

with      inputs ( id, time ) (        select 1, to_date('9/6/2016 00:01:00', 'mm/dd/yyyy hh24:mi:ss') dual union        select 1, to_date('9/6/2016 00:01:30', 'mm/dd/yyyy hh24:mi:ss') dual union        select 1, to_date('9/6/2016 00:02:00', 'mm/dd/yyyy hh24:mi:ss') dual union        select 1, to_date('9/6/2016 00:04:30', 'mm/dd/yyyy hh24:mi:ss') dual union        select 1, to_date('9/6/2016 00:05:30', 'mm/dd/yyyy hh24:mi:ss') dual union        select 1, to_date('9/6/2016 01:05:30', 'mm/dd/yyyy hh24:mi:ss') dual union        select 1, to_date('9/6/2016 05:05:30', 'mm/dd/yyyy hh24:mi:ss') dual union        select 1, to_date('9/6/2016 05:06:30', 'mm/dd/yyyy hh24:mi:ss') dual union        select 2, to_date('9/6/2016 01:55:00', 'mm/dd/yyyy hh24:mi:ss') dual union        select 2, to_date('9/6/2016 01:56:29', 'mm/dd/yyyy hh24:mi:ss') dual union        select 2, to_date('9/6/2016 01:57:31', 'mm/dd/yyyy hh24:mi:ss') dual union        select 2, to_date('9/6/2016 03:55:00', 'mm/dd/yyyy hh24:mi:ss') dual union        select 2, to_date('9/6/2016 04:13:00', 'mm/dd/yyyy hh24:mi:ss') dual union        select 2, to_date('9/6/2016 04:15:21', 'mm/dd/yyyy hh24:mi:ss') dual      ),      rec ( id, time, flag, time_diff ) (        select  id, time, 1,                 time - min(time) on (partition id order time) - 3/(24*60)           inputs        union        select  id, time, flag + 1,                 time - min(time) on (partition id order time) - 3/(24*60)           rec          time_diff > 0      ) select   id, time, flag     rec    time_diff <= 0 order id, time ; 

output:

  id time                      flag ---- ------------------- ----------    1 06/09/2016 00:01:00          1    1 06/09/2016 00:01:30          1    1 06/09/2016 00:02:00          1    1 06/09/2016 00:04:30          2    1 06/09/2016 00:05:30          2    1 06/09/2016 01:05:30          3    1 06/09/2016 05:05:30          4    1 06/09/2016 05:06:30          4    2 06/09/2016 01:55:00          1    2 06/09/2016 01:56:29          1    2 06/09/2016 01:57:31          1    2 06/09/2016 03:55:00          2    2 06/09/2016 04:13:00          3    2 06/09/2016 04:15:21          3   14 rows selected 

Comments

Popular posts from this blog

java - Jasper subreport showing only one entry from the JSON data source when embedded in the Title band -

serialization - Convert Any type in scala to Array[Byte] and back -

SonarQube Plugin for Jenkins does not find SonarQube Scanner executable -