Tuesday, November 10, 2009

ORACLE Analytics functions

In one of the projects i have worked on recently, we had a requirement to group the user information based on the latest updated user and display the results. The usage of native SQL like ORDER BY & GROUP BY was cumbersome and leads to a poor query performance.
Upon research we discovered that the ORACLE Analytic functions can do whatever the native SQL function does with a better performance.

The requirement was to filter the unique user id, user number from the user profile table with the latest user entry and the result should be group by User Number. For Ex; the query should return the Mohan4 & Mohan 3.

UserIdUser_NumberLatest_EntryOptions
Mohan11Nov-06-2009Y
Mohan21Nov-07-2009N
Mohan32Nov-09-2009Y
Mohan41Nov-10-2009N


The result can be obtained by GROUP BY Clause, but the problem is we can only retrieve GROUP BY Columns. The non-group by columns will not be retrieved in the GROUP BY clause.

We utilized the ORACLE Analytics functions to overcome these difficulties with performance, Group by issue etc, So our query looked like,

(SELECT USER_NUMBER, USER_ID, LATEST_ENTRY, OPTIONS
FROM
(SELECT USRINF.*, ROW_NUMBER() OVER (PARTITION BY USER_NUMBER ORDER BY LATEST_ENTRY DESC) RANK
From DB_SCHEMA.USERINFO USRINF WHERE LATEST_ENTRY IS NOT NULL
ORDER BY LATEST_ENTRY DESC
)
WHERE
1=1
AND RANK=1
);

In the above query, the clause “PARTITION BY” groups the results based on the column name indicated next to it. In this case, it is acted on the column USER_NUMBER. The subquery in the above query
SELECT USRINF.*, ROW_NUMBER() OVER (PARTITION BY USER_NUMBER ORDER BY LATEST_ENTRY DESC) RANK
From DB_SCHEMA.USERINFO USRINF WHERE LATEST_ENTRY IS NOT NULL
ORDER BY LATEST_ENTRY DESC

Will display the result as


UserIdUser_NumberLatest_EntryOptionsROW_NUMBER()
Mohan11Nov-06-2009Y3
Mohan21Nov-07-2009N3
Mohan32Nov-09-2009Y1
Mohan41Nov-10-2009N3


The outer query filters the unique first record from the result set described above ordered by descending of the Latest_Entry column. So the result set of the final query will be


UserIdUser_NumberLatest_EntryOptionsROW_NUMBER()
Mohan32Nov-09-2009Y1
Mohan41Nov-10-2009N3

No comments: