We all know how the GROUP BY clause works and how to use it. Group function (SUM, AVG, MIN, MAX, COUNT, …) are often used and well known too.
There’s a specific “grouping problem” that we often have to manage and that doesn’t find an answer in those functions: Using the EMP table (ENAME is the employee name, DEPTNO is the department number) we need to extract a record for each DEPTNO associated to the list of the names of the employees that work in that department.
Here are all the employees:
SQL> select deptno, ename 2 from emp 3* order by deptno DEPTNO ENAME ---------- ---------- 10 CLARK 10 KING 10 MILLER 20 JONES 20 FORD 20 ADAMS 20 SMITH 20 SCOTT 30 WARD 30 TURNER 30 ALLEN 30 JAMES 30 BLAKE 30 MARTIN
And we want to get something similar to the following:
DEPTNO ENAMES ---------- ---------- 10 CLARK,KING,MILLER 20 JONES,FORD,ADAMS,SMITH,SCOTT 30 WARD,TURNER,ALLEN,JAMES,BLAKE,MARTIN
The most natural approach is probably a PL/SQL function:
create or replace function getNames(p_deptno in number) return varchar2 is retval varchar2(32000); begin for r in (select ename from emp where deptno = p_deptno) loop retval := retval||r.ename||','; end loop; return rtrim(retval,','); end; / SQL> select deptno,getNames(deptno) enames 2 from emp 3* group by deptno DEPTNO ENAMES ---------- ------------------------------------------- 30 ALLEN,WARD,MARTIN,BLAKE,TURNER,JAMES 20 SMITH,JONES,SCOTT,ADAMS,FORD 10 CLARK,KING,MILLER
The second approach I want to show is based on an undocumented group function: WM_CONCAT.
This function is defined in the WMSYS schema. That schema contains system object used by the “Workspace Manager” feature (more info here.
It’s an undocumented function, used by Oracle for internal purposes only. Hence you shouldn’t use it in production environments.
But it exists and works perfectly:
SQL> select deptno, wm_concat(ename) enames 2 from emp 3 group by deptno; DEPTNO ENAMES ---------- ------------------------------------------- 10 CLARK,KING,MILLER 20 SMITH,FORD,ADAMS,SCOTT,JONES 30 ALLEN,BLAKE,MARTIN,TURNER,JAMES,WARD
The third approach I’m going to show uses the SYS_CONNECT_BY_PATH function.
This function can be used in hierarchical queries only (CONNECT BY) takes as input a column and a character and gives as output the full path of each element starting from the root of the hierarchy, using the given character as a separator.
An example will clarify:
SELECT deptno, ltrim(SYS_CONNECT_BY_PATH(ename, ','),',') enames FROM (select deptno, ename, rank() over(partition by deptno order by rownum) num from emp) where connect_by_isleaf=1 START WITH num=1 CONNECT BY PRIOR num+1 = num and prior deptno=deptno; DEPTNO ENAMES ---------- --------------------------------------------- 10 CLARK,KING,MILLER 20 SMITH,JONES,SCOTT,ADAMS,FORD 30 ALLEN,WARD,MARTIN,BLAKE,TURNER,JAMES
Not clear yet? Ok, let’s see step by step:
First of all I started from a query that shows the list of the employees sorted using a number (num) incremented at each row and reset when the department changes.
SQL> select deptno, ename, rank() over(partition by deptno order by rownum) num 2 from emp 3 ; DEPTNO ENAME NUM ---------- ---------- ---------- 10 CLARK 1 10 KING 2 10 MILLER 3 20 SMITH 1 20 JONES 2 20 SCOTT 3 20 ADAMS 4 20 FORD 5 30 ALLEN 1 30 WARD 2 30 MARTIN 3 30 BLAKE 4 30 TURNER 5 30 JAMES 6
To get this result I used analytic functions, available in Oracle since Oracle9i.
Second step: I introduced the hierarchical query: who has no experience with the CONNECT BY clause can see the docs.
I connect each ename to the other having the same deptno and the next num using the following clause
CONNECT BY PRIOR num+1 = num and prior deptno=deptno
I drop incomplete paths using the following
WHERE connect_by_isleaf=1
that allows to select only the leafs of the tree.
Finally I get only the paths that start from the first ename
START WITH num=1
Try to use this approach yourself, you will understand its power playing with it.
Forth approach: an XML query.
SQL> select deptno, rtrim( 2 xmlagg( 3 xmlelement("x",ename||',') 4 ).extract('//text()'),',') enames 5 from emp 6 group by deptno; DEPTNO ENAMES ---------- -------------------------------------------------- 10 CLARK,KING,MILLER 20 SMITH,FORD,ADAMS,SCOTT,JONES 30 ALLEN,BLAKE,MARTIN,TURNER,JAMES,WARD
For each department I get, using the XMLAGG function, an XML document that aggregate all the employees that work in that department. Then I extract only the element’s content using the EXTRACT function.
Fifth approach: the LISTAGG function.
It’s a new function available in Oracle11g release 2.
Here is documentation.
SQL >SELECT deptno, LISTAGG(ename, ',') WITHIN GROUP (ORDER BY ename) enames 2 FROM emp 3 group by deptno; DEPTNO ENAMES ---------- -------------------------------------------------- 10 CLARK,KING,MILLER 20 SMITH,JONES,SCOTT,ADAMS,FORD 30 ALLEN,BLAKE,JAMES,MARTIN,TURNER,WARD
Works like a charm!!
Massimo
Tags: LISTAGG, Oracle, PL/SQL, SQL, string concatenation, WM_CONCAT, XMLAGG
10/17/2012 at 21:46 |
Reblogged this on Sutoprise Avenue, A SutoCom Source.
10/19/2012 at 14:19 |
[…] The first one has been already showed in another post: About string concatenation across different records. […]