About string concatenation across different records

We all know how the GROUP BY clause works and how to use it. Group function (SUM, AVG, MIN, MAX, COUNT, …) are often used and well known too.
There’s a specific “grouping problem” that we often have to manage and that doesn’t find an answer in those functions: Using the EMP table (ENAME is the employee name, DEPTNO is the department number) we need to extract a record for each DEPTNO associated to the list of the names of the employees that work in that department.
Here are all the employees:

SQL> select deptno, ename
  2  from emp
  3* order by deptno

    DEPTNO ENAME
---------- ----------
        10 CLARK
        10 KING
        10 MILLER
        20 JONES
        20 FORD
        20 ADAMS
        20 SMITH
        20 SCOTT
        30 WARD
        30 TURNER
        30 ALLEN
        30 JAMES
        30 BLAKE
        30 MARTIN

And we want to get something similar to the following:

    DEPTNO ENAMES
---------- ----------
        10 CLARK,KING,MILLER
        20 JONES,FORD,ADAMS,SMITH,SCOTT
        30 WARD,TURNER,ALLEN,JAMES,BLAKE,MARTIN

The most natural approach is probably a PL/SQL function:

create or replace function getNames(p_deptno in number) return varchar2 is
retval varchar2(32000);
begin
  for r in (select ename from emp where deptno = p_deptno) loop
    retval := retval||r.ename||',';
  end loop;
  return rtrim(retval,',');
end;
/

SQL> select deptno,getNames(deptno) enames
  2  from emp
  3* group by deptno

    DEPTNO ENAMES
---------- -------------------------------------------
        30 ALLEN,WARD,MARTIN,BLAKE,TURNER,JAMES
        20 SMITH,JONES,SCOTT,ADAMS,FORD
        10 CLARK,KING,MILLER

The second approach I want to show is based on an undocumented group function: WM_CONCAT.
This function is defined in the WMSYS schema. That schema contains system object used by the “Workspace Manager” feature (more info here.
It’s an undocumented function, used by Oracle for internal purposes only. Hence you shouldn’t use it in production environments.
But it exists and works perfectly:

SQL> select deptno, wm_concat(ename) enames
  2  from emp
  3  group by deptno;

    DEPTNO ENAMES
---------- -------------------------------------------
        10 CLARK,KING,MILLER
        20 SMITH,FORD,ADAMS,SCOTT,JONES
        30 ALLEN,BLAKE,MARTIN,TURNER,JAMES,WARD

The third approach I’m going to show uses the SYS_CONNECT_BY_PATH function.
This function can be used in hierarchical queries only (CONNECT BY) takes as input a column and a character and gives as output the full path of each element starting from the root of the hierarchy, using the given character as a separator.
An example will clarify:

 SELECT deptno, ltrim(SYS_CONNECT_BY_PATH(ename, ','),',') enames
  FROM (select deptno, ename, rank() over(partition by deptno order by rownum) num from emp)
where connect_by_isleaf=1
START WITH num=1
CONNECT BY PRIOR num+1 = num and prior deptno=deptno;

    DEPTNO ENAMES
---------- ---------------------------------------------
        10 CLARK,KING,MILLER
        20 SMITH,JONES,SCOTT,ADAMS,FORD
        30 ALLEN,WARD,MARTIN,BLAKE,TURNER,JAMES

Not clear yet? Ok, let’s see step by step:
First of all I started from a query that shows the list of the employees sorted using a number (num) incremented at each row and reset when the department changes.

SQL> select deptno, ename, rank() over(partition by deptno order by rownum) num
  2         from emp
  3  ;

    DEPTNO ENAME             NUM 
---------- ---------- ---------- 
        10 CLARK               1 
        10 KING                2 
        10 MILLER              3 
        20 SMITH               1 
        20 JONES               2 
        20 SCOTT               3 
        20 ADAMS               4 
        20 FORD                5 
        30 ALLEN               1 
        30 WARD                2 
        30 MARTIN              3 
        30 BLAKE               4 
        30 TURNER              5 
        30 JAMES               6 

To get this result I used analytic functions, available in Oracle since Oracle9i.

Second step: I introduced the hierarchical query: who has no experience with the CONNECT BY clause can see the docs.

I connect each ename to the other having the same deptno and the next num using the following clause

CONNECT BY PRIOR num+1 = num and prior deptno=deptno

I drop incomplete paths using the following

WHERE connect_by_isleaf=1

that allows to select only the leafs of the tree.

Finally I get only the paths that start from the first ename


START WITH num=1

Try to use this approach yourself, you will understand its power playing with it.

Forth approach: an XML query.

SQL> select deptno, rtrim(
  2                   xmlagg(
  3                     xmlelement("x",ename||',')
  4                          ).extract('//text()'),',') enames
  5    from emp
  6  group by deptno;

    DEPTNO ENAMES
---------- --------------------------------------------------
        10 CLARK,KING,MILLER
        20 SMITH,FORD,ADAMS,SCOTT,JONES
        30 ALLEN,BLAKE,MARTIN,TURNER,JAMES,WARD

For each department I get, using the XMLAGG function, an XML document that aggregate all the employees that work in that department. Then I extract only the element’s content using the EXTRACT function.

Fifth approach: the LISTAGG function.
It’s a new function available in Oracle11g release 2.
Here is documentation.


SQL >SELECT deptno, LISTAGG(ename, ',') WITHIN GROUP (ORDER BY ename) enames
  2    FROM emp
  3  group by deptno;

    DEPTNO ENAMES
---------- --------------------------------------------------
        10 CLARK,KING,MILLER
        20 SMITH,JONES,SCOTT,ADAMS,FORD
        30 ALLEN,BLAKE,JAMES,MARTIN,TURNER,WARD

Works like a charm!!

Massimo

Advertisements

Tags: , , , , , ,

2 Responses to “About string concatenation across different records”

  1. SutoCom Says:

    Reblogged this on Sutoprise Avenue, A SutoCom Source.

  2. Hierarchical queries in Oracle – The CONNECT BY clause « Welcome To Oracle database by Massimo Ruocchio Says:

    […] The first one has been already showed in another post: About string concatenation across different records. […]

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s


%d bloggers like this: