Here we describe a way to do string concatenation on a data column.
Say we have a table like the following:
table name: t1
ID | Value |
17 | v1 |
17 | v2 |
17 | v3 |
28 | v4 |
28 | v5 |
Sometimes we may want our output be like this instead:
ID | Value |
17 | v1,v2,v3 |
28 | v4,v5 |
We can use string concatenation technique to achieve the effect.
First we create a temporary storage table called #tmp, with columns tmp_ID, tmp_value, tmp_list. Then we insert data from t1 into #tmp, with the tmp_list column set to NULL, ordered by ID column. The order by criteria is crutial here, for all the data has to be in order for this algorithm to work. Usually we pick the column that serves as the identity column to be in the order by list. And the reason will be elaborated latter.
create table #tmp(tmp_ID varchar(2), tmp_value varchar(10), tmp_list varchar(100))
insert into #tmp select ID, Value, NULL from t1 order by ID
Now our table looks like the following:
table name: #tmp
tmp_ID | tmp_value | tmp_list |
17 | v1 | NULL |
17 | v2 | NULL |
17 | v3 | NULL |
28 | v4 | NULL |
28 | v5 | NULL |
Then we do the following:
declare @list varchar(100)
declare @last varchar(10)
select @list = ''
select @last = ''
update #tmp
set @list = tmp_list = (CASE WHEN @last <> tmp_ID THEN tmp_value ELSE @list + ',' + tmp_value END), @last = tmp_value
In doing this, we set the values in tmp_list column to be the strings concatenated by all the string in tmp_value column in all the preceding rows. Note that whenever the tmp_ID column changes it value, we start the concatenation process all over again. Hence our table looks like the following now:
tmp_ID | tmp_value | tmp_list |
17 | v1 | v1 |
17 | v2 | v1,v2 |
17 | v3 | v1,v2,v3 |
28 | v4 | v4 |
28 | v5 | v4,v5 |
Now all we have to do is to select the values:
select tmp_ID, max(tmp_list) as tmp_list from #tmp group by tmp_ID
And we'll get the following result:
tmp_ID | tmp_list |
17 | v1,v2,v3 |
28 | v4,v5 |
As a side note, there is another way to do the string concatenation on a column.
declare @list varchar(100)
declare @list2 varchar(100)
select @list = ''
select @list2 = ''
select @list = @list+ Value + ',' from t1 where ID = '17'
select @list2 = @list2+ Value + ',' from t1 where ID = '28'
This will create the same results, too. But there is an obvious drawback. Imagine if you have many different ID values in t1, you have to hardcode it many times or put it in a loop, which will make the solution less concise and less desirable. Performance wise, these two are pretty much the same. Space wise, the first solution definitely requires more space.
沒有留言:
張貼留言