Actuarial Outpost
 
Go Back   Actuarial Outpost > Actuarial Discussion Forum > Software & Technology
FlashChat Actuarial Discussion Preliminary Exams CAS/SOA Exams Cyberchat Around the World Suggestions


Reply
 
Thread Tools Search this Thread Display Modes
  #1  
Old 05-29-2014, 05:12 PM
Clara Oswald's Avatar
Clara Oswald Clara Oswald is offline
Member
Non-Actuary
 
Join Date: Jan 2013
Posts: 4,756
Default SQL Help

ok, Here is my table and desired result example:

Code:
Table:			
key	code1	code2	code3
1	707	899	923
1	707	452	
1	707	321	
1	707		
2	493	507	679
2	493	507	679
2	493	507	679
3	719	893	
3	719		
4	562	859	
			
Answer:			
key	count		
1	5		
2	3		
3	2		
4	2
So I want for each key the number of unique code-values that appear. The actual table is very large and the actual number of codeX fields is 10. The number of lines per key is arbitrary. If field codeX is blank than all code(X+N) will be blank for that line if that helps.

Any help would be great. Thanks!
__________________
Run, you clever boy, and remember.
Reply With Quote
  #2  
Old 05-29-2014, 05:29 PM
erosewater's Avatar
erosewater erosewater is offline
Member
 
Join Date: Aug 2003
Location: my mom's basement
Studying for your mom
Favorite beer: Schlitz
Posts: 66,685
Default

I thought about this for 10 seconds, so maybe not the most efficient, but 10 nested subqueries, one for each codeX field, that returns the unique values, then combine those and count em up

select key, count(distinct code) as count

from
(
select key, code1 as code
from table
where code1 is not null
group by key, code1
union
select key, code2 as code
from table
where code 2 is not null
group by key, code2
)
group by key
Reply With Quote
  #3  
Old 05-29-2014, 05:51 PM
GreekSquared GreekSquared is offline
Member
 
Join Date: May 2003
Location: Houston
Posts: 154
Default

This is basically the same answer as above. I never use UNPIVOT but that might be an option.

select key, count(1)
from (
select key, code1 from table
union
select key, code2 from table
union
select key, code3 from table
) x
where code1 is not null
group by key
Reply With Quote
  #4  
Old 05-29-2014, 05:51 PM
Clara Oswald's Avatar
Clara Oswald Clara Oswald is offline
Member
Non-Actuary
 
Join Date: Jan 2013
Posts: 4,756
Default

Yeah, I think that would work. Creates an implicit table that is some 5 Billion rows which I was trying to avoid but that may be the only way. Thanks!
__________________
Run, you clever boy, and remember.
Reply With Quote
  #5  
Old 05-29-2014, 06:20 PM
GreekSquared GreekSquared is offline
Member
 
Join Date: May 2003
Location: Houston
Posts: 154
Default

Just used UNPIVOT for first time and this is straight from MS help. Might be faster than union

Code:
CREATE TABLE #pvt ([Key] int, code1 int, code2 int,
    code3 int, code4 int, code5 int);
GO
INSERT INTO #pvt VALUES (1,4,3,5,NULL,NULL);
INSERT INTO #pvt VALUES (1,4,1,NULL, NULL, NULL);
INSERT INTO #pvt VALUES (2,4,NULL, NULL, NULL, NULL);
INSERT INTO #pvt VALUES (2,4,2,5,5,4);
INSERT INTO #pvt VALUES (3,5,1,NULL, NULL, NULL);



SELECT distinct [KEY], CODE
from
(SELECT * 
   FROM #pvt) p
UNPIVOT
   (code FOR codes IN 
      (code1, code2, code3, code4, code5)) x
Reply With Quote
  #6  
Old 05-29-2014, 06:56 PM
Scherzo Scherzo is offline
Member
SOA
 
Join Date: Oct 2011
Posts: 12,969
Default

Quote:
Originally Posted by Clara Oswald View Post
Yeah, I think that would work. Creates an implicit table that is some 5 Billion rows which I was trying to avoid but that may be the only way. Thanks!
I was thinking of making a new table with one code column. or simply append to existing table the values in columns 2 and 3 as entries in column 1. And then you can query/count the unique entries of the longer column 1. Maybe not elegant, but it will give the answer.
Reply With Quote
  #7  
Old 05-29-2014, 10:35 PM
JMO's Avatar
JMO JMO is offline
Carol Marler
Non-Actuary
 
Join Date: Sep 2001
Location: Back home again in Indiana
Studying for Nothing actuarial.
Posts: 37,606
Default

It's more difficult to write good queries when the data is unnormalized.

I think a variation of Scherzo's approach looks good. Make a table of all the unique entries from col. 1 & col 2. Append a similar query on col. 1 & 3, and again on col 1 &4.
Then it's obvious to do a query on the combined table.
__________________
Carol Marler, "Just My Opinion"

Pluto is no longer a planet and I am no longer an actuary. Please take my opinions as non-actuarial.


My latest favorite quotes, updated Apr 5, 2018.

Spoiler:
I should keep these four permanently.
Quote:
Originally Posted by rekrap View Post
JMO is right
Quote:
Originally Posted by campbell View Post
I agree with JMO.
Quote:
Originally Posted by Westley View Post
And def agree w/ JMO.
Quote:
Originally Posted by MG View Post
This. And everything else JMO wrote.
And this all purpose permanent quote:
Quote:
Originally Posted by Dr T Non-Fan View Post
Yup, it is always someone else's fault.
MORE:
All purpose response for careers forum:
Quote:
Originally Posted by DoctorNo View Post
Depends upon the employer and the situation.
Quote:
Originally Posted by Sredni Vashtar View Post
I feel like ERM is 90% buzzwords, and that the underlying agenda is to make sure at least one of your Corporate Officers is not dumb.
Reply With Quote
  #8  
Old 05-30-2014, 10:54 AM
Abused Student's Avatar
Abused Student Abused Student is offline
Member
SOA
 
Join Date: Feb 2007
Location: The Eighth Circle of Hell
Favorite beer: Cold and lots of it
Posts: 42,202
Default

Would this be easier to do through a couple nested queries?

For example, and just typing it here so not proofing it but
select key, count(distinct table_Value) from
(Select key, table_value from
(select key, code1 as table_value, from table group by key)
[do the rest of the columns the same way]
)


Going to take some manipulation but I would think that would be better. Group each column down, combine them all, count distinct values.
__________________
GAME ON!!!!!!! Let your ness show. Join the D&D fun. Started but applications still accepted


Officially assigned the role of Dictator, 9/30/09. Bow to my whims.
Reply With Quote
  #9  
Old 05-30-2014, 12:13 PM
Clara Oswald's Avatar
Clara Oswald Clara Oswald is offline
Member
Non-Actuary
 
Join Date: Jan 2013
Posts: 4,756
Default

Quote:
Originally Posted by GreekSquared View Post
Just used UNPIVOT for first time and this is straight from MS help. Might be faster than union

Code:
CREATE TABLE #pvt ([Key] int, code1 int, code2 int,
    code3 int, code4 int, code5 int);
GO
INSERT INTO #pvt VALUES (1,4,3,5,NULL,NULL);
INSERT INTO #pvt VALUES (1,4,1,NULL, NULL, NULL);
INSERT INTO #pvt VALUES (2,4,NULL, NULL, NULL, NULL);
INSERT INTO #pvt VALUES (2,4,2,5,5,4);
INSERT INTO #pvt VALUES (3,5,1,NULL, NULL, NULL);



SELECT distinct [KEY], CODE
from
(SELECT * 
   FROM #pvt) p
UNPIVOT
   (code FOR codes IN 
      (code1, code2, code3, code4, code5)) x
This is cool. I didn't know about this method. We are converting some code from SAS to pure SQL and SAS doesn't have this method so I can't test it.

I'm using erose's solution (everyone else's seem to be variants of his, and I thank you all for the ideas) but I will likely revisit this once we get full access to the new platform.
__________________
Run, you clever boy, and remember.
Reply With Quote
  #10  
Old 05-30-2014, 12:54 PM
Steve Grondin Steve Grondin is offline
Member
SOA AAA
 
Join Date: Nov 2001
Posts: 6,415
Default

One of our SQL proficient actuaries gave a summary of what SQL actually does versus what you tell it. I forget the compsci term for the type of language that doesn't do exactly what you tell it. I can't find the summary, so my suggestion might not actually improve erose's.

Put "distinct" into each of the subqueries and skip the group by.

These might be equivalent instructions.
Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off


All times are GMT -4. The time now is 10:59 AM.


Powered by vBulletin®
Copyright ©2000 - 2018, Jelsoft Enterprises Ltd.
*PLEASE NOTE: Posts are not checked for accuracy, and do not
represent the views of the Actuarial Outpost or its sponsors.
Page generated in 0.24469 seconds with 9 queries