OraFAQ Forum: SQL & PL/SQL » percentages of data population

Home » SQL & PL/SQL » SQL & PL/SQL » percentages of data population (Oracle 11.2.0.3)

Show: Today's Messages :: Polls :: Message Navigator
E-mail to friend

percentages of data population [message #652027]

Tue, 31 May 2016 07:50

pointers
Messages: 451
Registered: May 2008

Senior Member

Hi,

I have two tables 1. master 2. lookup

CREATE TABLE SCOTT.MASTER
(
  ID      NUMBER,
  NAME    VARCHAR2(20 BYTE),
  LOC_ID  VARCHAR2(20 BYTE),
  REC_ID  NUMBER
)
/

ID and NAME are key columns in the master table i.e. id and name forms a unique record and to get rid of duplicates.

Insert into SCOTT.MASTER(ID, NAME, LOC_ID, REC_ID)
 Values(1, 's', '1000', 100);

Insert into SCOTT.MASTER(ID, NAME, LOC_ID, REC_ID)
 Values(2, NULL, '2000', NULL);

Insert into SCOTT.MASTER(ID, NAME, LOC_ID, REC_ID)
 Values(NULL, NULL, NULL, NULL);

Insert into SCOTT.MASTER(ID, NAME, LOC_ID, REC_ID)
 Values(NULL, 'i', NULL, NULL);

Insert into SCOTT.MASTER(ID, NAME, LOC_ID, REC_ID)
 Values(NULL, 'k', NULL, 200);

Insert into SCOTT.MASTER(ID, NAME, LOC_ID, REC_ID)
 Values(6, NULL, NULL, NULL);

Insert into SCOTT.MASTER(ID, NAME, LOC_ID, REC_ID)
 Values(7, NULL, '4000', 400);

Insert into SCOTT.MASTER(ID, NAME, LOC_ID, REC_ID)
 Values(8, NULL, NULL, NULL);

Insert into SCOTT.MASTER(ID, NAME, LOC_ID, REC_ID)
 Values(1, 's', '1000', 100);

Insert into SCOTT.MASTER(ID, NAME, LOC_ID, REC_ID)
 Values(9, 'z', NULL, NULL);

Insert into SCOTT.MASTER(ID, NAME, LOC_ID, REC_ID)
 Values(9, 'z', NULL, NULL);

Insert into SCOTT.MASTER(ID, NAME, LOC_ID, REC_ID)
 Values(9, 'z', NULL, NULL);

Insert into SCOTT.MASTER(ID, NAME, LOC_ID, REC_ID)
 Values(9, 'z', NULL, NULL);

COMMIT;

  
CREATE TABLE SCOTT.LOOKUP
(
  RECORDID     NUMBER,
  COLUMN_NAME  VARCHAR2(20 BYTE)
)
/

Insert into SCOTT.LOOKUP(RECORDID, COLUMN_NAME)
 Values(100, 'id');

Insert into SCOTT.LOOKUP(RECORDID, COLUMN_NAME)
 Values(100, 'name');

Insert into SCOTT.LOOKUP(RECORDID, COLUMN_NAME)
 Values(200, 'name');

Insert into SCOTT.LOOKUP(RECORDID, COLUMN_NAME)
 Values(400, 'id');

Below query gives the unique records of master table -

SELECT m.*
  FROM (SELECT   ID, NAME, MAX (ROWID) rid
            FROM MASTER
        GROUP BY ID, NAME) i,
       MASTER m
 WHERE i.rid = m.ROWID

Now I would like to write a query to pull pecentage for each column of master table where the percentage is calculated
(number of not null values for a column) / (total number of unique rows)

ID column got total 9 records (after above query using rowid) and it got 6 rows populated and 3 rows are null
so the percentage for id is 6/9*100

I could achieve this by using -

SELECT count(m.id)/count(*) "%id", count(m.name)/count(*) "%name",count(m.loc_id)/count(*) "%loc_id"
  FROM (SELECT   ID, NAME, MAX (ROWID) rid
            FROM MASTER
        GROUP BY ID, NAME) i,
       MASTER m
 WHERE i.rid = m.ROWID

but now, I would like calculate the percentages after the below change -
I would like to consider the "master" table column value as NULL for a record if the rec_id of master table is mapped to any value in the "lookup" table.
and lookup table contains the column for which it need to be considered as NULL.
for e.g. rec_id=100 of master table is mapped to 2 columns in the lookup table i.e. id and name so id and name for this record need to be considered as null.
where as rec_id=200 of master table is mapped to only one column i.e. name. so name need to be considered as null.

Thank you in advance.

Regards,
Pointers

Report message to a moderator

Re: percentages of data population [message #652052 is a reply to message #652027]

Tue, 31 May 2016 09:39

Michel Cadot
Messages: 68693
Registered: March 2007
Location: Nanterre, France, http://...

Senior Member
Account Moderator

Quote:

ID and NAME are key columns in the master table i.e. id and name forms a unique record and to get rid of duplicates.

So how this is possible:

Insert into SCOTT.MASTER(ID, NAME, LOC_ID, REC_ID)
 Values(9, 'z', NULL, NULL);

Insert into SCOTT.MASTER(ID, NAME, LOC_ID, REC_ID)
 Values(9, 'z', NULL, NULL);

Insert into SCOTT.MASTER(ID, NAME, LOC_ID, REC_ID)
 Values(9, 'z', NULL, NULL);

Insert into SCOTT.MASTER(ID, NAME, LOC_ID, REC_ID)
 Values(9, 'z', NULL, NULL);

1/ Do NOT post your statement with a schema, tablespace or other specific stuff like that, why should be put garbage in my SCOTT schema?
2/ Create your tables with all the constraints, this will prevent your from posting wring stuff
3/ What does MAX(ROWID) (and ROWID itself) has to do with the question?
4/ What is the purpose of LOC_ID and REC_ID in the problem? They are mentioned nowhere.

[Updated on: Tue, 31 May 2016 09:40]

Report message to a moderator

Re: percentages of data population [message #652053 is a reply to message #652052]

Tue, 31 May 2016 10:29

pointers
Messages: 451
Registered: May 2008

Senior Member

- Sorry for the schema names.
- Its a warehouse environment. There are no constraints but as per the business ID and NAME forms a unique record.
-loc_id is just another column for which we calculate percentage.
- ROWID is used to pick the latest record incase of duplicate records w.r.t ID and NAME.
- REC_ID is a column which is used to join lookup table (recordid column)

Please let me know your suggestion.

Thank you.

Regards,
Pointers

Report message to a moderator

Re: percentages of data population [message #652054 is a reply to message #652053]

Tue, 31 May 2016 10:42

cookiemonster
Messages: 13943
Registered: September 2008
Location: Rainy Manchester

Senior Member

You can't use rowid to pick the latest record. Indeed nothing can be read into its order.

Report message to a moderator

Re: percentages of data population [message #652055 is a reply to message #652053]

Tue, 31 May 2016 10:54

Michel Cadot
Messages: 68693
Registered: March 2007
Location: Nanterre, France, http://...

Senior Member
Account Moderator

Quote:

There are no constraints but as per the business ID and NAME forms a unique record

This is wrong in your test case as I showed it. So post a valid test case.

Quote:

Please let me know your suggestion.

Just build the row source using both tables then count in an outer query (it is not mandatory but it is clearer code and can't degrade the performances).

Report message to a moderator

Re: percentages of data population [message #652057 is a reply to message #652053]

Tue, 31 May 2016 11:41

John Watson
Messages: 8951
Registered: January 2010
Location: Global Village

Senior Member

This is off topic for your question, but this statement
Quote:

- Its a warehouse environment. There are no constraints but as per the business ID and NAME forms a unique record

is awful. If you do not define constraints, you are crippling the optimizer. Constraints are not just about data integrity, you know: they give the optimizer the information it needs to develop good execution plans. In a warehouse, this is even more important than in TP systems.

Report message to a moderator

Re: percentages of data population [message #652059 is a reply to message #652057]

Tue, 31 May 2016 12:01

pointers
Messages: 451
Registered: May 2008

Senior Member

Yes I agree. But that is how the system is defined.
@micheal: I have no clue, can you pls help me.
Thank you in advance.

Regards,
Pointers

Report message to a moderator

Re: percentages of data population [message #652060 is a reply to message #652059]

Tue, 31 May 2016 12:04

Michel Cadot
Messages: 68693
Registered: March 2007
Location: Nanterre, France, http://...

Senior Member
Account Moderator

Quote:

So post a valid test case.

Report message to a moderator

Re: percentages of data population [message #652061 is a reply to message #652059]

Tue, 31 May 2016 12:23

Michel Cadot
Messages: 68693
Registered: March 2007
Location: Nanterre, France, http://...

Senior Member
Account Moderator

Ignoring your "key", assuming there is none, and ignoring for a first step the lookup table, the percentages query is:

SQL> select round(100*count(distinct id)/count(*),2) "%id",
  2         round(100*count(distinct name)/count(*),2) "%name",
  3         round(100*count(distinct loc_id)/count(*),2) "%loc_id"
  4  from master
  5  /
       %id      %name    %loc_id
---------- ---------- ----------
     46.15      30.77      23.08

Now your lookup table is just a way to dynamically modify the master table, so replace in the query the master table by an inline view reflecting these changes).
First you have to pivot the lookup table:

SQL> select recordid, id, name, loc_id
  2  from lookup
  3  pivot (max(column_name)
  4         for column_name in ('id' "ID", 'name' "NAME", 'loc_id' "LOC_ID"))
  5  /
  RECORDID ID                   NAME                 LOC_ID
---------- -------------------- -------------------- --------------------
       100 id                   name
       400 id
       200                      name

Then join this to master table get the new master table:

SQL> select nvl2(l.id,     null, m.id) id,
  2         nvl2(l.name,   null, m.name) name,
  3         nvl2(l.loc_id, null, m.loc_id) loc_id,
  4         rec_id
  5  from master m,
  6       ( select recordid, id, name, loc_id
  7         from lookup
  8         pivot (max(column_name)
  9                for column_name in ('id' "ID", 'name' "NAME", 'loc_id' "LOC_ID")) ) l
 10  where l.recordid (+) = m.rec_id
 11  /
        ID NAME                 LOC_ID                   REC_ID
---------- -------------------- -------------------- ----------
                                1000                        100
                                1000                        100
                                4000                        400
                                                            200
         9 z
         9 z
         9 z
         9 z
         8
         6
           i

         2                      2000

compare to:

SQL> select * from master;
        ID NAME                 LOC_ID                   REC_ID
---------- -------------------- -------------------- ----------
         1 s                    1000                        100
         2                      2000

           i
           k                                                200
         6
         7                      4000                        400
         8
         1 s                    1000                        100
         9 z
         9 z
         9 z
         9 z

Is this correct?

Report message to a moderator

Re: percentages of data population [message #652062 is a reply to message #652061]

Tue, 31 May 2016 12:57

pointers
Messages: 451
Registered: May 2008

Senior Member

I cant stop myself thanking you Micheal.

This should really solve the issue.

I would test it tomorrow and let you know.
But as I believe this should help me.

Thank you very much again.

Just curious is there any other way to achieve the same (Just to learn other way. I am learning new techniques from you )

Regards,
Pointers

Report message to a moderator

Re: percentages of data population [message #652148 is a reply to message #652062]

Thu, 02 June 2016 14:32

Bill B
Messages: 1971
Registered: December 2004

Senior Member

Just curious, why is the ID number in master allowed to be null? Shouldn't it be the primary key, not null, and filled with a sequence.

Report message to a moderator

Previous Topic:	Data not inserted after MView creation
Next Topic:	transpose and insert the data

Goto Forum:

-=] Back to Top [=-

[ Syndicate this forum (XML) ] [

]

Current Time: Sat Sep 28 13:51:20 CDT 2024