OraFAQ Forum: Performance Tuning » Tuning analytical function query

Home » RDBMS Server » Performance Tuning » Tuning analytical function query (oracle 10g)

Show: Today's Messages :: Polls :: Message Navigator
E-mail to friend

Tuning analytical function query [message #388873]

Thu, 26 February 2009 08:08

ajitpal.s
Messages: 204
Registered: November 2006

Senior Member

Hi Guys,

I have a query, for some reasons its taking more than 5mins to run, which is not so reasonable.

explain plan for
select distinct
       a.line
      ,a.item_id
      ,first_value (b.std) over (partition by b.item_id
                                   order by case when b.r_date <= (select add_months(start_date,-4) from rel_tim where time_id = 5) then b.r_date
                                                 else null end desc nulls last
                                            ,case when b.r_date > (select add_months(start_date,-4) from rel_tim where time_id = 5) then b.r_date
                                                 else null end asc nulls last) val
from 
dm_v a, std_c b
where a.item_id = b.item_id
and b.std <> 0
and b.id = 105
and a.TRX >= '1-Jul-2008'

-------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                      | Name                           | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT               |                                |  4584K|   144M|       |   106K  (2)| 00:21:14 |
|   1 |  TABLE ACCESS BY INDEX ROWID   | rel_tim      			|     1 |    12 |       |     1   (0)| 00:00:01 |
|*  2 |   INDEX UNIQUE SCAN            | rel_tim_PK   			|     1 |       |       |     0   (0)| 00:00:01 |
|   3 |   TABLE ACCESS BY INDEX ROWID  | rel_tim      			|     1 |    12 |       |     1   (0)| 00:00:01 |
|*  4 |    INDEX UNIQUE SCAN           | rel_tim_PK   			|     1 |       |       |     0   (0)| 00:00:01 |
|   5 |  HASH UNIQUE                   |                                |  4584K|   144M|   421M|   106K  (2)| 00:21:14 |
|   6 |   WINDOW SORT                  |                                |  4584K|   144M|   421M|   106K  (2)| 00:21:14 |
|*  7 |    HASH JOIN                   |                                |  4584K|   144M|  6560K| 23300   (2)| 00:04:40 |
|   8 |     TABLE ACCESS BY INDEX ROWID| dm_v                    	|   239K|  3744K|       |  7622   (1)| 00:01:32 |
|*  9 |      INDEX RANGE SCAN          | dm_v_N11                	|   239K|       |       |   649   (2)| 00:00:08 |
|* 10 |     TABLE ACCESS FULL          | std_c 				|   406K|  6745K|       | 14725   (3)| 00:02:57 |
-------------------------------------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   2 - access("TIME_ID"=5)
   4 - access("TIME_ID"=5)
   7 - access("A"."ITEM_ID"="B"."ITEM_ID")
   9 - access("A"."TRX">=TO_DATE('2008-07-01 00:00:00', 'yyyy-mm-dd hh24:mi:ss'))
  10 - filter("B"."ID"=105 AND "B"."std"<>0)

Note:
1) Both tables have dm_v & std_c have index created on item_id
2) I have anaylzed the table

ANALYZE TABLE dm_v ESTIMATE STATISTICS SAMPLE 30 PERCENT;

ANALYZE TABLE std_c  ESTIMATE STATISTICS SAMPLE 30 PERCENT;

Appreciate suggestions on how can i go about tuning, i see a full table scan, but not sure how to get rid of it.

Report message to a moderator

Re: Tuning analytical function query [message #388884 is a reply to message #388873]

Thu, 26 February 2009 08:52

BlackSwan
Messages: 26766
Registered: January 2009
Location: SoCal

Senior Member

Are there indexes on
where a.item_id = b.item_id
and b.std <> 0
and b.id = 105
and are statistics current on all indexes (not just tables)?

Report message to a moderator

Re: Tuning analytical function query [message #388889 is a reply to message #388884]

Thu, 26 February 2009 09:08

ajitpal.s
Messages: 204
Registered: November 2006

Senior Member

Thanks for the reply.

Noticed and b.std <> 0 no index, but its a number field, not sure if its a good idea creating index on it.

After analyzing rest of indexes, using syntax below, noticed the explain plan still producing same plan. Can i tell the optimizer to use index for this column?

ANALYZE index <name> ESTIMATE STATISTICS SAMPLE 30 PERCENT;

Is there any guideline as of when to analyze index/table? How many sample percent to estimate and do we analyze table or index first?

Report message to a moderator

Re: Tuning analytical function query [message #388890 is a reply to message #388873]

Thu, 26 February 2009 09:13

BlackSwan
Messages: 26766
Registered: January 2009
Location: SoCal

Senior Member

How many total rows in std_c b?
How many rows a.item_id = b.item_id?
How many rows b.std <> 0?
How many rows b.id = 105?

Report message to a moderator

Re: Tuning analytical function query [message #388893 is a reply to message #388890]

Thu, 26 February 2009 09:32

ajitpal.s
Messages: 204
Registered: November 2006

Senior Member

Did a count, figures are as below, i did not specify the filter on trx (and a.TRX >= '1-Jul-2008')

How many total rows in std_c b?

select count(*) from std_c -- = 4848914

How many rows a.item_id = b.item_id?

select count(*) from dm_v a, std_c b where a.item_id = b.item_id -- = 3709761000

How many rows b.std <> 0?

select count(*) from std_c  where std <> 0 -- 3280043

How many rows b.id = 105?

select count(*) from std_c  where  id = 105 -- 728215

Report message to a moderator

Re: Tuning analytical function query [message #388895 is a reply to message #388873]

Thu, 26 February 2009 09:37

BlackSwan
Messages: 26766
Registered: January 2009
Location: SoCal

Senior Member

How does
select count(*) from dm_v a, std_c b where a.item_id = b.item_id -- = 3709761000
return MORE rows than exist in std_c?
select count(*) from std_c -- = 4848914

I give up.

Report message to a moderator

Re: Tuning analytical function query [message #388896 is a reply to message #388895]

Thu, 26 February 2009 09:44

ajitpal.s
Messages: 204
Registered: November 2006

Senior Member

Yes. Pls dont give up, i have high hopes on you.

You give up, i end up getting sacked.

Its a many to many join

Report message to a moderator

Re: Tuning analytical function query [message #388902 is a reply to message #388896]

Thu, 26 February 2009 09:53

JRowbottom
Messages: 5933
Registered: June 2006
Location: Sunny North Yorkshire, ho...

Senior Member

Try this - instead of querying rel_tim twice per row returned, try querying it once per query

select distinct
       a.line
      ,a.item_id
      ,first_value (b.std) over (partition by b.item_id
                                 order by case when b.r_date <= add_months(t.start_date,-4) 
                                               then b.r_date end desc nulls last
                                         ,case when b.r_date > add_months(t.start_date,-4) 
                                               then b.r_date end asc nulls last) val
from dm_v a
    ,std_c b
    ,rel_tim t
where a.item_id = b.item_id
and b.std <> 0
and b.id = 105
and a.TRX >= to_date('1-Jul-2008','dd-mon-yyyy')
and t.time_id = 5

Report message to a moderator

Re: Tuning analytical function query [message #388906 is a reply to message #388902]

Thu, 26 February 2009 10:10

ajitpal.s
Messages: 204
Registered: November 2006

Senior Member

Thanks for building up the query initially and tuning it, now seems like its taking - 4min 21secs. Some imporvement there.
explain plan reduced one step.

Just curious, is it possible not to use full table scan on std_c table (step 10 in explain plan). Not sure if it would faster if it accesses via index since theres index created.

Any specific hints can i specify to perform a much faster join?

Report message to a moderator

Re: Tuning analytical function query [message #388952 is a reply to message #388906]

Thu, 26 February 2009 14:13

rleishman
Messages: 3728
Registered: October 2005
Location: Melbourne, Australia

Senior Member

Seems to me like the natsy analytic function can be converted into a slighlty less nasty aggregate function. If we do this and aggregate down to the join key before the mini-cartesian join to A, we can sort fewer rows and reduce the size of joins.

SELECT a.line
,      a.item_id
,      max(d.val)
FROM dm_v a
,(
    SELECT b.item,
           MAX(b.std) KEEP (DENSE_RANK FIRST ORDER BY
                            CASE
                              WHEN b.r_date <= add_months(t.start_date,-4) 
                              THEN b.r_date 
                            END DESC NULLS LAST
                           ,CASE WHEN b.r_date > add_months(t.start_date,-4) 
                            THEN b.r_date 
                            END ASC NULLS LAST
                           ) AS val
    FROM   std_c b
    ,      rel_tim t
    AND    b.std <> 0
    AND    b.id = 105
    AND    t.time_id = 5
    GROUP BY b.item
) d
WHERE a.item = d.item
AND   a.TRX >= to_date('1-Jul-2008','dd-mon-yyyy')
GROUP BY a.line
,      a.item_id

Ross Leishman

Report message to a moderator

Re: Tuning analytical function query [message #388972 is a reply to message #388952]

Thu, 26 February 2009 17:11

ajitpal.s
Messages: 204
Registered: November 2006

Senior Member

Was trying to understand the changes you have made.

The dense_rank assigns rank, if the value is same as previous, same rank gets assigned and the next value will have subsequent rank.

But can you explain the function of aggregrate, how does it work

 MAX(b.std) before dense_rank

Just trying to understand how the sql in 'd' clause effects the join

[Updated on: Thu, 26 February 2009 17:14]

Report message to a moderator

Re: Tuning analytical function query [message #389129 is a reply to message #388972]

Fri, 27 February 2009 06:11

Alessandro Rossi
Messages: 166
Registered: September 2008
Location: Rome

Senior Member

This one should do the same job probably.

select a.line
      ,a.item_id
      ,d.std as val
from dm_v a
	left outer join (
			select b.item_id,
				max(b.std) keep (
					dense_rank first 
					order by abs(b.r_date - add_months(c.start_date,-4))
				) as std
			from std_c b 
				join rel_tim c on ( c.time_id = 5 )
			where b.r_date is not null
			group by b.item
		) d on (a.item_id = d.item_id)
where b.std <> 0
	and b.id = 105
	and a.TRX >= date '2008-7-1'

As Ross Leishman said it can be converted into direct aggregate function (instead of using distinct over analytic).

Here I tried to reduce everything:

1. Instead of using order by .. nulls last I used an outer join against not null first_values.

2. Instead of aggregating the final ( and bigger ) data set I tried to use the aggregate function on the smallest one ( the table std_c ).

3. That order by clause using case may be reduced into a mathematical one probably a bit easier to evaluate.

PS.

the max before dense_rank is there because generally you may have multiple rows to be first (or last) upon a particular sort order. So on those (multiple) rows you need to define an aggregate function, because the result must reside on a single row.

Bye Alessandro

[Updated on: Fri, 27 February 2009 06:18]

Report message to a moderator

Re: Tuning analytical function query [message #389133 is a reply to message #389129]

Fri, 27 February 2009 06:28

ajitpal.s
Messages: 204
Registered: November 2006

Senior Member

Alessandro, thanks alot for tuning the code, appreaciate it,

cant imagine how you guys can come up such good sql scripts in some many differents ways.

Thanks all for the contributions.

Report message to a moderator

Previous Topic:	better way to write or tune a query with joins
Next Topic:	Understanding Explain Plan

Goto Forum:

-=] Back to Top [=-

[ Syndicate this forum (XML) ] [

]

Current Time: Fri Jun 28 01:29:18 CDT 2024