You are not logged in or registered. Please login or register to use the full functionality of this Website SybaseTeam.Com...
Hello There, Guest! (LoginRegister) Remember Me? Current time: 10-01-2014, 05:15 PM
   Home  |  About Us  |  Sybase Training  |  Synergy  |  Consulting  |  Job Openings  |  Tech Videos  |  Rules and Disclaimer  |  Search
Post Reply 
Forum Tools
Which query is better to find min & max value
11-10-2011, 01:25 AM
Post: #1
Quote this message in a reply
Which query is better to find min & max value


Hi,

I have to find min & max value of a numeric column (say order_number) of size 12 from a table (say ORDER).

Table statistics are given below.

Total number of rows : 20615728

Two indexes 1) clustered, unique and 2 ) nonclustered (Column ORDER_NO is the only column in this index).

Now i wrote the usual query to find the min & max value as

Code:
select
min(ORDER_NO)
from
ORDERS
go
select
max(ORDER_NO)
from
ORDERS
go


Now one of my colleague suggested the following approach

Code:
set rowcount 1
go
select
ORDER_NO
from
ORDERS
order by
ORDER_NO asc
go
select
ORDER_NO
from
ORDERS
order by
ORDER_NO desc
go


This is a fast growing table in production. In the second approach aggregate functions min/max can be avoided. i don't know how much cpu time will be saved. With 'set rowcount 1' query plan is same for both approaches. So, my question is which approach is better?

Please share your thought in this context.



BRs,
Pradyut
Find all posts by this user
11-10-2011, 09:00 AM
Post: #2
Quote this message in a reply
RE: Which query is better to find min & max value


1. I think that second code segment has four lines missing.

2. You already know enough Sybase and SQL to figure out the answer yourself. In addition to SHOWPLAN, use:
    SET STATISTICS IO ON
    SET STATISTICS TIME ON
and run both queries. If the answer is not completely obvious, post to that effect, and I will respond.

3. As you have determined from the SHOWPLAN, ASE is smart enough to figure out that the queries are the same, and thus the plans are the same.

4. The "problem", if there is one, is in the mind of the beholder (I have no problem with you trying to understand). The first query is not MIN() and MAX(); it does not invoke the aggregate functions; the column is indexed, so the MIN() is actually the first entry, and the MAX() is the last entry; of the index; the query is a Covered Query, serviced by the index; "base table will not be read". MIN() and MAX() need not be aggregate functions.
    To see an aggregate at work, try MAX(Order.Amount) or some non-leading column or SUM(OrderNo). Then you will get a tablescan, invoking APF & Large I/O.
5. The second query understands (4); the first part obtains the first value of the column; if the second part was
    Code:
    SELECT  OrderNo
        FROM ORDERS
        ORDER BY OrderNo DESC
it would obtain the last value of the column. That happens to be the min and max.

6. Since (4) and (5) resolve to the same plan, therefore neither one is "better" than the other, technically or in terms of performance. Think about that. The second does not "avoid aggregate functions" because that implies the first did invoke aggregate functions; and it did not. (Certainly, you coded them, but the Optimiser resolved them into something else.)

7. What remains, is to label the activity accurately, name its intent, and to write code for what you actually want to do, regardless of the efficiency; then make it efficient. From where I sit, given the limited knowledge of your example, since we are dealing with a single -column Key, you are really seeking First and Last, so the second code segment is more clear in its intent. That does not make the first code segment wrong. If you were seeking a true SUM() or MAX() of a non-leading column, then the first would be more clear in intent.

If an user goes up to two developers and asks "what is the earliest Order date you have on file", in the mind of one developer, that means MIN(Order.Date); in the mind of the other, that means ORDER BY Order.Date DESC. Neither is wrong. Depending on whther Order.Date is indexed or not, and other factors, one may be more efficient.



Ashirvad to my Shishyas, Cheers to the others
Derek Asirvadem
Information Architect / Sr Sybase DBA
Website
Selection of Useful Documents for the Sybase DBA
Visit this user's website Find all posts by this user
11-10-2011, 09:49 AM
Post: #3
Quote this message in a reply
RE: Which query is better to find min & max value


Hi,
Yes, my intent is to find the first and last value.
As you have suggested i ran all queries with SET STATISTICS IO ON & SET STATISTICS TIME ON. Second approach is taking less IO and cpu time than first approach. Query plan is attached for references. I'm still not able to figure out why second approach is taing less io & cpu time?



Attached File(s)
.txt  MaxORDER_NO_Plan.txt (Size: 4.71 KB / Downloads: 22)
.txt  MinORDER_NO_Plan.txt (Size: 4.71 KB / Downloads: 13)


BRs,
Pradyut
Find all posts by this user
11-10-2011, 11:05 AM
Post: #4
Quote this message in a reply
RE: Which query is better to find min & max value


That's easy. but we are moving into completely different territory now.

1. As long as the server is busy, the I/O is going to fluctuate from the execution to execution. Here, it is almost idle, there is some work going on elsewhere; checkpoints; cache movements; etc. But even so, you can't count on absolute values to the nth degree. On a 200 million row table, if the I/O counts were say 10,000 vs 20,000, we have something worth discussing; if they were 10,000 vs 10,500 we have very little worth discussing. Here even the I/O counts "(regular=4 apf=0 total=4)" are identical.

If we move to the estimated vs actual I/O cost, and elapsed time, we have:
Max:
108/108_36
108/8____0
Min:
108/27__53
108/2____3
as far as the original question is concerned (posts #1 and #2), it is well and truly answered, and now confirmed. The estimates and plans are identical.

So we are now discussing understanding only.

2. We move to the actuals. This is a run-time, currency issue. First, the diff between 108 vs 8 I/O cost or 36 vs 0 ms is too small to discuss. In real life, on an active server, that may be 108 vs 108 and 500 vs 500 ms. There would be no substantial difference to inspect further. I can't get any 10 executions (and my lab is sterile) to come within 50 ms of each other.

Do not bother doing this, but for a real test, you need (a) a reasonable I/O, say 10,000, (b) run each pair 10 times, capture 10 figures, take the average © clear the cache between each test. The 10,000 I/O is important threshold becasue that is the absolute best I/O per sec you can get from a SAN (real world test) and testing less than one seconds worth is too small a sample size.

Again, here, the sample is too small to make a reasonable comparison. (But we continue for explanatory purposes.)

3. Yes, you have a valid lab space (idle server), but no, it is not sterile. The old gotcha: even the act of testing, affects the test; even the act of observing, affects the test. We know that from the Vedas. Scientists proved that at the turn of the 20th century.

Do this: Reverse the order. Do the ORDER BY first, the aggregate second. Then post back.



Ashirvad to my Shishyas, Cheers to the others
Derek Asirvadem
Information Architect / Sr Sybase DBA
Website
Selection of Useful Documents for the Sybase DBA
Visit this user's website Find all posts by this user
Post Reply 


Possibly Related Threads...
Thread: Author Replies: Views: Last Post
  How to Optimize Sybase Union query sureshk_85 1 1,009 07-05-2014 12:26 AM
Last Post: krishna.tadepalli5@gmail.com
  Query Plans dicomal 1 2,063 01-14-2013 11:21 PM
Last Post: tsadmin
  SQL (sybase) query using TOP N performs very badly when inserted into table tmatharu 2 2,157 01-14-2013 11:19 AM
Last Post: Bharathvajan
  Query taking 30 hours after upgrading the server to 15.0 from 12.5 preetha16 11 3,781 07-22-2012 08:46 AM
Last Post: sarosh
  Making sense of query performance inarius 1 2,409 04-15-2012 05:28 PM
Last Post: DerekAsirvadem
  How will you tune a query? jomyluke@rediffmail.com 1 4,171 04-12-2012 05:49 AM
Last Post: DerekAsirvadem
  Stopping query execution dicomal 0 2,224 02-10-2012 04:28 PM
Last Post: dicomal
  Query performance issue after upgrade help needed awahidt3 0 2,358 01-11-2012 05:43 PM
Last Post: awahidt3
  Index choice dependant on who running query?? ncosgrove 1 2,757 10-19-2010 12:58 AM
Last Post: DerekAsirvadem
  Fixing queries using abstract query plans in Sybase ASE Nagendra 0 5,839 12-10-2009 04:29 PM
Last Post: Nagendra

Options:
Forum Jump:


Contact Us | SybaseTeam | Disclaimer & Rules | Return to Top | Return to Content | Lite (Archive) Mode | RSS Syndication