| Home | About Us | Sybase Training | Synergy | Consulting | Job Openings | Tech Videos | Rules and Disclaimer | Search |
![]() |
| Home | About Us | Sybase Training | Synergy | Consulting | Job Openings | Tech Videos | Rules and Disclaimer | Search |
|
Which query is better to find min & max value
|
|
11-10-2011, 01:25 AM
Post: #1
|
|||
|
|||
|
Which query is better to find min & max value
Hi, I have to find min & max value of a numeric column (say order_number) of size 12 from a table (say ORDER). Table statistics are given below. Total number of rows : 20615728 Two indexes 1) clustered, unique and 2 ) nonclustered (Column ORDER_NO is the only column in this index). Now i wrote the usual query to find the min & max value as Code: select Now one of my colleague suggested the following approach Code: set rowcount 1 This is a fast growing table in production. In the second approach aggregate functions min/max can be avoided. i don't know how much cpu time will be saved. With 'set rowcount 1' query plan is same for both approaches. So, my question is which approach is better? Please share your thought in this context. BRs, Pradyut |
|||
|
11-10-2011, 09:00 AM
Post: #2
|
|||
|
|||
|
RE: Which query is better to find min & max value
1. I think that second code segment has four lines missing. 2. You already know enough Sybase and SQL to figure out the answer yourself. In addition to SHOWPLAN, use:
SET STATISTICS TIME ON 3. As you have determined from the SHOWPLAN, ASE is smart enough to figure out that the queries are the same, and thus the plans are the same. 4. The "problem", if there is one, is in the mind of the beholder (I have no problem with you trying to understand). The first query is not MIN() and MAX(); it does not invoke the aggregate functions; the column is indexed, so the MIN() is actually the first entry, and the MAX() is the last entry; of the index; the query is a Covered Query, serviced by the index; "base table will not be read". MIN() and MAX() need not be aggregate functions.
Code: SELECT OrderNo6. Since (4) and (5) resolve to the same plan, therefore neither one is "better" than the other, technically or in terms of performance. Think about that. The second does not "avoid aggregate functions" because that implies the first did invoke aggregate functions; and it did not. (Certainly, you coded them, but the Optimiser resolved them into something else.) 7. What remains, is to label the activity accurately, name its intent, and to write code for what you actually want to do, regardless of the efficiency; then make it efficient. From where I sit, given the limited knowledge of your example, since we are dealing with a single -column Key, you are really seeking First and Last, so the second code segment is more clear in its intent. That does not make the first code segment wrong. If you were seeking a true SUM() or MAX() of a non-leading column, then the first would be more clear in intent. If an user goes up to two developers and asks "what is the earliest Order date you have on file", in the mind of one developer, that means MIN(Order.Date); in the mind of the other, that means ORDER BY Order.Date DESC. Neither is wrong. Depending on whther Order.Date is indexed or not, and other factors, one may be more efficient. Ashirvad to my Shishyas, Cheers to the others Derek Asirvadem Information Architect / Sr Sybase DBA Website Selection of Useful Documents for the Sybase DBA |
|||
|
11-10-2011, 09:49 AM
Post: #3
|
|||
|
|||
|
RE: Which query is better to find min & max value
Hi, Yes, my intent is to find the first and last value. As you have suggested i ran all queries with SET STATISTICS IO ON & SET STATISTICS TIME ON. Second approach is taking less IO and cpu time than first approach. Query plan is attached for references. I'm still not able to figure out why second approach is taing less io & cpu time? BRs, Pradyut |
|||
|
11-10-2011, 11:05 AM
Post: #4
|
|||
|
|||
|
RE: Which query is better to find min & max value
That's easy. but we are moving into completely different territory now. 1. As long as the server is busy, the I/O is going to fluctuate from the execution to execution. Here, it is almost idle, there is some work going on elsewhere; checkpoints; cache movements; etc. But even so, you can't count on absolute values to the nth degree. On a 200 million row table, if the I/O counts were say 10,000 vs 20,000, we have something worth discussing; if they were 10,000 vs 10,500 we have very little worth discussing. Here even the I/O counts "(regular=4 apf=0 total=4)" are identical. If we move to the estimated vs actual I/O cost, and elapsed time, we have: Max: 108/108_36 108/8____0 Min: 108/27__53 108/2____3 as far as the original question is concerned (posts #1 and #2), it is well and truly answered, and now confirmed. The estimates and plans are identical. So we are now discussing understanding only. 2. We move to the actuals. This is a run-time, currency issue. First, the diff between 108 vs 8 I/O cost or 36 vs 0 ms is too small to discuss. In real life, on an active server, that may be 108 vs 108 and 500 vs 500 ms. There would be no substantial difference to inspect further. I can't get any 10 executions (and my lab is sterile) to come within 50 ms of each other. Do not bother doing this, but for a real test, you need (a) a reasonable I/O, say 10,000, (b) run each pair 10 times, capture 10 figures, take the average © clear the cache between each test. The 10,000 I/O is important threshold becasue that is the absolute best I/O per sec you can get from a SAN (real world test) and testing less than one seconds worth is too small a sample size. Again, here, the sample is too small to make a reasonable comparison. (But we continue for explanatory purposes.) 3. Yes, you have a valid lab space (idle server), but no, it is not sterile. The old gotcha: even the act of testing, affects the test; even the act of observing, affects the test. We know that from the Vedas. Scientists proved that at the turn of the 20th century. Do this: Reverse the order. Do the ORDER BY first, the aggregate second. Then post back. Ashirvad to my Shishyas, Cheers to the others Derek Asirvadem Information Architect / Sr Sybase DBA Website Selection of Useful Documents for the Sybase DBA |
|||
|
« Next Oldest · Next Newest »
|