You are not logged in or registered. Please login or register to use the full functionality of this Website SybaseTeam.Com...
Hello There, Guest! (LoginRegister) Remember Me? Current time: 09-30-2014, 12:50 PM
   Home  |  About Us  |  Sybase Training  |  Synergy  |  Consulting  |  Job Openings  |  Tech Videos  |  Rules and Disclaimer  |  Search
Post Reply 
Forum Tools
How will you tune a query?
01-11-2012, 08:08 AM
Post: #1
Quote this message in a reply
How will you tune a query?


Performance of the SQL queries of an application often play a big role in the overall performance of the underlying application. The response time may at times be really irritating for the end users if the application doesn't have fine-tuned SQL queries. There are sevaral ways of tuning SQl statements, few of which are:-

• Understanding of the Data, Business, and Application - it's almost impossible to fine-tune the SQl statements without having a proper understanding of the data managed by the application and the business handled by the application. The understanding of the application is of course of utmost importance. By knowing these things better, we may identify several instances where the data retrieval/modification by many SQL queries can simply be avoided as the same data might be available somewhere else, may be in the session of some other integrating application, and we can simply use that data in such cases. The better understanding will help you identify the queries which could be written better either by changing the tables involved or by establishing relationships among available tables.
• Using realistic test data - if the application is not being tested in the development/testing environments with the volume and type of data, which the application will eventually face in the production environment, then we can't be very sure about how the SQL queries of the application will really perform in actual business scenarios. Therefore, it's important to have the realistic data for development/testing purposes as well.
• Using Bind Variables, Stored Procs, and Packages - Using identical SQL statements (of course wherever applicable) will greatly improve the performance as the parsing step will get eliminated in such cases. So, we should use bind variables, stored procedures, and packages wherever possible to re-use the same parsed SQL statements.
• Using the indexes carefully - Having indexes on columns is the most common method of enhancing performance, but having too many of them may degrade the performance as well. So, it's very critical to decide wisely about which all columns of a table we should create indexes on. Few common guidelines are:- creating indexes on the columns which are frequently used either in WHERE clause or to join tables, avoid creating indexes on columns which are used only by functions or operators, avoid creating indexes on the columns which are required to changed quite frequently, etc.
• Making available the access path - the optimizer will not use an access path that uses an index only because we have created that index. We need to explicitly make that access path available to the optimizer. We may use SQL hints to do that.
• Using EXPLAIN PLAN and TKPROF - these tools can be used to fine tune SQL queries to a great extent. EXPLAIN PLAN explains the complete access path which will be used by the particular SQL statement during execution and the second tool TKPROF displays the actual performance statistics. Both these tools in combination can be really useful to see, change, and in turn fine-tune the SQL statements.
• Optimizing the WHERE clause - there are many cases where index access path of a column of the WHERE clause is not used even if the index on that column has already been created. Avoid such cases to make best use of the indexes, which will ultimately improve the performance. Some of these cases are: COLUMN_NAME IS NOT NULL (ROWID for a null is not stored by an index), COLUMN_NAME NOT IN (value1, value2, value3, ...), COLUMN_NAME != expression, COLUMN_NAME LIKE'%pattern' (whereas COLUMN_NAME LIKE 'pattern%' uses the index access path), etc. Usage of expressions or functions on indexed columns will prevent the index access path to be used. So, use them wisely!
• Using WHERE instead of HAVING - usage of WHERE clause may take advantage of the index defined on the column(s) used in the WHERE clause.
• Using the leading index columns in WHERE clause - the WHERE clause may use the complex index access path in case we specify the leading index column(s) of a complex index otherwise the WHERE clause won't use the indexed access path.
• Indexed Scan vs Full Table Scan - Indexed scan is faster only if we are selcting only a few rows of a table otherwise full table scan should be preferred. It's estimated that an indexed scan is slower than a full table scan if the SQL statement is selecting more than 15% of the rows of the table. So, in all such cases use the SQL hints to force full table scan and suppress the use of pre-defined indexes. Okay... any guesses why full table scan is faster when a large percentage of rows are accessed? Because an indexed scan causes multiple reads per row accessed whereas a full table scan can read all rows contained in a block in a single logical read operation.
• Using ORDER BY for an indexed scan - the optimizer uses the indexed scan if the column specified in the ORDER BY clause has an index defined on it. It'll use indexed scan even if the WHERE doesn't contain that column (or even if the WHERE clause itself is missing). So, analyze if you really want an indexed scan or a full table scan and if the latter is preferred in a particular scenario then use 'FULL' SQL hint to force the full table scan.
• Minimizing table passes - it normally results in a better performance for obvious reasons.
• Joining tables in the proper order - the order in which tables are joined normally affects the number of rows processed by that JOIN operation and hence proper ordering of tables in a JOIN operation may result in the processing of fewer rows, which will in turn improve the performance. The key to decide the proper order is to have the most restrictive filtering condition in the early phases of a multiple table JOIN. For example, in case we are using a master table and a details table then it's better to connect to the master table first to connecting to the details table first may result in more number of rows getting joined.
• Simple is usually faster - yeah... instead of writing a very complex SQL statement, if we break it into multiple simple SQL statements then the chances are quite high that the performance will improve.
• Just check the query Plan of the Stored Procedure before moving to production



Jomy Luke Xavier
Sr.System Analyst
Find all posts by this user
04-12-2012, 05:49 AM
Post: #2
Quote this message in a reply
RE: How will you tune a query?


Your intention is good, and you make many valuable points. But ...
  1. A small amount of the statements you make re the operation of Sybase, is valid; and probably valid for any SQL platform.
  2. Most of the Sybase-specific statements are not correct (they are valid for Oracle, and you use their terms).
    • We do not have those problems/needs and
    • we have quite different features and defaults.
    The Oracle optimiser is very primitive compared to the Sybase optimiser.
  3. There are a lot more (not mentioned in your post) features for © identifying performance issues in SQL in Sybase and (d) methods for addressing them in Sybase, that Oracle does not have. Assuming the database design is correct, that is where most Sybase SQL developers should spend their time.

As for the advice you give (as opposed to the stated operation of Sybase), that may be true for Oracle but it is incorrect or incomplete for Sybase. If you had a good handle on the operation of Sybase, perhaps you would give different advice. But to counter each point properly in a forum like this, is going to take a lot of explanation. You make many points (good), but most of them need to be corrected or clarified (bad). I cannot address all of them, I will address just two, to provide an example of what I mean.
    (01-11-2012 08:08 AM)jomyluke@rediffmail.com Wrote:  Indexed Scan vs Full Table Scan - Indexed scan is faster only if we are selcting only a few rows of a table
Correct
    Quote:otherwise full table scan should be preferred.
Well, that depends. I have a problem with "preferred" and the simplicity of the advice. Reading 16 million rows via a tablescan is faster than via an index, yes, but the issue that really needs to be addressed is why are you reading so many rows in an online transaction processing system ? Point being, in OLTP systems, we should be reading a few rows. In which case, tablescans are to be avoided at all costs; find out why it is not using an index and fix that problem.
    Quote:It's estimated that an indexed scan is slower than a full table scan if the SQL statement is selecting more than 15% of the rows of the table.
I agree in general. No comment on the 15% figure, but it is easy enough to run a benchmark, on your particular populated table, forcing both an index scan and a tablescan, and see for yourself.
    Quote:So, in all such cases use the SQL hints to force full table scan and suppress the use of pre-defined indexes.
Definitely not. Never, ever include tablescan hints in the code. That is like forcing a car to run on first gear only. The only valid place to code a tablescan is in a benchmark, where you want to load the cache, or collect timings, etc.
    (Separately, ASE does choose tablescans over index scans correctly, it does all that thinking for you, and reaches the same conclusion, therefore there is almost never a need for having to code a tablescan.)
The Sybase Optimiser is far more intelligent, and flexible, than you realise. Sybase operates on the basis of statistics for the columns in each table. While a tablescan may be faster in one instance, a year later, with the new population of the table, and up-to-date stats, an index scan may well be faster. Or vice versa. If you have forced hints, you will never know, you will always [force] execute at the old known speeds.

As for providing index hints (as opposed to tablescan hints) in SQL code, that is allowed, but do so as a very last resort, after you have tried everything else and failed. Update Stats first. Develop the query (fine-tune, rework, re-sequence, change joins to subqueries, etc), second, so that it uses the indices you have, and chooses a good query plan: that is where you will learn the problems, and where you will get the best results anyway.

Since ASE 12.5.4, we have Abstract Query Plans, which are really a much more effective method of addressing the entire QP, join order, join type, etc. You can specify as little as you need (and let the Optimiser choose the rest), or the full QP.
    Quote:Okay... any guesses why full table scan is faster when a large percentage of rows are accessed? Because an indexed scan causes multiple reads per row accessed whereas a full table scan can read all rows contained in a block in a single logical read operation.
Maybe in Oracle, but that is still in the dark ages. Not so for Sybase. While it is true that an index scan does read the index first, and the data rows second, it is not correct it causes 'multiple reads per row". No, we have caches (which Oracle does not have), and very advanced cache management. A page (containing many rows) is read once, and timestamped; if the page is read again, it is a logical read not a physical read, and the timestamp is updated, and it "moves" to the Most Recently Used end of the cache [in the manuals, the cache is discussed in terms of two "ends", the MRU and the Least Recently Used ends; the pages do not actually move; the pointers to the pages are changed, so that the pages appear to "move" within those two "ends" of the cache]. Anyway, the point is, it is a single read per row, and additionally, many of them will be logical reads not physical reads, because the pages are cached, and they have not aged out during the period of the query.

Tablescan or Index scan, we can read a Page, multiple Pages up to an Extent (8 Pages), or multiple Extents up to an AllocationUnit (32 Extents, which is 256 Pages), in a single I/O operation. ASE has a feature called Asynchronous Pre-Fetch, which fetches as many pages as possible, in anticipation of the query requirement; that is, far ahead of the current set of pages being processed. It too, is very mature, and has a built-in governing mechanism, called the Look-Ahead Set, which ASE keeps up-to-date based on actual hits. This feature makes ASE very, very fast with both Index scans and tablescans.
    However, when the DataStructures become fragmented (as in, you as the DBA have not been de-fragmenting regularly as prescribed), APF is ineffective: the Look-Ahead Set gets reined back very early. It needs to be mentioned that DPL/DRL tables have a third level of fragmentation, and they get fragmented at that level very fast.
To close that point, Tablescans are very fast to blindlyly fast, depending on how good your administration is; it is the smallest no of logical and physical reads. Covered Index scans are blindingly fast, period. Index scans (of the data) are slower because they scan the Index first, and then for each qualifying index entry, reads the page containing the row; which is many more logical reads (which result in many more physical reads) than a tablescan, and jumps all over the place.
    Quote:Joining tables in the proper order ...
In previous versions, join order was part of the method. Join order is completely irrelevant since the Optimiser was re-written in ASE 15.0. It will now choose Query Plans from (a) a much larger set of possibilities, (b) many more Join types, and © this results in the joins being executed in quite a different way. It actually normalises the Query Tree; in so doing, it produces a much better Query Tree than the SQL that you wrote. What is more important is the scan operators, which is the new method, and the joins are plugged into the scans. Eg. table2 in your query join order may appear as element 3 of scan operator 4. Further, different executions of the same SQL code, may well use a different Query Plan or a different "join order". If you really think that you can identify a faster query by specifying join order, join types (that deserves a full chapter), scan type, then you really should write Abstract Query Plans.

There is a lot more, that should be said, in response to yours, but I do not have the time. In summary, I would say that it appears you are applying Oracle methods to Sybase, and that simply will not work, it will in fact produce negative performance (compared to the default Sybase operation). You really need to read up on the Sybase features, and gain some experience using them, changing their operation, etc, and following that you can identify how to use them, and how to approach SQL performance.

However your general advice on that point is quite correct: join tables from smallest-no-of-rows to largest.

----

Oh yeah. We do not have EXPLAIN PLAN, TKPROF, Bind Variables or Packages.

We have a SHOWPLAN that details the Query tree, either text or graphical, and we have and STATISTICS IO, that gives us exact scans and I/O by type.

These should not be "checked before moving to production". No, They should be used all the time during development of the SQL code; they are essential for understanding what the code is actually doing vs its intent; and trying to improve and fine-tune the code (that's why it is called the the development stage, not UAT, not production). Developing SQL code without them, is stupid in the extreme, like coding with your monitor turned off. More important, if you are not using SHOWPLAN all the time (which you should) and you code an index hint or force a tablescan, at least then, you must use SHOWPLAN to check that the Optimiser is doing what you intend, and that there are no subsequent deleterious effects.
    Quote:Simple is usually faster - yeah... instead of writing a very complex SQL statement, if we break it into multiple simple SQL statements then the chances are quite high that the performance will improve.
Actually the reverse is true. Sure, twenty years ago, it was necessary to break code up, but not after System X (1990). The optimiser in fact handles complex SQL very well. While simple may well be fast, it is unfair to compare a simple statement with a complex statement (they produce entriely different output); what is fair to compare is a complex statement written well, and the same complex statement (same output) written badly, and the performance of each. First and foremost the Optimiser is very mature, and handles complex statements well, but it is not an AI system, it cannot unravel badly written complex statements, it handles these badly and makes poor decisions. So I advise as follows:
  1. Write complex statements. Doing many operations in one statements, or collecting all the info required in a single SELECT, is very efficient. Breaking that one statement up into several smaller, simpler statements, will be much slower in performance (more calls to ASE, going over the same data over and over again), and much slower again if #tables are required. Going to the supermarket once per week, with a carefully constructed shopping list, and knowledge on the aisles, is much much more efficient than going several times per week, or not having a good list, or not knowing the store.
  2. Again, while using SHOWPLAN, develop the statement code. Make it crystal clear, block it out carefully; ensure that it is doing what you intend. If necessary change your approach or the sequence of code blocks. From hundreds of assignments, I can confirm that optimiser will handle this well.
  3. If the code is badly constructed, not only will it be hard to understand for any human who has the task of maintaining it, it will be hard to understand and therefore to optimise, for the optimiser. So it can't remain in that state for long, it is unacceptable (if your colleagues do not scream, the users eventually will).
  4. So it now depends on your SQL coding ability. If you can, you should strive to achieve [2]. If you can't, then you have no choice, break it up into smaller queries, while avoiding #tables. Very much a second choice.

Oracle cacks itself on Subqueries, which is just one reason you have to break your statements up on that platform, change them into Derived tables, jump through hoops, etc. Oracle handles Derived Tables, but much slower than Sybase. Yes, I have benchmarked, head to head with a few Oracle big names.

Subqueries (producing both scalars and tables or vectors) are an ANSI/ISO/IEC SQL 89 requirement. IIRC Derived Tables are SQL 92. Sybase and DB2 handle Subqueries and Derived Tables brilliantly. If you do not learn both those techniques properly, you simply cannot code SQL effectively. There have been many instances where I have coded a report in a single SELECT with derived tables and subqueries, where a less experienced developer would have broken it up into multiple SELECTs (acceptable), and where an Oracle developer cannot even understand it and therefore codes several SELECT INTOs (unacceptable).

Feel free to post an example, it will be good for everyone to work through it.

Oracles gross inability to handle Subqueries and ANSI SQL requirements (among others) is the reason Oracle databases are unnormalised or denormalised, it is demanded in order to allow the queries that Oracle is capable of without cacking itself. No such problem on the Sybase/DB2 side of the universe, which is the reason our databases tend to be more normalised, and "denormalisation for performance" is absolutely not necessary. But note, a normalised database will have more smaller tables (far fewer indices), and the queries will each address more tables. I would not call that "complex", but you do need to be able to code SQL effectively.



Ashirvad to my Shishyas, Cheers to the others
Derek Asirvadem
Information Architect / Sr Sybase DBA
Website
Selection of Useful Documents for the Sybase DBA
Visit this user's website Find all posts by this user
Post Reply 


Possibly Related Threads...
Thread: Author Replies: Views: Last Post
  How to Optimize Sybase Union query sureshk_85 1 1,008 07-05-2014 12:26 AM
Last Post: krishna.tadepalli5@gmail.com
  Query Plans dicomal 1 2,061 01-14-2013 11:21 PM
Last Post: tsadmin
  SQL (sybase) query using TOP N performs very badly when inserted into table tmatharu 2 2,157 01-14-2013 11:19 AM
Last Post: Bharathvajan
  Query taking 30 hours after upgrading the server to 15.0 from 12.5 preetha16 11 3,779 07-22-2012 08:46 AM
Last Post: sarosh
  Making sense of query performance inarius 1 2,407 04-15-2012 05:28 PM
Last Post: DerekAsirvadem
  Stopping query execution dicomal 0 2,221 02-10-2012 04:28 PM
Last Post: dicomal
  Query performance issue after upgrade help needed awahidt3 0 2,355 01-11-2012 05:43 PM
Last Post: awahidt3
  Which query is better to find min & max value pradyut.dhara 3 2,516 11-10-2011 11:05 AM
Last Post: DerekAsirvadem
  Index choice dependant on who running query?? ncosgrove 1 2,753 10-19-2010 12:58 AM
Last Post: DerekAsirvadem
  Fixing queries using abstract query plans in Sybase ASE Nagendra 0 5,836 12-10-2009 04:29 PM
Last Post: Nagendra

Options:
Forum Jump:


Contact Us | SybaseTeam | Disclaimer & Rules | Return to Top | Return to Content | Lite (Archive) Mode | RSS Syndication