Understanding Different Parameter Types in Database Cost Functions

Different Parameter Types in Database Cost Functions and Their Information Storage

The cost function is part of the query cost optimization module of the database management system. Elmasri & Navathe (2015) indicate that the optimizer heuristic rules or query transformations are not the only dependencies used for cost optimization.

The cost-based optimizer will estimate and compare the cost of executing a query by taking advantage of different execution strategies and algorithms, and it then chooses the strategy with the lowest cost estimate.

From a developer or database administrator perspective, one can affect optimizer results by analyzing the table or adding or deleting indexes on the table(s). To find out how the optimizer has defined the path for the execution at any one time, an EXPLAIN PLAN SQL statement can be placed in front of the SQL code to find how the optimizer interprets the fast query execution as described by SQL Reference: (2016). A few examples of ways the optimizer is convinced to use a different path or index is by using ANALYZE to update the row count statistics, adding or dropping an index on a table(s), and offering an optimizer hint in the case of the Oracle database management system as described by Optimizer Influence (2016).

The ANALYZE SQL command will collect statistics on the table indexes. Some databases will automatically collect statistics; others require the ANALYZE command to be run regularly to keep the optimizer statistics in sync with what is currently physically stored in the table. Adding or dropping an index(s) may change the optimizer path and can be seen when one drops or adds an index(s) and re-running the EXPLAIN PLAN to see the new optimization path chosen. Lastly, the Oracle DBMS allows one to supply a hint to the optimizer to alter the optimization chosen. Optimizer hints are placed in the SQL to assist the optimizer in deciding the index or optimization path that may be the fastest but not selected by the optimizer.

In estimating the optimizer costs of various execution strategies, the database must keep track of any information or parameters needed for the cost functions of the optimizer, as described by Elmasri & Navathe (2015) . Cost function information is stored in the DBMS catalog, where the DBMS query optimizer can easily access the cost functions. The DBMS collects information. It collects the size of each file. For files where the records are the same type, the record count, the average record size, and a count or estimated number of file blocks are needed. The DBMS may also collect the blocking factor in some cases. The primary file organization for each file is also collected and stored.

Elmasri & Navathe (2015) goes on to discuss that the main file organization records may be unordered, an attribute could order them and have a primary or clustering index defined. There also could be a hash on a key attribute. Other information kept would be primary, secondary, or clustering indexes and their indexing attributes. The number of levels of a multilevel index will be needed for cost functions that estimate how many block accesses occur during query execution. In some cost functions, the number of first-level index blocks (bI1) is needed.

The cost functions, which are part of the cost-based query optimization process in the DBMS, use traditional optimization methods that search the solution space to a problem for a solution that minimizes the object cost function, as pointed out by Elmasri & Navathe (2015). It is important to remember that the cost functions used in query optimization are estimated and not precise cost functions. The optimizer can decide on a query execution strategy that may not end up as the best or fastest performing, which is why some query tuning with EXPLAIN PLAN and ANALYZE is best during query design in the application development process.

References

Elmasri, R., Navathe, S. B. (2015). Fundamentals of Database Systems, 7th Edition [VitalSource Bookshelf version]. Retrieved from (https://bookshelf.vitalsource.com/books/9781323139899)

Optimizer Influence. (2016). Database Tuning Guide: Influencing the Optimizer. Retrieved From: https://docs.oracle.com/database/121/TGSQL/tgsql_influence.htm#TGSQL246

SQL Reference. (2016). Database Language SQL Reference . Retrieved from https://docs.oracle.com/database/121/SQLRF/toc.htm

Posts in this series