paint-brush
Optimizing Oracle Database Queries Using Execution Plans: A Step-by-Step Guideby@arvindtoorpu
138 reads

Optimizing Oracle Database Queries Using Execution Plans: A Step-by-Step Guide

by Arvind Toorpu10mFebruary 3rd, 2025
Read on Terminal Reader
Read this story w/o Javascript
tldt arrow

Too Long; Didn't Read

Execution plans are indispensable tools for diagnosing and improving query performance in Oracle databases. This guide bridges theory and practice, offering a systematic approach to tuning queries using execution plans. Use the visual plan tree to analyze operations.
featured image - Optimizing Oracle Database Queries Using Execution Plans: A Step-by-Step Guide
Arvind Toorpu HackerNoon profile picture
0-item


Optimizing Oracle Database Queries Using Execution Plans: A Step-by-Step Guide


Abstract: Execution plans are indispensable tools for diagnosing and improving SQL query performance in Oracle databases. This guide provides a comprehensive workflow to generate, interpret, and optimize execution plans, with practical examples, actionable insights, and advanced tuning strategies for large-scale datasets.


Table of Contents

  1. Introduction
  2. Generating Execution Plans
  3. Interpreting Execution Plans
  4. Identifying Performance Bottlenecks
  5. Query Tuning Strategies
  6. Practical Optimization Example
  7. Advanced Tips and Best Practices
  8. Conclusion

Introduction

Execution plans in Oracle databases reveal how the SQL optimizer processes queries, exposing inefficiencies like full table scans, expensive joins, or sorting operations. With the rise of big data, optimizing these plans is critical for reducing latency, improving scalability, and minimizing resource consumption. This guide bridges theory and practice, offering a systematic approach to tuning queries using execution plans.


Generating Execution Plans

Method 1: Using EXPLAIN PLAN

Generate an estimated plan without executing the query:

EXPLAIN PLAN FOR  
SELECT c.customer_id, c.city, SUM(o.amount)  
FROM customers c  
JOIN orders o ON c.customer_id = o.customer_id  
WHERE c.city = 'New York'  
GROUP BY c.customer_id, c.city  
ORDER BY SUM(o.amount) DESC;  

-- Retrieve the plan  
SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY);  

Method 2: Oracle SQL Developer (GUI)

  1. Open SQL Developer.
  2. Write your query and click Explain Plan (or press F10).
  3. Use the visual plan tree to analyze operations.

Method 3: Capture Actual Execution Statistics

Include runtime metrics using GATHER_PLAN_STATISTICS:

SELECT /*+ GATHER_PLAN_STATISTICS */  
    c.customer_id, c.city, SUM(o.amount) AS total_amount  
FROM customers c  
JOIN orders o ON c.customer_id = o.customer_id  
WHERE c.city = 'New York'  
GROUP BY c.customer_id, c.city  
ORDER BY total_amount DESC;  

-- Display actual execution data  
SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR(format => 'ALLSTATS LAST'));  

Interpreting Execution Plans

Plan Structure and Hierarchy

Execution plans are tree structures where child operations feed data to parent operations.

  • Reading Direction: Right-to-left and bottom-to-top.
  • Indentation = Depth in the tree.
  • Root Node: The top-left operation (SELECT STATEMENT).

Sample Execution Plan

For the query:

SELECT c.customer_id, c.city, SUM(o.amount)  
FROM customers c  
JOIN orders o ON c.customer_id = o.customer_id  
WHERE c.city = 'New York'  
GROUP BY c.customer_id, c.city  
ORDER BY SUM(o.amount) DESC;  

Resulting Plan:

-------------------------------------------------------------------------------  
| Id  | Operation            | Name       | Rows  | Bytes | Cost (%CPU)| Time  |  
-------------------------------------------------------------------------------  
|   0 | SELECT STATEMENT     |            | 10000 | 450K  | 25468   (2)| 00:00:01 |  
|   1 |  SORT ORDER BY       |            | 10000 | 450K  | 25468   (2)| 00:00:01 |  
|   2 |   HASH GROUP BY      |            | 10000 | 450K  | 25467   (2)| 00:00:01 |  
|*  3 |    HASH JOIN         |            | 5000K | 214M  | 24472   (1)| 00:00:01 |  
|*  4 |     TABLE ACCESS FULL| CUSTOMERS  | 10000 | 300K  |   423   (1)| 00:00:01 |  
|   5 |     TABLE ACCESS FULL| ORDERS     | 5000K | 71M   | 19742   (1)| 00:00:01 |  
-------------------------------------------------------------------------------  

Visual Representation of Data Flow:

SELECT STATEMENT (Id 0)  
└── SORT ORDER BY (Id 1)  
    └── HASH GROUP BY (Id 2)  
        └── HASH JOIN (Id 3)  
            ├── TABLE ACCESS FULL CUSTOMERS (Id 4)  
            └── TABLE ACCESS FULL ORDERS (Id 5)  

Execution Order:

  1. Leaf Nodes First (Deepest/Rightmost Operations):
    • Id 4: Full scan of CUSTOMERS (filtered by city = 'New York').
    • Id 5: Full scan of ORDERS.
  2. Parent Operations:
    • Id 3: Combines results from Id 4 and Id 5 via a HASH JOIN.
    • Id 2: Groups data by customer_id and city.
    • Id 1: Sorts the grouped results by total_amount DESC.
  3. Root Node:
    • Id 0: Returns the final sorted result.

Key Metrics and Cost Model

Column

Description

Id

Step identifier. Child operations execute before parents (e.g., Id 4 → Id 3).

Operation

Action performed (e.g., HASH JOIN, TABLE ACCESS FULL).

Name

Object involved (table/index name).

Rows

Estimated rows processed at this step (cardinality).

Bytes

Estimated data size processed.

Cost (%CPU)

Optimizer’s cost estimate (lower = better). Includes CPU and I/O overhead.

Time

Estimated runtime for the operation.

How Oracle Calculates Cost

The optimizer estimates cost based on:

  • I/O: Reading data from disk (e.g., full table scans).
  • CPU: Processing data (joins, sorting, aggregations).
  • Memory: Storing intermediate results (e.g., hash tables for joins).

Critical Red Flags in the Sample Plan

  1. Full Table Scans (TABLE ACCESS FULL):
    • Id 4 and Id 5 scan entire tables. For large tables, this is slow.
    • Fix: Add indexes on filtered (city) or joined (customer_id) columns.
  2. High Cost Operations:
    • The HASH JOIN (Id 3) contributes 24,472 to the total cost (96% of total).
    • Why? Joining 5 million rows from ORDERS with 10,000 rows from CUSTOMERS.
  3. Expensive Sorting:
    • SORT ORDER BY (Id 1) adds overhead.
    • Fix: Indexes on SUM(o.amount) or materialized views.

Identifying Performance Bottlenecks

  1. Full Table Scans
    • Cause: Missing indexes on filtered or joined columns.
    • Fix: Create composite or function-based indexes.
  2. High Join Costs
    • Example: HASH JOIN on tables with 10M+ rows.
    • Fix: Use NESTED LOOPS for indexed small tables.
  3. Sorting Overheads
    • Indicators: SORT GROUP BY, SORT ORDER BY.
    • Fix: Add indexes on GROUP BY/ORDER BY columns.
  4. Cardinality Mismatches
    • Example: Optimizer estimates 100 rows, but 100K are processed.
    • Fix: Refresh statistics with DBMS_STATS.GATHER_TABLE_STATS.

Query Tuning Strategies

1. Index Optimization

  • Covering Indexes: Include frequently accessed columns.

    CREATE INDEX idx_customers_city ON customers(city) INCLUDE (customer_id, name);  
    
  • Partitioning: Split tables by range, hash, or list (e.g., by order_date).

2. SQL Rewrites

  • Replace SELECT * with explicit columns.
  • Use EXISTS instead of IN for subqueries.
  • Avoid functions on indexed columns (e.g., WHERE UPPER(name) = 'JOHN').

3. Optimizer Hints

Force specific behaviors (use sparingly):

SELECT /*+ INDEX(orders idx_orders_customer_id) */  
    customer_id, SUM(amount)  
FROM orders  
GROUP BY customer_id;  

Practical Optimization Example

Step 1: Create Sample Tables

-- Customers table with 1M rows  
CREATE TABLE customers (  
    customer_id NUMBER PRIMARY KEY,  
    name VARCHAR2(100),  
    city VARCHAR2(50)  
);  

-- Orders table with 5M rows  
CREATE TABLE orders (  
    order_id NUMBER PRIMARY KEY,  
    customer_id NUMBER,  
    order_date DATE,  
    amount NUMBER  
);  

Step 2: Baseline Execution Plan

SELECT /*+ GATHER_PLAN_STATISTICS */  
    c.customer_id, c.city, SUM(o.amount) AS total_amount  
FROM customers c  
JOIN orders o ON c.customer_id = o.customer_id  
WHERE c.city = 'New York'  
GROUP BY c.customer_id, c.city  
ORDER BY total_amount DESC;  

Initial Plan Analysis

Initial Execution Plan

-------------------------------------------------------------------------------  
| Id  | Operation            | Name       | Rows  | Bytes | Cost (%CPU)| Time  |  
-------------------------------------------------------------------------------  
|   0 | SELECT STATEMENT     |            | 10000 | 450K  | 25468   (2)| 00:00:01 |  
|   1 |  SORT ORDER BY       |            | 10000 | 450K  | 25468   (2)| 00:00:01 |  
|   2 |   HASH GROUP BY      |            | 10000 | 450K  | 25467   (2)| 00:00:01 |  
|*  3 |    HASH JOIN         |            | 5000K | 214M  | 24472   (1)| 00:00:01 |  
|*  4 |     TABLE ACCESS FULL| CUSTOMERS  | 10000 | 300K  |   423   (1)| 00:00:01 |  
|   5 |     TABLE ACCESS FULL| ORDERS     | 5000K | 71M   | 19742   (1)| 00:00:01 |  
-------------------------------------------------------------------------------  
  • FULL TABLE SCAN on both tables.
  • Cost: 25,468.
  • Bottlenecks: No indexes on city or customer_id.

Step 3: Apply Optimizations

  1. Add Indexes:

    CREATE INDEX idx_customers_city ON customers(city);  
    CREATE INDEX idx_orders_customer_id ON orders(customer_id, amount);  
    
  2. Refresh Statistics:

    -- Update statistics for CUSTOMERS  
    EXEC DBMS_STATS.GATHER_TABLE_STATS(USER, 'CUSTOMERS');  
    
    -- Update statistics for ORDERS  
    EXEC DBMS_STATS.GATHER_TABLE_STATS(USER, 'ORDERS');  
    


Step 4: Optimized Execution Plan

Re-run the Query:

SELECT /*+ GATHER_PLAN_STATISTICS */  
    c.customer_id, c.city, SUM(o.amount) AS total_amount  
FROM customers c  
JOIN orders o ON c.customer_id = o.customer_id  
WHERE c.city = 'New York'  
GROUP BY c.customer_id, c.city  
ORDER BY total_amount DESC;  


Optimized Execution Plan Output:

-------------------------------------------------------------------------------  
| Id  | Operation             | Name               | Cost (%CPU)| Time        |  
-------------------------------------------------------------------------------  
|   0 | SELECT STATEMENT      |                    | 1502 (100)| 00:00:01    |  
|   1 |  SORT ORDER BY        |                    | 1502   (3)| 00:00:01    |  
|   2 |   HASH GROUP BY       |                    | 1498   (3)| 00:00:01    |  
|*  3 |    HASH JOIN          |                    | 1412   (1)| 00:00:01    |  
|*  4 |     INDEX RANGE SCAN  | IDX_CUSTOMERS_CITY |    40   (0)| 00:00:01    |  
|   5 |     INDEX FAST FULL SCAN| IDX_ORDERS_CUSTOMER_ID | 1363 (1)| 00:00:01 |  
-------------------------------------------------------------------------------  

Optimized Plan Results

  • INDEX RANGE SCAN on CUSTOMERS:

Uses idx_customers_city to filter city = 'New York' efficiently.

  • INDEX FAST FULL SCAN on ORDERS:

    • Leverages idx_orders_customer_id to retrieve customer_id and amount without scanning the entire table.
  • Cost Reduction:

    • Before: 25,468
    • After1,502 (94% reduction).

Advanced Tips and Best Practices

  1. Adaptive Query Optimization (Oracle 12c+):
    • Let Oracle dynamically adjust execution plans.
  2. SQL Tuning Advisor:
    • Use Oracle’s built-in tool for automated recommendations.
  3. Monitor Historical Plans:
    • Query DBA_HIST_SQL_PLAN for plan regressions.

Conclusion

Execution plans are the cornerstone of Oracle query optimization. By systematically analyzing operations like full scans, inefficient joins, and sorting steps, developers can achieve dramatic performance improvements. Key takeaways:

  • Index Strategically: Align indexes with query predicates.
  • Update Statistics Regularly: Ensure accurate cardinality estimates.
  • Test Incrementally: Validate changes with A/B plan comparisons.



Further Reading