ExchangeDEX+

Buy Crypto Markets Spot Futures500X Earn Events

The "bad" correlated subquery outperformed the "good" JOIN. The subquery triggered a Nested Loop plan with fast index lookups (25 quick searches)The "bad" correlated subquery outperformed the "good" JOIN. The subquery triggered a Nested Loop plan with fast index lookups (25 quick searches)

Is JOIN Faster Than Correlated Subqueries? Taking a Look and Subsequently Debunking the Myth

By: Hackernoon

2025/11/20 02:55

FASTER$0,0005901+13,48%

LOOK$0,02233-13,07%

MYTH$0,0431-10,02%

BAD$0,00000000186-11,42%

LOOP$0,01004-4,47%

Hey there, fellow developers! If you've ever dabbled in SQL, you've probably heard the golden rule: "Never use correlated subqueries in SELECT—they're a recipe for N+1 disasters!" Instead, we're told to always opt for JOINs because they're set-based, efficient, and lightning-fast.

\ But is this rule set in stone? I decided to put it to the test across four popular database systems: MySQL 8.0, Oracle 23c, PostgreSQL 16, and SQLite 3.45. Spoiler alert: The results were eye-opening. Sometimes, the "bad" correlated subquery outperformed the "good" JOIN. Let's dive in and see why.

The Test Setup: Customers and Orders

To keep things fair, I used a simple schema with two tables:

Customers: A small table with 25 rows of customer data.

Orders: A larger table with 1,000 rows of orders, linked via a foreign key. The goal? Count the number of orders per customer, including those with zero orders.

\ Here's the schema (using MySQL syntax for reference):

-- Table of customers CREATE TABLE customers ( customer_id INT PRIMARY KEY, name VARCHAR(255) NOT NULL ); -- Table of orders CREATE TABLE orders ( order_id INT PRIMARY KEY, customer_id INT, order_date DATETIME, FOREIGN KEY (customer_id) REFERENCES customers(customer_id) ON DELETE CASCADE );

Data was populated with random values to simulate real-world scenarios.

The Two Queries: JOIN vs. Correlated Subquery

I compared two approaches to achieve the same result.

The "Good" Way – JOIN + GROUP BY - This is the set-based, relational approach everyone loves:

\

SELECT c.customer_id, COUNT(o.order_id) AS orders_count FROM customers c LEFT JOIN orders o ON c.customer_id = o.customer_id GROUP BY c.customer_id;

Pros: Handles all customers, even those without orders.
Theory: One optimized operation to join and aggregate.

The "Bad" Way – Correlated Subquery This is the row-by-row method we're warned against:

\

SELECT c.customer_id, (SELECT COUNT(o.order_id) FROM orders o WHERE o.customer_id = c.customer_id) AS orders_count FROM customers c;

Pros: Also includes customers with zero orders.
Theory: Executes a subquery for each customer—classic N+1 problem.

Testing Across Databases: The Results

I ran both queries on online SQL testers (links provided below) and analyzed execution times and plans using EXPLAIN. Here's what happened.

MySQL 8.0: Subquery Wins!

Execution Times: Subquery ~14 ms vs. JOIN ~16 ms.

\ Why? The subquery triggered a Nested Loop plan with fast index lookups (25 quick searches). JOIN used Hash Join + Aggregate, which was overkill for small data.

\ Key Insight: With an index on orders.customer_id, the subquery wasn't N+1—it was efficient Nested Loops.

\ Test Link: MySQL Tester

Oracle 23c: Subquery Dominates!

Execution Times: Subquery ~2.4 ms vs. JOIN ~15 ms.

\ Why? Similar to MySQL—Nested Loop for subquery vs. Hash Join for JOIN. The subquery avoided heavy aggregation overhead.

\ Key Insight: Indexes are crucial; without them, Oracle falls back to full scans.

\ Test Link: Oracle Tester

PostgreSQL 16: JOIN Takes the Lead

Execution Times: JOIN ~0.6 ms vs. Subquery ~1.9 ms.

\ Why? PostgreSQL's optimizer rewrote the subquery into a JOIN-like plan, but the explicit JOIN was slightly faster. Subquery showed 25 sub-plan executions (mild N+1).

\ Key Insight: PostgreSQL is smart—indexes level the playing field.

\ Test Link: PostgreSQL Tester

SQLite 3.45: A Tie!

Execution Times: Both ~1 ms.

\ Why? Plans were nearly identical: SCAN on customers + SEARCH on orders via index. No N+1 effect.

\ Key Insight: SQLite's simplicity made both queries efficient; choose based on readability.

\ Test Link: SQLite Tester

Key Takeaways: No Silver Bullet

The "JOIN is always faster" myth crumbles because performance depends on:

Database Optimizer: PostgreSQL rewrites queries; MySQL/Oracle follow your syntax more literally.
Data Size: Small outer tables (like our 25 customers) favor Nested Loops; large ones benefit from Hash Joins.Indexes: Without an index on orders.customer_id, subqueries tank. With it, they shine.
Bottom Line: Don't blindly follow rules. Always run EXPLAIN (or EXPLAIN ANALYZE) to see the actual execution plan. Test with your data!

\ What are your experiences with JOINs vs. subqueries? Drop a comment below!

\ This article is based on real testing and analysis. Links to testers are provided for you to verify the results.

Disclaimer: The articles reposted on this site are sourced from public platforms and are provided for informational purposes only. They do not necessarily reflect the views of MEXC. All rights remain with the original authors. If you believe any content infringes on third-party rights, please contact service@support.mexc.com for removal. MEXC makes no guarantees regarding the accuracy, completeness, or timeliness of the content and is not responsible for any actions taken based on the information provided. The content does not constitute financial, legal, or other professional advice, nor should it be considered a recommendation or endorsement by MEXC.