Thiết kế website giá rẻ

Question

I am trying to run this query and it takes around 76 seconds without any index, now I added an index in it and now it takes around 80 seconds.

This table has 24k records as of now.

Here is the DDL :

CREATE TABLE personal_details (
    pno VARCHAR(20) PRIMARY KEY,   -- Personal number or unique ID
    first_name VARCHAR(20),        -- First name
    middle_name VARCHAR(20),       -- Middle name (nullable if not always used)
    last_name VARCHAR(20),         -- Last name
    dob DATE,                      -- Date of birth (YYYY-MM-DD format)
    gender CHAR(1),                -- Gender (M/F/O)
    email VARCHAR(50),             -- Email address
    phone_number VARCHAR(20),      -- Phone number (variable format, max 20 chars)
    address VARCHAR(255),          -- Address (full address string)
    city VARCHAR(50),              -- City
    state VARCHAR(50),             -- State
    zip_code VARCHAR(10),          -- Postal or zip code
    country VARCHAR(50),           -- Country
    marital_status VARCHAR(20),    -- Marital status (Single/Married/etc.)
    nationality VARCHAR(50),       -- Nationality
    occupation VARCHAR(50),        -- Job title or occupation
    salary DECIMAL(10, 2),         -- Salary (up to 10 digits, 2 decimal places)
    hire_date DATE,                -- Hire date (YYYY-MM-DD format)
    department VARCHAR(50),        -- Department name
    is_active BOOLEAN              -- Status flag for active/inactive (True/False)
);

Here is the query that takes 76 seconds:


SELECT
    pd1.pno AS pno1,
    pd1.first_name AS first_name1,
    pd1.last_name AS last_name1,
    pd2.pno AS pno2,
    pd2.first_name AS first_name2,
    pd2.last_name AS last_name2,
    pd3.pno AS pno3,
    pd3.first_name AS first_name3,
    pd3.last_name AS last_name3
FROM
    personal_details pd1
JOIN
    personal_details pd2 ON pd1.city = pd2.city
JOIN
    personal_details pd3 ON pd2.state = pd3.state
WHERE
    pd1.dob BETWEEN '1980-01-01' AND '1990-12-31'
    AND pd2.gender = 'M'
    AND pd3.salary > 50000
ORDER BY
    pd1.pno, pd2.pno, pd3.pno;

Here are the indexes that I created (tried and tested with few )

CREATE INDEX idx_pno ON personal_details (pno);

CREATE INDEX idx_composite_query ON personal_details (city, state, dob, gender, salary);

CREATE INDEX idx_city ON personal_details (city);

CREATE INDEX idx_state ON personal_details (state);

CREATE INDEX idx_dob ON personal_details (dob);

CREATE INDEX idx_gender ON personal_details (gender);

CREATE INDEX idx_salary ON personal_details (salary);

Even after creating these indexes the query took a lot of time( around 80 seconds which is more time than the normal query)

I am trying to understand how indexes work, that why I created this query which self-joins itself, I was expecting the execution time to go down but it increased.

I tried creating single indexes, and composite indexes but all gave me almost the same results.

Tried reading a few articles and made changes according to it but to no avail, Kind of lost here.

I also tried to Analyze the query using EXPLAIN ANALYZE

EXPLAIN
"-> Sort: pd1.pno, pd2.pno, pd3.pno  (actual time=19130..19461 rows=2.25e+6 loops=1)
    -> Stream results  (cost=346436 rows=105833) (actual time=129..1905 rows=2.25e+6 loops=1)
        -> Inner hash join (pd3.state = pd2.state)  (cost=346436 rows=105833) (actual time=129..634 rows=2.25e+6 loops=1)
            -> Filter: (pd3.salary > 50000)  (cost=50.4 rows=813) (actual time=0.0498..23.2 rows=22872 loops=1)
                -> Table scan on pd3  (cost=50.4 rows=24394) (actual time=0.0433..19.1 rows=24000 loops=1)
            -> Hash
                -> Nested loop inner join  (cost=7124 rows=3905) (actual time=0.105..62.5 rows=4916 loops=1)
                    -> Filter: ((pd2.gender = 'M') and (pd2.city is not null))  (cost=2633 rows=2439) (actual time=0.058..20.3 rows=7968 loops=1)
                        -> Table scan on pd2  (cost=2633 rows=24394) (actual time=0.0541..17.6 rows=24000 loops=1)
                    -> Index lookup on pd1 using idx_composite_query (city=pd2.city), with index condition: (pd1.dob between '1980-01-01' and '1990-12-31')  (cost=0.4 rows=1.6) (actual time=0.00432..0.00515 rows=0.617 loops=7968)
"

This is what was shown to me. This query is just for educational purpose for my self learning

Thiết kế website giá rẻ

Danh mục

DB Index running slower than intended