Why is my page load time so closely correlated with number of database queries?

Whenever I’m doing web development, and a page takes longer than half a second to be generated, I know that somewhere my code is hitting the DB too many times. The normal way to fix this situation is to ask the DB for all the information all at once instead, by doing JOINs and the like.

My question is: Why do many database queries make a page slow? There must be considerable overhead to each query, but what is it?

EDIT: Alright, let’s take an example (it’s a bit silly and small, but it’ll do)

people table:

| name | football_team_id | 
+------+------------------+
| jim  | 1                |
| mike | 3                |
| carl | 2                |

football_team table:

| id | color |
+----+-------+
| 1  | red   |
| 2  | blue  |
| 3  | green |

We all know that this is slow:

SELECT name,football_team_id FROM people;
# start rendering the page, realise we need colors
SELECT color FROM football_team WHERE id=1
# oops, need mike's color
SELECT color FROM football_team WHERE id=3
# oh, and carl's
SELECT color FROM football_team WHERE id=2

This is a bit better:

SELECT name,football_team_id FROM people;
SELECT id,color FROM football_team WHERE id IN (1,3,2)

This is best:

SELECT name,football_team_id,color FROM people JOIN football_team ON people.football_team_id=football_team.id

In each example we’re getting the same amount of data, but the latter is easily the fastest.

You wouldn’t expect the same behaviour if you were reading from a file descriptor, for example.

I’ve profiled a number of applications and I have found that:

Creating a database connection is usually the most expensive operation (between 700-1500+ ms on many major databases)
On the database server, most simple queries like you listed in your question take very little time to execute (between 1-20ms measured on the server)
A good portion of the time is spent transferring data from the database to the web page (about 100-300ms per simple query).

Armed with this information, if you aren’t currently caching connections then now is a great time to start. You can see that the actual time to execute a query really is negligible. The problem is the time to actually get the data back to your web app.

So what’s going on?

You’ll find that most database protocols are very “chatty”. Basically, they they send bytes back and forth so that the database and the client know they are still present, and that the client has proper permissions, etc. In some cases there is some overhead when cursors are shared between server and client.

Your database server returns results in chunks, and the driver may have to send acknowledgements to let the server know that the chunk was received properly. The driver then needs to take these chunks and represent in a way your application can use. All this processing takes time.

All communications have a couple properties that affect transmission time:

Latency: the delay between the time a packet is sent to the time it is received.
Transmission Speed: the number of bits/bytes per second that the wire can support.

The more firewalls, routers, and other infrastructure devices you have between your app and database the more that raises latency. Transmission speed is something we are more familiar with, because we know are servers are connected with 10baseT, 100baseT, or 1000baseT Ethernet (10, 100, 1000 million bits per second respectively).

If you have high bandwidth once the data moves, it moves very quickly. High latency can make communications with the database much slower than it should be due to the small packets moving back and forth between the database and application.

How do you deal with it?

One of the best ways to minimize the cost of dealing with the database is to minimize the number of times you call the database. Additionally, you’ll want to make sure you are only getting the data you actually need to display.

In some cases you can use some intelligent caching so you just don’t have to hit the database at all for some parts of the pages you have to render.

Why do many database queries make a page slow?
Why does a large number of anything make a page slow?

Do something once and it takes “some amount” of time.

Do the same thing a thousand times and yes; it’s going to take [roughly] a thousand times as long. There’s no magic here. Unless you start parallelising and multi-threading your programs, everything’s going to get done “one thing after another”.

Yes; getting a database connection and using it does have an overhead, although things like Connection Pooling serve to diminish the impact, but the more times you go to the database, the longer things are going to take.

Also, watch out for the amount of data you’re pulling back. “select *” seems to be making something of a comeback in the “Newbie” coding communities at the moment. Great if your table has three columns and you want all three of them. Not so good if you only want three columns but your tables as “somehow” acquired another twelve of them; all of them massive text fields!

(Remember; you’re not the only user of “your” database).

Generally it’s what we call “on-the-wire overhead”. In a lot of servers, the database is located on a different machine than the server app. (This has scalability benefits, among other things.) That means that any database call has to go over a network connection, and all results have to be pushed back over the network. The cost of that overhead, even if the machine is only sitting a few feet away from the one hosting the server app, can add up very quickly.

Trang chủ Giới thiệu Sinh nhật bé trai Sinh nhật bé gái Tổ chức sự kiện Biểu diễn giải trí Dịch vụ khác Trang trí tiệc cưới Tổ chức khai trương Tư vấn dịch vụ Thư viện ảnh Tin tức - sự kiện Liên hệ Chú hề sinh nhật Trang trí YEAR END PARTY công ty Trang trí tất niên cuối năm Trang trí tất niên xu hướng mới nhất Trang trí sinh nhật bé trai Hải Đăng Trang trí sinh nhật bé Khánh Vân Trang trí sinh nhật Bích Ngân Trang trí sinh nhật bé Thanh Trang Thuê ông già Noel phát quà Biểu diễn xiếc khỉ Xiếc quay đĩa Dịch vụ tổ chức sự kiện 5 sao Thông tin về chúng tôi Dịch vụ sinh nhật bé trai Dịch vụ sinh nhật bé gái Sự kiện trọn gói Các tiết mục giải trí Dịch vụ bổ trợ Tiệc cưới sang trọng Dịch vụ khai trương Tư vấn tổ chức sự kiện Hình ảnh sự kiện Cập nhật tin tức Liên hệ ngay Thuê chú hề chuyên nghiệp Tiệc tất niên cho công ty Trang trí tiệc cuối năm Tiệc tất niên độc đáo Sinh nhật bé Hải Đăng Sinh nhật đáng yêu bé Khánh Vân Sinh nhật sang trọng Bích Ngân Tiệc sinh nhật bé Thanh Trang Dịch vụ ông già Noel Xiếc thú vui nhộn Biểu diễn xiếc quay đĩa Dịch vụ tổ chức tiệc uy tín Khám phá dịch vụ của chúng tôi Tiệc sinh nhật cho bé trai Trang trí tiệc cho bé gái Gói sự kiện chuyên nghiệp Chương trình giải trí hấp dẫn Dịch vụ hỗ trợ sự kiện Trang trí tiệc cưới đẹp Khởi đầu thành công với khai trương Chuyên gia tư vấn sự kiện Xem ảnh các sự kiện đẹp Tin mới về sự kiện Kết nối với đội ngũ chuyên gia Chú hề vui nhộn cho tiệc sinh nhật Ý tưởng tiệc cuối năm Tất niên độc đáo Trang trí tiệc hiện đại Tổ chức sinh nhật cho Hải Đăng Sinh nhật độc quyền Khánh Vân Phong cách tiệc Bích Ngân Trang trí tiệc bé Thanh Trang Thuê dịch vụ ông già Noel chuyên nghiệp Xem xiếc khỉ đặc sắc Xiếc quay đĩa thú vị

Filed under: softwareengineering - @ 10:11

Thẻ: database, web-applications

Why is my page load time so closely correlated with number of database queries?

My question is: Why do many database queries make a page slow? There must be considerable overhead to each query, but what is it?

EDIT: Alright, let’s take an example (it’s a bit silly and small, but it’ll do)

people table:

| name | football_team_id | 
+------+------------------+
| jim  | 1                |
| mike | 3                |
| carl | 2                |

football_team table:

| id | color |
+----+-------+
| 1  | red   |
| 2  | blue  |
| 3  | green |

We all know that this is slow:

SELECT name,football_team_id FROM people;
# start rendering the page, realise we need colors
SELECT color FROM football_team WHERE id=1
# oops, need mike's color
SELECT color FROM football_team WHERE id=3
# oh, and carl's
SELECT color FROM football_team WHERE id=2

This is a bit better:

SELECT name,football_team_id FROM people;
SELECT id,color FROM football_team WHERE id IN (1,3,2)

This is best:

SELECT name,football_team_id,color FROM people JOIN football_team ON people.football_team_id=football_team.id

In each example we’re getting the same amount of data, but the latter is easily the fastest.

You wouldn’t expect the same behaviour if you were reading from a file descriptor, for example.

I’ve profiled a number of applications and I have found that:

Creating a database connection is usually the most expensive operation (between 700-1500+ ms on many major databases)
On the database server, most simple queries like you listed in your question take very little time to execute (between 1-20ms measured on the server)
A good portion of the time is spent transferring data from the database to the web page (about 100-300ms per simple query).

So what’s going on?

All communications have a couple properties that affect transmission time:

Latency: the delay between the time a packet is sent to the time it is received.
Transmission Speed: the number of bits/bytes per second that the wire can support.

How do you deal with it?

In some cases you can use some intelligent caching so you just don’t have to hit the database at all for some parts of the pages you have to render.

Why do many database queries make a page slow?
Why does a large number of anything make a page slow?

Do something once and it takes “some amount” of time.

(Remember; you’re not the only user of “your” database).

Filed under: softwareengineering - @ 10:11

Thẻ: database, web-applications

Thiết kế website giá rẻ

Danh mục

Why is my page load time so closely correlated with number of database queries?

Why is my page load time so closely correlated with number of database queries?