Why is polling accepted in web programming?

I am currently working on a Ruby on Rails project which shows a list of images.

A must-have for this project is that it shows new posts in realtime without the need of refreshing the web page. After searching for a while, I’ve stumbled upon some JavaScript solutions and services such as PubNub; however, none of the provided solutions made sense at all.

In the JavaScript solution (polling) the following happens:

User 1 views the list of photos.
In the background the JavaScript code is polling an endpoint every second to see if there is a new post.
User 2 adds a new photo.
There is a delay of 50 ms before the new cycle is triggered and fetches the new data.
The new content is loaded in the DOM.

This seems odd when translated to a real world example:

User 1 holds a pile of pictures on his/her desk.
He/she walks to the photographer every second and asks if he has a new one.
The photographer makes a new photo.
This second when he/she walks in, she can take the picture and put it on the pile.

In my opinion the solution should be as following:

User 1 holds a pile of pictures on his/her desk.
The photographer takes a new picture.
The photographer walks to the pile and puts it with the rest.

The PubNub solution is basically the same, however this time there is an intern walking between the parties to share the data.

Needless to say, both solutions are very energy consuming as they are triggered even when there is no data to load.

As far as my knowledge goes there is no (logic) explanation why this way of implementation is used in almost every realtime application.

Pushing works well for 1, or a limited number of users.

Now change the scenario with one photographer and 1000 users that all want a copy of the picture. The photographer will have to walk to 1000 piles. Some of them might be in locked office, or spread all over the floor. Or their user on vacation, and not interested in new pictures at the moment.

The photographer would be busy walking all the time and not take new pictures.

Fundamentally: a pull/poll model scales better to lots of unreliable readers with loose realtime requirements (if a picture takes 10 seconds later to arrive on a pile, what’s the big deal).

That said, a push model is still better in a lot of situations. If you need low latency (you need that new photo 5s after it’s taken), or updates are rare and requests frequent and predictable (keep asking the photographer every 10 seconds when he generates a new picture a day), then pulling is inappropriate. It depends on what you’re trying to do. NASDAQ: push. Weather service: pull. Wedding photographer: probably pull. News photo agency: probably push.

I’m really surprised that only one person has mentioned WebSockets. Support is implemented in basically every major browser.

In fact PubNub uses them. For your application the browser would probably subscribe to a socket that would broadcast whenever a new photo is available.
The socket wouldn’t send the photo, mind you, but just a link so the browser could download it asynchronously.

In your example imagine something like:

User(s) lets photographer know that he wants to know about all future photos
Photographer says over loudspeaker that a new photo is available
User asks photographer for photo

This is somewhat like your original example solution. It’s more efficient than polling because the client doesn’t have to send any data to the server (except maybe heartbeats.)

Also, as others have mentioned, there are other methods that are better than simple polling that work in older browsers (longpolling, et al.)

Sometimes good enough is good enough.

Of all the possible ways to implement a “real-time” communications process, polling is perhaps the simplest way. Polling can be used effectively when the polling interval is relatively long (i.e. seconds, minutes or hours rather than instantaneous), and the clock cycles consumed by checking the connection or resource don’t really matter.

The HTTP protocol is limited in that the client MUST be the one to initiate the request. The server cannot communicate with the client unless responding to a client’s request.

So to adjust your real world example, add the following restraint:

User 2 can ONLY respond to User 1’s questions with a single sentence reply, after which User 1 must leave. User 2 has no other way of communicating.

With this new restraint, how would you do it other than polling?

Why is polling accepted? Because in reality every solution is actually low-level polling!

If the server should update you as soon as new pictures are available, it usually has to have a connection to you – because IP addresses change often and you never know if someone isn’t interested anymore, so the client has to send some form of keep-alive signal, for example, “I’m still here, I’m not offline”

All stateful connections (for example, TCP/IP) work the same, since you can only send singular data-packets over the Internet; you never know if the other party is still there.

So every protocol has a timeout. If an entity doesn’t answer within X seconds, it is presumed to be dead. So even if you have only an open connection between server and client, without sending any data, the server and client have to send regular keep-alive packets (this is handled low-level if you open a connection between them) – and how is this in the end any different from polling?

So the best approach would probably be longpolling:

The client sends a request immediately after loading the site (for example, telling the photographer “Tell me if there are any new pictures”), but the server doesn’t answer if there aren’t any new pictures. As soon as the request times out, the client asks again.

If the server now has any new pictures, it can immediately answer all the clients which stand in line for new pictures. So your reaction time after a new picture is even shorter than with push, since the client is still waiting in an open connection for a reply and you don’t have to build up a connection to the client. And the polling requests from the client are not much more traffic than a constant connection between client and server for an answer!

One advantage of polling is that it limits the harm that can be caused if a message goes missing or the state of something gets glitched. If X asks Y for its state once every five seconds, then the loss of a request or a reply will merely result in X’s information being ten seconds out of date rather than 5. If Y gets rebooted, X can find out about it the next time Y is able to respond to one of X’s messages. If X gets rebooted, it might never bother asking Y for anything afterward, but whoever is observing the status of X should recognize that it has been rebooted.

If instead of X polling Y, X relied upon Y to inform it whenever its state changed, then if Y’s state changed and it sent a message to X, but for whatever reason that message was not received, X might never become aware of the change. Likewise if Y gets rebooted and never has any reason to send X a message about anything.

In some cases it may be helpful to for X to request that Y autonomously send messages with its status, either periodically or when it changes, and only have X poll if it goes too long without hearing anything from Y. Such a design may eliminate the need for X to send most of its messages (typically, X should at least occasionally inform Y that it’s still interested in receiving messages, and Y should stop sending messages if it goes too long without any indication of interest). Such a design would, however, require Y to persistently maintain information about X, rather than being able to simply send a reply to whoever polled it and then immediately forget about who that was. If Y is an embedded system, such a simplification may help reduce memory requirements sufficiently to allow the use of a smaller and cheaper controller.

Polling can have an additional advantage when using a potentially-unreliable communications medium (e.g. UDP or radio): it can largely eliminates the need for link-layer acknowledgments. If X sends Y a status request Q, Y responds with a status report R, and X hears R, X won’t need to hear any sort of link-layer acknowledgment for Q to know that it was received. Conversely, once Y sends R, it doesn’t need to know or care if X received it. If X sends a status request and gets no response, it can send another. If Y sends a report and X doesn’t hear it, X will send another request. If each request goes out once and either yields a response or doesn’t, neither party needs to know or care whether any particular message was received. Since sending an acknowledgment may consume almost as much bandwidth as a status request or report, using a round-trip of request-report doesn’t cost much more than would an unsolicited report and acknowledgment. If X sends a few requests without getting replies, it may on some dynamically-routed networks need to enable link-level acknowledgments (and ask in its request that Y do likewise) so that the underlying protocol stack can recognize the delivery problem and search for a new route, but when things are working a request-report model will be more efficient than using link-level acknowledgments.

The question is to balance the amount of unnecessary polls vs the amount of unnecessary pushes.

If you poll:

You get an answer at this very moment. Good if you ask only occasionally or need a data set this very moment.
You might get a “no content” answer, causing pointless load on the line.
You put load on the line only when you poll, but always when you poll.

If you push:

You deliver the answer right when it is available, which allows an immediate processing on the client side.
You might deliver data to clients which are not interested in this data, causing pointless load on the line.
You put load on the line every time there is new data, but only when there is new data.

There are several solutions on how to deal with the various scenarios and their disadvantages, like for example a minimum time between polls, poll-only proxies to take the load off the main system, or – for the pushes – a regulation to register and specify the wanted data followed by unregistering on log-off. Which one fits best is nothing you can say in general, it depends on the system.

In your example polling is not the most efficient solution, but the most practical one. It is very easy to write a polling system in JavaScript, and it is very easy to implement it on the delivery side as well. A server made to deliver image data should be able to handle the extra requests, and if not, it can be scaled linearly, as the data is mostly static and can therefore be easily cached.

A push method implementing a log-in, description of wanted data and finally a log-off would be most efficient, but is probably too complex for the average “script-kiddy”, and needs to deal with the question: what if the user just shuts down the browser and log-off cannot be performed?

Maybe it is better to have more users (as accessing is easy) than to save some bucks on another cache-server?

For some reason, these days, all the younger web developers seem to have forgotten the lessons of the past, and why some things have evolved the way they did.

Bandwidth was an issue
Connection might be intermittent.
Browsers did not have as much computing power
There were other methods of accessing content. The web is is not w3.

In the face of these constraints, you might not have a constant 2 way communication. And if you looked at the OSI model, you’d find most considerations are meant to decouple persistency with the underlying connection.

With that in mind, a polling method of pulling information is a great way to reduce bandwidth and computation on the client side. The rise of push is really for the most part just the client doing constant polling, or web sockets. Personally if i was everyone else out there, i’d appreciate the regularity of polling as a means of traffic analysis, where an out of time GET/POST request would signal a man in the middle situation of some sort.

Trang chủ Giới thiệu Sinh nhật bé trai Sinh nhật bé gái Tổ chức sự kiện Biểu diễn giải trí Dịch vụ khác Trang trí tiệc cưới Tổ chức khai trương Tư vấn dịch vụ Thư viện ảnh Tin tức - sự kiện Liên hệ Chú hề sinh nhật Trang trí YEAR END PARTY công ty Trang trí tất niên cuối năm Trang trí tất niên xu hướng mới nhất Trang trí sinh nhật bé trai Hải Đăng Trang trí sinh nhật bé Khánh Vân Trang trí sinh nhật Bích Ngân Trang trí sinh nhật bé Thanh Trang Thuê ông già Noel phát quà Biểu diễn xiếc khỉ Xiếc quay đĩa Dịch vụ tổ chức sự kiện 5 sao Thông tin về chúng tôi Dịch vụ sinh nhật bé trai Dịch vụ sinh nhật bé gái Sự kiện trọn gói Các tiết mục giải trí Dịch vụ bổ trợ Tiệc cưới sang trọng Dịch vụ khai trương Tư vấn tổ chức sự kiện Hình ảnh sự kiện Cập nhật tin tức Liên hệ ngay Thuê chú hề chuyên nghiệp Tiệc tất niên cho công ty Trang trí tiệc cuối năm Tiệc tất niên độc đáo Sinh nhật bé Hải Đăng Sinh nhật đáng yêu bé Khánh Vân Sinh nhật sang trọng Bích Ngân Tiệc sinh nhật bé Thanh Trang Dịch vụ ông già Noel Xiếc thú vui nhộn Biểu diễn xiếc quay đĩa Dịch vụ tổ chức tiệc uy tín Khám phá dịch vụ của chúng tôi Tiệc sinh nhật cho bé trai Trang trí tiệc cho bé gái Gói sự kiện chuyên nghiệp Chương trình giải trí hấp dẫn Dịch vụ hỗ trợ sự kiện Trang trí tiệc cưới đẹp Khởi đầu thành công với khai trương Chuyên gia tư vấn sự kiện Xem ảnh các sự kiện đẹp Tin mới về sự kiện Kết nối với đội ngũ chuyên gia Chú hề vui nhộn cho tiệc sinh nhật Ý tưởng tiệc cuối năm Tất niên độc đáo Trang trí tiệc hiện đại Tổ chức sinh nhật cho Hải Đăng Sinh nhật độc quyền Khánh Vân Phong cách tiệc Bích Ngân Trang trí tiệc bé Thanh Trang Thuê dịch vụ ông già Noel chuyên nghiệp Xem xiếc khỉ đặc sắc Xiếc quay đĩa thú vị

Filed under: softwareengineering - @ 01:24

Thẻ: logic, loops, polling

Why is polling accepted in web programming?

I am currently working on a Ruby on Rails project which shows a list of images.

In the JavaScript solution (polling) the following happens:

User 1 views the list of photos.
In the background the JavaScript code is polling an endpoint every second to see if there is a new post.
User 2 adds a new photo.
There is a delay of 50 ms before the new cycle is triggered and fetches the new data.
The new content is loaded in the DOM.

This seems odd when translated to a real world example:

User 1 holds a pile of pictures on his/her desk.
He/she walks to the photographer every second and asks if he has a new one.
The photographer makes a new photo.
This second when he/she walks in, she can take the picture and put it on the pile.

In my opinion the solution should be as following:

User 1 holds a pile of pictures on his/her desk.
The photographer takes a new picture.
The photographer walks to the pile and puts it with the rest.

The PubNub solution is basically the same, however this time there is an intern walking between the parties to share the data.

Needless to say, both solutions are very energy consuming as they are triggered even when there is no data to load.

As far as my knowledge goes there is no (logic) explanation why this way of implementation is used in almost every realtime application.

Pushing works well for 1, or a limited number of users.

The photographer would be busy walking all the time and not take new pictures.

Fundamentally: a pull/poll model scales better to lots of unreliable readers with loose realtime requirements (if a picture takes 10 seconds later to arrive on a pile, what’s the big deal).

I’m really surprised that only one person has mentioned WebSockets. Support is implemented in basically every major browser.

In your example imagine something like:

User(s) lets photographer know that he wants to know about all future photos
Photographer says over loudspeaker that a new photo is available
User asks photographer for photo

This is somewhat like your original example solution. It’s more efficient than polling because the client doesn’t have to send any data to the server (except maybe heartbeats.)

Also, as others have mentioned, there are other methods that are better than simple polling that work in older browsers (longpolling, et al.)

Sometimes good enough is good enough.

The HTTP protocol is limited in that the client MUST be the one to initiate the request. The server cannot communicate with the client unless responding to a client’s request.

So to adjust your real world example, add the following restraint:

User 2 can ONLY respond to User 1’s questions with a single sentence reply, after which User 1 must leave. User 2 has no other way of communicating.

With this new restraint, how would you do it other than polling?

Why is polling accepted? Because in reality every solution is actually low-level polling!

All stateful connections (for example, TCP/IP) work the same, since you can only send singular data-packets over the Internet; you never know if the other party is still there.

So the best approach would probably be longpolling:

The question is to balance the amount of unnecessary polls vs the amount of unnecessary pushes.

If you poll:

You get an answer at this very moment. Good if you ask only occasionally or need a data set this very moment.
You might get a “no content” answer, causing pointless load on the line.
You put load on the line only when you poll, but always when you poll.

If you push:

You deliver the answer right when it is available, which allows an immediate processing on the client side.
You might deliver data to clients which are not interested in this data, causing pointless load on the line.
You put load on the line every time there is new data, but only when there is new data.

Maybe it is better to have more users (as accessing is easy) than to save some bucks on another cache-server?

For some reason, these days, all the younger web developers seem to have forgotten the lessons of the past, and why some things have evolved the way they did.

Bandwidth was an issue
Connection might be intermittent.
Browsers did not have as much computing power
There were other methods of accessing content. The web is is not w3.

Filed under: softwareengineering - @ 01:24

Thẻ: logic, loops, polling

Thiết kế website giá rẻ

Danh mục

Why is polling accepted in web programming?

Why is polling accepted in web programming?