HTTP/2 server push enables pushing assets to clients which you anticipate they’ll need in order to render the page. For example, you can push your CSS/JS/images to the client at the same time as you respond to a request for an HTML page, so they receive those assets sooner.
But how do you avoid wasting bandwidth on clients that already have the assets cached?
Does the server push mechanism handle this situation automatically, and how? Or does my serverside app need to somehow track clients that have already been served a response previously, and avoid server-pushing extra assets to those clients?
Read the spec http://http2.github.io/http2-spec/#PushResources, it notes that cacheable responses will be cached, non-cacheable responses must not be cached.
If something is already in the cache, the client will be able to refuse the duplicate asset by sending a RST_STREAM frame: http://http2.github.io/http2-spec/#RST_STREAM.
See mark Nottingham’s blog for more details:
https://www.mnot.net/blog/2014/01/30/http2_expectations
He also indicates that server push will allow the server to proactively invalidate the client’s cache, i.e. push a new asset when the server asset has changed, which is a very useful feature.
2