I am working on a project and after arguing with people at work for about more than a hour. I decided to know what people on stack-exchange might say.
We’re writing an API for a system, there is a query that should return a tree of Organization or a tree of Goals.
The tree of Organization is the organization in which the user is present, In other words, this tree should always exists. In the organization, a tree of goal should be always present. (that’s where the argument started). In case where the tree doesn’t exist, my co-worker decided that it would be right to answer response with status code 200. And then started asking me to fix my code because the application was falling apart when there is no tree.
I’ll try to spare flames and fury.
I suggested to raise a 404 error when there is no tree. It would at least let me know that something is wrong. When using 200, I have to add special check to my response in the success callback to handle errors. I’m expecting to receive an object, but I may actually receive an empty response because nothing is found. It sounds totally fair to mark the response as a 404. And then war started and I got the message that I didn’t understand HTTP status code schema. So I’m here and asking what’s wrong with 404 in this case? I even got the argument “It found nothing, so it’s right to return 200″. I believe that it’s wrong since the tree should be always present. If we found nothing and we are expecting something, it should be a 404.
More info,
I forgot to add the urls that are fetched.
Organizations
/OrgTree/Get
Goals
/GoalTree/GetByDate?versionDate=...
/GoalTree/GetById?versionId=...
My mistake, both parameters are required. If any versionDate that can be parsed to a date is provided, it will return the closes revision. If you enter something in the past, it will return the first revision. If by Id with a id that doesn’t exists, I suspect it’s going to return an empty response with 200.
Extra
Also, I believe the best answer to the problem is to create default objects when organizations are created, having no tree shouldn’t be a valid case and should be seen as an undefined behavior. There is no way an account can be used without both trees. For that reasons, they should be always present.
also I got linked this (one similar but I can’t find it)
http://viswaug.files.wordpress.com/2008/11/http-headers-status1.png
12
When in doubt, consult the documentation. Reviewing the W3C definitions for HTTP Status codes, gives us this:
200 OK – The request has succeeded. The information returned with the response is dependent on the method used in the request.
404 Not Found – The server has not found anything matching the Request-URI.
In the context of your API, it very much depends on how queries are created and how objects are retrieved. But, my interpretation has always been that:
- If I ask for a particular object, and it exists return
200
code, if it doesn’t exist return the correct404
code. - But, if I ask for a set of objects that match a query, a null set is a valid response and I want that returned with a
200
code. The rationale for this is that the query was valid, it succeeded and the query returned nothing.
So in this case you are correct, the service isn’t searching for “a specific thing” it is requesting a particular thing, if that thing isn’t found say that clearly.
I think Wikipedia puts it best:
200 OK – … The actual response will depend on the request method used. In a GET request, the response will contain an entity corresponding to the requested resource.
404 Not Found – The requested resource could not be found but may be available again in the future. Subsequent requests by the client are permissible.
Seems pretty clear to me.
Regarding the example requests
/GoalTree/GetByDate?versionDate=...
/GoalTree/GetById?versionId=...
For the format, you said, you always return the nearest revision to that date. It will never not return an object, so it should always be returning 200 OK
. Even if this were able to take a date range, and the logic were to return all objects within that timeframe returning 200 OK – 0 Results is ok, as that is what the request was for – the set of things that met that criteria.
However, the latter is different as you are asking for a specific object, presumably unique, with that identity. Returning 200 OK
in this case is wrong as the requested resource doesn’t exist and is not found.
Regarding choosing status codes
- 2xx codes Tell a User Agent (UA) that it did the right thing, the request worked. It can keep doing this in the future.
- 3xx codes Tell a UA what you asked probably used to work, but that thing is now elsewhere. In future the UA might consider just going to the redirect.
- 4xx codes Tell a UA it did something wrong, the request it constructed isn’t proper and shouldn’t try it again, without at least some modification.
- 5xx codes Tell a UA the server is broken somehow. But hey that query could work in the future, so there is no reason not to try it again. (except for 501, which is more of a 400 issue).
You mentioned in a comment using a 5xx code, but your system is working. It was asked a query that doesn’t work and needs to communicate that to the UA. No matter how you slice it, this is 4xx territory.
Consider an alien querying our solar system
Alien: Computer, please tell me all planets that humans inhabit.
Computer: 1 result found. Earth
Alien: Computer, please tell me about Earth.
Computer: Earth – Mostly Harmless.
Alien: Computer, please tell me about all planets humans inhabit, outside the asteroid belt.
Computer: 0 results found.
Alien: Computer, please destroy Earth.
Computer: 200 OK.
Alien: Computer, please tell me about Earth.
Computer: 404 – Not Found
Alien: Computer, please tell me all planets that humans inhabit.
Computer: 0 results found.
Alien: Victory for the mighty Irken Empire!
15
Ignoring the fact that /GoalTree/Get* looks like a verb, not resources, you should always return 200 because the URI /GoalTree/Get* represent resources that’s always available for access and it’s not client error if there’s no tree as a result of a request. Just return 200 with empty set when there’s no entity to be returned.
You use 404 if the resource is not found, not when there’s no entity.
Put it in another way, if you want to return 404 for your objects, then give them their own URIs.
5
This is an interesting question, because it’s all about the system’s specification.
imel96’s response has convinced me a 404 wouldn’t be a proper response, since the 4xx family of codes is mainly for user/client errors, and this isn’t one. The URL is well-formed and the tree must be there; if it’s not, the system is in an inconsistent state!
Therefore this is a server error, i.e. something in the 5xx family. Possibly a generic 500 Internal Server Error or a 503 Service Unavailable (the service being “fetch me the tree that must be there”).
9
I’d say that either a 200 or a 404 response code can be valid, depending on how you look at the situation.
The thing is that HTTP response codes are defined in the context of an server, which can deliver various resources based on their URL. In this context, the meanings of 200 OK
and 404 Not Found
are perfectly unambiguous: the former says “here’s the resource you asked for”, while the latter says “sorry, I don’t have any resource like that”.
However, in your situation, you have an additional application layer between the HTTP server and the actual resources (trees) that are being requested. The application occupies a sort of an intermediate space that is not well addressed in the HTTP spec.
From the webserver’s viewpoint, the application looks kind of like a resource: it’s typically a file on the server, identified by (a part of) the URL, just like other resources (e.g. static files) the server might serve. On the other hand, it’s a weird kind of resource, since it consists of executable code that dynamically determines the content, and indeed potentially even the status code, of the response, making it behave in some ways more like a mini-server.
In particular, in your example case, the webserver can locate the application just fine, but the application then fails to locate the subresource (tree) that has been requested. Now, if you consider the application to be just an extension of the server, and the subitem (tree) to be the actual resource, then a 404 response is appropriate: the server has merely delegated the task of finding the actual resource to the application, which it turn has failed to do so.
On the other hand, if your viewpoint is that the application is the resource being requested, then obviously the webserver should return a 200 response; after all, the application was found and executed correctly. Obviously, in this case, the application should actually return a valid response body in the expected format, indicating (using whatever higher-level protocol that format encodes) that no actual data matching the query was found.
Both of these viewpoints can make sense. In most cases, at least for applications intended to be directly accessed over HTTP with an ordinary web browser, I would favor the former view: the user generally doesn’t care about internal details like the difference between the server and the application, they just care about whether the data they wanted is there or not.
However, in the specific case of an application designed to communicate with other computer programs using a custom high-level API protocol, using HTTP only as a low-level transport layer, there’s an argument to be made in favor of the latter view: for clients interfacing with such an application, all they really care about, at the HTTP level, is whether they managed to successfully contact the application or not. Everything else is, in such cases, often more naturally communicated using the higher-level protocol.
In any case, regardless of which of the above views you prefer, there are a few details you should keep in mind. One is that, in many cases, there may be a meaningful distinction between an (essentially) empty resource and a nonexistent one.
On the HTTP level, an empty resource would simply be indicated by a 200 response code and an empty response body, while a nonexistent resource would be indicated by a 404 response and a resource body explaining the absence of the resource. In a higher-level API protocol, one would typically indicate a nonexistent resource by an error response, containing a suitable protocol-specific error code/message, while an empty response would simply be a normal response structure with no data items.
(Note that a resource need not be literally zero bytes long to be “empty” in the sense I mean above. For example, a search result with no matching items would count as empty in the broad sense, as would an SQL query result with no rows or an XML document containing no actual data.)
Also, of course, if the application really does believe that the requested subresource should be there, but can’t find it, then a third possible response code exists: 500 Internal Server Error
. Such a response makes sense if the existence of the resource is an assumed precondition for the application, such that its absence necessarily indicates an internal malfunction.
Finally, you should always keep in mind Postel’s law:
“Be conservative in what you send, and liberal in what you receive.“
Whether the server should respond in a particular situation with a 200 or a 404 response, that doesn’t excuse you as the client implementor from handling either response appropriately and in the manner that maximizes robust interoperability. Of course, what “appropriate” handling means in different situations can be argued, but it certainly shouldn’t normally include crashing or otherwise “falling apart”.
7
How about a 204 No Content? It would suggest that your request was processed successfully but is returning nothing. It’s still a “success” but allows you to see if you have results based on status code alone.
1
If the URL represents a resource that never existed return 404 Not Found
If the URL represents a resource that is an empty list return an empty list and 200 OK.
Example:
{
total: 0,
items: []
}
If the URL represents a resource that used to exist return 410 Gone.
Regarding Lego Stormtrooper’s dialog:
Alien: Computer, please tell me all planets that humans inhabit. GET /planets?inhabitedBy=humans
Computer: 200 OK. { total: 1, items:[{name:'Earth'}] }
Alien: Computer, please tell me about Earth. GET /planets/earth
Computer: 200 OK. {name:'Earth', status: 'Mostly Harmless'}
Alien: Computer, please tell me about all planets humans inhabit, outside the asteroid belt. GET /planets?inhabitedBy=humans&distanceFromSun=lots
Computer: 200 OK. {total:0, items:[] }
Alien: Computer, please destroy Earth. DELETE /planets/earth
Computer: 204 No Content. (or 202 Accepted if it takes some time to destroy Earth)
Alien: Computer, please tell me about Earth. GET /planets/earth
Computer: 410 Gone
Alien: Computer, please tell me all planets that humans inhabit. GET /planets?inhabitedBy=humans
Computer: 200 OK 0 {total: 0, items:[] }
Alien: Victory for the mighty Irken Empire!
From the sound of it, this is an API for internal use. This gives the edge of using whichever schema that gives the most benefit, regardless of whether it is by-the-book (specification) or not. This doesn’t mean completely invent your own status codes, but it’s OK to ‘bend’ the rules a little bit if it is beneficial.
I agree with your stand that you should get a status code that shows something went wrong. This is after all what status codes are for. Also you get the benefit of libraries that throw exceptions/etc. on non-200 status code so you do not have to check explicitly (Or you can write your own wrapper that does this).
I also agree with Andres F.’s point of view that 500 is appropriate since the tree should exist. In practice though, I like to split server errors into two categories.
Something unexpected went wrong and something that I can practically check for went wrong. This results in the following status codes,
- 200 – Everything is good
- 404 – Wrong url
- 409 – Something went wrong
- 500 – An unexpected error occurred on the server
In your particular case, you can check whether the tree exists or not on the server side and if it is not there then return a 409. It is an expected error (you know it can happen, you can check for it, etc.). 409 conflict is just my personal preference, a 5xx may also be appropriate as long as you can sit down and decide this with your team.
Categorizing codes like this helps you quicker identify the type of error, but it can have benefits beyond organization. Often with website errors you do not want the client to get unexpected errors as this can be a security concern and reveal vulnerabilities so you return a generic 500 “An error occurred.” and log the full error on the server. But if an expected error occurs as a 409 you know that it would be safe to show the error to the client and you don’t have to leave them in the dark as to what happened. This is just one practical use I can recount, but there are lots of possibilities.
This is a bit of a catch-22 because you are posting this due to not being able to agree with your co-workers, but it sounds like you guys are arguing more about semantics and who is politically correct. It really doesn’t matter who is more proper, as long as you can come up with a system which most benefits the company.
On the other hand if this is a public API following the specifications as close as possible would be more important as to avoid confusion among the community.
Taking a tangential stab at this:
If a human is eventually using the API (through a GUI) I would suggest doing whatever makes life easy for the end user .
The non existence of the tree when it should exist is a “Domain model inconsistency” error. A system error is when you ran out of memory or had some other systemic failure.
So returning 5xx is inappropriate.
As mentioned by several folks above, 4xx might be appropriate if the tree itself had its own URI, which is not the case here. But here’s what 404 tells the client: you can try again and again till you get something back.
If you returned 200, you could return sufficient diagnostics back to the user or user agent so that the user agent can display a meaaage so that the user stops retrying and just contacts support.
On the other hand if this API is intended for systems only, an “exception message” must be a part of the API so that you don’t have to rely on the inherent vagueness of HTTP error codes, and can return something meaningful which can be logged by the consumer and escalated.
2
404 == The client has pointed to a “individual”/”concrete” resource that do not exists. That’s a client error.
200 == The client has requested to construct a set of “concrete” resources. A set can always be also empty. So even if the collection (query result) is empty, the client hasn’t made an error.