We have a REST API with an endpoint accepting JSON data from the client. One of the JSON fields is a URL that will be rendrered to other users as a hyperlink to a website page associated to the resource. Somewhere in the pipeline we needed to ensure the URL is valid (starts with http(s)://
, contains a domain from a whitelist, etc.).
So we designed the API such that it would accept only valid URLS, and return an error (400) when the URL is considered invalid. On the UI side, the user has to correct the URL until it is valid, the error message adapting to the error case (missing value, invalid domain, invalid format…).
Our product owner tested our implementation before going live, and had trouble with this simple approach. He typed in “facebook.com/foobar” and was expecting the URL to be valid. The error message was something along the lines of “Please enter a valid URL like https://www.example.com/xxxx”. He was expecting (quite rightly) that the input field would accept anything a browser address bar would accept. The error message could have been clearer (“the URL should start with http(s)://”), but we agreed he was right and that in this case, user input should be fixed by our application before being saved.
Here we had 2 ideas:
- Either let the API correct the URL (prepend a default protocol) upon saving ;
- Or prepend the protocol on the client side, and don’t touch the API validation.
I have a strong preference for the client-side method, because I believe a REST API should never alter user input silently (you never know what kind of client will consume your API, and silently modifying user input could have unexpected side-effects). The problem is I couldn’t find any real-life example to back up my point of view.
On the contrary, one of my teammates (the one responsible for the fix) couldn’t find any good reason to prefer one method over the other, and went for the API fix (mainly because it’s much faster to implement, and you don’t have to implement this behavior for each client using your API).
What do you think?
1
I agree with not silently modifying the user input, it could cause for confusion and doesn’t perfectly reflect the JSON data the end-user submitted.
Although client-side user input verification can be circumvented, it’s still a wise move to show the user that his url is going to be modified before it will be consumed.
In any case, always verify the input server-side, because you could assume the data (in this case the url) is well-formed, but the user has complete control over the UI and could submit faulty data resulting in a nasty server error.
For me there is a clear winner.
You should always ask yourself who is responsible for what?. For me, the Rest API (and any API) should be kept simple, never add behaviors outside what’s expected from it, in this scenario, the API has the role of storing data for latter use, so no, do not fix it silently.
Also, if you add this responsibility to the API, where do you draw the line of what should be fixed and what not?
If you need consistency on the client side, add a class in your consumer API that manages this kind of issues and reuse it everywhere you consume the API.
1
There’s one option which hasn’t been mentioned, and that is change the definition of what constitutes correct input to the API. Make “facebook.com” an acceptable field value (note: the field is now NOT a URL), and document that. Then no correction is necessary. The API then uses this field to construct a hyperlink when generating a subsequent response, if necessary, just like it would use a date of birth to compute an user’s age.
The rest of my point of view follows what Dan Wilson wrote.
I don’t see an issue in this specific case since the URL reference is well-known. Your service should have no problem accepting facebook.com/foobar
or https://www.facebook.com/foobar
as valid input.
You know the URL references a Facebook page, Facebook pages can always use HTTPS, and Facebook will perform redirection if www
is not specified.
I would not modify the user input; I would simply
- validate that it matches the format of a URL
- add a missing scheme only in your test for validation so that truly invalid data (
facebook=comfoobar
) can be rejected
- add a missing scheme only in your test for validation so that truly invalid data (
- store whatever the user gave you as-is
- format it appropriately (add missing
https://
) when rendering it
Should the user need to modify the URL in the future, they can be presented with exactly what they entered and will not be surprised by any modifications.
You can account for user expectations while not modifying the data.
5
There doesn’t seem to be any real reason to prefer one over the other.
Essentially you have a method you want to expose (it would be nice to have some real names here)
service.Method(urlLikeString)
and some code which changes a url like string into a url. There are three possible places to put it
1 the application
app.Method(urlLikestring)
{
url = getUrl(urlLikestring)
client.Method(url);
}
2 the client
app.Method(urlLikestring)
{
client.Method(urlLikestring);
}
client.Method(urlLikestring)
{
url = getUrl(urlLikestring)
httpclient.post(url)
}
3 the server
client.Method(urlLikestring)
{
httpclient.post(urlLikestring)
}
server.Method(urlLikestring)
{
url = getUrl(urlLikestring);
logic.Method(url);
}
Unless there is a scenario where you don’t want the UrlLikeString converted, or the conversion requires information that only one of the layers has. It doesn’t make any difference where you put the conversion.
Obviously you still have to verify the urlLikeString is url-like whatever you do