Premising that I just want to be nice and generate this in a safe way:
Link: <https://example.com>; rel="preconnect"
https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Link
Premising that accordingly to the Mozilla Developers Documentation:
The URI (absolute or relative) must encode char codes greater than 255:
Question:
How would you “encode char codes greater than 255” in PHP?
I’m not looking for something that “works”, I would like something that respects the standard (since in Internet “we” try to be permissive about what I accept, and rigid about what I give).
Thanks for any help for PHP.
Counter examples
The rawurlencode()
(that follows RFC 3986) is incorrect because it escapes :
and /
:
echo rawurlencode('https://example.com/苗条');
Result (wrong):
https%3A%2F%2Fexample.com%2F%E8%8B%97%E6%9D%A1
The urlencode()
is also incorrect, same result (as expected):
echo urlencode('https://example.com/苗条');
Result (wrong):
https%3A%2F%2Fexample.com%2F%E8%8B%97%E6%9D%A1
Workaround
I may just use rawurlencode()
and then restore some characters. Stuff like:
echo str_replace(['%3A', '%2F', '%3C', '%3E'], [':', '/', '<', '>'], rawurlencode('https://example.com/苗条'));
Result (correct):
https://example.com/%E8%8B%97%E6%9D%A1
But, to respect the standard, I should probably loop chr()
from zero to 255 to get all rawurlencode(chr(1..255))
to restore them… so this is becoming a bit hacky and overkill.
Current Approach
So, trying to be strict about “encode char codes greater than 255”, so, I tried looping every single character(/byte?) and only run rawurlencode()
if the result is 255
, and it works well:
<?php
/**
* Encode char codes greater than 255.
* @param string $s
* @return string
*/
function escape_link_header($s) {
$result = '';
foreach (mb_str_split($s) as $c) {
if(mb_ord($c) >= 255) {
$c = rawurlencode($c);
}
$result .= $c;
}
return $result;
}
echo escape_link_header('https://example.com/苗条');
So the code is strict and the result seems correct to me:
https://example.com/%E8%8B%97%E6%9D%A1
Premising that this sounds a bit esoteric to me, is this a correct approach in PHP?
Any other approach to “encode char codes greater than 255” to have a safe <...>
value for the Link
HTTP header?
Again, the goal is to “just” strictly sanitize the <...>
in this part:
header('Link: <https://example.com>; rel="preconnect"');
Thanks!