I noted my dev team is using some xhtml links that pull something from w3.org within an application as it runs. Is this practice standard and/or secure/safe?
They are referencing the following URLs:
w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd
w3.org/1999/xhtml
4
First off, yes, http://www.w3.org/ is a completely upstanding site that is the repository of the documentation and specifications for html, xml, xhtml, and the like.
Next, you have a significant misunderstanding about what is actually in the xhtml files.
From the documentation on xhtml (notice the link):
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html
PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
<title>Virtual Library</title>
</head>
<body>
<p>Moved to <a href="http://example.org/">example.org</a>.</p>
</body>
</html>
The xmlns
is a name space. It is closely related to name spaces in xml (again, notice the link). This is a way of grouping the tags and showing what definition they follow.
This brings us to the dtd. A dtd is a document type definition. It says “if you want to verify that this document is formed correctly, look at this document.” As xhtml is a standard maintained by the World Wide Web Consortium (w3c), it is hosted on their servers. You could host it on your servers, but seriously, no one ever looks – the important parts are built in to web browsers already.
Nothing is pulled at any time with these statements. You can verify this by setting up a proxy such as fiddler or wireshark and watching all of the traffic when requesting the page – see if anything is pulled from http://www.w3.org/ or not (it isn’t).
0
These URLs are references to XML namespaces. These URLs define what XML tags are allowed in the document. Every valid XML document needs to include such namespace URLs. When you are using a standardized XML format, like XHTML1-transitional, you should reference the XML namespaces of the standardization body which defined them, in this case the World Wide Web Consortium.
Most HTML parsers will not download the standard definitions from the provided URLs, because they usually already have them. They will only do so when you provide a non-standard URL, so moving the definitions to your own domain would be a much worse solution.
0