Http Overload
An article (http://www.w3.org/blog/systeam/2008/02/08/w3c_s_excessive_dtd_traffic) posted on the W3C team
blog by Ted Guild on February 8, 2008 describes a significant problem that is easily preventable if developers exercise
due diligence. Guild explains that software applications wrongly attempt to access http URIs with the result of
excessive and uncecessary load on W3C servers. Guild gives the following two examples:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" ...>
Guild points out that "these are not hyperlinks...." and further that "software does not usually need to fetch these
resources." He offers several suggestions to developers on how to prevent this problem, but does not give any code
examples.
Microsoft developers using the XmlReader class to parse xml simply have to add the following two lines to their code to
prevent the XmlReader from accessing URIs referenced in a DTD declaration:
settings.ProhibitDtd = false;
settings.XmlResolver = null;
where settings is an instance of the XmlReaderSettings class. Assigning a false value toProhibitDtd will prevent the
XmlReader from throwing an exception when a DTD reference is encountered, whereas setting it true will, which will
cause the reader to abort further parsing. Setting the XmlResolver to null causes the reader to ignore the externally
referenced DTD. This allows the reader to parse the entire xml document without accessing an externally referenced DTD.
Guild states "Yet we receive a surprisingly large number of requests for such resources: up to 130 million requests per
day, with periods of sustained bandwidth usage of 350Mbps, for resources that haven't changed in years." Writing a
followup comment to his own article, Guild states on June 15, 2009 "Java based applications and libraries are
presently accounting for nearly 1/4th of our DTD traffic (in the hundred of millions a day). There is also another more
substantial source of traffic which the vendor is working to correct in the hopefully near future."
Visit the above referenced blog to get the latest developments on this issue.
Submitted by Bill Conniff, Founder of Xponent, on October 23, 2009