distributed.net Faq-O-Matic: Why is arbitrary HTML no longer allowed in team or participant mottos?

distributed.net Faq-O-Matic :

Statistics, Graphs, and other Statsbox things :

Why is arbitrary HTML no longer allowed in team or participant mottos?

In January 2002, we had to instate a new policy that prohibited the use of arbitrary HTML in the customized team and participant "motto" sections of our website. http://cgi.distributed.net/cgi/planarc.cgi?user=bovine&plan=2002-01-22.20:52

Although it was not something we wanted to do, without restrictions on content, it was possible for malicious javascript to execute and leverage your logged-in team and participant password to alter your membership, including switching you onto different teams. Allowing arbitrary HTML or Javascript to execute from within the domain of *.distributed.net could also allow the compromization of secure cookies used by the stats site to remember your login. As you can imagine, this is a significant vulnerability and we need to treat security issues seriously.

Attempting to filtering general HTML is what a number of web-mail providers (like Hotmail, YahooMail, and others) attempt to do today, but if you've kept up on monitoring security vulnerabilities mailing lists (such as BugTraq or Vuln-Dev), you're probably aware of incident after incident of new tags being discovered that needed to also be filtered. These new discoveries have continued to occur every few months for the last several years, ever since the popularity of web-based mail took off. E-Bay also attempts to allow auctioneers to utilize HTML for their product descriptions, but they also need to filter malicious techniques that can be used to automatically submit bids and such. There are many other instances of websites that need to employ similar tactics.

There are many ways to include malicious javascript in arbitrary HTML: you can attach OnLoad/OnOver/OnFocus events to most HTML elements. Cascading style sheets can be used to add extended behaviors to other types. The "javascript:" protocol can be used anyplace an URL can be referenced. Inline <script ...> blocks can be used. Plugins/ActiveX can be invoked through a variety of techniques including the <embed ...> or <object ...> tag and potentially invoke holes that exist in those plugins. Frames can be created with <iframe ...>, <frame ...>, <layer ...>, or <span ...> that reference other websites and load unsafe code that can manipulate the main page's namespace through webbrowser bugs. It's possible to use <meta http...> to trigger refreshes to other locations or submit automatic responses to other pages on our website. Other URL protocols, such as "file:" or "money:" or "telnet:" can be used in some cases to reference local resources through some types of browser bugs.

An additional part of the difficulty is that web browsers are designed to be extremely tolerant of HTML that they parse in the interests of being most compatible (but each browser is tolerant in different ways). Different browsers allow different encodings throughout their documents. For example, "<script%20language= javascript>", "<script >", "< script>", and hex-encoded, mixed-case, and unicode/foreign character sets provide equivalent ways of expressing the same thing.

Two well-written security advisories on this type of exploit are available from CERT:
http://www.cert.org/advisories/CA-2000-02.html
http://www.cert.org/tech_tips/malicious_code_mitigation.html

In summary, it is an ongoing job to maintain filters that attempt to explicitly exclude "known unsafe" activities, and we do not have the resources or desire to maintain such filtering technologies. It's far easier to do the opposite and create a filter that only allows "known safe" activities. Although it would be possible to try to construct a filter that allows good-HTML to be entered (but be very aggressive and only allow "known safe" forms), it's a little easier to create your own variant that does not have any of the operational expectations that HTML has. Attempting to filter HTML also has the problem of inadvertantly breaking "safe" HTML that simply does not conform to the strict filtering rules that would need to be used (after all, there are innumerable ways of expressing the same effect).

By creating our own markup language that is extremely limited and extremely strict in its allowances, there is far less ambiguity over why something isn't working or what is allowed. In actuality, the markup language that was chosen to be implemented is intentionally similar to the markup language used by UBB/phpBB/vBulletin and a number of other bulletin-board packages available today, so as to reduce the "non-standardness" of our technique. The available tags are shown on the team and participant configuration pages.

This document is: http://faq.distributed.net/?file=268

	[Search]	[Appearance]		[Show Expert Edit Commands]
This is a Faq-O-Matic 2.721.test.