Comment Avoiding XSS and Drive-by scripting (Score 1) 82
The original article is right that there aren't any really new problems with AJAX apps, but they are different, so some mistakes are easier to make and some mistakes are harder.
Two of the mistakes that I've found easier to make with Ajax apps are XSS and drive-by-scripting attacks.
XSS attacks arise when a user can put text on your site that runs with the domain privileges of your site. Once they can do that, it's easy to steal user information, cookies, etc. and send it back, or make modifications to user data on their behalf.
The way to avoid XSS attacks is by properly escaping all text converted to html or javascript. The difficulty arises for two reasons
1) DOM manipulation is just too slow, so you end up having to generate html. Generating html safely requires keeping a lot of escaping conventions in mind.
2) Users often want to enter rich text, so you end up passing around some strings that are plain text and some that are (hopefully properly sanitized) html.
The first can be addressed by templating -- use something PHP or JSP like to generate your html. You can make it fast, and make it obvious what's escaped and what's not. It's not a silver bullet, but it makes the problem manageable. If you want to get fancy, you can make it smart enough to know that substitutions in HREFs should be encodeURIComponented, and substitutions in onClick should be escaped as javascript strings. Just make sure that people don't have to work around the system for those rare, but necessary cases where they know they've got safe content. It's much better for there to be clear, easily-audited exceptions to the rule than for people to have to bypass your system entirely to "fix" a critical bug late on Friday night.
The second problem is harder. A good static type system for javascript might allow different types for Strings that contain plain text vs. html vs. CSS vs. javascript. Then some simple static analysis could warn you when you add plain text to html to compose a string. Unfortunately, there aren't many such tools, so my recommendation is to use naming conventions. I know everyone hates anything that smacks of Hungarian notation, but do you find it easier to spot the bug in
var name = document.form.myform.elements.myinput.value; ...
nameNode.innerHTML = '<div>' + name + '</div>';
or
var nameAsText = document.form.myform.elements.myinput.value; ...
nameNode.innerHTML = '<div>' + nameAsText + '</div>';
In the second case, you're documenting what a variable contains, so someone reviewing the code has a hope of knowing what escaping schemes need to be applied when.
Drive-by-scripting attacks occur when someone malicious finds that you're using some neat JSON-like javascript to communicate between the browser and the server. It seems like a good idea and it is, because eval is fast, and you need every bit of speed because javascript interpreters are dog-slow.
And the cross-domain policies ensure that noone can do an xml-http-request and get back the javascript, and if they do, it's safe because it has no side-effect.
Unfortunately, anyone can write a script tag, and they can replace the Array constructor. So if your message contains an array, and they can trick your user into visiting their site, then you're screwed.
What do you do? Instead of sending over
[valuable, user data]
send over
while(1); [valuable, user, data]
That way, the constructor for the data they were trying to steal never gets executed, and no other javascript on their site executes, so hopefully the user realizes that that's not a good site to visit.
How does that help you? Well, since you're using XHR, and obeying the cross-domain policy like the fine upstanding citizen you are, you can just get the response text, and strip the "while(1);" from the front and go about your merry way.
Finally, a malicious site can try and make changes on your user's behalf, again by embedding script or image tags, which cause the browser to send requests with the user's cookies. This is only a problem for sites that user's keep open all the time, and that use cookies for authentication. It's also a problem that affects traditional web apps. The solution is that your AJAX client needs a secret that it can pass back with any requests (POSTs only please). That secret should match a cookie, and then your server can refuse any requests where the cookie and the cgi param don't match. It works because, although the attacker can cause the browser to make arbitrary requests, they don't have access to the user's cookies, and so can't craft a URL with a param that matches the cookie.
This should be done for all actions that change user data.
Some of these attacks may sound obscure and so only worthwhile against applications that manage valuable data, and some of them are only feasible for applications that users keep open most of the day, but you should hope that your application becomes one of those, so start thinking about it now :)
cheers,
mike
Two of the mistakes that I've found easier to make with Ajax apps are XSS and drive-by-scripting attacks.
XSS attacks arise when a user can put text on your site that runs with the domain privileges of your site. Once they can do that, it's easy to steal user information, cookies, etc. and send it back, or make modifications to user data on their behalf.
The way to avoid XSS attacks is by properly escaping all text converted to html or javascript. The difficulty arises for two reasons
1) DOM manipulation is just too slow, so you end up having to generate html. Generating html safely requires keeping a lot of escaping conventions in mind.
2) Users often want to enter rich text, so you end up passing around some strings that are plain text and some that are (hopefully properly sanitized) html.
The first can be addressed by templating -- use something PHP or JSP like to generate your html. You can make it fast, and make it obvious what's escaped and what's not. It's not a silver bullet, but it makes the problem manageable. If you want to get fancy, you can make it smart enough to know that substitutions in HREFs should be encodeURIComponented, and substitutions in onClick should be escaped as javascript strings. Just make sure that people don't have to work around the system for those rare, but necessary cases where they know they've got safe content. It's much better for there to be clear, easily-audited exceptions to the rule than for people to have to bypass your system entirely to "fix" a critical bug late on Friday night.
The second problem is harder. A good static type system for javascript might allow different types for Strings that contain plain text vs. html vs. CSS vs. javascript. Then some simple static analysis could warn you when you add plain text to html to compose a string. Unfortunately, there aren't many such tools, so my recommendation is to use naming conventions. I know everyone hates anything that smacks of Hungarian notation, but do you find it easier to spot the bug in
var name = document.form.myform.elements.myinput.value;
nameNode.innerHTML = '<div>' + name + '</div>';
or
var nameAsText = document.form.myform.elements.myinput.value;
nameNode.innerHTML = '<div>' + nameAsText + '</div>';
In the second case, you're documenting what a variable contains, so someone reviewing the code has a hope of knowing what escaping schemes need to be applied when.
Drive-by-scripting attacks occur when someone malicious finds that you're using some neat JSON-like javascript to communicate between the browser and the server. It seems like a good idea and it is, because eval is fast, and you need every bit of speed because javascript interpreters are dog-slow.
And the cross-domain policies ensure that noone can do an xml-http-request and get back the javascript, and if they do, it's safe because it has no side-effect.
Unfortunately, anyone can write a script tag, and they can replace the Array constructor. So if your message contains an array, and they can trick your user into visiting their site, then you're screwed.
What do you do? Instead of sending over
[valuable, user data]
send over
while(1); [valuable, user, data]
That way, the constructor for the data they were trying to steal never gets executed, and no other javascript on their site executes, so hopefully the user realizes that that's not a good site to visit.
How does that help you? Well, since you're using XHR, and obeying the cross-domain policy like the fine upstanding citizen you are, you can just get the response text, and strip the "while(1);" from the front and go about your merry way.
Finally, a malicious site can try and make changes on your user's behalf, again by embedding script or image tags, which cause the browser to send requests with the user's cookies. This is only a problem for sites that user's keep open all the time, and that use cookies for authentication. It's also a problem that affects traditional web apps. The solution is that your AJAX client needs a secret that it can pass back with any requests (POSTs only please). That secret should match a cookie, and then your server can refuse any requests where the cookie and the cgi param don't match. It works because, although the attacker can cause the browser to make arbitrary requests, they don't have access to the user's cookies, and so can't craft a URL with a param that matches the cookie.
This should be done for all actions that change user data.
Some of these attacks may sound obscure and so only worthwhile against applications that manage valuable data, and some of them are only feasible for applications that users keep open most of the day, but you should hope that your application becomes one of those, so start thinking about it now
cheers,
mike