-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 __________________________________________________________ The U.S. Department of Energy Computer Incident Advisory Capability ___ __ __ _ ___ / | /_\ / \___ __|__ / \ \___ __________________________________________________________ TECHNICAL BULLETIN Understanding Cross-Site Scripting (XSS) June 3, 2008 18:00 GMT Number CIACTech08-003 ______________________________________________________________________________ PROBLEM: Cross-Site Scripting has become an increasingly prevalent attack vector that can be leveraged to perform a wide range of compromises. Web site developers need to understand the problem of cross-site scripting to reduce the likelihood that their servers will be used to compromise their users. PLATFORM: All ABSTRACT: Cross-Site Scripting has become an increasingly prevalent attack vector that can be leveraged to perform a wide range of compromises. These compromises can range from simple popup displays within a user's browser to session and cookie capture that are used for information and identity theft. As these attacks become more mature, as well as obscure, it is imperative that we understand how they happen, how they propagate, and the ways to prevent them. By understanding the different vectors of attack and realizing and implementing simple security measures against them, we can better protect ourselves and our users now, and in the future. ______________________________________________________________________________ LINKS: CIAC BULLETIN: http://www.ciac.org/ciac/techbull/CIACTech08-003.shtml OTHER LINKS: HTTP Cookies: http://en.wikipedia.org/wiki/HTTP_cookie Samy (XSS) Worm: http://en.wikipedia.org/wiki/Samy_(XSS) Htmlspecialchars() function: http://us.php.net/htmlspecialchars Htmlentities() Function:. http://us3.php.net/htmlentities PHP mysql_real_escape_string() Function.: http://www.w3schools.com/php/func_mysql_ real_escape_string.asp What is SQL Injection?: http://www.cgisecurity.com/questions/sql.shtml ASCII Table and Description: http://www.asciitable.com Disabling Scripts: http://www.cert.org/tech_tips/malicious_code_FAQ.html NoScript: http://noscript.net/ Gilby Productions. TinyURL!: http://tinyurl.com/#example ______________________________________________________________________________ Introduction ============ The term Cross-Site Scripting (also known as "XSS") refers to the act of a user injecting malicious code into a web page in order to perform some sort of exploit. This term is not to be confused with "CSS", which refers to Cascading Style Sheets - a completely unrelated topic. XSS exploits the web browser's (as well as the user's) trust that the current web page they are viewing is considered to be safe in order to download and/or run the malicious code that has been embedded within the trusted web page. The effects of this vulnerability can range from simple information compromise via Cookie retrieval to complete identity theft via session hijacking, all while being completely transparent to the end users. Types of Cross-Site Scripting ============================= There are several XSS attack vectors that are used to run the malicious code. Each of these vectors is equally potent and allows an attacker to custom tailor an injection to the specific vulnerabilities found within the web page. I. DOM-Based - ----------------- DOM stands for Document Object Model, a standard set forth by W3C to allow programs and scripts to dynamically access and modify content within documents. For HTML, this means that this technique can be used to modify content within the page, on the fly, based on instructions retrieved from an external source, such as the end user. Programmers may or may not choose to use this, however the DOM is actually required by JavaScript in order to successfully (dynamically) read or write to a web page. It provides the scripts with several objects such as events, event changes, and variable values, namely the URL, which it can parse and execute according to its given parameters and values. The allowance for this dynamic execution of code plays the main role in this type of XSS attack. In this attack vector, the problem lies within the local scope of the page's client-side script. Typically, a malicious user simply appends code to a variable within a URL string that links to the vulnerable web page. That code gets executed in line with the valid web page. This type of code can be added to any URL that links to the vulnerable web page such as a link on another website or one in an html formatted e-mail message and is usually obfuscated to hide the appended code. Therefore, by the DOM allowing for retrieval of variables from the URL, a vulnerable web page will extract and run this additional code. What the code does is limited by what JavaScript is allowed to do so a script cannot normally write to disk or memory. However, it does have access to everything on the web page plus any cookies saved there by the legitimate site. For example, let's consider the following example of the HTML code residing on the server for a web page: Welcome To Our Safe Site Welcome, A safe and typical web page address may look like the following: http://www.safe.site/navigate.php?user=Bob Upon interpretation of the above URL by the browser, the DOM supplies the JavaScript within the web page with the value of the variable "user" in order to display the message of "Welcome, Bob", all while being completely transparent to the user. For example, the HTML source after processing this URL would look as such: Welcome To Our Safe Site Welcome, Bob However, just as this provides much convenience for the programmer, it also provides an opportunity for attackers to insert their own malicious code to be interpreted in place of the user's name value. Suppose a malicious user executed the following URL within the browser: http://www.safe.site/navigate.php?user= The DOM sends this "" as the value for the variable "user_name" and will instead write this code to the HTML page. This new script is also executed by the web browser, changing the purpose of this variable's value from simple display to execution. Now, all this script does is pop up a window with your Cookie information for the site, but it could be worse. For example, imagine that this web page was the login to your bank and the script changes the URL to which your login information is sent from the bank to the intruder's site. This type of exploit allows any JavaScript code to be run within the page with the local user's permissions and local execution privileges (i.e. executing the code and extracting information residing on the local machine) within the limits of what JavaScript is allowed to do. II. Non-Persistent - ---------------------- This type of XSS exploit known as "Non-Persistent", or "Reflected", is by far the most common method found today in the majority of Phishing emails and is similar to the DOM-based method. It also involves a malicious URL string that might involve a little social engineering in order to get the user to navigate to, and execute, it within their browser. However, the malicious code and URL's involved in these types of attacks are actually submitted to the hosting server and executed on the server with the server's privileges, as compared to the DOM-based attacks which are executed locally on the user's system. These attacks allow for much greater information compromise, as the server (depending upon what it is hosting) may contain many people's personal information that could possibly be retrieved. For example, suppose the site www.safebank.com has a main login page of www.safebank.com/index.php?page=Login . What this URL tells us is that the website is set up to use the "page" variable to determine which page you are trying to navigate to, and thus which page it should display. If the programmer has not properly coded this script to sanitize the input string (i.e. making sure that the "page" variable you are sending is actually a valid page), then it could be vulnerable to this type of attack. In such an instance, let's say the user Alice has an account at SafeBank. She navigates to the home page and logs in to check her balance for the day. In doing so, the server creates a Cookie (possibly containing personal information, but in this case we will simplify it and assume it does not, and it is used purely for ID purposes) for her and returns it to her system, which then is stored for later use. Now, as long as Alice does not log out or close her browser (i.e. as long as she does not invalidate the Cookie, and as long as it is not expired), every time she navigates to another page within the website, the Cookie is sent back to the bank's server along with each request for a new page, proving that she is a valid user and is allowed to navigate through her personal account without having to enter her login credentials again. While being incredibly convenient, this acts as the catalyst for a Non-Persistent XSS attack. Now, let's assume a malicious user, Ken, knows about this bank's flaw in the way it retrieves and displays web pages. He can use this knowledge, along with the fact that Alice's computer contains a valid Cookie, in order to retrieve personal information about her, and her account, stored on the server. How would he do that? Well, the prime method for doing so tends to be Phishing emails. All Ken has to do is send an email to Alice, posing as SafeBank, and somehow coerce her into the clicking the specially crafted URL contained within the message. An example email could be as follows: From: SafeBank To: Alice Dear Customer, SafeBank has recently observed some questionable online activity within your account. In order to be safe, please click on the following URL to verify your login credentials and change your password. http://www.safebank.com/index.php?page=PasswordChange This URL looks exactly the same as the one Alice sees when she regularly logs in to the bank's website. However, within this link, although obfuscated, is embedded code such as JavaScript. If one simply hovered their mouse over the link, they would see something more along these lines: http://www.safebank.com/index.php?page= *Note that this code is not often in plain text and is often obfuscated via some type of encoding so that its construction and purpose are not initially clear to the user observing it When Alice clicks on this link, the URL as well as her Cookie is sent to the bank's server. The server receives the Cookie and determines that she is a valid user for the system, has already provided the proper login credentials, and still has a valid session. Now that authentication is taken care of, the server provides any requested information residing on the server to the web page. Additionally, since it trusts that the page variable is a valid page, it processes whatever the variable contains inline with the HTML output to Alice's screen and executes the embedded JavaScript (this time on the server, not on Alice's computer) as if Alice had written it herself (because after all, she did send the request). Now, depending on what the JavaScript code's intention is, all of her personal information can be retrieved. The JavaScript could contain code to send all of her person information retained on the bank's server (such as name, SSN, address, bank account numbers, balances, etc.) back to a remote server where Ken is just sitting and collecting all of this information for later (mis)use. All of this has happened without Alice suspecting a thing. The web page looks completely legitimate (as it should since it is the actual bank's web page!), appearing no different than before (due to the fact that the JavaScript code is invisibly executed on the server). Alice continues using the valid web page to change her password, never realizing what has occurred. III. Persistent - ------------------ A "Persistent" (also referred to as "Stored" or "Second-Order") XSS attack is considered to be the most powerful method of compromise. Such an attack doesn't necessarily need any type of social engineering and has a much longer lifetime than the others. An attacker simply injects the scripts into a vulnerable server once and has it propagate and execute in all users' browsers who retrieve and display the infected page. Consider the infamous case of the MySpace "Samy" (JS.SpaceHero) XSS Worm. This is a prime example of a persistent XSS attack. For anyone not familiar, MySpace is a popular social networking site that allows registered users to post social information on their web pages, as well as post messages on others' web pages (in HTML format, nonetheless). In submitting and posting these messages, the data was not properly sanitized, thus allowing users to include JavaScript and have it execute upon viewing. This allowed for an incredible chain reaction of events to ensue. The code only had to be submitted to one person's page. When anyone viewed the posted message, the included code executed, grabbed their cookie information and sent the same code to all of the viewer's friends (which the MySpace server accepted as valid because it used valid cookies that were extracted from each person's computer), all the while being completely invisible to the user. Then, anyone who viewed the messages sent to those friends got infected and so on. You can see the pattern here. It continued on exponentially until it was finally caught and quarantined 24 hours later, upon which it had already propagated to 1 million people. All it took was one single submission to a website that stored the malicious code on the server and posted it to anyone (unknowingly) requesting it. The attack could have easily been much worse. If this user had included code to steal all of each user's cookies, the attacker could have submitted even more false (but yet valid to the server) requests to many different sites with these stolen credentials (known as the method of Cross Site Request Forgery which will be explained later). This is why this type of attack is the most powerful of all. There is no need to repetitively hide, submit, and execute code within a user's local browser window, and best of all, you get an exponential return on the small amount of code provided. IV. Cross-Site Request Forgery - ---------------------------------- In the previous example of "Persistent" XSS attacks, we used the scenario of an attacker extracting and using logged in users' Cookie credentials to submit valid requests on their behalf to the hosting server. However, consider the situation in which a user has logged into their bank account, email account, and logged in and bought things from various online vendors without ending any of these sessions (i.e. not logging out). Now, they have Cookies residing on their system for each of these sites, which allows the user to return to each one without having to provide login credentials again for a given period of time, or until they logout and thus erase these Cookies. Suppose the user decides to log into MySpace now to check his friend's profile, which is hosting the following "Persistent" XSS code: Without the user knowing it, their web browser has made a request to a completely unrelated server (in this case their bank), using their own login credentials in the Cookie residing on their system (so it's completely valid), and performed actions that the user has not explicitly verified (transferred money from their account to the attacker's). This type of attack opens up an unlimited realm of attack for malicious code. Not only can the code leverage attacks against the residing server, but also to any other server of their choosing, provided they know how to formulate a proper request and the unsuspecting user still has a valid session remaining with the given site (*Note that this is a non-subtle hint to those who stay logged in to their email accounts all day and those who don't click the "logout" button once they are finished with an online transaction or session). Methods of Protection ===================== Security begins at the programming level. A website is only secure as its code. Therefore, the code should be the first focus of effort. For example, many websites are using the PHP and mySQL combination due to its widespread availability, lack of cost, and ease of use. However, PHP in its default configuration does not sanitize user input. This responsibility lies solely in the hands of the programmer. By simply utilizing the methods of htmlspecialchars()(or htmlentities()) and mysql_real_escape_string(), programmers can protect against stealthy SQL Injection attacks (which were not covered in this paper) as well as XSS attacks. These two methods properly sanitize user input and prevent these attacks from happening by translating HTML special characters to encoded values that are simply displayed and not executed. If you recall any given sample of XSS previously described, you will see that in order for them to be interpreted and executed by a browser, they need to be in the proper format (as with any code). The format for HTML scripting within a web page happens to begin with the "" tag. Therefore, in order for the code to run, it would most likely contain the HTML special characters of "<", ">", and might contain "(" and ")" for certain method calls. Without these specific characters, the browser cannot interpret the text as code and thus cannot execute it. Removing or encoding any small part of these tags renders the code useless. Therefore, at the very least, input retrieval and submission from a web page should be coded to have the "<", ">", "(", ")", "'", """, and "&" characters filtered out or HTML encoded to "<", ">", "(", ")", "'", """, and "&" before being processed by the server. This invalidates the given XSS samples, converting "