-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

             __________________________________________________________

                       The U.S. Department of Energy
                   Computer Incident Advisory Capability
                           ___  __ __    _     ___
                          /       |     /_\   /
                          \___  __|__  /   \  \___
             __________________________________________________________
 
                              TECHNICAL BULLETIN 
 
                    Understanding Cross-Site Scripting (XSS)

June 3, 2008 18:00 GMT                                   Number CIACTech08-003
______________________________________________________________________________
PROBLEM:       Cross-Site Scripting has become an increasingly prevalent 
               attack vector that can be leveraged to perform a wide range of 
               compromises. Web site developers need to understand the problem 
               of cross-site scripting to reduce the likelihood that their 
               servers will be used to compromise their users. 
PLATFORM:      All 
ABSTRACT:      Cross-Site Scripting has become an increasingly prevalent 
               attack vector that can be leveraged to perform a wide range of 
               compromises. These compromises can range from simple popup 
               displays within a user's browser to session and cookie capture 
               that are used for information and identity theft. As these 
               attacks become more mature, as well as obscure, it is 
               imperative that we understand how they happen, how they 
               propagate, and the ways to prevent them. By understanding the 
               different vectors of attack and realizing and implementing 
               simple security measures against them, we can better protect 
               ourselves and our users now, and in the future. 
______________________________________________________________________________
LINKS: 
 CIAC BULLETIN:      http://www.ciac.org/ciac/techbull/CIACTech08-003.shtml 
 OTHER LINKS:        HTTP Cookies: http://en.wikipedia.org/wiki/HTTP_cookie 
                     Samy (XSS) Worm: http://en.wikipedia.org/wiki/Samy_(XSS) 
                     Htmlspecialchars() function: 
                     http://us.php.net/htmlspecialchars Htmlentities() 
                     Function:. http://us3.php.net/htmlentities PHP 
                     mysql_real_escape_string() Function.: 
                     http://www.w3schools.com/php/func_mysql_
                        real_escape_string.asp 
                     What is SQL Injection?: 
                     http://www.cgisecurity.com/questions/sql.shtml ASCII 
                     Table and Description: http://www.asciitable.com 
                     Disabling Scripts: 
                     http://www.cert.org/tech_tips/malicious_code_FAQ.html 
                     NoScript: http://noscript.net/ Gilby Productions. 
                     TinyURL!: http://tinyurl.com/#example 
______________________________________________________________________________

Introduction
============

The term Cross-Site Scripting (also known as "XSS") refers to the act of a user
injecting malicious code into a web page in order to perform some sort of
exploit. This term is not to be confused with "CSS", which refers to Cascading
Style Sheets - a completely unrelated topic. XSS exploits the web browser's (as
well as the user's) trust that the current web page they are viewing is
considered to be safe in order to download and/or run the malicious code that
has been embedded within the trusted web page. The effects of this
vulnerability can range from simple information compromise via Cookie retrieval
to complete identity theft via session hijacking, all while being completely
transparent to the end users.

Types of Cross-Site Scripting
=============================

There are several XSS attack vectors that are used to run the malicious code.
Each of these vectors is equally potent and allows an attacker to custom tailor
an injection to the specific vulnerabilities found within the web page.

I.	DOM-Based
- -----------------

DOM stands for Document Object Model, a standard set forth by W3C to allow
programs and scripts to dynamically access and modify content within documents.
For HTML, this means that this technique can be used to modify content within
the page, on the fly, based on instructions retrieved from an external source,
such as the end user. Programmers may or may not choose to use this, however
the DOM is actually required by JavaScript in order to successfully
(dynamically) read or write to a web page. It provides the scripts with several
objects such as events, event changes, and variable values, namely the URL,
which it can parse and execute according to its given parameters and values.
The allowance for this dynamic execution of code plays the main role in this
type of XSS attack.

In this attack vector, the problem lies within the local scope of the page's
client-side script. Typically, a malicious user simply appends code to a
variable within a URL string that links to the vulnerable web page. That code
gets executed in line with the valid web page. This type of code can be added
to any URL that links to the vulnerable web page such as a link on another
website or one in an html formatted e-mail message and is usually obfuscated to
hide the appended code. Therefore, by the DOM allowing for retrieval of
variables from the URL, a vulnerable web page will extract and run this
additional code. What the code does is limited by what JavaScript is allowed to
do so a script cannot normally write to disk or memory. However, it does have
access to everything on the web page plus any cookies saved there by the
legitimate site.

For example, let's consider the following example of the HTML code residing on
the server for a web page:

  <HTML>
  <TITLE>Welcome To Our Safe Site</TITLE>
  Welcome, <!-- This is where the user's name will be displayed -->
  <SCRIPT>

  /* This stores the user's name from the URL string */
  var user_name = document.URL.indexOf("user=")+4;

  /* This writes the retrieved user's name to the page */
  document.write(document.URL.substring(user_name,document.URL.length));

  </SCRIPT>
  </HTML>

A safe and typical web page address may look like the following:

  http://www.safe.site/navigate.php?user=Bob 

Upon interpretation of the above URL by the browser, the DOM supplies the
JavaScript within the web page with the value of the variable "user" in order
to display the message of "Welcome, Bob", all while being completely
transparent to the user. For example, the HTML source after processing this URL
would look as such:

  <HTML>
  <TITLE>Welcome To Our Safe Site</TITLE>
  Welcome, Bob
  </HTML>

However, just as this provides much convenience for the programmer, it also
provides an opportunity for attackers to insert their own malicious code to be
interpreted in place of the user's name value. Suppose a malicious user
executed the following URL within the browser:

  http://www.safe.site/navigate.php?user=<script>alert(document.cookie)   
  </script> 

The DOM sends this "<script>alert(document.cookie)</script>" as the value for
the variable "user_name" and will instead write this code to the HTML page.
This new script is also executed by the web browser, changing the purpose of
this variable's value from simple display to execution. Now, all this script
does is pop up a window with your Cookie information for the site, but it could
be worse. For example, imagine that this web page was the login to your bank
and the script changes the URL to which your login information is sent from the
bank to the intruder's site. This type of exploit allows any JavaScript code to
be run within the page with the local user's permissions and local execution
privileges (i.e. executing the code and extracting information residing on the
local machine) within the limits of what JavaScript is allowed to do.

II.	Non-Persistent
- ----------------------

This type of XSS exploit known as "Non-Persistent", or "Reflected", is by far
the most common method found today in the majority of Phishing emails and is
similar to the DOM-based method. It also involves a malicious URL string that
might involve a little social engineering in order to get the user to navigate
to, and execute, it within their browser. However, the malicious code and URL's
involved in these types of attacks are actually submitted to the hosting server
and executed on the server with the server's privileges, as compared to the
DOM-based attacks which are executed locally on the user's system.

These attacks allow for much greater information compromise, as the server
(depending upon what it is hosting) may contain many people's personal
information that could possibly be retrieved. For example, suppose the site
www.safebank.com has a main login page of www.safebank.com/index.php?page=Login
. What this URL tells us is that the website is set up to use the "page"
variable to determine which page you are trying to navigate to, and thus which
page it should display. If the programmer has not properly coded this script to
sanitize the input string (i.e. making sure that the "page" variable you are
sending is actually a valid page), then it could be vulnerable to this type of
attack.

In such an instance, let's say the user Alice has an account at SafeBank. She
navigates to the home page and logs in to check her balance for the day. In
doing so, the server creates a Cookie (possibly containing personal
information, but in this case we will simplify it and assume it does not, and
it is used purely for ID purposes) for her and returns it to her system, which
then is stored for later use. Now, as long as Alice does not log out or close
her browser (i.e. as long as she does not invalidate the Cookie, and as long as
it is not expired), every time she navigates to another page within the
website, the Cookie is sent back to the bank's server along with each request
for a new page, proving that she is a valid user and is allowed to navigate
through her personal account without having to enter her login credentials
again. While being incredibly convenient, this acts as the catalyst for a
Non-Persistent XSS attack.

Now, let's assume a malicious user, Ken, knows about this bank's flaw in the
way it retrieves and displays web pages. He can use this knowledge, along with
the fact that Alice's computer contains a valid Cookie, in order to retrieve
personal information about her, and her account, stored on the server. How
would he do that? Well, the prime method for doing so tends to be Phishing
emails. All Ken has to do is send an email to Alice, posing as SafeBank, and
somehow coerce her into the clicking the specially crafted URL contained within
the message. An example email could be as follows:

  From: SafeBank
  To: Alice

  Dear Customer,

  SafeBank has recently observed some questionable online activity
  within your account. In order to be safe, please click on the following 
  URL to verify your login credentials and change your password. 
	
  	http://www.safebank.com/index.php?page=PasswordChange

This URL looks exactly the same as the one Alice sees when she regularly logs
in to the bank's website. However, within this link, although obfuscated, is
embedded code such as JavaScript. If one simply hovered their mouse over the
link, they would see something more along these lines:

 
http://www.safebank.com/index.php?page=<script>javascript:var...some...more... 
  code... to...  execute... navigate to real PasswordChange page</script> 

*Note that this code is not often in plain text and is often obfuscated via
some type of encoding so that its construction and purpose are not initially
clear to the user observing it

When Alice clicks on this link, the URL as well as her Cookie is sent to the
bank's server. The server receives the Cookie and determines that she is a
valid user for the system, has already provided the proper login credentials,
and still has a valid session. Now that authentication is taken care of, the
server provides any requested information residing on the server to the web
page. Additionally, since it trusts that the page variable is a valid page, it
processes whatever the variable contains inline with the HTML output to Alice's
screen and executes the embedded JavaScript (this time on the server, not on
Alice's computer) as if Alice had written it herself (because after all, she
did send the request). Now, depending on what the JavaScript code's intention
is, all of her personal information can be retrieved. The JavaScript could
contain code to send all of her person information retained on the bank's
server (such as name, SSN, address, bank account numbers, balances, etc.) back
to a remote server where Ken is just sitting and collecting all of this
information for later (mis)use.

All of this has happened without Alice suspecting a thing. The web page looks
completely legitimate (as it should since it is the actual bank's web page!),
appearing no different than before (due to the fact that the JavaScript code is
invisibly executed on the server). Alice continues using the valid web page to
change her password, never realizing what has occurred.

III.	Persistent
- ------------------

A "Persistent" (also referred to as "Stored" or "Second-Order") XSS attack is
considered to be the most powerful method of compromise. Such an attack doesn't
necessarily need any type of social engineering and has a much longer lifetime
than the others. An attacker simply injects the scripts into a vulnerable
server once and has it propagate and execute in all users' browsers who
retrieve and display the infected page.

Consider the infamous case of the MySpace "Samy" (JS.SpaceHero) XSS Worm. This
is a prime example of a persistent XSS attack. For anyone not familiar, MySpace
is a popular social networking site that allows registered users to post social
information on their web pages, as well as post messages on others' web pages
(in HTML format, nonetheless). In submitting and posting these messages, the
data was not properly sanitized, thus allowing users to include JavaScript and
have it execute upon viewing. This allowed for an incredible chain reaction of
events to ensue.

The code only had to be submitted to one person's page. When anyone viewed the
posted message, the included code executed, grabbed their cookie information
and sent the same code to all of the viewer's friends (which the MySpace server
accepted as valid because it used valid cookies that were extracted from each
person's computer), all the while being completely invisible to the user. Then,
anyone who viewed the messages sent to those friends got infected and so on.
You can see the pattern here. It continued on exponentially until it was
finally caught and quarantined 24 hours later, upon which it had already
propagated to 1 million people.

All it took was one single submission to a website that stored the malicious
code on the server and posted it to anyone (unknowingly) requesting it. The
attack could have easily been much worse. If this user had included code to
steal all of each user's cookies, the attacker could have submitted even more
false (but yet valid to the server) requests to many different sites with these
stolen credentials (known as the method of Cross Site Request Forgery which
will be explained later). This is why this type of attack is the most powerful
of all. There is no need to repetitively hide, submit, and execute code within
a user's local browser window, and best of all, you get an exponential return
on the small amount of code provided.

IV.	Cross-Site Request Forgery
- ----------------------------------

In the previous example of "Persistent" XSS attacks, we used the scenario of an
attacker extracting and using logged in users' Cookie credentials to submit
valid requests on their behalf to the hosting server. However, consider the
situation in which a user has logged into their bank account, email account,
and logged in and bought things from various online vendors without ending any
of these sessions (i.e. not logging out). Now, they have Cookies residing on
their system for each of these sites, which allows the user to return to each
one without having to provide login credentials again for a given period of
time, or until they logout and thus erase these Cookies. Suppose the user
decides to log into MySpace now to check his friend's profile, which is hosting
the following "Persistent" XSS code:

  <script>
  <img src="http://your.bank.com/withdraw?fromAccount=Bob&toAccount=
      Haxxor... &moneyAmount=10000&valid=true">
  </script>

Without the user knowing it, their web browser has made a request to a
completely unrelated server (in this case their bank), using their own login
credentials in the Cookie residing on their system (so it's completely valid),
and performed actions that the user has not explicitly verified (transferred
money from their account to the attacker's). This type of attack opens up an
unlimited realm of attack for malicious code. Not only can the code leverage
attacks against the residing server, but also to any other server of their
choosing, provided they know how to formulate a proper request and the
unsuspecting user still has a valid session remaining with the given site
(*Note that this is a non-subtle hint to those who stay logged in to their
email accounts all day and those who don't click the "logout" button once they
are finished with an online transaction or session).

Methods of Protection
=====================

Security begins at the programming level. A website is only secure as its code.
Therefore, the code should be the first focus of effort. For example, many
websites are using the PHP and mySQL combination due to its widespread
availability, lack of cost, and ease of use. However, PHP in its default
configuration does not sanitize user input. This responsibility lies solely in
the hands of the programmer. By simply utilizing the methods of
htmlspecialchars()(or htmlentities()) and mysql_real_escape_string(),
programmers can protect against stealthy SQL Injection attacks (which were not
covered in this paper) as well as XSS attacks. These two methods properly
sanitize user input and prevent these attacks from happening by translating
HTML special characters to encoded values that are simply displayed and not
executed.

If you recall any given sample of XSS previously described, you will see that
in order for them to be interpreted and executed by a browser, they need to be
in the proper format (as with any code). The format for HTML scripting within a
web page happens to begin with the "<script>" tag and end with the "</script>"
tag. Therefore, in order for the code to run, it would most likely contain the
HTML special characters of "<", ">", and might contain "(" and ")" for certain
method calls. Without these specific characters, the browser cannot interpret
the text as code and thus cannot execute it. Removing or encoding any small
part of these tags renders the code useless. Therefore, at the very least,
input retrieval and submission from a web page should be coded to have the "<",
">", "(", ")", "'", """, and "&" characters filtered out or HTML encoded to
"<", ">", "(", ")", "'", """, and "&" before being processed by the server.
This invalidates the given XSS samples, converting "<script>" to "script" (or
"<script>") and "alert(document.cookie)" to "alert(document.cookie$#41;",
preventing any of it from being interpreted as a proper HTML script.

However, just a simple conversion isn't always enough. Attackers are getting
smarter and realizing how to trick these methods into thinking it is not
malicious code. For example, a method to sanitize input might be looking for
the HTML special character "<", of which should never be submitted in any
circumstance. However, by converting the "<" to the Hexadecimal equivalent of
"%3C", this text is not sanitized and is passed on to the browser to execute.
Well browsers understand Hexadecimal input, so the "%3c" is treated just like
it was a "<" (in ASCII format), rendering the full "<script>" argument within
the HTML code and thus completely bypassing this security measure. Input
sanitizing procedures need to also look for and encode these equivalent
character encodings.

Excessively long time periods of Cookie validity and lack of user credential
re-verification upon return to a site are just a few things that allow for
Cross-Site Request Forgery. In order to lessen the occurrence of these,
programmers should incorporate the use of unique ID's, or "tokens", with the
Cookie code that are only valid for a short time period (until the user
navigates away from the page or a specified fixed time value, thus not
necessarily requiring an explicit logout), tied to their specific IP address
(making sure any requests coming from another IP would be invalid), or
something of the likes. Upon returning to the page, they are asked to verify
these credentials again, issuing a new unique identifier to use, and so forth.
Such a process renders these Cross-Site Request Forgeries useless, assuming
they don't know the login information and thus can't renew the unique
identifier for another valid session.

At the user-level, users should ideally disable client-side scripting (scripts
that are downloaded to your machine and executed within the context of your
browser, thus getting access to all information residing within it). These can
be disabled by your browser, however the best option is to run a more secure
browser such as Firefox and install the NoScript extension. This is installed
with a default rule set to disallow all client-side scripting for all websites.
However, with one click you can enable it for all future uses with the trusted
site you are currently visiting. This gives you the power and discretion to
pick and choose which sites you want to let run scripts within your browser and
prevents you from getting blindsided by unfamiliar sites that are housing
malicious scripts.

Moreover, be aware of what is going on in your browser. Hover over links before
clicking on them to see if the target site is the same as the projected site.
Be aware of excessively long URL's and, at the other extreme, URL's that have
been shortened by an online converter such as "TinyURL!", which takes any URL
and converts it to a shorter one with the same destination. For example,
TinyURL would take the URL of...

  http://www.mapquest.com/maps/map.adp?ovi=1&mqmap.x=300&mqmap.y=75&mapdata=
      %252bKZmeiIh6N%252bIgpXRP3bylMaN0O4z8OOUkZWYe7NRH6ldDN96YFTIUmSH3Q6OzE
      5XVqcuc5zb%252fY5wy1MZwTnT2pu%252bNMjOjsHjvNlygTRMzqazPStrN%252f1YzA0o
      WEWLwkHdhVHeG9sG6cMrfXNJKHY6fML4o6Nb0SeQm75ET9jAjKelrmqBCNta%252bsKC9n
      8jslz%252fo188N4g3BvAJYuzx8J8r%252f1fPFWkPYg%252bT9Su5KoQ9YpNSj%252bmo
      0h0aEK%252bofj3f6vCP

...and convert it to...
 
  http://tinyurl.com/6.

If you receive an email that claims to be from a trusted entity and asks for
personal information, call them. Believe it or not we still have telephones
(although many are internet-based now; i.e. VOIP) and they are perfect for
getting in touch with a valid entity in such a situation. Question. Question.
Question. I cannot urge that enough. You can never have too many questions in
this day and age of mimics and social engineers. You may waste 5 minutes of
your day placing a call to verify that your bank really does need you to reset
your password. However, that is a small price to pay as compared to losing your
savings due to the fact that you simply could not be inconvenienced by taking
the time to question something. In life, you always pay, whether it is now or
later. Pay now with small portions of your time rather than later with identity
theft, money loss, defamation, and any combination thereof.

Conclusion
==========

XSS attacks have continued to grow and will continue to worsen as attackers
figure out more and more ways to bypass standard security implementations. We
are past the age of reactive measures when a simple patch could retroactively
fix the minimal damage caused by an immature exploit. Just as we take steps in
the direction of fixing these holes and instantiating preventative measures, so
do the attackers in the direction of even more mature and obscure methods of
compromise. We can't just be as creative as the attacker; we have to be more
so. Compromise does not come from what we know; it comes from what we do not.
By not thinking outside of the box and becoming comfortable in our current
security logic we become susceptible to the logic of those who do not settle
for what is known. As the old saying goes, "Where there's a will, there's a
way", and robust and future-intelligent security measures can take that will
away.


CIAC, the Computer Incident Advisory Capability, is the computer
security incident response team for the U.S. Department of Energy
(DOE) and the emergency backup response team for the National
Institutes of Health (NIH). CIAC is located at the Lawrence Livermore
National Laboratory in Livermore, California. CIAC is also a founding
member of FIRST, the Forum of Incident Response and Security Teams, a
global organization established to foster cooperation and coordination
among computer security teams worldwide.

CIAC services are available to DOE, DOE contractors, and the NIH. CIAC
can be contacted at:
    Voice:    +1 925-422-8193 (7x24)
    FAX:      +1 925-423-8002
    STU-III:  +1 925-423-2604
    E-mail:   ciac@ciac.org

Previous CIAC notices, anti-virus software, and other information are
available from the CIAC Computer Security Archive.

   World Wide Web:      http://www.ciac.org/
   Anonymous FTP:       ftp.ciac.org

PLEASE NOTE: Many users outside of the DOE, ESnet, and NIH computing
communities receive CIAC bulletins.  If you are not part of these
communities, please contact your agency's response team to report
incidents. Your agency's team will coordinate with CIAC. The Forum of
Incident Response and Security Teams (FIRST) is a world-wide
organization. A list of FIRST member organizations and their
constituencies can be obtained via WWW at http://www.first.org/.

This document was prepared as an account of work sponsored by an
agency of the United States Government. Neither the United States
Government nor the University of California nor any of their
employees, makes any warranty, express or implied, or assumes any
legal liability or responsibility for the accuracy, completeness, or
usefulness of any information, apparatus, product, or process
disclosed, or represents that its use would not infringe privately
owned rights. Reference herein to any specific commercial products,
process, or service by trade name, trademark, manufacturer, or
otherwise, does not necessarily constitute or imply its endorsement,
recommendation or favoring by the United States Government or the
University of California. The views and opinions of authors expressed
herein do not necessarily state or reflect those of the United States
Government or the University of California, and shall not be used for
advertising or product endorsement purposes.

LAST 10 CIAC BULLETINS ISSUED (Previous bulletins available from CIAC)

S-213: Nukedit 'email' Parameter Vulnerability
S-214: SurgeMail and WebMail 'Page' Command Vulnerability
S-215: Symantec Backup Exec Scheduler ActiveX Control Multiple Vulnerabilities
S-216: Juniper Networks Secure Access 2000 'rdremediate.cgi' Vulnerability
S-217: Drupal Multiple HTML Vulnerabilities
S-218: gd Security Update
S-219: Juniper Networks Secure Access 2000 Web Root Path Vulnerability
S-220: PHP-Nuke My_eGallery Module 'gid' Parameter Vulnerability
S-221: Learn2 STRunner ActiveX Control Vulnerabilities
S-222: Evolution Security Update


-----BEGIN PGP SIGNATURE-----
Version: PGP 8.1

iQA/AwUBSEhYnkrr52ee8YsTEQJTgQCfS7o4Q/Wz3EcDaIcd56XPG53KzYMAoLnk
NJz9kNEmMIb5Bu6vWOsDlsB8
=+Y0A
-----END PGP SIGNATURE-----