Date: Fri, 29 Mar 91 16:35:24 -0500
From: padgett%tccslr.dnet@uvs1.orl.mmc.com (A. Padgett Peterson)
Subject: Six Bytes for Virus Detection paper

WARNING: The method depicted in this paper will not detect every conceivable
         virus, to do so would take far more than six bytes. What it will
         do is to detect all currently "common" viruses for a knowlegable user,
         however, CHKDSK can do the same thing if intelligently applied. A
         short .COM file following these principles will make a good "first 
         check" before using a scanner to determine if something unknown 
         might be resident. 

         Some viruses revealed immediately include Brain, Yale, Datalock,
         Stoned, 4096, Fish-6, Flip, Whale, Joshi, MusicBug, and Azusa. 
         TSR viruses such as the Jerusalem, Sunday, and 1701/1704 variants
         will also be revealed if the user is knowlegable about the system.

						Padgett Peterson, 3/29/91


                  Six Bytes for Virus Detection

                             in the

                        MS-DOS Environment


                                        A. Padgett Peterson, P.E.
                                        Orlando, Florida


                          Introduction

     Concerning the size of the population (over fifty million MS-DOS 
platforms  at  last estimate), to the macro, the 240+  known  viruses 
represent  a relatively small statistic. In the micro  however,  they 
can be devastating.

     With  the growth in size of fixed disks and applications,  often 
backups  are obsolete or incomplete where proper discipline  has  not 
been  established. Unfortunately, this seems to include the  majority 
of the non-power users.

     Since  the number of known viruses appears to be  doubling  each 
year, the threat is not diminishing, yet the most accepted utilities, 
John  McAfee's SCAN & CLEAN, rely on detection of  known  infections. 
While  there  are  some  products  that  actually  perform  integrity 
management  of  a system (Certus International  CERTUS,  Enigma-Logic 
VIRUS-SAFE and PC-SAFE, Fischer International PC-WATCHDOG, Dr.  Panda 
BEARTRAP),  most are oriented to file protection rather  than  system 
protection.

     To  adequately  protect  a  machine  that  possesses  no  native 
integrity management requires a layered approach of user  management, 
files/applications management, and systems management. We have a good 
handle  on  the  first two but the  question  of  systems  integrity, 
something  so pervasive in mainframes that it is taken  for  granted, 
does not currently exist for the PC.

     Until  recently, a large enough population did not exist of  not 
only successful but also unsuccessful viruses to draw any  inferences 
concerning their viability in the general population. At the close of 
1990, however, certain characteristics of "successful" viruses, those 
listed  as "common" in Patricia Hoffman's Virus Summary, have  become 
clear:

1: Become resident in memory following infection

2: Allocate memory to themselves

3: Redirect part of the operating system (not necessarily interrupts)
 
     Each  of these elements is easily detected, often in  more  than 
one  way, yet few people or programs bother to look. Some years  ago, 
this  author  wrote   three simple assembly language  programs,  each 
about 1k bytes long. The first tests file integrity, the second tests 
disk integrity, and the third tests system integrity. Taken  together 
these still detect every "common" virus, not because they "know"  all 
viruses  but  because  they "know" an  uninfected  system.  There  is 
nothing magical involved, merely a knowledge of how the  architecture 
operates.

     This paper does not address those viruses that attach themselves 
to  programs or files specifically, rather consideration is  made  to 
those  that  attack  elements of the  operation  system.  That  these 
infections may later attack programs or files is incidental.  Rather,
a description is provided of the third of these routines.


                          Architecture

     One  of  the greatest strengths of the architecture  of  the  PC 
(both  IBM  and  clones - interestingly this  system  has  become  so 
pervasive that while all other systems: Mac, Amiga, etc. are known by 
their  names  or  initials, the letters PC -  standing  for  Personal 
Computer  - most people immediately relate to a  single  platform) is 
its lack of change. 

     In a world marked by instability, all systems since October  27, 
1982  are  compatible  with  every other  system.  In  fact  programs 
properly written to the PC-DOS 1.0 specification will still run under 
MS-DOS 5.0 and on the latest 486-33 mhz machines. (Out of  curiosity, 
yesterday I booted my 386-25 with MS-DOS 1.25 - seeing A: instead  of 
A> had been forgotten.).

     The reason that this is possible is because the structure of the 
DOS (disk operating system), while it has received many  enhancements 
in the last ten years, has had no deletions. The low level interrupts 
(0h-1Fh) still operate to the original specifications and provide all
necessary peripheral operations (disk, keyboard, display, etc.).

     To understand this, one must picture the architecture in layers:
at  the  bottom are the individual elements,  the  iapx80X86  central 
processing unit, the 8251 programmable I/O, the RAM memory, the  BIOS 
(basic  input output system) ROM, the keyboard, the display, and  the 
disks  (not even necessary in a real IBM-PC). Until turned on,  these 
are merely a collection of inert devices.

     On power on, the CPU performs a simple self test and then  reads 
in the instruction stored at address F000:FFF0h in memory. Since  the 
ROM BIOS is connected to the system so that this address is found  in 
its  non-volatile memory, the program stored there begins  execution. 
The BIOS program performs three functions: 

     The first is a test and system check (POST - power on self test) 
in  a particular order - CPU and the first 64k of memory  (a  failure 
here results in a series of beeps from the speaker, all the processor 
may  be able to do at this point). Next the low level interrupts  are 
loaded  into  the first 64k and checks are made to determine  if  the 
rest   of  memory,  display,  keyboard,  and  connected   disks   are 
operational. Failures now will be shown on the display.

     Second,  the BIOS will scan memory from segment C000h  to  E000h 
for BIOS extensions in hardware. Special hardware cards that  require 
operations  not  found in MS-DOS often contain  their  own  low-level 
interrupt  handlers  &  DOS  makes provision  for  this.  VGA  cards, 
Bernoulli disks, and PLUS Hardcards are good examples.

     Finally,  the BIOS will attempt to read in the first  sector  of 
from  either  the first floppy disk (if present) or the  first  fixed 
disk  (if  the  floppy  does not respond)  and,  if  successful  will 
transfer execution to the first location in that sector.


     At  this point, the PC is a fully functioning computer,  but  is 
not yet an MS-DOS machine. Whether the PC runs Unix, MS-DOS, CP/M-86, 
OS/2, or any other system depends on what happens next, however it is 
here  that  viruses known as Boot Sector  Infectors  (Brain,  Stoned, 
Joshi, MusicBug, Azusa) become resident. 

     The  key is that the PC makes an unconditional transfer  to  the 
program found in sector 1 of the selected disk. Only a trivial  check 
is  made  as  to whether it is a valid program  or  conforms  to  any 
standard. It is this willingness to run any code presented to it that 
makes the PC attractive to many viruses and leaves users  unsurprised 
at having to reboot several times a day.

     In fact, there is are rarely exploited points at which integrity 
checking could be layered, either as a hardware ROM extension or as a 
"special" partition table. But while products are known that  enforce 
access  control  and  encryption at this point,  Virus  or  malicious 
software  control  seems an afterthought. Hardware devices  are  also 
expensive.

     Should  the access be as intended, the first sector on  a  fixed 
disk  contains  the Partition Table and the program contained  on  it 
defines  the  structure of the disk and  has  information  concerning 
where the Boot Record for the MS-DOS partition is found. Some systems 
such  as the COHERENT operating system will take this opportunity  to 
ask  the  user  if  COHERENT or MS-DOS  is  desired  and  select  the 
appropriate Boot Record matching their choice.

     Next  (or  first  if from a floppy disk),  the  Boot  Record  is 
loaded,  again  unconditionally,  and  the program  found  on  it  is 
executed.  Normally,  the  boot record defines the  disks  and  their 
variables, finds the two DOS operating system files, loads them  into 
memory, and transfers control to them.

     DOS now examines the data left for it by the previous operations 
and loads handlers for the DOS interrupts including 20h-2Fh. Next DOS 
looks  for an optional file in the root directory  called  CONFIG.SYS 
that  can  contain  device  drivers,  executable  files,  and  system 
parameters. These are loaded and executed again without any integrity 
management. The best current anti-viral programs load as part of  the 
CONFIG  process, already too late to prevent or detect some types  of 
infections.

     Finally,  DOS  loads its Command Line  Interpreter,  COMMAND.COM 
which,  as  its first action looks in the root directory for  a  file 
called  AUTOEXEC.BAT, if found, the commands in it will be  executed, 
if not, the familiar requests for Date and Time are presented and DOS 
is ready for use.

     At  this point, the machine should have a  stable  configuration 
with   a   certain  amount  of  low-level  memory  in  use   by   the 
parameter/interrupt tables, DOS, and TSR programs. Free memory should 
be  left  to the (normally) 640k/segment A000h boundary, and  vi  deo 
buffers,  hardware  drivers  and the resident part of  the  BIOS  (in 
segment F000h) should reside in the address space above 640k.


                            Intrusion

     As previously noted, the basic vulnerability of the PC is  found 
in  its  willingness to execute anything properly  presented  to  it. 
There  is  no separation of kernel and user space as  found  in  many 
mainframes and there is no integrity management at all which leads to 
frequent  system  crashes  if anything  not  properly  programmed  is 
executed.

     The  ease of subversion of the Boot Sector/Partition  Table  and 
the  fact that this is the first volatile storage that is loaded  and 
executed  makes  this  a  popular target  for  viruses.  Despite  the 
architecture  prescribing  certain  standards  for  such  tables,  no 
checking is done for compliance. This is both good and bad - good  in 
that  few BSI viruses bother to adhere to the standards  making  them 
easy  to  detect on a disk, bad in that integrity  checking  at  this 
point would have detected such viruses early but is not done.

     In  view of this, it is surprising that a  commercial  integrity 
management  program  does not yet exist that goes resident  via  this 
means and blocks/detects any attempt to subvert it.

     Currently,  the  other target for viruses are  the  applications 
files  that execute under DOS. These are easy detected in  a  "clean" 
DOS environment since for an infection to take place, the files  must 
have experienced a change and this is detectable by nearly any simple 
checksum  routine. Again, unfortunately even though .EXE  files  have 
provision for such a value in the file header, no use is made of it.

     Fortunately, two means exist to detect such attacks though  they 
are  not  in DOS. First any attempt to write to  an  executable  file 
should  be  viewed with question. A few programs  (CERTUS,  BEARTRAP, 
FLU-SHOT) do trap such attempts though often the trapping is done  on 
top of DOS and not at the BIOS level. When a program claims to  "trap 
interrupt  13h" the question must be asked *when* and *where*  it  is 
done since DOS traps it from the BIOS first. Again fortunately,  this 
is not important if the virus is not memory resident.


                          The Six Bytes

     Now  that  knowledge  of  the  structure  of  the  PC  has  been 
established,  examination can be made of the three common  points  of 
"successful" system-infecting viruses and other malicious software in 
terms of detection and the two most important concern residency: Such 
viruses  all become resident and all allocate memory  to  themselves. 
This allocation is detectable.

     There  have  been some viruses that have gone  resident  without 
allocating  memory (Icelandic, Alabama, 512) but none of  these  have 
become widespread.

     While  the  potential for "new" techniques  exists  and  several 
forms  of malicious software that are not discernable in this  manner 
have  been  postulated,  the following hold true  for  all  currently 
"sucessful" infections.

     Memory allocation is done by today's viruses in three ways:
1) By moving the TOM (top of memory) downward and using the free 
   space. (Brain, Stoned)
2) By reducing free space beneath the TOM (Datalock, 4096)
3) By using conventional TSR techniques (Jerusalem, Sunday)

     Consequentially, detection of these memory resident viruses is a 
simple matter of checking three values: the TOM (Int 12h return), the
low memory in use (CS: value for a .COM file), and the amount of free 
memory  (Int  21h  Fn  48h with  BX=FFFFh  return).  Of  course  this 
assumes that the user / calling program knows what the memory returns 
are supposed to be. 

     The method used by the author is to make the INT 12h call, shift 
the value left six times to convert the number of kilobytes to number 
of  segments  and  store it. Next the Int 21h is  performed  and  the 
number of free segments is added to the CS: value. This should  match 
the  adjusted Int 12h return and match the number of segments in  the 
machine  (A000h for 640k, 8000h for 512k). Any mismatch is cause  for 
concern.

     Of course this presupposes two things. First that the amount  of 
memory in the machine is known and that the amount of memory used  by 
the  O/S  and  TSRs is known. Since this is run on every  boot  of  a 
machine  and the memory is most stable immediately following a  boot, 
the expected values are stable. To the technician, eventually a  feel 
for the numbers becomes commonplace.

     Currently, the task is made easier since the most recent viruses 
publicly  reported  (Datalock,  MusicBug)  as  well  as  all  of  the 
"stealth"  viruses  (4096, Flip, Whale, Joshi) affect  the  TOM.  For 
these,  detection is a simple as running CHKDSK and knowing what  the 
"total memory" value should be (640k = 655,360 bytes). 

     If a virus is found, the size of the discrepancy and the type is 
a  good indication of what it is.  For example, discovery of  a  1792 
byte  unnamed  TSR  in low memory immediately  suggests  a  Jerusalem 


infection.  A 4096 byte loss above the TOM suggests the MusicBug.  In 
any  case, use of tools such as SCAN are in order and the  technician 
now  knows what part of memory is suspect. Where to look when "my  PC 
keeps crashing" is not.

                             Summary

     We  have  seen how system viruses and other  malicious  software 
rely on two things, the lack of any integrity checking on either  the 
part  of DOS or the user, and the simplicity of creating a "hole"  in 
memory  to  hide  in.  So  far,  those  viruses  that  attempt  other 
concealment or fail to go resident simply have not spread very far.

     Since  a  large portion of viruses are "Boot  Sector  Infectors" 
that  become resident before any normal software can  execute,  these 
could  be  difficult  to detect at the DOS  level.  Luckily,  current 
viruses  have  operating  system impacts that  make  them  relatively 
simple  to detect. Hardware ROM extensions or non-standard  partition 
table software would be necessary for increased protection.

     Even  at  the user level, integrity checking of attempts  for  a 
program to go resident is a simple matter as a stand-alone and  would 
be  both trivial and fast. Such a check could be incorporated as  one 
layer  of  an integrity shell or Command  Line  Interpreter.  Several 
program  have  attempted  this  in the  past  only  to  fail  through 
excessive screens irritating the user. An "intelligent" program  that 
knows  what  is permitted to go resident and how would be  simple  to 
program and only flag "unregistered" attempts. The surprising fact is 
that no-one seems to have done so as yet.


     Padgett Peterson
     (407)356-4054, 6384 work
     (407)648-0733 FAX
     (407)352-6007 home