Date: Fri, 29 Mar 91 16:35:24 -0500 From: padgett%tccslr.dnet@uvs1.orl.mmc.com (A. Padgett Peterson) Subject: Six Bytes for Virus Detection paper WARNING: The method depicted in this paper will not detect every conceivable virus, to do so would take far more than six bytes. What it will do is to detect all currently "common" viruses for a knowlegable user, however, CHKDSK can do the same thing if intelligently applied. A short .COM file following these principles will make a good "first check" before using a scanner to determine if something unknown might be resident. Some viruses revealed immediately include Brain, Yale, Datalock, Stoned, 4096, Fish-6, Flip, Whale, Joshi, MusicBug, and Azusa. TSR viruses such as the Jerusalem, Sunday, and 1701/1704 variants will also be revealed if the user is knowlegable about the system. Padgett Peterson, 3/29/91 Six Bytes for Virus Detection in the MS-DOS Environment A. Padgett Peterson, P.E. Orlando, Florida Introduction Concerning the size of the population (over fifty million MS-DOS platforms at last estimate), to the macro, the 240+ known viruses represent a relatively small statistic. In the micro however, they can be devastating. With the growth in size of fixed disks and applications, often backups are obsolete or incomplete where proper discipline has not been established. Unfortunately, this seems to include the majority of the non-power users. Since the number of known viruses appears to be doubling each year, the threat is not diminishing, yet the most accepted utilities, John McAfee's SCAN & CLEAN, rely on detection of known infections. While there are some products that actually perform integrity management of a system (Certus International CERTUS, Enigma-Logic VIRUS-SAFE and PC-SAFE, Fischer International PC-WATCHDOG, Dr. Panda BEARTRAP), most are oriented to file protection rather than system protection. To adequately protect a machine that possesses no native integrity management requires a layered approach of user management, files/applications management, and systems management. We have a good handle on the first two but the question of systems integrity, something so pervasive in mainframes that it is taken for granted, does not currently exist for the PC. Until recently, a large enough population did not exist of not only successful but also unsuccessful viruses to draw any inferences concerning their viability in the general population. At the close of 1990, however, certain characteristics of "successful" viruses, those listed as "common" in Patricia Hoffman's Virus Summary, have become clear: 1: Become resident in memory following infection 2: Allocate memory to themselves 3: Redirect part of the operating system (not necessarily interrupts) Each of these elements is easily detected, often in more than one way, yet few people or programs bother to look. Some years ago, this author wrote three simple assembly language programs, each about 1k bytes long. The first tests file integrity, the second tests disk integrity, and the third tests system integrity. Taken together these still detect every "common" virus, not because they "know" all viruses but because they "know" an uninfected system. There is nothing magical involved, merely a knowledge of how the architecture operates. This paper does not address those viruses that attach themselves to programs or files specifically, rather consideration is made to those that attack elements of the operation system. That these infections may later attack programs or files is incidental. Rather, a description is provided of the third of these routines. Architecture One of the greatest strengths of the architecture of the PC (both IBM and clones - interestingly this system has become so pervasive that while all other systems: Mac, Amiga, etc. are known by their names or initials, the letters PC - standing for Personal Computer - most people immediately relate to a single platform) is its lack of change. In a world marked by instability, all systems since October 27, 1982 are compatible with every other system. In fact programs properly written to the PC-DOS 1.0 specification will still run under MS-DOS 5.0 and on the latest 486-33 mhz machines. (Out of curiosity, yesterday I booted my 386-25 with MS-DOS 1.25 - seeing A: instead of A> had been forgotten.). The reason that this is possible is because the structure of the DOS (disk operating system), while it has received many enhancements in the last ten years, has had no deletions. The low level interrupts (0h-1Fh) still operate to the original specifications and provide all necessary peripheral operations (disk, keyboard, display, etc.). To understand this, one must picture the architecture in layers: at the bottom are the individual elements, the iapx80X86 central processing unit, the 8251 programmable I/O, the RAM memory, the BIOS (basic input output system) ROM, the keyboard, the display, and the disks (not even necessary in a real IBM-PC). Until turned on, these are merely a collection of inert devices. On power on, the CPU performs a simple self test and then reads in the instruction stored at address F000:FFF0h in memory. Since the ROM BIOS is connected to the system so that this address is found in its non-volatile memory, the program stored there begins execution. The BIOS program performs three functions: The first is a test and system check (POST - power on self test) in a particular order - CPU and the first 64k of memory (a failure here results in a series of beeps from the speaker, all the processor may be able to do at this point). Next the low level interrupts are loaded into the first 64k and checks are made to determine if the rest of memory, display, keyboard, and connected disks are operational. Failures now will be shown on the display. Second, the BIOS will scan memory from segment C000h to E000h for BIOS extensions in hardware. Special hardware cards that require operations not found in MS-DOS often contain their own low-level interrupt handlers & DOS makes provision for this. VGA cards, Bernoulli disks, and PLUS Hardcards are good examples. Finally, the BIOS will attempt to read in the first sector of from either the first floppy disk (if present) or the first fixed disk (if the floppy does not respond) and, if successful will transfer execution to the first location in that sector. At this point, the PC is a fully functioning computer, but is not yet an MS-DOS machine. Whether the PC runs Unix, MS-DOS, CP/M-86, OS/2, or any other system depends on what happens next, however it is here that viruses known as Boot Sector Infectors (Brain, Stoned, Joshi, MusicBug, Azusa) become resident. The key is that the PC makes an unconditional transfer to the program found in sector 1 of the selected disk. Only a trivial check is made as to whether it is a valid program or conforms to any standard. It is this willingness to run any code presented to it that makes the PC attractive to many viruses and leaves users unsurprised at having to reboot several times a day. In fact, there is are rarely exploited points at which integrity checking could be layered, either as a hardware ROM extension or as a "special" partition table. But while products are known that enforce access control and encryption at this point, Virus or malicious software control seems an afterthought. Hardware devices are also expensive. Should the access be as intended, the first sector on a fixed disk contains the Partition Table and the program contained on it defines the structure of the disk and has information concerning where the Boot Record for the MS-DOS partition is found. Some systems such as the COHERENT operating system will take this opportunity to ask the user if COHERENT or MS-DOS is desired and select the appropriate Boot Record matching their choice. Next (or first if from a floppy disk), the Boot Record is loaded, again unconditionally, and the program found on it is executed. Normally, the boot record defines the disks and their variables, finds the two DOS operating system files, loads them into memory, and transfers control to them. DOS now examines the data left for it by the previous operations and loads handlers for the DOS interrupts including 20h-2Fh. Next DOS looks for an optional file in the root directory called CONFIG.SYS that can contain device drivers, executable files, and system parameters. These are loaded and executed again without any integrity management. The best current anti-viral programs load as part of the CONFIG process, already too late to prevent or detect some types of infections. Finally, DOS loads its Command Line Interpreter, COMMAND.COM which, as its first action looks in the root directory for a file called AUTOEXEC.BAT, if found, the commands in it will be executed, if not, the familiar requests for Date and Time are presented and DOS is ready for use. At this point, the machine should have a stable configuration with a certain amount of low-level memory in use by the parameter/interrupt tables, DOS, and TSR programs. Free memory should be left to the (normally) 640k/segment A000h boundary, and vi deo buffers, hardware drivers and the resident part of the BIOS (in segment F000h) should reside in the address space above 640k. Intrusion As previously noted, the basic vulnerability of the PC is found in its willingness to execute anything properly presented to it. There is no separation of kernel and user space as found in many mainframes and there is no integrity management at all which leads to frequent system crashes if anything not properly programmed is executed. The ease of subversion of the Boot Sector/Partition Table and the fact that this is the first volatile storage that is loaded and executed makes this a popular target for viruses. Despite the architecture prescribing certain standards for such tables, no checking is done for compliance. This is both good and bad - good in that few BSI viruses bother to adhere to the standards making them easy to detect on a disk, bad in that integrity checking at this point would have detected such viruses early but is not done. In view of this, it is surprising that a commercial integrity management program does not yet exist that goes resident via this means and blocks/detects any attempt to subvert it. Currently, the other target for viruses are the applications files that execute under DOS. These are easy detected in a "clean" DOS environment since for an infection to take place, the files must have experienced a change and this is detectable by nearly any simple checksum routine. Again, unfortunately even though .EXE files have provision for such a value in the file header, no use is made of it. Fortunately, two means exist to detect such attacks though they are not in DOS. First any attempt to write to an executable file should be viewed with question. A few programs (CERTUS, BEARTRAP, FLU-SHOT) do trap such attempts though often the trapping is done on top of DOS and not at the BIOS level. When a program claims to "trap interrupt 13h" the question must be asked *when* and *where* it is done since DOS traps it from the BIOS first. Again fortunately, this is not important if the virus is not memory resident. The Six Bytes Now that knowledge of the structure of the PC has been established, examination can be made of the three common points of "successful" system-infecting viruses and other malicious software in terms of detection and the two most important concern residency: Such viruses all become resident and all allocate memory to themselves. This allocation is detectable. There have been some viruses that have gone resident without allocating memory (Icelandic, Alabama, 512) but none of these have become widespread. While the potential for "new" techniques exists and several forms of malicious software that are not discernable in this manner have been postulated, the following hold true for all currently "sucessful" infections. Memory allocation is done by today's viruses in three ways: 1) By moving the TOM (top of memory) downward and using the free space. (Brain, Stoned) 2) By reducing free space beneath the TOM (Datalock, 4096) 3) By using conventional TSR techniques (Jerusalem, Sunday) Consequentially, detection of these memory resident viruses is a simple matter of checking three values: the TOM (Int 12h return), the low memory in use (CS: value for a .COM file), and the amount of free memory (Int 21h Fn 48h with BX=FFFFh return). Of course this assumes that the user / calling program knows what the memory returns are supposed to be. The method used by the author is to make the INT 12h call, shift the value left six times to convert the number of kilobytes to number of segments and store it. Next the Int 21h is performed and the number of free segments is added to the CS: value. This should match the adjusted Int 12h return and match the number of segments in the machine (A000h for 640k, 8000h for 512k). Any mismatch is cause for concern. Of course this presupposes two things. First that the amount of memory in the machine is known and that the amount of memory used by the O/S and TSRs is known. Since this is run on every boot of a machine and the memory is most stable immediately following a boot, the expected values are stable. To the technician, eventually a feel for the numbers becomes commonplace. Currently, the task is made easier since the most recent viruses publicly reported (Datalock, MusicBug) as well as all of the "stealth" viruses (4096, Flip, Whale, Joshi) affect the TOM. For these, detection is a simple as running CHKDSK and knowing what the "total memory" value should be (640k = 655,360 bytes). If a virus is found, the size of the discrepancy and the type is a good indication of what it is. For example, discovery of a 1792 byte unnamed TSR in low memory immediately suggests a Jerusalem infection. A 4096 byte loss above the TOM suggests the MusicBug. In any case, use of tools such as SCAN are in order and the technician now knows what part of memory is suspect. Where to look when "my PC keeps crashing" is not. Summary We have seen how system viruses and other malicious software rely on two things, the lack of any integrity checking on either the part of DOS or the user, and the simplicity of creating a "hole" in memory to hide in. So far, those viruses that attempt other concealment or fail to go resident simply have not spread very far. Since a large portion of viruses are "Boot Sector Infectors" that become resident before any normal software can execute, these could be difficult to detect at the DOS level. Luckily, current viruses have operating system impacts that make them relatively simple to detect. Hardware ROM extensions or non-standard partition table software would be necessary for increased protection. Even at the user level, integrity checking of attempts for a program to go resident is a simple matter as a stand-alone and would be both trivial and fast. Such a check could be incorporated as one layer of an integrity shell or Command Line Interpreter. Several program have attempted this in the past only to fail through excessive screens irritating the user. An "intelligent" program that knows what is permitted to go resident and how would be simple to program and only flag "unregistered" attempts. The surprising fact is that no-one seems to have done so as yet. Padgett Peterson (407)356-4054, 6384 work (407)648-0733 FAX (407)352-6007 home