MalCheck ======== Tool to check up on dynamically allocated blocks. Intended to work with all kinds of compilers and runtime libraries on RISC OS. This manual refers to MalCheck v2.0 The workings of this version are identical to v1.1.1.1 Version 2.0 is a Free Software publication of the sources. This package is Free Software, as defined by the Free Software Foundation, granting freedoms 0, 1, 2 and 3: Freedom 0: The freedom to run the software as you wish, for any purpose. Freedom 1: The freedom to study how the software works, and change it so it does your computing as you wish. Freedom 2: The freedom to redistribute copies so you can help your neighbor. Freedom 3: The freedom to distribute copies of your modified versions to others. What it does ------------ This library gives you some help to track down program errors involving careless writing to dynamically allocated memory blocks. These usually manifest themselves as unpredictable crashes at locations in the program that look perfectly correct. The damage was done at an earlier point, but only shows itself later. A surefire pointer to such a problem is when the postmortem backtrace indicates that the last routine called was malloc(), realloc(), calloc() or free(). This package keeps an eye on dynamically allocated blocks and tries to see if data is written outside the block or into free or dealloc blocks. (Free blocks are blocks released by free(), dealloc blocks are blocks left behind when realloc() needed to move data to a bigger block.) How to use MalCheck ------------------- Include this line in the sources you want to check: #include "MalCheck:malcheck.h" Add the library 'MalCheck:o.malcheck' to the list given to the Link command in your Makefile. Note: Put this library *before* UnixLib:o.Unixlib or C:o.ansilib. Define the macro 'MALCHECK' by passing -DMALCHECK to the compile commands. Note: you can switch off the package without changing the sources by compiling without this macro defined. Make sure that the MalCheck$Path is set (e.g. by calling SetVars in this directory). Make your software. When you run it, checks will be performed and output written to stderr. If desired you can redirect stderr, stdout and stdin with one of the following: *myprog 0< infile Read stdin from infile *myprog < infile Same as 0< *myprog 1> outfile Redirect stdout to outfile *myprog > outfile Same as 1> *myprog 2> errfile Redirect stderr to errfile *myprog 2>&1 Redirect stderr to same channel as stdout *myprog >& outfile Redirect both stdout and stderr to outfile *myprog >> errfile Append stderr to errfile *myprog >>& outfile Append both stdout and stderr to outfile *myprog 1>&2 Redirect stdout to same channel as stderr Refer to the "Acorn C/C++" manual, page 87, under "Standard implementation definition", "Environment (A.6.3.2)". How it works ------------ MalCheck intercepts calls to malloc(), calloc(), realloc() and free(), by replacing them with macros that call special library functions. These functions then perform the normal action requested, with a little extra. When blocks are allocated, a header with maintenance information is put before them, and a 'magic number' (m3) is put after the block. The header is used to maintain a list of blocks and to keep information about each block. Changes in the data can be detected by a calculated CRC which is kept in the header. The header itself is protected by it's own CRC, plus a magic numbers at the start (m1) and at the end (m2). Every time one of the above functions is called by the user program, the library runs though it's list of blocks and looks for suspicious things. For example: a change in m3 will show that probably the end of the allocated block was overwritten. This commonly happens when a string written to a block is too large for that block (maybe the terminating zero was forgotten in the count). Problems are reported on stderr, if their level is equal to or worse than the current report level. If nothing desastrous has happened, the MalCheck function then performs the duty it was called for, checking whether nothing illegal is done (like calling free() with a pointer that does not point to an existing block). When no problems occured equal to or worse than the current quit level, the routine returns to the user program. As an extra, atexit() is used to install a routine which produces a listing of all blocks when the program exits. The list of blocks ------------------ The library starts with checking the list of blocks. If the list is printed it looks something like this: MalCheck: MalCheck: Malloc blocks (status, base, size, backtrace): MalCheck: M 0x000001f994 15 _main(), main(), MalCheck_malloc() MalCheck: M 0x000001f7cc 321 _main(), main(), foo(), MalCheck_malloc() MalCheck: M 0x000001e73c 4 _main(), main(), bar(), MalCheck_malloc() MalCheck: F 0x000001e484 567 _main(), main(), foo(), hop(), MalCheck_free() MalCheck: M 0x000001cf24 1234 _main(), main(), MalCheck_malloc() Note that currently the most recently allocated block is at the top of the list. Also note that newer blocks are located at higher addresses. All messages are preceded with "MalCheck:", so you can see where the output is coming from. The first column is a letter wich gives the status of the block. This can be: M for Malloc: a block created by malloc() C for Calloc: a block created by calloc() F for Free: a block released by free() R for Realloc: a block created by realloc() D for Dealloc: a block left behind when realloc() moved the data to a bigger block. The next column gives the pointer to the user data block, as returned by malloc(), calloc() or realloc(). The pointer uniquely identifies the block. The third column gives the size of the block as allocated by the user program. Note that the actual allocated size can be slightly bigger because MalCheck allocates memory a word at the time. Thus for a request of 5 bytes, 8 bytes (=2 word) are actually allocated. This is similar to the behaviour of Clib and UnixLib. Apart from this MalCheck also allocates room for the header and the trailing Magic Number. The rest of the line is taken up by a backtrace of the functions which were active the moment the alloc function was called. Note, in the example, that the backtrace for the Free block shows where free() was called, not where malloc() was called to create the block. Reports for the list -------------------- While checking the list of blocks, MalCheck can give various reports (in order of severity): MalCheck: === INFO - Data changed in regular block Between this call and the previous one the program wrote some data in an allocated block. This is quite common and nothing to worry about. In particular, data tends to be written in a block right after it was created with malloc(). MalCheck: === WARNING - Data changed in dealloc or free block This points to questionable programming practice. Data was written to a block that formally no longer exists. In olden days this happened a lot, because the early C libraries had the propperty that a free'd block was available at least up to the next malloc(). Programmers would call "p=free(malloc(size));" and happily use the block. MalCheck: === ERROR - Bad Magic 3 (end of block overwritten) An important message: This is the result of one of the most common (and most annoying) errors when dealing with dynamic memory. The data written to the block was bigger that the size of the block. Under normal runtime conditions this usually messes up the internal management of dynamic blocks, resulting in a crash on malloc(), realloc() or free() later on. MalCheck: === BAD - Magic 1 overwritten for block MalCheck: === BAD - Magic 2 overwritten for block The program came close to damaging the header that MalCheck maintains for the block. Maybe due to a negative offset to the pointer (like *(p+d), where d is accidentally negative) or overflow from too large data written to a block at a lower address. MalCheck can still continue. MalCheck: === FATAL - Bad header CRC for block ( ?) A fatal blow to MalCheck: The header is corrupted. It can no longer be sure that the links that connect the list of blocks is correct, which means the rest of the blocks can not be checked. Neither can it be sure of other data in the header. Therefore further information about the block can not be given. Usually it is not a good idea to continue with the program after this happens. Reports for the action ---------------------- Next, the MalCheck routine will attempt to execute the action for which it was called. This can result in various reports. For starters, if the report level includes Actions, the name of the routine is printed, with it's parameters and the file and the linenumber from which it was called. This line comes before the listing mentioned above. Examples: MalCheck: malloc(1234), ^.Test.c.main line 25 MalCheck: free(0x1e4a4), ^.Test.c.main line 37 After the list of blocks information is given about the action and the block or blocks involved. For the various routines they look like this: ++ For malloc(): MalCheck: Block allocated at 0x1cf44: MalCheck: M 0x000001cf44 1234 _main(), main(), MalCheck_malloc() [ The block information is shown ] ++ For calloc(): MalCheck: Block allocated at 0x1eb54 and set to zero: MalCheck: C 0x000001eb54 15 _main(), main(), MalCheck_calloc() [ The block information is shown ] ++ For free(): MalCheck: Block at 0x1e4a4 freed: MalCheck: M 0x000001e4a4 567 _main(), main(), MalCheck_malloc() MalCheck: F 0x000001e4a4 567 _main(), main(), MalCheck_free() [ The block is shown both before it is freed and after. Note that the backtrace changes too ] MalCheck: Pointer zero: no action [ Free was called with a NULL pointer. Nothing happens ] ++ For realloc(): MalCheck: Block 0x1ea3c moved to 0x1eac4 MalCheck: R 0x000001eac4 12 _main(), main(), MalCheck_realloc() MalCheck: D 0x000001ea3c 6 _main(), main(), MalCheck_realloc() [ The new size did not fit in the old size, so a new block is allocated and the data moved to the new location. Both the old, deallocated, block and the new block (with status R) are shown ] MalCheck: New alloc size 8 fits in old alloc size 12: no change in block 0x1eac4 MalCheck: R 0x000001eb04 12 _main(), main(), MalCheck_realloc() [ The new block fits in the old size. realloc() takes no action. The backtrace is updated to show this call to realloc(). Note that the allocated size is shown, not the requested size. This is because the allocated size of the block is rounded up from the requested size, sometimes a larger realloc can fit in a apparently smaller block. E.g. after p=malloc(5) and p=realloc(p,7) it might say: "New alloc size 8 fits in old alloc size 8" ] MalCheck: Pointer is NULL: using malloc() MalCheck: Block allocated at 0x1eeb4: MalCheck: M 0x000001eeb4 123 _main(), main(), MalCheck_malloc() [ According to the standard, realloc() which is called with a NULL pointer should behave like malloc(). This is exactly what MalCheck does ] MalCheck: Size is 0: using free() MalCheck: Block at 0x1eb54 freed: MalCheck: C 0x000001eb54 15 _main(), main(), MalCheck_calloc() MalCheck: F 0x000001eb54 15 _main(), main(), MalCheck_realloc(), MalCheck_free() [ According to the standard, realloc() which is called with a non-NULL pointer and with size 0 should behave like free(). This is exactly what MalCheck does. Note that in this case the original block was created with calloc() ] Errors on actions ----------------- Various parameters for calls to allocation routines are considered naughty by MalCheck and are reported. In order of severity: MalCheck: === WARNING - Attempt to free block , which is free The program tries to free the same block twice. Not usually fatal, but might give strange effects if in the normal environment the space was used again by another call to malloc(). Take the following sequence of statements: p=malloc(100); /* Allocate first block */ ... /* Use it */ free(p); /* The first block is released */ ... r=malloc(90); /* Allocate a second block. The normal library might re-use the earlier block, so r==p */ free(p); /* aparently an accidental second free of the first block */ ... s=malloc(80); /* Allocate third block. The same block is again re-used, while the programmer thinks his data at pointer r is safe. */ strcpy(s,"banana") /* Writing to s also overwrites data at r */ The effect would be that the second block is unintentionally de-allocated. MalCheck never actually releases free blocks, so each call to malloc() will return a different pointer. MalCheck: === WARNING - Attempt to free block , which is de-allocated For some reason the program tries to free a block using a pointer that is no longer valid, because the block was moved due to a realloc(). MalCheck marks the block as free and returns. MalCheck: === ERROR - Attempt to free non-existing block The pointer passed to free() has never pointed to a valid dynamic memory block. It might be a badly calculated pointer or a pointer in some non-initialised data. A program can usually continue without problems. MalCheck simply returns without taking further action. MalCheck: === ERROR - Attempt to realloc block , which is free The program tries to use data which was made free earlier. This may lead to problems. MalCheck returns NULL, indicating to the program that the realloc failed. MalCheck: === ERROR - Attempt to realloc block , which is de-allocated Presumably a pointer is used which is no longer valid due to an earlier realloc(). This call shows an intention of the program to use the (officialy lost) data, so this is classified as an ERROR. MalCheck returns NULL, indicating to the program that the realloc failed. MalCheck: === BAD - Attempt to realloc non-existing block Nasty: Not only does the program think that the pointer holds valid data, it also intends to use it later. MalCheck returns NULL, indicating to the program that the realloc failed. Controling the output level with MalCheck_setlevel() ---------------------------------------------------- The output reported by MalCheck is classed in different severity levels. They are: FATAL, BAD, ERROR, WARNING, INFO, Action and List. The ones in capitals refer to the messages with the '===' on the line. 'Action' messages are described in "Reports for the action" above. The listing of blocks is of the 'List' level. Normally only messages of level WARNING or worse are reported. You can change the report level by calling MalCheck_setlevel(), passing it the level of your choice. See the header "malloc.h" for details. Apart from the above severities, you can set the output level to MalCheck_Nothing, which means you get nothing except the listing produced when the program exits. You can call this routine anywhere in your program and as often as you like. This gives you the opportunity to fine-tune the output when you narrow in on a particular bug. Note that even small programs can call a lot of malloc routines and thus produce a lot of output. In particular the MalCheck_List level can soon give you megabytes of text. Controling the quit level with MalCheck_setquit() ------------------------------------------------- Normally MalCheck exits if a message of level ERROR or worse occurs. You might want it to go on longer or quit earlier. This can be set with a call to MalCheck_setquit() with the desired level as parameter. Note that you cannot set the quitlevel worse than FATAL, because MalCheck cannot function after a block header is corrupted. The lightest level available is Action. This will stop MalCheck at the first call to one of the allocation functions (this will presumably be the first malloc() ). When MalCheck quits the program, a listing of all blocks will be produced. For those looking for memory leaks: ignore the Free and Dealloc blocks. The rest (i.e. the Malloc, Calloc and Realloc blocks) are blocks that have never been released in the program. The backtrace can help you find out which data they contain. Setting the quit level lighter than the output level can give you some strange-looking results. Setting the quit level very high (e.g. FATAL) will make even very buggy programs plod on and on, producing tons of output with lots of warnings, errors, etc. Redirecting the output stream with MalCheck_setoutput() ------------------------------------------------------- By default, MalCheck prints all it's reports to stderr. Apart from redirecting stderr on the command line, with 2> and similar (see above), you can call MalCheck_setoutput() from your program to send the output to a given file. The routine takes a FILE * as an argument. You must first open de file yourself using fopen(). Differences with MemCheck ------------------------- This package is not nearly as powerful as Dr. Smith's MemCheck. For starters, it can only performs checks when the user calls one of the allocation routines. MemCheck has an option to check *every memory read or write instruction* to see if it reads from or writes into illegal memory or a protected block. - MalCheck has no feature to set read or write permission on blocks. - MalCheck only watches dynamic memory blocks, not stack chunks, flex blocks or user defined blocks. - MalCheck only reports on blocks allocated directly in the source of the program. Any blocks allocated by library routines (like Toolbox routines) are invisible to MalCheck. Dr. Smith's reports on *all* calls to malloc routines. - Manipulation of output is limited. So why use this package? ------------------------ The main disadvantage of Dr. Smith's MemCheck is that it needs to be linked with the Clib Stubs library. This is a problem when you want to check software which links with UnixLib. I've tried to keep things simple, so the library can be universally used. And finally: this is freeware. Differences with normal runtime libraries ----------------------------------------- The MalCheck routines, because of their nature, do not behave exactly like the normal runtime routines. Some differences are: - The space allocated is bigger This is because it includes the header and the trailing magic word. The size of the header is about 30 words (120 bytes), of which the most part is taken up with the backtrace data. For large allocations (1000 bytes or more) this may not be much of a difference. However, most programs allocate a large number of small blocks (16 bytes or less). The difference is then significant. Program behaviour (i.e. when it crashes) may become different, because runtime libraries tend to handle small blocks in a different way from large blocks. With MalCheck, all blocks become (fairly) large. - Blocks are never released When free() and realloc() are called, the blocks freed or deallocated are only marked as free or dealloc, but are not returned to the free pool, the way runtime libraries do. This means that bugs caused by the re-use of released blocks may no longer show. On the other hand, the tests that MalCheck performs will probably pinpoint the problem that causes the bug even before the bug makes the program fall down. - Blocks are allocated in multiples of 4 bytes I don't know about Clib, but UnixLib allocates blocks in multiples of 8 bytes. This means that with UnixLib, the program might have a margin of 7 extra bytes at the end of a block. This might mask some bad programming. MalCheck is less forgiving (as it should be) and only gives the program a maximum of 3 bytes to mess up unnoticed before the trailing Magic Word is overwritten. Future extensions ----------------- - Interception of string and memory routines like strcpy() and memcpy(), and check if they try to write to malloc blocks in a damageing way. - Maybe do a check on every write and read, like MemCheck does. - Add facilities to produce output in case of a signal (like Segmentation Fault, i.e. write or read to non-existing memory). - List the blocks in reversed order, with the oldest block at the top and the youngest block at the bottom. This connects better with the action reports which are given *after* the listing. It might look better intuitively. Contacts and Acknowledgements ----------------------------- MalCheck was conceived, designed, developed by Erik Groenhuis. Webpage: http://www.xs4all.nl/~erikgrnh With thanks to Leo Smiers for the initial push. Thanks to the guys of UnixLib for their way of getting a backtrace. Thanks to the guys of OsLib for ideas on how to make SWI veneers.