Communication with C

Purpose

To illustrate communication between C and assembly code:

Due dates

You must hand in answers to the questions at the beginning of your laboratory session on Thursday, September 11. This part of the assignment is worth 15 points.

You must be prepared to discuss your work at the beginning of your laboratory session on Tuesday, September 16, and to demonstrate your program to your TA during that period. This part of the assignment is worth 10 points.

Background

Modular decomposition is the technique that we use to control complexity. In order to realize the benefits of decomposition, we must physically divide a program into routines (procedures). Each routine carries out a part of the task, communicating with other routines via well-defined interfaces. A complete program may contain both routines written in C and routines written in assembly language. The conventions for communication between these two kinds of routines are set by the authors of the C compiler, and may vary from one machine and operating system to another.

Packaging an algorithm as a function

When decomposing a program into routines, one method for abstracting an algorithm is to hide the details in a function. The function is characterized by an interface specification that defines all desired states of the machine on entry and exit, suppressing the actual implementation. Users of the function need understand only the interface specification.

The interface specification acts as a contract between the implementor and the user, and the implementor is obliged to guarantee that the machine state on exit from the function will satisfy that specification whenever the state on entry does. In order to fulfill this guarantee, the implementor must resort to a combination of reasoning and testing.

Four steps are required to package an algorithm as a function:

  1. Write an interface specification.
  2. Express the entry and exit states in the implementation language.
  3. Implement interface code for function entry and exit.
  4. Implement the algorithm to produce an acceptable exit state whenever the entry state is acceptable.
The first step is often carried out as part of a higher-level design; it provides the context for the remaining steps.

In this course, you will need to package some algorithms as C functions and some as assembly language functions. Some of your assembly language functions will be invoked from C code, others from assembly code. Although most of your C functions will be invoked from C code, some will be invoked from assembly code.

Interface specifications should be stated in terms of the highest-level language involved. Thus the interface specification of a C-coded function should always be stated in C. The interface specification of an assembly-coded function should be stated in C, if that function is to be called from C. The interface specification of an assembly language function should be stated in assembly language if and only if it is invoked only from assembly code.

When an interface specification is stated in C, and the routine is either written in assembly language or invoked from assembly code, the conventions for C/assembler interaction must be used to express the sets of states defined by the interface specification in terms of assembly language. These conventions govern:

The Microtec C compiler's conventions are typical and are described the on-line C manual. (Click here for a description of how to access the documentation if you don't already have an icon for it on your desktop.) You should carefully read Chapter 10, pages 1 through 10, Chapter 11, pages 1 through 8 and pages 25 and 26.

We strongly disagree with Microtec's suggested operands for the movem instruction that restores register values at the end of the routine body:

movem.l -n(a6),<register list>
Here the value of n depends on both the amount of local storage and the number of registers saved, because it is the offset from a6 of the bottom of the save area. Thus if you make a change in either the local storage or the register list, you must make a corresponding change to n. This is a recipe for disaster! We advocate using the stack pointer's postincrement addressing mode:
movem.l (sp)+,<register list>
This requires only that any use of the stack pointer within the routine be balanced (i.e. pushes must have corresponding pops), a principle that should be adhered to in any case in a hand-written assembly code program.

Testing the implementation

No matter how carefully you code a function, mistakes are almost inevitable. Although testing can never guarantee the absence of errors, it can detect them and allow you to correct them before your function is released to users. Because the function has an interface specification, you can test it independently of the code that will eventually be invoking it.

Exhaustive testing of any interesting function is impossible because of the size of the domain of values to which the function can be applied. In addition to being impossible, however, exhaustive testing is really irrelevant for most functions: If a function treats a large number of input values in exactly the same way, and works for one, then it will work for all.

Test cases for a function are usually selected according to two criteria:

For example, consider a function taking two signed shorts as input parameters. The first criterion would suggest four test cases providing all combinations of the values -32768 and 32767. If the function's implementation contained code that was executed only if the first parameter were 0, then those four test cases wouldn't force the function to execute that code -- it wouldn't be ``covered'' by the test cases. The second criterion would therefore suggest another test case in which the first parameter was 0.

A function containing a number of tests may have a vast number of possible paths through it. Again, the numbers involved make it impossible to force the function over every possible path. Here's an example that illustrates the difference between paths and coverage:

if (a < 0) a = -a;
if (b < 0) b = -b;
There are four possible paths through this program, corresponding to all combinations of positive and negative values of the variables:
a < 0,	b < 0
a < 0,	b > 0
a > 0,	b < 0
a > 0,	b > 0
All of the code gets executed, however, when a test case corresponding to the first condition (both variables negative) is provided.

When testing a function, the implementor must select a set of test cases that will demonstrate correct behavior at the extreme input values, cover all of the code, and exercise the ``important'' paths. It is often useful to have a second person (who knows nothing about how the function is implemented) select a set of test cases on the basis of the interface specification and their intuition about possible implementation problems. In any case, the test data and results should be retained for use during maintenance.

Even when a function has been carefully tested in isolation, it may fail in use. When such a failure occurs, there are two possible causes:

  1. The entry conditions are satisfied and the function contains an error.
  2. The invoking system contains an error that resulted in the function being invoked in a state that did not satisfy the entry conditions.
Clearly the debugging process must distinguish these two causes. The easiest way to do this is to stop the program on entry to the function and manually verify the entry conditions. If the entry conditions are satisfied, then the specific values that caused the error should be added to the set of test cases for the function. When the error has been found and corrected, the entire set of test cases should be run in order to ensure that the correction has not resulted in another error. (This strategy is known as regression testing.)

Stepping the simulator

The Code window of the simulator running on the HP has a set of buttons below the pull-down menus at the top. You can use the leftmost six to advance the simulation by various amounts. This can be very useful when verifying interfaces and diagnosing puzzling errors.

From left to right, the buttons are:

Go:
Run the program to completion or to the next breakpoint.
Hi-Level Step Into:
Execute the next C action. If that is a function call, enter the function and stop.
Hi-Level Step Over:
Execute the next C action.
Low-Level Step Into:
Execute the next assembly instruction.
Low-Level Step Over:
Execute the next assembly instruction. If it is a JSR, run until the called routine returns.
Go Return:
Run the program until the current procedure returns.
For a complete description of these buttons, select Help on Code Window from the pull-down Help menu in the Code window of the simulator.

Task

Write a routine that adds two 64-bit integers and returns the overflow status, implementing the interface given by file arith.h. Package this routine by adding your code to the skeleton arith.src file. Section 7.4 of the text discusses multiple-precision arithmetic, with background on binary number representation in Section 3.1. You may also want to review Sections 6.2.2 and 6.2.3, which describe the use the conditional branch instruction.

Test your routine with a driver that reads data and prints results.

Note any problems you have as they occur, and be prepared to discuss them at your next laboratory session. Also be prepared to demonstrate and explain your use of the development system to your TA during the lab session.

The individual steps are expanded below.

Obtain an executable program

Click here to get a zip file containing the source code for the driver, the assembly language skeleton, and an appropriate Makefile. Unpack that zip file just as you did in Homework #2. Start a command-line window, go to the arith directory unpacked from the zip file, and add your code to the skeleton file arith.src. When you have a completed assembly code file, execute make.

Verify the interface

Invoke the MC68000 simulator running on the HP. (Review the process from the last homework if necessary.) Set a breakpoint at the call to LongAdd in the driver, and start the program executing. When the breakpoint is reached, click the Low-Level Step Into button on the the toolbar at the top of the Code window. This will change the program display in the upper pane of the code window to show the assembly code implementing the call, with a box around the second PEA instruction. Click the same button two more times, executing the second and third PEA instructions. At this point, the program is about to call the subroutine.

Go to the Windows pull-down menu of the Code window, and select Register. A small window showing the contents of all of the registers will appear. Again go to the Windows pull-down menu of the Code window, but this time select Memory. Another window, which is capable of displaying the contents of memory locations, will appear.

First, you should configure the memory window so that it is appropriate for the program you are examining: All of the program's data consists of longs, so you want the memory display to show you longs as single entities (rather than as strings of bytes). Therefore you should select longs in the pull-down View menu of the Memory window. You also want to see changes in the memory contents when they occur, so select Automatic Update from the same menu.

The interesting area of memory for this program is the stack, at the location addressed by register A7. Type the hexadecimal address from A7, as shown in the Register window, into the Start Address pane of the Memory window. You must precede the number with 0x to indicate that it is hexadecimal, and you need not type leading zeros. Either upper- or lower-case letters are acceptable, and you should terminate the number with Enter. The display will be easier to read if you type 1 (followed by Enter) into the Display Columns pane of the Memory window. (Re-size the Memory window to display only the relevant information.)

Verify the picture of the stack that you drew for question 2b, labeling all of the locations with the actual addresses. Convince yourself that your understanding of the way the parameters are passed was correct, and be prepared to explain the picture to your TA when you demonstrate your program.

Step through your routine

Write down the address of the instruction following the JSR that is about to be executed, and use the scrollbar of the Memory window to position the top of the stack roughly in the center (so that you can see locations both above and below it). Use the Code window's Low-Level Step Into button to execute the JSR instruction and enter your subroutine. Note that both the PC and A7 displays in the Register window are highlighted in yellow to indicate that their values have changed. Verify that the appropriate return address (the address of the instruction following the JSR) has been placed onto the stack.

The first instruction of your routine should be a LINK that saves the frame pointer and sets it so that you can access the arguments passed to your routine. Before executing this instruction, try to predict the changes that it will make. Then execute it, and verify your predictions.

Continue to step though your routine, predicting the effect of each instruction and then executing it to verify that your prediction was correct. This will give you practice in using the simulator, and also will help you to visualize the behavior of the instructions. If you find this useful, you can click here to download a simple simulator for use on your own machine (this link is also in the navigation bar on the course web site).

After executing the RTS instruction at the end of your routine, you should be back at the instruction following the JSR in the driver. Click the Src tab between the panes of the Code window to see the C code, and use the Hi-Level Step Over button to output the values of the variables. If there is any error, go back and iterate the previous steps until this test case works.

Test your program on the MB5

You have now successfully executed one test, and you need to complete the testing process by running all of the tests that you proposed. Prepare a file that contains the test data you submitted in answer to question 3, plus any that you may have thought of since. Each 64-bit number must be expressed as a pair of 32-bit hexadecimal values. One set of test data is expressed as four values. Put each set of test data on a separate line of the text file. Thus the first (built-in) test could be expressed as follows:
0 1 ffffffff ffffffff

Modify the Makefile to produce code for the MB5: Lines beginning with # in a Makefile are ignored. Lines containing an = sign define a symbol, which can be substituted in a command. The symbols LNKFLAGS and LNKCMD in this Makefile control the linking process. Thus the behavior is changed simply by selecting which pair of definitions should be used.

After changing the Makefile, give the command make clean to remove all of the files concerned with the simulator implementation. Then run make to build the files for the MB5.

Download the program to the MB5

Double-click on the Tera Term icon on the desktop, or use the Windows Start menu to start Tera Term Pro. Select the Serial connection via COM1, and click OK.

Turn on the MB5 board, using the switch on the left side (all descriptions of the MB5 assume that the keypad is ``right side up'' as you look at it). All of the LEDs should be lit. If the board is already on, press the red reset button on the left side below the red LED.

In the Tera Term window, you will see text describing the five single-letter commands, A through F, implemented by MooseLoad (the MB5's ROM monitor). You want to download a program from the HP, so type an upper-case A on the HP's keyboard.

MooseLoad should now request that you send your code. From Tera Term's File menu, select Send file.... This will produce a normal Windows explorer pop-up window to allow you to select the file. Initially, this window is positioned in the TTERMPRO directory. Click the downward-pointing arrow at the right end of the Look in: pane and select the Z: drive. Navigate to the arith directory you created earlier, and open file driver.abs. Send file progress window containing a running count of the bytes transferred will now appear, and text will be written into the main Tera Term window. The downloading process can take several seconds to several minutes, depending on the size of the program. When it is complete, the progress window will disappear and MooseLoad will write its command list to the main TeraTerm window.

Run the program

After downloading your program, but before executing it, pull down the Setup menu of the Tera Term window and select Serial port. Set the Transmit delay to 100 msec/character. This will slow the rate at which characters will be sent to the MB5, giving the I/O routines time to build the data for your program. Now execute your program.

MooseLoad writes the starting address of the downloaded program just above its command list. Copy the last six characters of this starting address onto a piece of paper. You'll need it to run the program.

Type an upper-case B on the HP's keyboard to begin execution. MooseLoad will then ask you for the address at which to start running. Since you want to run the program from the beginning, type the six characters of the program's starting address, followed by Enter. The case of the letters is irrelevant here.

You should see the same output that you saw when running under the simulator.

The program is now waiting for further input. Send your test data file via Tera Term, just as you sent your program. You will see the characters of your data appearing relatively slowly. After each line is completed, the program will output the operands and result for the data on that line. You should verify that each set of test data produces the correct answer.

After you are convinced that your program is correct, change Tera Term's transmit delay back to 0 and try running your program again. Can you explain the behavior you see?

Questions

  1. Suppose that a computer represents integers in n bits using 2's complement notation.
    1. (1 point) What is the largest value of v such that all numbers from -v to +v inclusive are representable?
    2. (1 point) Are any other numbers representable? Explain briefly.
    3. (1 point) Briefly explain why an unsigned, double-precision number whose high-order component is a and whose low-order component is b has the value a*2n + b on this computer.

  2. Consider the following interface specification:
    int
    LongAdd(long *a, long *b, long *c);
    /* 64-bit addition
     *   On entry-
     *     a, b, and c point to 64-bit, big-endian integers
     *   If the computation overflows, then on exit-
     *     LongAdd=1
     *     c is undefined
     *   Else on exit-
     *     LongAdd=0
     *     c = a + b
     ***/
    
    1. (2 points) What must the name of the assembly code routine be? Explain briefly.
    2. (2 points) Draw a picture of the stack as it appears on entry to the assembly code routine.
    3. (2 points) Where must the result be stored on exit? Explain briefly.
    4. (2 points) Which registers must the assembly code routine save and restore? Explain briefly.

  3. (4 points) What pairs of values (a,b) would you select as test cases for a function with the interface specification of the previous problem? Briefly justify each of your selections.

Demonstration

You must demonstrate your program to your TA on one of the machines in the Microlab.
Instructor Revision 1.34 (2007/08/27 00:15:43)