Preface

My aim is not to replace such good text-books as "The C Programming Language" by Kernighan and Ritchie. It is my wish to ease your learning path, by giving you another viewpoint.

Wu Siu Yan from Hong Kong.


Chapter 1

In times past, computer languages were classified into : low-level language (assembler), and high-level languages (e.g. COBOL, FORTRAN, ALGOL). Assembler is related to the actual machine instructions, that is, it depends on the actual hardware. One statement in high-level language will expand into many, many assembler statements, and IS INDEPENDENT OF THE HARDWARE.

Assembler language is not standardized. There is one assembler language for Z80, another for 6800, another for 6502, still another for .....(Note : all these "Z80, 6800, 6502, ..." are microprocessors)

C may be regarded as "standardized assembler", and shakes off its dependence on particular microprocessor. C may be regarded as evolved from "Assember -> B (written by Ken Thompson in 1970 for PDP-7) -> BCPL (written by Martin Richards) -> C (written by Dennis M. Ritche)".

Nowadays, people rarely use "assembler" any more. Nearly all the chip manufacturers provide a "C" compiler for the chip they manufactured.

Also, in times past, most "high-level languages" would compile a program by first translating it into assembler code, then translate the assembler code into machine codes.

Now, the "high-level languages" would usually translate a program into C, then into machine codes. (In fact, C will rarely translate directly into machine codes, but into a "pseudo code" resembling "machine code", then into actual machine code. The reason for this is to shake off the dependence on hardware. Hence one C compiler may compile for 68000, Z8000, Intel chip, ... all sorts of chips, what is needed is a "pseudo code to actual code translator".)

This can be seen from the existence of "p2c" (Pascal to C translator), "f2c" (FORTRAN to C translator), ....(Note : p2c and f2c are 2 free softwares that usually come with LINUX.)

The reason why I write these notes is that : C will for a long long time be the standardized "Assember", and if one is to work with chips on a lower level, one must knows C. Moreover, familiarity with C will pave the way of learning many other languages, e.g. javascript, PHP, ... because their syntax are similar to C (CAUTION : they are roughly the same in syntax to C, but their methods of doing things are TOTALLY DIFFERENT, because they are "high-level" languages, whereas C is "standardized assembler", a low-level one.)

The 6800 chip

To fully understand C, one must know hardware. 6800 is one the simplest chip for understanding hardware.

Number System

Before we discuss the instructions of 6800, let us first review the number system.

Decimal : 0 1 2 3 4 5 6 7 8 9

Octal : 0 1 2 3 4 5 6 7

Hexadecimal : 0 1 2 3 4 5 6 7 8 9 A B C D E F

Therefore, in decimal system, we use 10 symbols. In octal system, we use 8 symbols. In hexadecimal system, we use 16 symbols.


 Decimal       Binary       Octal      Hexadecimal

    1             1            1             1
    2            10            2             2
    3            11            3             3
    4           100            4             4
    5           101            5             5
    6           110            6             6
    7           111            7             7
    8          1000           10             8
    9          1001           11             9
   10          1010           12             A
   11          1011           13             B
   12          1100           14             C
   13          1101           15             D
   14          1110           16             E
   15          1111           17             F
   16         10000           20            10
   17         10001           21            11
   18         10010           22            12
   19         10011           23            13
   20         10100           24            14
   21         10101           25            15
   22         10110           26            16
   23         10111           27            17
   24         11000           30            18

In decimal system, we make a carry every 10 numbers.
In binary system, we make a carry every 2 numbers.
In octal system, we make a carry every 8 numbers.
In hexadecimal system, we make a carry every 16 numbers.

Example : Binary number 1011110111 (which is 759 in decimal ) 
written in octal and hexadecimal :

Exercise : How many symbols would be needed for a "quad decimal system"? Write the number 759 (decimal) as "quad number".

Ans :

Exercise : in a "m" system, how many symbols do we need? And what is the general expression for a number in that system ?

Ans : We need m symbols, and

Negative Number

In this section, we use only BINARY number.

We usually put a "-" to denote negative number, e.g. "-1234". But in computer, we do it in another way : "-a" is any number that when it is added to "a" will give 0, i.e. a + [-a] = 0, and any carry is neglected. e.g. The negative of 759 is

Exercise : The register of 6800 is 8 bits long. Hence it can represent at most 2^8 = 256 numbers. And if negative number is considered, it can represent -127 to 127.
Write 12 (decimal) in binary, then find its negative [-12], assuming register is 8 bits long. Also do the same for 28. (Hint : to find a number that add to a will be 0, (a+[-a])=0 , first invert all the bits, then sum the two up. You will find all 1's. Next ....)

Ans :



We notice that "negative" in the computer sense is not what we desire mathematically, because of the "carry", or "overflow". But this is hardware limitation that we have to live with.

Exercise : Show that, e.g. 30 - 12 = 30 + [-12] , where [-12] is the computer's negative number, will work.

Ans : Since 12 + [-12] = (100000000) (where (....) is in binary), therefore
                 [-12] = (100000000) - 12.
      Hence 30 + [-12] = 30 - 12 + (100000000)  which is the correct representation
                                                of 18 as the "carry, 100000000" is 
                                                discarded.

Introduction to instructions of 6800.

We can see, from the figure above, that 6800 has 2 registers A, B. Both these registers are 8 bits (= 1 byte long, hence 6800 is called a 8 bit microprocessor. Intel's 386 etc. is called 32 bits microprocessor.) We will use M to represent a memory location, in the discussion below.

Instructions


      (1) Load registers instructions             M -> A
                                                  M -> B

          (meaning : the content from memory is copied into A, or B)

      (2) Store registers instructions            A -> M
                                                  B -> M

          (meaning : the content of A, or B is copied back to memory)

      (3) Add instructions                        A + B   -> A
                                                  A + M   -> A
                                                  B + M   -> B
                                                  A + M + C   -> A  (C=carry bit from 
                                                  B + M + C   -> B     previous operation.)

          Notice that for adding long numbers, we have to add byte by byte, and in 
          each addition, there may be "C, carry", which is needed for the next addition.

          (A+B -> A, means, the content of A and B are added and the sum is stored
          back to A.) 
          
      (4) Subtract instructions                   A - B   -> A
                                                  A - M   -> A
                                                  B - M   -> B
                                                  A - M - C   -> A  (C=carry bit from
                                                  B - M - C   -> B     previous operation.)
          
          Form "negatives"                        [-M]    -> M
                                                  [-A]    -> A
                                                  [-B]    -> B
          
          (Here the "negatives" are formed as above - the bits are inverted, then
           1 is added to it.)

      (5) Logical instructions                    A and M   -> A
                                                  B and M   -> B

             (inclusive "or")                     A or M    -> A
                                                  B or M    -> B

             (exclusive or, "xor")                A xor M   -> A
                                                  B xor M   -> B

             (negate, " 1 -> 0, 0 -> 1 " )        ~A  -> A       (bits in A are inverted
                                                                  then put back to A.)
                                                  ~B  -> B     

      (6) Increment by 1 instructions             M + 1    -> M
                                                  A + 1    -> A
                                                  B + 1    -> B

          Decrement by 1 instructions             M - 1    -> M
                                                  A - 1    -> A
                                                  B - 1    -> B

      (7) Rotation/shift instructions, the operation may be performed on M, or A, or B.
          Bit by bit, they are rotated/shifted.

          

      (8) Clear                                   0    -> M
                                                  0    -> A
                                                  0    -> B

      (9) A, B exchange content                   A -> B  and  B -> A

Indirect Addressing and Index Register

(We illustrate this precept using the "load register A, (M -> A)" instruction in its various formats.)

If you look at the 6800 diagram above, you will find a "index register", which is 16 bits long, in contrast to A, B registers, which are 8 bits long.

Another name for "index register" is "address register", and it is used for "indirect addressing"

        DIRECT ADDRESSING, we illustrate with "load register A" instruction.
        -----------------

              The following "load register A" instruction uses "direct addressing" :

              (Notice that all numbers are in HEXADECIMAL or HEX, and 1 byte will require 2
               hexadecimal digits.)

                                         Machine code 
                                            in hex
 
              (a) Load A immediate           86 xx     (86 is the "instruction code",
                                                        xx will contain the value to
                                                        be loaded. e.g.

                                                          86 1A

                                                        The value 1A  (=26 in decimal)
                                                        will be loaded in register A.)

              (b) Load A short address       96 xx     (96 is the "instruction code",
                                                        xx represents the address.  As xx
                                                        is one byte long, the address must
                                                        be in the range 0 ~ 255. e.g.

                                                           96 1A

                                                        Computer will get the value from
                                                        address "1A" and put it in A.)

              (c)  Load A long address       B6 xxxx   (Again, "B6" is the "instruction
                                                        code", but the address xxxx is
                                                        2 bytes long, hence the whole 64K
                                                        address, 0 ~ 65535 may be accessed. 
                                                        e.g.

                                                            B6  A94B

                                                        computer will get the value from
                                                        address "A94B (=43339)" and put
                                                        it into A.)

       INDIRECT ADDRESSING
       -------------------

           In indirect addressing, the value is fetched in 2 steps.
           First, we go to the memory and gets its value.  This value is not what
               we want, this value is the "address".
           Next, using this "address", we get the value we want.

           In what follows, "address register" = "index register"

              (d) Load into address register.  
                                               FE xxxx  ("FE" is the "instruction code",
                                                          xxxx is the address.
                                                          e.g.
                                                          
                                                             FE A94B
                                  
                                                         )

              (e) Load A using address register
                                               A6 xx    ("A6" is the "instruction code".
                                                          xx is the "offset"
                                                          e.g.

                                                             A6 05

                                                         )

(Notice that "direct addressing" and "indirect addressing" is NOT limited to "load A" 
 instruction, nearly all instructions can use both direct, and indirect addressing, e.g.
 Add instructions,  M + A -> A, subtract instructions, B - M -> M , .... )

Instructions of 6800 (cont'd)

(Note : Index register = direct register.
        Also we may use direct addressing, which comprises "immediate, short address,
        long address", or "indirect addressing", which uses Index register, with nearly
        all instructions, including the following.)

       (10) Load Index Register                M (2 bytes)  ->  X (2 bytes)

       (11) Store Index Register               X (2 bytes)  ->  M (2 bytes)

       (12) Increment Index Register                X + 1   ->  X
     
       (13) Decrement Index Register                X - 1   ->  X 

Examples of use of "indirect addressing"

         (a) If we have 5 numbers, (say, 64, 78, 90, 85, 66) and we wish to find
             it sum.  We would proceed as follows :

                     Clear register A
                     Store 5 into register B, which serves as a counter.
                     Load the address where the first number (first byte) is 
                          stored into Address Register.

               L20:  Add to A using Address register
                     Increment Address Register by 1
                     Decrement B by 1

                     If B is not 0, then goto L20
                     store A (which contains the sum) to somewhere.

         (b) Suppose we have a record containing

                     Name (20 bytes)  Postal address (40 bytes)  Telephone (10 bytes)

             Then to access name, we load into Address Register the address of
               the 1st byte of name.
             To access the Postal address, we use Index Register with offset = 20.
             To access Telephone number, we use Index Register with offset = (20+40) = 60.

Stack

Inside 6800, there is a "stack pointer" (see diagram above). "Stack pointer" functions like an address register (= index register). The "stack pointer" points to area of memory where we are to store temporary data.

Use of Stack :

  1. During input/output operations, the peripheral I/O chip will pull low the "Interrupt Request wire", (i.e. voltage change from high to low.) telling 6800 that it needs service. 6800 will finish off its current instruction, store away (push into stack) the contents of Registers A, B, Index Register, condition/status register, Program Counter (which stores the return address), then branch to the I/O service subroutines.

  2. User subroutines. We use subroutines to save repeated coding. Before we jump to the subroutine, we have to store away relevant informations (registers A, B, Index Register, condition/status register, Program Counter), also we have to pass arguments to the subroutines. The values of the arguments are pushed into the stack.

Instructions of 6800 that relates to stack

       (14) Load Stack Pointer                M (2 bytes)  ->  SP (2 bytes)

       (15) Store Stack Pointer             SP  (2 bytes)  ->  M (2 bytes)

       (16) Increment Stack Pointer               SP + 1   ->  SP
     
       (17) Decrement Stack Pointer               SP - 1   ->  SP

       (18) Push Data into stack, address is taken from SP, then SP will be decremented 
            by 1 automatically.
                                                        A  ->  [SP]  indirect addressing
                                                   SP - 1  ->  SP

                                                        B  ->  [SP]  indirect addressing
                                                   SP - 1  ->  SP

       (19) Pop data (or pull back data) from stack.  First SP is incremented, then using
            this address, data are popped back into A or B.

                                                   SP + 1  ->  SP
                                                     [SP]  ->  A     indirect addressing

                                                   SP + 1  ->  SP
                                                     [SP]  ->  B     indirect addressing

Condition/Status Register

The condition/status register has 8 bits :

           bit 0  --  Carry bit
               1  --  Overflow bit
               2  --  Zero bit
               3  --  Negative bit
               4  --  Interrupt bit
               5  --  Half carry bit 

and after each instruction, (e.g. load, store, add, subtract, and, or, xor, increment, decrement, rotation or shift, ... ), the bits in the condition/status register are set/reset accordingly.

        e.g.  If we load 0 in A, then the zero bit is set.

              If we subtract 12 - 15 in A, 
                      12 - 15 = (00001100) + (11110001) = (11111101)
              and (11111101) is  -3  in computer representation.  It is a negative
              number, as the 7th bit is 1.  Hence the "negative bit" is set.

The bits in condition/status register is for use in "if" statements, e.g. branch if result is 0, branch if result is positive, ....

Instructions of 6800 that set/reset bits in condition/status register

        (20)  Clear carry bit
              Clear Interrupt bit
              Clear Overflow bit
              Set carry bit
              Set Interrupt bit
              Set Overflow bit

        (21)  Bit Test            
                                                      M and A
                                                      M and B
              (logical AND is performed, and the bits in condition register are set/reset
               accordingly, but result "M and A" is not stored, it is discarded.

               e.g.  If we wish to check if the 3rd bit in M is set, we would use
                                           
                               Load A with (00001000)
                               Bit test      M and A

               If "zero bit" is set, then 3rd bit in M is 0. Otherwise it is 1.
               )

         (22)  Subtraction without storing the results, simply for setting/resetting bits
               in condition register
                                                      A - M
                                                      B - M
                                                      A - 0  (or simply A)
                                                      B - 0  (or simply B)

Instructions of 6800 that relates to both conditional branching and unconditional branching.

         (23)  Unconditional branching (i.e. go to )
               Branch to subroutine
               Return from subroutine
               Software Interrupt (user initiated)

         (24)  Conditional branching

               There are many instructions in this group, based on the various bits
               in condition register, 
               e.g.

                       branch if carry bit is set
                       branch if carry bit is clear
                       branch if zero bit is set
                       branch if zero bit is clear
                       branch if negative bit is set
                       branch if negative bit is clear
                            ......
                            ......

I have roughly explained the instructions of 6800 (only a few are left out). My aim is not for 6800 chip programming, but to give you an idea of microprocessor hardware. It is seen that there is address register for indirect addressing, there are instructions for hardware increment and decrement, and there is a condition/status register that reflects the results of operations (load, store, add, subtract, and, or, xor ....).

Motorola later produced 68000, which is a great improvement over 6800. 68000 has 8 data registers each 32 bits long, and 8 address registers, also 32 bits long. It can deal with bytes, word (=2 bytes), or long word (=4 bytes). From there evolved 68010, 68020, ...

Intel 8086 is used in old PC's, it has 4 data registers (AX, BX, CX, DX) each 16 bits long, and 4 address registers (SP, BP, SI, DI = stack pointer, base pointer, source index register, destination index register) each 16 bits long. In 80286, 80386, 80486 .... the registers are extended to 32 bits long. Moreover floating point hardware, memory management are incorporated into the chips.

I hope, with this background knowledge, you would be better prepared to understand C, which differs greatly from QBASIC we have learnt, and is closely related to computer hardware.

Should your become interested in microprocessor and wish to learn more, I suggest you buy a "micro-processor training kit" and also read more material about them on the Internet.


[Home] [Next]