Make an empty tutorial folder. Make an basic.asm file. Type this code into the basic.asm file:
BITS 32 ; Tell NASM we're using 32-bit instructions by default.
GLOBAL _main ; Tells the linker about the label called _main
SECTION .text ; Says that this is the start of the code section.
_main: ; Code execution will start at the label called _main
mov eax, 42 ; The simplest program you'll write in this class.
ret ; Return to DJGPP's crt0 library startup code
Go into the directory and type nasm -f coff basic.asm -l basic.lst. A quick run of nasm -hf shows that this assembles the basic.asm file and creates basic.o in the form of a COFF object file. (This is the format that the DJGPP linker can read). The object file contains the assembled code and data and information about the variable and label names so that the linker can link the object file with other object files and the system libraries. NASM also creates a list file called basic.lst which contains the compiled code with line numbers, addresses, and data tacked on to it. Look at this file. What is the opcode for ret? How many bytes were the two opcodes in the _main function? Note how large the constant "42" is.
Type objdump --disassemble-all basic.o to disassemble the object file that NASM created and print its contents to the screen. (This step doesn't actually do anything, it's just to see how NASM works.) Look at the objdump output. This is the information that's in an object file. Question 3: How much of the mov opcode was actually opcode and how much data? Hint: find the hex value of your data in the opcode.
Type gcc -o basic basic.o which runs gcc, which runs the linker to take the basic.o file and link it with the DJGPP startup code to make it an executable.
A linker takes a bunch of assembled object files and sticks them all into one big object (probably executable) file. Object files can call routines or access variables in other object files as long as they are declared GLOBAL in one object file and EXTERN in the others. When something is declared as GLOBAL, NASM will put its name and address into the object file it creates. Other object files, with EXTERN references to a routine or variable, will be assembled into object files with unresolved links. The linker takes these object files, matches up the names, and puts the the address of the GLOBAL routine or variable into the code instead of just an unresolved name. This is how the LIB291 library code has been matched up with the MP code since the beginning of the class.
Type "objdump --disassemble-all basic > out.txt" and look at out.txt (which is now huge) to see the dump of the object file it created with all the libraries. Find <_qsort>. Find <start> and <exit>. <start> is where protected mode execution actually starts. It eventually calls your <_main>. When <_main> returns, execution passes into the <exit> code which calls the <_exit> code which calls the <__exit> code, which finally leaves protected mode. This is how C works. Be afraid. Be very afraid.
Type basic to run the example program. Nothing happened? Good. Awe at the fact that there's only one line of assembly, yet twenty million things had to go on to get into and out of protected mode, to load the code, to interact with the operating system, to toggle the bits in the microprocessor, to manipulate the quantum state of billions and billions of electrons, etc, etc...
If it seems like an excessive amount of work for one line of code, it is. It's possible to do the exact same thing this "basic" program did in real mode. Keep in mind that this is only the beginning, and it's good to start simple.
Type cv32 basic to actually see what's going on. This is the best protected mode debugger available in ECE 291 at the moment. Hit F8 a few times to step through. (Go slow, or it's easy to miss the one line of code!) Alt-H brings up a help screen. Alt-X exits. CV32 will become more useful as the programs get more complex.