💾 Archived View for aphrack.org › issues › phrack52 › 17.gmi captured on 2021-12-04 at 18:04:22. Gemini links have been rewritten to link to archived content
⬅️ Previous capture (2021-12-03)
-=-=-=-=-=-=-
---[ Phrack Magazine Volume 8, Issue 52 January 26, 1998, article 17 of 20 -------------------------[ Protected mode programming and O/S development --------[ Mythrandir <jwthomp@cu-online.com> ----[ Forward About two months ago I decided to begin learning about developing an operating system from the ground up. I have been involved in trusted operating systems development for over two years now but have always done my work with pre-existing operating systems. Mucking with this driver model, deciphering that streams implementation, loving this, hating that. I decided it was time to begin fresh and start really thinking about how to approach the design of one, so that I would be happy with every part. At least if I wasn't, I would only be calling myself names. This article is the first tentative step in my development of an operating system. What is here is not really much of a kernel yet. The big focus of this article will be getting a system up and running in protected mode with a very minimal kernel. I stress minimal. I have been asked repeatedly what my design goals for this operating system are. The fact is the operating system itself was the goal for this part. There was simply to much that I didn't know about this stage of the development to go on designing something. It would be like asking a kindergarten fingerpainter what her final masterpiece was going to look like. However, now that I have this phase reasonably done, it is time to begin thinking about such issues as: a security subsystem, a driver subsystem, as well as developing a real task manager and a real memory manager. Hopefully, by the next phrack I will be able to not only answer what I want for these topics but have also implemented many of them. This will leave me with a much more solid kernel that can be built upon. So, why write this article? There are several reasons. First, writing down what you have done always help solidify your thoughts and understanding. Second, having to write an article imposes a deadline on me which forces me to get the job done. Finally, and most importantly I hope to give out enough knowledge that others who are interested in the subject can begin to do some work in it. One comment on the name. JeffOS is not going to be the final name for this OS. In fact several names have been suggested. However, I have no idea yet what I want to call it, mostly because it just isn't solidified enough for a name. When its all said and done, I do hope I can come up with something better than JeffOS. For now, getting a real working kernel is more important than a real working name. I hope that you find the following information interesting, and worth investigating further. Cheers, Jeff Thompson AKA Mythrandir PS: Some words on the Cryptography article. First a thank you for all of the letters that I received on the article. I am happy to find that many people found the article interesting. For several people it rekindled an old interest which is always great to hear. However, for several people I have unfortunate news as well. The next article in the series will have to be postponed for a few issues until I complete this operating system. As is with many people, I have been caught by a new bug (The OS bug) and have set myself up to be committed to the work for some time. I am of course still interested in discussing the topic with others and look forward to more email on the subject. The winners of the decryption contest were: 1st message: 1st) Chaos at chaos@vector.nevtron.si 2nd) Oxygen at oxygen@james.kalifornia.com Solution: The baron's army will attack at dawn. Ready the Templar knights and strike his castle while we hold him. 2nd message: 1st) Chaos Solution: MULTICAST PROTOCOLS HAVE BEEN DEVELOPED TO SUPPORT GROUP COMMUNICATIONS THESE PROTOCOLS USE A ONE TO MANY PARADIGM FOR TRANSMISSION TYPICALLY USING CLASS D INTERNET PROTOCOL ADDRESSES TO SPECIFY SPECIFIC MULTICAST GROUPS Also, there is one typo in my article. The book which was written without the letter 'e' was not The Great Gatsby, but rather Gadsby. Thanks to Andy Magnusson for pointing that out. Great job guys! ----[ Acknowledgements I owe a certain debt to two people who have been available to me during my development work. Both have done quite a bit of work developing their own protected mode operating systems. I would like to thank Paul Swanson of the ACM@UIUC chapter for helping solve several bugs and for giving me general tips on issues I encountered. I would also like to thank Brian Swetland of Neoglyphics for giving me a glimpse of his operating system. He was also nice enough to allow me to steal some of his source code for my use. This source include the console io routines which saved me a great deal of time. Also, the i386 functions were given to me by Paul Swanson which has made a lot of the common protected mode instructions easily useable. Following new releases and information on this operating systems work, I am currently redoing my web site and will have it up by Feb 1, 1998. I will be including this entire article on that site along with all updates to the operating system as I work on it. One of the first things that I will be doing is rewriting all of the kernel. A large part of what is contained within these pages was a learning experience. Unfortunately, one consequence of trying to get this thing done was it becoming fairly messy and hackish. I would like to clean it up and begin to build upon it. Having a good code base will be invaluable to this. So please watch for the next, and future releases of this code and feel free to contact me with any feedback or questions. I will do my best to help. I won't be able to answer every question but I will certainly try. Also, please be patient as I have a very busy schedule outside of this project and am often times caught up by it. I can be reached at: jwthomp@cu-online.com and my web site is at: http://www.cu-online.com/~jwthomp/ (Up Feb 1, 1998) ----[ Introduction Throughout this document I assume a certain level of knowledge on the part of the reader. This knowledge includes c and assembly language programming, and x86 architecture. The development requirements for the GuildOS operating system are: An ELF compiler I used the gnu ELF compiler which comes with linux. It is possible to use other ELF cross compilers on other systems as well. a386 assembler This can be obtained from: Eric Isaacson 416 E. University Ave. Bloomington IN 47401-4739 71333.3154@compuserve.com or call 1-812-335-1611 A86+D86+A386+D386 is $80 Printed manual $10 This is a really nice assembler. Buy a copy. I did. It is also possible to convert the boot loader assembly code to another assembler. A 486+ machine You must have a machine to test the OS on. Great books to read to gain an understanding of the various topics presented in the following pages are: Protected Mode Software Architecture by Tom Shanley from MindShare, Inc. ISBN 0-201-55447-X $29.95 US This book covers the protected mode architecture of the x86. It also explains the differences between real mode and protected mode programming. This book contains much of the information which is in the Intel Operating Systems Developers guide, but also explains things much more in depth. Developing Your Own 32-Bit Operating System by Richard A. Burgess from SAMS Publishing. ISBN 0-672-30655-7 This book covers the development of a complete 32-bit OS. The author also creates his own 32-bit assembler and compiler. Considerable portions of the code are written in asm, but there is still quite a bit in C. The entire Intel architecture series and their OS developers guides which are available from their web site for free. ----[ Chapter 1 - Booting into protected mode The first step in setting up an operating system on the x86 architecture is to switch the machine into protected mode. Protected mode allows you to use hardware protection schemes to provide operating system level security. The first component which I began working on was the first stage boot loader which is located in "JeffOS/loader/first/". The first stage boot loader is placed on the first sector of the floppy. Each sector is 512 bytes. This is not a lot of room to write all of the code required to boot into protected mode the way I would like to so I had to break the boot loader into two parts. Thus the first and second stage floppy loader. After the Power On Self-Test (POST) test this first sector is loaded up into memory location 0000:7C00. I designed the first stage of the floppy boot loader to load up all of the files into memory to be executed. The first instruction in the boot loader jumps to the boot code. However, between the jump and the boot code are some data structures. The first section is the disk parameters. I'm not currently using any of this information but will in future versions. The next set of structures contain information on the other data files on the floppy disk. Each structure looks like this in assembly: APCX DW 0000h ; Specifies CX value for INT 13h BIOS routine APDX DW 0000h ; DX APES DW 0000h ; ES APBX DW 0000h ; BX APSZ DB 0h ; Specifies number of sectors to read in APSZ2 DB 0h ; Unused There are four copies of this structure (APxx, BPxx, CPxx, DPxx). The INT 13h BIOS call has the following arguments: ch: Cylinder number to start reading from. cl: Sector number to start at. dh: Head number of drive to read from (00h or 01h for 1.44M floppy disk drives) dl: Drive number (00h for Disk A) es: Segment to store the read in sectors at. bx: Offset into the segment to read the sectors into. ah: Number of sectors to read in. al: Function number for INT 13h. (02h is to read in from the disk) I use the APxx to load the second stage boot loader. BPxx is being used to load the first stage kernel loader. CPxx is used to load a simple user program. Finally, DPxx is used to load the kernel in. Following the loader structures are two unused bytes which are used to store temporary data. SIZE is used but SIZE2 is not currently used. The boot code follows these structures. This boot code relocates itself into another section of memory (9000:0000 or 90000h linear). Once relocated, it loads all of the files into memory and then jumps into the beginning of the second stage boot loader. The first part of the second stage boot loader contains a macro which is used to easily define a Global Descriptor Table (GDT) entry. In protected mode the GDT is used to store information on selectors. A selector in protected mode is referred to by a number stored in any of the segment registers. A selector has the following format: Bits Use 15 - 3 Descriptor Table Index 2 Table Indicator 1 - 0 The Requestor Privilege Level The Descriptor Table Index or (DT) is an index into the GDT. The first entry in the GDT is 00h, the second is 08h, then 10h, etc.. The reason that the entries progress in this manner is because the 3 least significant bits are used for other information. So to find the index into the GDT you do a segment & 0xfff8 (DT = Selector & 0xfff8). The Table Indicator selects whether you are using a GDT or a Local Descriptor Table (LDT). I have not yet had a reason to use LDT's so I will leave this information to your own research for now. Finally, the Requestor Privilege Level is used to tell the processor what level of access you would like to have to the selector. 0 = OS 1 = OS (but less privileged than 0) 2 = OS (but less privileged than 1) 3 = User level Typically levels 0 and 3 are the only ones used in modern operating systems. The GDT entries which describe various types of segments have the following form: 63 - 56 Upper Byte of Base Address 55 Granularity Bit 54 Default Bit 53 0 52 Available for Use (free bit) 51 - 48 Upper Digit of Limit 47 Segment Present Bit 46 - 45 Descriptor Privilege Level 44 System Bit 43 Data/Code Bit 42 Conforming Bit 41 Readable bit 40 Accessed bit 39 - 32 Third Byte of Base Address 31 - 24 Second Byte of Base Address 23 - 16 First Byte of Base Address 15 - 8 Second Byte of Limit 7 - 0 First Byte of Limit The base address is the starting location of the segment descriptor (for code or data segments). The limit is the number of bytes or 4k pages. Whether it is bytes or 4k pages depends on the setting of the granularity but. If the granularity bit is set to 0 then the limit specifies the length in bytes. If it is set to 1 then the limit specifies the length of the segment in 4k pages. The default bit specifies whether the code segment is 32bit or 16bit. If it is set to 0 then it is 16bit. If it is set to 1 then it is 32bit. The present bit is set to one if the segment is currently in memory. This is used for virtual paging. The descriptor privilege level is similar to the RPL. The DPL simply states at what protection level the segment exists at. The values are the same as for the RPL. The system bit is used to specify whether the segment contains a system segment. It is set to 0 if it is a system(OS) segment. The data/code bit is used to specify whether the segment is to be used as a code segment or as a data segment. A code segment is used to execute code from and is not writable. A data segment is used for stacks and program data. It's format is slightly different from the code segment depicted above. The readable bit is used to specify whether information can be read from the segment or whether it is execute only. The next part of the second stage floppy boot loader contains the code which is used to enable the A20 address line. This address line allows you to access beyond the 1MB limit that was imposed on normal DOS real mode operation. For a discussion of this address line I recommend looking at the Intel architecture books. Once enabled the GDT that exists as data at the end of the assembly file is loaded into the GDT register. This must be done before the switch into protected mode. Other wise any memory accesses will not have a valid selector described for them and will cause a fault (I learned this from experience). Once this is completed the move is made to protected mode by setting the protected mode bit in the CR0 register to 1. Following the code which enables protected mode, there is data which represents a far call into the next portion of the second stage boot loader. This causes a new selector to be used for CS as opposed to an undefined one. The code that is jumped into simply sets up the various selectors for the data segments. There is then some simple debugging code which prints to the screen. This was used for myself and can be removed. The stack segment is then set up along with the stack pointer. I placed the stack at 90000h. Finally I push the value for the stack onto the stack (to be retrieved by the kernel) and then call linear address 100080h which contains the first stage loader for the kernel. ----[ Chapter 2 - The first stage kernel boot loader The first stage kernel boot loader is located in \boot. First some notes on what is happening with the first stage boot loader. The boot loader is compiled to ELF at a set TEXT address so that I can jump into the code and have it execute for me. In the makefile I specify the text address to be 10080. The first 80h bytes are used as the ELF header. I completely ignore this information and jump directly into linear memory address 10080h. It is my understanding that newer versions of the ELF compiler have a slightly different header length and may cause this number to need to be modified. This can be determined by using a dissasembler (i.e. DEBUG in DOS) to determine where the text segment is beginning. The two files of importance to the boot loader are main.c and mem.c. main.c contains the function `void _start(unsigned long blh);`. This function must be the first function linked in. So main.c must be the first file which is linked and _start() must be the first function in it. This guarantees that start will be at 10080h. The parameter blh is the value which was pushed in by the second stage boot loader. This originally had meaning, but no longer does. The first thing that _start does is to call kinit_MemMgmt which is the initialization routine for memory. The first thing that kinit_MemMgmt does is set nMemMax to 0xfffff. This is the maximum number of bytes on the system. This value is 1MB. kinit_MemMgmt then calls kmemcount which attempts to calculate the amount of free memory on the system. Currently this routine does not work properly and assumes that there is 2MB of free memory on the system. This is sufficient for now but needs to be fixed in the future. kinit_MemMgmt then calls kinit_page which sets of the page tables for the kernel. Paging is the mechanism used to define what memory a task is able to access. This is done by creating a "virtual" memory space which the task accesses. Whenever an access to memory occurs the processor looks into the page tables to determine what "real" physical memory is pointed to by this memory location. For example, the kernel could designate that each task will get 32k (8 pages) of memory to use for the stack. Without using paged memory each of these memory locations would occur at a different address. However, by using paging you can map each of these physical memory allocations to a paged address which allows each of these allocations to appear to occur at the same location. The page tables are broken up in the following manner. First is the page directory. It is composed of 1024 entries which have the following properties: 31 - 12 Page Table Base Address 11 - 9 Unused (Free bits) 8 0 7 Page Size Bit 6 0 5 Accessed Bit 4 Page Cache Disable Bit 3 Page Write Through Bit 2 User/Supervisor Bit 1 Read/Write Bit 0 Page Present Bit The Page Table Base address is an index to the page table which contains information about this memory location. When a memory location is accessed the most significant 10 bits are used to reference one of the 1024 entries in the page directory. This entry will point to a page table which has a physical memory address equal to the Page Table Base Address. This table is then referenced to one of its 1024 entries by the 21 - 12 bits of the memory address. The Page Size Bit tells whether each page is equal to (Bit = 0) 4kb or (Bit = 1) 4MB. The accessed bit is used to show whether the page has ever been accessed. Once set to 1, the OS must reset it to 0. This is used for virtual paging. The Page Cache Disable Bit and Page Write Bit are not currently used by me, so I will leave its definition as an exercise to the reader (enjoy). The User/Supervisor Bit specifies whether access to the page table is restricted to access by tasks with privilege level 0,1,2 or 3. If the bit is set to 0 then only tasks with level 0, 1, or 2 can access this page table. If the bit is set to 1, then tasks with level 0, 1, 2, or 3 can access this page table. The Read/Write bit is used to specify whether a user level task can write to this page table. If it is set to 0 then it is read only to "User" tasks. If it is set to 1 then it is read/writable by all tasks. Finally, the Present Bit is used to specify whether the page table is present in memory. If this is set to 1 then it is. Once the page directory is referenced, the offset into the page table is selected. Using the next 10 bits of the memory reference. Each page table has 1024 entries with each entry having the following structure: 31 - 12 Page Base Address 11 - 9 Unused (Free bits) 8 - 7 0 6 Dirty Bit 5 Accessed Bit 4 Page Cache Disable Bit 3 Page Write Through Bit 2 User/Supervisor Bit 1 Read/Write Bit 0 Page Present Bit The Page Base Address points to the upper 20 bits in physical memory where the memory access points to. The lower 12 bits are taken from the original linear memory access. The Dirty, Accessed, Page Cache, and Page Write Through Bits are all used for virtual memory and other areas which I have not yet been concerned yet. So they are relegated to the reader (for now). The remaining three bits behave just as in the page directory except that they apply to the physical memory page as opposed to a page table. All kernel pages are set to have Supervisor, Read/Write, and Page Present bits set. User pages do not have the supervisor bits set. The code in kinit_page creates the page directory in the first of the three physical pages that it set aside. The next page is used to create a low (user) memory area of 4MB (One page table of 1024 entries points to 1024 4kb pages, Thus 4MB). The third page is used to point to high (OS) memory. The kinit_page function sets all of the low page memory equal to physical memory. This means that there is a one to one correlation for the first 4MB of memory to paged memory. kinit_page then maps in ten pages starting at 70000h linear into 0x80000000. Entry number 0 of the page directory is then set to point to the low page table. Entry number 512 is set to point to the high page table. Finally the kinit_page function places the address of the page directory into the cr3 register. This tells the processor where to look for the page tables. Finally, cr0 has its paging bit turned on which informs the processor that memory accesses should go through the page table rather than just being direct physical memory accesses. After this the _start function is returned into and k_start() has been set to 0x80000080 which points to the _start() function in the main kernel. _start in the boot code calls this function which starts the real kernel off. ----[ Chapter 3 - The Kernel The kernel is where all of the fun begins. Unfortunately, this is the place that needs the most work. However, there is enough here to demonstrate the beginnings of what needs to be done to build a viable kernel for your own work. The kernel boot loader created the kernel page table and then jumped into the kernel at _start(); _start() then sets up the console, clears it, and displays the message "Main kernel loaded.". Once this is done it runs the memory manager initialization routine 'kinit_page()'. The memory manager initialization routine begins by initializing a structure called the PMAT. The PMAT is a giant bit field (2048 bytes), where each bit represents one page of physical memory. If a bit is set to 1, the corresponding page of memory is considered allocated. If the bit is set to 0 then it is considered unallocated. Once this array is initialized the memory management code sets aside the chunks of physical memory which are already in use. This include the system BUS memory areas, as well as the location of the kernel itself in physical memory. Once this is completed the memory manager returns to the _start() function so that it can proceed with kernel initialization. The _start() function then calls a temporary function which I am using now to allocate memory which is use by the user program loading in by the first stage floppy loader. This will go away after I add the loading of processes off of disk during run time. This function sets aside the physical memory which is located at 20000h linear. Now that the basic memory system is set up the _start() function calls the kinit_task() function. kinit_task() sets up the kernel task so that it can run as a task rather than as a the only process on the system. kinit_task() is really a shell function which calls two other functions: kinit_gdt() and kinit_ktask(); kinit_gdt() initializes a new kernel GDT which is to be used by the kernel rather than the previous temporary one which was set up by the second stage floppy boot loader. Once the new location for the gdt is mapped into memory several selectors are added to it. Kernel Code and Data selectors are added. Also, User Code and Data selectors are added. Once these selectors are put into place, the new gdt is placed in the gdt register on the processor so that it can be used. kinit_task() now calls the kinit_ktask() function. This task creates a task which the kernel code will be executed as. The first thing this function does is to clear out the kernels task list. This list contains a list of tasks on the system. Next a 4k page is allocated for the kernel task segment. The current executing task is then set to the kernel task. Next the task segment is added to the GDT. This task segment has the following structure and is filled out for the kernel with the following values by me. In fact all tasks will start out with these settings. struct TSS { ushort link; // set to 0 ushort unused0; ulong esp0; // set to the end of the task segment page ushort ss0; // set to SEL_KDATA (Kernel Data segment) ushort unused1; ulong esp1; // set to 0 ushort ss1; // set to 0 ushort unused2; ulong esp2; // set to 0 ushort ss2; // set to 0 ushort unused3; ulong cr3; // set to the physical address of this tasks page // tables ulong eip; // set to the entry point to this tasks code ulong eflags; // set to 0x4202 ulong eax, ecx, edx, ebx, esp, ebp, esi, edi; // set to garbage values ushort es; // set to SEL_KDATA (Kernel data segment) ushort unused4; ushort cs; // set to SEL_KCODE (Kernel code segment) ushort unused5; ushort ss; // set to SEL_KDATA ushort unused6; ushort ds; // set to SEL_KDATA ushort unused7; ushort fs; // set to SEL_KDATA ushort unused8; ushort gs; // set to SEL_KDATA ushort unused9; ushort ldt; // set to 0 ushort unused10; ushort debugtrap; // set to 0 ushort iomapbase; // set to 0 }; The link field is used by the processor when an interrupt is called. The processor places a pointer to the task segment which was running prior to the interrupt. This is useful for determining access rights based on the calling process. The espx and ssx parameters are used to store a pointer to a stack which will be used when a task with a lower privilege level tries to access a high level privilege area. The cr3 parameter is used to store a pointer to the physical address of this tasks page table. Whenever this task is switched to, the processor will load the value stored in cr3 into the cr3 register. This means that each task can have a unique set of page tables and mappings. The eax, ebx, etc.. registers are all set to a garbage value as they are uninitialized and will only gain values once they are used. When the processor switches to this task these parameters will be loaded into their respective processor registers. The cs, es, ss, ds, fs, and gs parameters are all set to meaningful values which will be loaded into their respective processor registers when this task is switched to. As I am not using a local descriptor I set this parameter to 0 along with the debugtrap and iomapbase parameters. As I have mentioned every time a task is switched to the processor will load all of the parameters from the task segment into their respective registers. Likewise, when a task is switched out of, all of the registers will be stored in their respective parameters. This allows tasks to be suspended and to restart with the state they left off at. Switching tasks will be discussed later when the point in the kernel where this takes place at is reached. Once this task state segment is created it is necessary to create an entry in the GDT which points to this task segment. The format of this 64 bit entry is as follows: 63 - 56 Fourth Byte of Base Address 55 Granularity Bit 54 - 53 0 52 Available for use (free bit) 51 - 48 Upper Nibble of Size 47 Present in Memory Bit 46 - 45 Descriptor Privilege Level 44 System Built 43 16/32 Bit 42 0 41 Busy Bit 40 1 39 - 32 Third Byte of Base Address 31 - 24 Second Byte of Base Address 23 - 16 First Byte of Base Address 15 - 8 Second Byte of Segment Size 7 - 0 First Byte of Segment Size As you have probably noticed, this structure is very similar to the code segment descriptor. The differences are the 16/32 bit, and the Busy Bit. The 16/32 Bit specifies whether the task state segment is 16 bit or 32 bit. We will only be using the 32 Bit task segment (Bit = 1). The 16 bit task state segment was used for the 286 and was replaced by a 32 bit task state segment on the 386+ processors. The busy bit specifies whether the task is currently busy. Once the kernel task is allocated, a new kernel stack is allocated and made active. This allows the stack to be in a known and mapped in location which uses the memory manager of the kernel. The user tasks is then created in a similar fashion as the kernel task. In this current implementation the user task is located at 0x20000. Its stack is located at 0x2107c. Currently, this user task operates with OS level privilege. I encountered some problems when changing its selectors to user entries in the GDT. As soon as I fix this problem I will post a fix on my web site. After the user task is created it is added to the task queue to be switched to once the scheduler starts. Now that the kernel task and a user task (though running with kernel privilege level) have been created it is necessary to set up the interrupt tables. This is done by a call to the kinit_idt() function. kinit_idt() starts by setting all of the interrupts to point to a null interrupt function. This means that for most interrupts a simple return occurs. However, interrupt handlers for the timer as well as for one system call. Also, interrupts are set up to handle the various exceptions. Once this table is filled out the interrupt descriptor table (IDT) is loaded into the idt register. The interrupts are then enabled to allow them to be called. The timer interrupt handler is a simple function which calls a task switch every time the hardware timer fires. The system call (interrupt 22h) is called, the handler will print out on the console the string which is pointed to be the eax register. The exception handling routine will dump the task registers and then hang the system. The jump.S file in JeffOS/kernel/ contains the assembly wrappers which are called when an interrupt occurs. These wrapper functions then call the C handler functions. Now that the IDT is set up and interrupts are occurring task switches can occur. These occur when the swtch() function is called in the task.c file. The swtch() function locates the next task in its queue and does a call to the selector address of the new task. This causes the processor to look up the selector and switch to the new task. You now have a very simple multi-tasking kernel. ----[ Chapter 4 - User level libraries The user level libraries are fairly simplistic. There are two files in this directory. The first is the crt0.c file. This file contains one function which is the _start() function. This function makes a call to main which will be defined in user code. This stub function must always be linked in first as it will be jumped into by the kernel to begin running the process. The second file is the syscall.c file. This file contains one system call function which is simply an interrupt 22. This interrupt calls the console system call. eax is passed in as a pointer to a string which is printed to the system console. Both of these source files are compiled to objects and are used during the linking phase of any user code. ----[ Chapter 5 - User code The user code is stored in one file called test.c. This file is located in the /user/ directory. All this code does is call the console system call function provided by the library, wait a short amount of time, and call it again in a non-terminating loop (good thing, as I don't handle task termination yet). The important thing to note is that when linking this user process is set to have a text segment of 20000h linear. Also the crt0.o and syscall.o files are linked in as well. crt0.o is linked in first to insure that its _start() function is at 20080h so it will be jumped into by the kernel. In truth, _start() is the real main as opposed to the main() everyone is used to dealing with. This code is the task which is created and run alongside the kernel, as described in chapter 3. ----[ Chapter 6 - Creating a disk image out of the binaries Once you have compiled all of the binaries and placed them into the build directory you will need to create two more files before continuing. These files are called STUFF.BIN and STUFF2.BIN. These files are simply containers of empty space to cause alignment of other binaries. The floppy loader expects the user program to be 1k in size. If the user program is not exactly this size then STUFF2.BIN needs to be created and be of such a size that when added to USER.BIN the size is 1024 bytes. Also, the floppy boot loader expects the kernel boot loader to be 3.5k (3584 bytes) in size. STUFF.BIN needs to be made of such length that when added to the size of the BOOT.BIN (kernel boot loader) file the size will be 3584 bytes. In the future I will try to automate this process, but for now this is simply how it must be done. Once this is complete the shell program 'go' must be run. This will place all of the binary files into one file called 'os.bin'. This file can then be written to disk by one of the following two methods. If you want to do it from linux you can do the following command: dd if=os.bin of=/dev/fd0 (places os.bin directly onto the floppy disk) or from DOS you can obtain the rawrite command and run it and follow its directions. ----[ Conclusion The kernel contained within is far from complete. However, it is a first step towards creating a real protected mode operating system. It is also enough to begin working with, or to refer to during you own work on a protected mode operating system. Doing this work is simply both one of the most rewarding things you will ever do, and one of the most frustrating. Many a night has been spent at the local tavern telling war stories about this stuff. But in the end, it has all been great fun. I wish you all the best of luck! Jeff Thompson jwthomp@cu-online.com http://www.cu-online.com/~jwthomp/ <++> JeffOS.tgz.uue begin 600 JeffOS.tgz M'XL(`(-CQC0``^P\:W?:R)+Y"N?D/_3!<]=@$R(!Q@Y,<@\&G#CKUQKG9C(W M<WP$:D"QD%BU,'AF\]^WJKI;+^-'LH[WSAUW'JA?]>[J:G7!>SX:'?=?/ON1 MA=6-;<-@SQB6[*>JL$:]8<+?K;K)F&F86^8SMO5#J5)E+D(K8.Q9X/OA;>/N MZO^3EO=2_ZYOV3SX06:P0O]FO;Y"_U5CJV;6H==LU(U_,?W/!0_$8Q#TN"6M M?\&'OF<_M!E\N_YKIEE[TO]CE)7ZEQ\52TP?!(=I&(U;]%_;WM[*Z'^K:H#^ MC0?!?D?YB^N_S\==+H;LL-TY/7Z>S]D+5ERKL?91EQE[4$JZK9II&[`U4S=- M5$-=?6ZQXU.VUHB&]=^=LIWG^;7>X?/\\SQ(TH2^X].WS)A@PRZ(OHD/N:E_ MR0;+,MN>J(HU*8.MZ(I;9NOM=:@Y7@A!PB2:8\&<H5`56Y2A056XJCS/MZM& MV[8#+@3@0KMKL3X/&30SU<Y<Q^-LP@..XW%(YV`?X'3:!P>,>];`Y3!:]O3/ ML.?]X0GK^%[H>'.8$PU1"+`,+=>-Y_IFW//%^YV)B1^{body}lt;;?M>SP>H%@V;',2 M-_KS$);*I&RY#XMCE,5A2!P)L,WT-#Y<EJL&E$EJE)L8]F4Z4^A_VJS&S8Z' M.(&+N"GD(L3&Q"C7]V=`?A)T`A&RD<`4\!#4@6I[GM<*:=[/H'8S!A6;QEO7 M'U@NP_41.+/0#]@98B<+*;-PX@@V!??!!IPA.>SD=!_L/O392>"'[-"W(SLZ M>-L](WOJ-_\)C[^E\>!P/@RY37.4`4(OL(+V)X<>'O^#..VU?RFSSJDR0\`7 MMYKID3"HC!VR%4T5"VF"5F8N9S0:(`Z#6[2`%[F</QH)(&@&!$%[0S<;.R09 M;&[F8/G6`$1.TCX'#?/QE'NA7GX<Y4VRC%;C(+D:H\HH61FKRO-\!,,8[$CK M(B5>A9S-PH#]$_I^*[/JUJH.MLE,Z#2V;^BLWM99@TXRRA1SH36\T"RF.-S1 M'(H4A^B*7M&R8"]8/>)[5L8N::"YV5Q,9#4I,6.'A"97LY0%N$]C8!-)8#8N M]QC_[SEK;.Q0'1VS,8)">K)AS1JO7DF=,8,0:>]N@"40364YH0S83#AE&<L! MM`Q'V`X\7W3ZK`C*+K'.R8%QUWSC^OPNS`?5E[YC:A^G[MP]55.]DR#:;+!B M]1O0[B2PTMR[\&*Y1K+5Z=\UZ3J?5O=[)O7[J,Y#,3Z`.(V6<7>7&5WL;L-_ MZ]CL>.-UA$;6@ILK%G.O!_6^,_:L<!YPG/@18K!V>PO7S[,;XK^ARRVO,K`> MZ,![1_S'V%8M$__5MZO5I_CO,8K-7;91\0=?P-G`HXK\!XZ7JHNK*5G+4_EW M*RO7?\`M>\HKX?)A',`=Z]_<,O7Z-XWMADGG/_-I_3]*.<,HUG8"B#_]X(J! M^D/+\01$MYQ90O#IP,56"$M'$/]BJ[01#(S&T.CZL]D5&X#P\M*&*HSM>\P/ MX!%#882!,;,#`>65/Y<!LS]`)!)';:>A{body}lt;$,#R)A.P]]$)`'/EH@TP.#T(%] M*4$KT#/U;6=T1?U$HT*(L-C<`Q+R%D"$@#I&4<GG]T=$B@7[X5S`MLG:2(1P MIC/@-;R:<;9.9,6O0=8KC`2U<"`V@Z/BW"6Z+#9R7)['B`V"]Y$3B!!=)UM, MG.%$#E8"3<H3AL(@*[BJL+:0N`&TQ;K'?98?6`+Z(=(>!]94GC$6?G`A@7D< M^H!'?>``N#@)!/'1\6Q_(9!,+CB1):G-)^8,`G\^GH#X+Z5ND"8X\LZ7"{body}amp;* M:>`OH=D*,P("W>>'_G3FN%;H^*#>$4V^X(''79#HZ=SS<-Q@[K@V!@Z:^=E5 MPF9(-D@::E>BIPD)G2[P\)-'Y$/+8T-P1!"C6S!`7#!GBA8G65D$3AA".(P* MUT:(@Q*T1{body}amp;,UAJ$V@`+@^ME&%@>]^=""6H4^%,B)R:$+X=\%N:UT4O=@B5( M^BU8`$2^%8*B"4CES[D_KO3_D1H?!L>=\5\M^_ZOOEW;?O+_CU%HB2:69Z7R M&?Z2`3Q%?'^!DE[_Y.8>_!;PV^Y_P!>8U>WMI_N_1RFK]/_`Q_^[_7]]*Z/_ M.HQ_\O^/43+G_XWXZ"]#GJ>3_[]W6;7^HV#W@7#<OO[-:@,JF?6_77^*_QZE MR(N.,VO0#_T9'J?J*Z]H]0T27M7*VM'QB>YXQP/+M>/=O+O+V/I;#"&/^^:Z M'./M7H5<G/"@#P<L_1JZ2F^KJ1N;H;?CBC!0(`Q][^F=BDM;S6-ZKA%W[[7/ M^BR-WJCJJ4!PUPEZ7BCBJ;T([9D?6FZ?SGQ"=^_6=??@D-N.E8&\EZ49T"?) M>I7I/@OP!DEWFU7=_8Y;MDC`EK/C;L>VX8`;=7>IVU`7KFGB:]4;A@Q07]W` MN>0)!G3G*0>QHEK8BL[>,L2Y?6<<=U8CUH[[L<PT[68]!?BC'VB34*SI[G_X MKF=--59E+J?[G7>JKBQFSPK/\%U,:AC(VJS*,<_SN?9)YY<<0">YY5KR'8TC M7UYE+E4/R,'12ZPK?QZPXS[-[^KY!L[?9Q,+1"7FLYGKX&L3@%=A$BSWPN`* M;[/G>#*"J;T^3FVHJ>!%U*L.9$VA8<6&O`D\<#QN!26:MYM">2QO7`'!!P]6 MG03=_S4'W*)]4Z6*-7FAMQMS7(LXGOKV'%^JB)@]^68&W^N@%A7W-#_#<:A8 M0@:"D%FA?.?D^D/YIL<:A3R(!(JB/+`E(,F_J0#U069XT8\4N+XW1EE'+Z'P MC8T%YFPSUPH5'6DQ%,V4H%:)95>)A:[R=]-BZ<1BL6X0B\7,"X9.7+]<HVD) M:5!=,E555-&N:*-09*(#)8A8`8V\4XT=12^:)HRHXBF;E@SP6JW3[;,HJ7&2 M%970THV9&2+LMGJ%J?C)Z)@FI)5ZZ`<\V2N9VE9,=N\FO:M(KZH)&0+[^[_V ME.SI.59$"_ZR/EF2/Y+B"ZA5I?C@.NX<[+/[E!;K.H+R+?8],)I@/L,4`YTW M<8@O\VB]T04];%RO5+9`.@D"\R)>Q;Y'-_?[9:8S(Z*VDS)+P6B!)Z.%P.4J MDCZ/<!FWXLI1:6'F2.AX<B'U=0I!<D8O0<8O*I>CNU^&?]<A;W?``&^451\L M8LA78^FNP-('+/T,E@Y@V2+W>A.6#KXMPB%DO<K;]V1B"4#I[\:B>S^?4CSA M\47D36Y5S<F'_CO9?'W8#K]AT&GO;.]&>M,"<J9S]#ZX_(?<QLOP@(?SP&,C MB/_B3*+[0<,7P+!8F!/RJ6!?\%(#DSIP]7P;(%^_"9=V+-VGEE9DALCEV`=O M"H.C]"#,0;LGEC/DTXF6$1L@KA0&CCY`OTC'@"%KW-V#\K5P(HUD+WX-SXKM M9BEM<*A&K;7](QDQF36EUO<="637LF6$F</4,=RU]J&B6IM9DG9_D8E=R;;V M.YG?E6H#TM=KZQG<Y-N@EC,;T=,K6M,2;S/C'6#!'._M]7MG3.5A*'>&R8%0 M3N9A9V(%(FD`E$CT7W,'A(T&8SM!>%6I5-BI/P"+P9LKV@_EN#T\=;$9^$Z5 M%C1<@L1@+]")5%3MZJI%U5X_D^<89;QAH)%(<\,X0E719<-@5^<:(2][L%=' MB4]TOY<A9%<2,@2U8)6$]^5W=N1/?33'B,+=-(6[MU"XFZ9P]]X4PNX>7".P M<T\".VD".[<0V$D3V+DW@7O@C,-)EL+N/2GLIBGLWD)A-TUA]WX42HQ-F;^: MB_-7I>7CJ*;<O_:/FK0I=.$?;E6XX!"P3F$;+N6XCKXUIB`2+]]DK$-6/[R" MJ$E&GS3)SD[JZTD3.!')A2(=4,07'/5HRBD.<#QFJJU8#<$U3U&62MM#=[J. M8]?9WMP;TO:+CA66.'F<W!(J*+[!,IJCHB#PKX84PUN_*;V"]%*Y&1R/[:5Z M&%)V'JD2$TH-B;LSX<J%6QX#5TL$@H9WPX4O>3[RRW0CC/)F!"+MRV34X(B9 M"Z<)`-(C:!I4UL?E4@ZNMZ[HC;V;=J):\4A(,R%7X?R.8K;Y4%H*93-"&QJ. M$CA=FIKRAE0-)Z[=B.LV[*0+>2$MV>W"DTI(HP&?N$@E=W,AC<L;!AR#%9U= M"4%%,K;(P8F!QE<5(GVZDTFJF_CR@`E?S]ZL1BFG,LM4JC;$6!ACD$O+G=.- MN:+&\89LZ,9J`4A.3%'&CM?TN@6VP1PD_2-D&^_(+6ER'"Q778R'>.:'.1(\ MM'A,+!R\*U81D1*6]WND'QKY40*$@!^!(D`@"\-HF+0,%34JB13IL2=9-8"I MTS(R_J[@'_%%1ZU`33:"3Y,3KU'MKY!1"?A7'OB4$JYDX<VG`ZZ7'A*@QO4C MB$3`FIDQP-0*SK"3(#+]18"A5&,/;^R'4LS#>1"@CA)^14Q@/0/%^(V#Y7`R MILI@DK`A16,;JJ9.?=``8H[2<\7$C:!J'Z[,ZL2UACPE.9G2H/S"MTK06"U! M0YOI))=9,==)OZ^D$XL3/0'ES>,?';ZHJ.?@N(O!/#[J%'-T,NT#%:_]*MOZ M[XY/SVC!7X_"C-[D>K@&X?[VBCB,8*H$=0DS#J>Z<:8_Q,'74OR_.R!LK`H( MG^.7"++Q%6U;T;YKQ<%5W`%R'$G2?DR:[@\KJ][_/W#ZWSWR_XPX_Z]&^7\P MX>G]_V.4;\__(QOY*Z3_1==@3]E__[?LOU@T3\E__VIEE?]_X/2_>^3_)>]_ M&Y3_8=:?_/]C%%JD\0)]2O_[BQ6U_G