dthain / basekernel Goto Github PK
View Code? Open in Web Editor NEWA simple OS kernel for research, teaching, and fun.
License: GNU General Public License v2.0
A simple OS kernel for research, teaching, and fun.
License: GNU General Public License v2.0
Right now we just use preset VBE modes, but these have been deprecated. I am working on adding querying on a branch of my fork: https://github.com/JohnathonNow/basekernel/tree/feature/video-mode
@kevinwern with the latest merge including kevinfs, loading a program from the cdromfs results in a crash, please take a look:
I see work being done on querying the video mode before the kernel boots up. Could it be done such that switching video modes is possible from within the kernel? The only example of such code Ive seen using VESA is using a virtual x86 mode thread and making the switch, but I dont know enough about VESA to be certain.
btw, thanks for your efforts. Im seriously considering switching my project to use your core, but I really need a VESA method to switch from 320x640x256, 640x480, and text 25x80. (Mine just uses a VGA driver)
https://github.com/xlar54/emudore64
(Edit: I also have the beginnings of a FAT32 filesystem that may be useful to this project, unless filesystems are outside the scope and intent)
If we want to start working on support for the C standard library, we should add a few of these headers: stdarg.h, stddef.h, stdint.h, limits.h, stdnoreturn.h, stdalign.h, and stdbool.h.
Pick something and use it as a starting point for automated testing. Preferably something that's easily comparable, as things like graphics/console/disk would require code to be able to compare outputs. My thought is either memory allocation (perform some memory operations and see that memory is structured how we expect), data structures (list), or output formatting (although that would require some modification so we have an interim string before printf happens).
@kevinwern please add a page to the wiki describing the gdb-qemu technique that we discusses this week.
Burn a basekernel cdrom and try it on real hardware, see how far it gets, and report back on any problems.
(Access to a suitable VBE video mode in BIOS may be a problem.)
Currently, programs are loaded in the form of a "binary blob". We just take the raw assembly, dump it into user space, and jump to the first location.
Let's take the next step towards loading the a.out binary format (Linux variant). This is pretty straightforward: read the program header, allocate space for code, data, and bss, load the code and data into the appropriate place, and then jump to the proper program entry point.
(@JohnathonNow observed that the current binary blob doesn't event convert bss into data, so empty globals don't even appear in the program image!)
(Project notes have very limited text, so this seems like a better place to document a discussion.)
We need several kinds of abstractions related to I/O in the kernel that will sit on top of some of the existing code. For example, sys_run
invokes cdromfs
directly since that's the only filesystem we have. But we need abstractions that will hide and unify multiple implementations.
The ones that come to mind are:
Before writing a whole mess of code, try sketching this out in the form of a header file that defines the operations possible on each one, and let's argue about that for a while before digging into the implementation.
Add a gettimeofday
system call which pulls the current time out of the rtc
device and presents it seconds-since-the epoch to the user.
Add support for indirect blocks to kevinfs, so that it can access files larger than a few kilobytes. Once that's working, try double-indirect and triply-indirect up to a reasonable maximum file size.
Using make with a clean project (without test.exe prebuilt), the build fails:
genisoimage: No such file or directory. Invalid node - 'test.exe'.
Makefile:13: recipe for target 'basekernel.iso' failed
make: *** [basekernel.iso] Error 2
I think the traget all should be:
all: test.exe basekernel.iso
Build a simple user-level shell that can take simple actions on the filesystem (mount, list, copy, etc) and stop and start child processes. Use this opportunity to make sure that all of the needed functionality is available via well-formed system calls. Once this is working, remove the kernel shell and replace it with an invocation of the user-level shell.
So mount 2
will not work, but hard coding 2
where it is needed will make it run correctly.
Not sure what the reason for that is.
Right now, a process is allowed to touch any page within its virtual address space. If the page does not have a frame allocated, the page fault handler allocates and puts one it. That makes things easy for the user, but doesn't allow us to rationally limit the memory used by each process.
Implement a more controlled management scheme similar to that of Unix:
1 - Allow a process to request an increase/decrease in heap size relative to the original break point.
2 - Allow a process to automatically extend the stack by touching the page just below the current stack top.
There are some boundary cases to consider, if the page allocation should fail, or the heap run into the stack.
Add a system call that causes the calling process to block for a specified amount of time using the kernel clock
module. Update the sample user level program to use it.
When the exception handler calls process_dump
, it is likely to cause the console to overflow and clear, which causes most of the dump to be invisible as it has been cleared.
I would suggest adding "\f" to the start of the console_printf
calls on line 44 of interupt.c and line 54 of interupt.c. This would allow the whole error message and process dump to be visible to the user.
Maybe not such high priority, but, for instance, it's hard to create a real-time clock with 0 padding because format modifiers such as %02d
are not available.
In the code where we determine the highest place value for large ints, we have:
while((i/f)>0) {
f*=10;
}
f=f/10;
Problem is, this means f
must go one place value higher than the actual number. Meaning that 2*10^9 will require f to be 10^10, which would overflow for 32-bit ints and render the remaining calculations incorrect.
Very minor: graphics_line requires w and h have the same sign, but it might be useful to allow for opposite signs, for things like drawing a polygon as a list of line segments.
You should not use the system compiler for operating system development. Use a cross-compiler instead. You should not pass -m32
and -march
to your compiler. Your system gcc is likely to come from the package management of your distro, so you have no way of knowing what release it will break. Additionally, your system compiler targets a hosted Linux requirement, which obviously is not a freestanding environment. Lastly, compiling the kernel on a different architecture will obviously break everything (try compiling this on a Raspberry Pi!). System compilers are too different and will break the code way too often, likely even in silent ways you won't notice right now but will some time down the road in the form of mysterious bugs.
We tend to favor looking up dirents/inodes repeatedly in cases where we need the properties of a directory or file multiple times, so create a caching layer to reduce our overhead from these lookups.
Next step: build up the user visible process management API. We are going to do things a bit differently from Unix, in order to have better control over resource management. Here is the idea:
process_run( path, args )
- Creates a new child process running the indicated executable and arguments, returns the process id.process_exit( status )
- Terminates the current process and all of its children, indicating an integer status to the parent.process_kill( pid )
- Terminates the indicated process and all of its children, indicating that the process tree was killed to the parent.process_wait( &info, timeout )
- The caller waits to see if any of its children have terminated. If one has, then return the relevant information (pid, status, etc) into the info structure. This only works for the immediate parent of a process on its children.process_reap( id )
- Remove the termination record for process id
When issuing an ata/atapi read/write, the ata driver should generally call process_wait() to wait for an interrupt to arrive. However, on "fast" virtual hardware, the interrupt is often delivered even before process_wait() has been called. Hence, the current code has these waits commented out, and just does a busy-wait on drive status.
Fix the race condition by blocking interrupts in the critical section, so that we can properly handle "slow" real hardware without busy waiting.
See PR #72 for an instance of the problem and PR #82 for the temporary workaround.
Using GCC 6.3.0, running make
yields:
cc -Wall -c -ffreestanding -m32 -march=i386 bootblock.S -o bootblock.o
ld -m elf_i386 -Ttext 0 -s --oformat binary bootblock.o -o bootblock
cc -Wall -c -ffreestanding -m32 -march=i386 kernelcore.S -o kernelcore.o
cc -Wall -c -ffreestanding -m32 -march=i386 main.c -o main.o
cc -Wall -c -ffreestanding -m32 -march=i386 console.c -o console.o
cc -Wall -c -ffreestanding -m32 -march=i386 memory.c -o memory.o
cc -Wall -c -ffreestanding -m32 -march=i386 keyboard.c -o keyboard.o
cc -Wall -c -ffreestanding -m32 -march=i386 mouse.c -o mouse.o
cc -Wall -c -ffreestanding -m32 -march=i386 clock.c -o clock.o
cc -Wall -c -ffreestanding -m32 -march=i386 interrupt.c -o interrupt.o
cc -Wall -c -ffreestanding -m32 -march=i386 kmalloc.c -o kmalloc.o
cc -Wall -c -ffreestanding -m32 -march=i386 pic.c -o pic.o
cc -Wall -c -ffreestanding -m32 -march=i386 ata.c -o ata.o
cc -Wall -c -ffreestanding -m32 -march=i386 cdromfs.c -o cdromfs.o
cdromfs.c: In function 'cdrom_dirent_readdir':
cdromfs.c:95:5: warning: assignment from incompatible pointer type [-Wincompatible-pointer-types]
d = ((char*)d)+d->descriptor_length;
^
cc -Wall -c -ffreestanding -m32 -march=i386 string.c -o string.o
cc -Wall -c -ffreestanding -m32 -march=i386 bitmap.c -o bitmap.o
cc -Wall -c -ffreestanding -m32 -march=i386 graphics.c -o graphics.o
cc -Wall -c -ffreestanding -m32 -march=i386 font.c -o font.o
cc -Wall -c -ffreestanding -m32 -march=i386 syscall.S -o syscall.o
cc -Wall -c -ffreestanding -m32 -march=i386 syscall_handler.c -o syscall_handler.o
cc -Wall -c -ffreestanding -m32 -march=i386 process.c -o process.o
cc -Wall -c -ffreestanding -m32 -march=i386 mutex.c -o mutex.o
cc -Wall -c -ffreestanding -m32 -march=i386 list.c -o list.o
cc -Wall -c -ffreestanding -m32 -march=i386 pagetable.c -o pagetable.o
cc -Wall -c -ffreestanding -m32 -march=i386 rtc.c -o rtc.o
ld -m elf_i386 -Ttext 0x10000 -s --oformat binary kernelcore.o main.o console.o memory.o keyboard.o mouse.o clock.o interrupt.o kmalloc.o pic.o ata.o cdromfs.o string.o bitmap.o graphics.o font.o syscall.o syscall_handler.o process.o mutex.o list.o pagetable.o rtc.o -o kernel
main.o: In function `kernel_main':
main.c:(.text+0xe): undefined reference to `_GLOBAL_OFFSET_TABLE_'
console.o: In function `console_reset':
console.c:(.text+0xc): undefined reference to `_GLOBAL_OFFSET_TABLE_'
console.o: In function `console_writechar':
console.c:(.text+0xf0): undefined reference to `_GLOBAL_OFFSET_TABLE_'
console.o: In function `console_heartbeat':
console.c:(.text+0x13c): undefined reference to `_GLOBAL_OFFSET_TABLE_'
console.o: In function `console_putchar':
console.c:(.text+0x1ce): undefined reference to `_GLOBAL_OFFSET_TABLE_'
console.o:console.c:(.text+0x2fa): more undefined references to `_GLOBAL_OFFSET_TABLE_' follow
Makefile:20: recipe for target 'kernel' failed
make: *** [kernel] Error 1
This does not happen with either GCC 4.7 or GCC 5.4.
See here.
Basically, I created an inode-like filesystem and I think it would be worthwhile to use this as a starting point. I've been using the function calls directly to test it, but I think that if we can get something simple that "just works" out of this, that'd be great.
We can discuss this on Thursday, but my goals would probably be something along the lines of:
There are definitely tons of things I haven't done, like enforcing allowed characters, implementing multi-level path strings, and reserving a boot section for booting from the same disk. Then, there things that can be simplified--inodes/data could be a union instead of being a buffer, "commit types" have an unnecessary level of distinction/verbosity.
Maybe we can call this a meta-issue? I'm not sure if Git Issues has that concept.
Add a new command mount
to the kernel shell which takes a device number, attempts to load a filesystem, and then stores the root dirent as a global variable root_directory
. Then, you can make list
and run
relative to the current working directory and mount whichever unit you want.
And the reason why isn't readily apparent...
We need a syscall for updating a window. I am working on it here: https://github.com/JohnathonNow/basekernel/tree/feature/windows
but there are a few questions that need to be answered:
When I run the build script for the cross compiler on an installation of Arch Linux it fails:
g++ -no-pie -g -DIN_GCC -DCROSS_DIRECTORY_STRUCTURE -fno-exceptions -fno-rtti -fasynchronous-unwind-tables -W -Wall -Wno-narrowing -Wwrite-strings -Wcast-qual -Wmissing-format-attribute -Woverloaded-virtual -pedantic -Wno-long-long -Wno-variadic-macros -Wno-overlength-strings -DHAVE_CONFIG_H -static-libstdc++ -static-libgcc -o lto1 \
lto/lto-lang.o lto/lto.o lto/lto-object.o attribs.o lto/lto-partition.o lto/lto-symtab.o libbackend.a main.o libcommon-target.a libcommon.a ../libcpp/libcpp.a ../libdecnumber/libdecnumber.a -lmpc -lmpfr -lgmp -rdynamic -ldl -L./../zlib -lz libcommon.a ../libcpp/libcpp.a ../libbacktrace/.libs/libbacktrace.a ../libiberty/libiberty.a ../libdecnumber/libdecnumber.a
collect2: error: ld returned 1 exit status
make: *** [../../gcc-7.2.0/gcc/lto/Make-lang.in:81: lto1] Error 1
I am running the 4.9.69-1-lts
kernel with gcc version 7.2.1 20171128
.
Any advice?
Once that works we want to collapse the window table into the file/kobject descriptor table.
In some areas (e.g. kshell.c), we have a mix of tab and 4-space indentations. We should probably do a quick search/replace to get everything back to only tab characters.
Which means that newlines are treated like single spaces extending to the end of the line.
Once we have a basic libc and processes that can display their own pid, go into process.c
and turn on allow_preempt
which should enable periodic preemption via clock_interrupt()
. Run two or more processes from main
that display something different, and let's see if preemption works!
Easy one: help
should show a list of commands in the kernel shell.
Currently, the CDROM filesystem reports back all files in uppercase, which is quaint but not really necessary. Fix this as follows:
This will be handy down the road when we want the cdrom filesystem to be swappable for other more capable filesystems...
We are going to need some simple user-space libraries in order to build slightly less non-trivial user-level programs. The kernel module string.c
is a good starting put, since that is fairly pure code, excepting the use of console_write
in printf
. Adapt that module so it can be used in a user-level program, so that printf
invokes debug
. Then, you can use printf
from user space.
@JohnathonNow also a good one for you to work on, after adding the system calls.
If I load the basekernel.img file as a floppy, if an IDE disk controller is present but there is no disk, in both qemu and virtualbox it hangs forever. Disabling the unused IDE controller gets around this.
Here is a small project to gain familiarity with all of the various layers of the system:
Add system calls to return the current process number (getpid
) and the parent process (getppid
):
process.[ch]
to generate a unique incrementing id each time a process is created.syscall_handler.[ch]
to handle the new system calls.syscalls.c
to implement the user side of the system calls.test.c
so the user level program can invoke them and print the results.@JohnathonNow this would be a good one for you to work on.
ATAPI (CDROM) devices are not detected when running on VirtualBox.
(They are detected with Bochs)
If the first command I enter is mount 2 cdrom
, the mount succeeds. Strangely, if I do anything before running that command, it hangs when I do. I was traced the problem to this statement:
return f->mount(device_no);
I think it's supposed to call this function, cdrom_volume_open
, but a debug print statement at the beginning of it is never reached.
cdromfs_readdir
handles dot and dot-dot, but cdromfs_lookup
does not.
@TBurchfield will fix it.
Run these commands to reproduce with a freshly-cloned copy:
make
qemu-system-i386 -fda basekernel.img
Am I doing something wrong (if so, we should adjust README.md) or is this a bug in the OS?
I noticed when working on setting up passing argv and argc to processes that I could not write to any address in the stack before 0x0FF0
bytes down. That seemed strange, until I realized that the stack starts 0xF
from the end of memory, meaning that I couldn't write to the page assigned to the end of memory at all. I eventually thought it might be related to how pages are assigned, so I did some math. So, for the top of the stack, 0xFFFFFFF0
, we have 0xFFFFFFF0 - 0x00001000
passed in as the argument to to pagetable_alloc
here.
0xFFFFFFF0 - 0x00001000 = 0xFFFFE000
0xFFFFE000 & 0xFFFFF000
= 0xFFFFE000
, which is the page before the one we wanted. To test, I added a debug message to the page fault handler, and sure enough, as soon as a process starts it page faults and gets the top page allocated for it.
This isn't intended, right? Should the line in process.c
read
pagetable_alloc(p->pagetable,PROCESS_STACK_INIT-stack_size + 0x10,stack_size,PAGE_FLAG_USER|PAGE_FLAG_READWRITE);
instead?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.