💾 Archived View for zaibatsu.circumlunar.space › ~krixano › phlog › 111018_edimcoder_updates.txt captured on 2020-09-24 at 01:35:32.
-=-=-=-=-=-=-
------------------------------------------------------------ More On EdimCoder - Including Updates ------------------------------------------------------------ EdimCoder is my line editor program that I wrote in C. It is based on ideas from Ed, Vim, and 4Coder. In this post, I will cover some of the technical things about EdimCoder, some experiments that I did to see what it's memory usage looked like, how it compares to other text editors that are similar to it, and recent updates that I have made. How Text Is Stored The first thing is how the editor stores the text in memory as the file is open. Various text editors do this in different ways. Vi and Vim use a dynamic array of dynamic arrays of characters (or dynamic strings). Emacs uses gap buffers. Microsoft Word uses Piece Tables. And other text editors use ropes, or just strings. Below I will provide a brief overview of each one. The simplest is a dynamic array of characters (aka dynamic string). This is the simplest but also the worst. The reason why it's the worst is because adding a character anywhere in the middle or in the beginning requires all of the other characters that come after it to be moved down to make room for the next character. And if not enough memory was previously allocated, then you'd need to allocate a whole new chunk for all of the characters plus that one added character, then move the memory over. So, there's a much better way of doing this - the Gap Buffer. With a dynamic array, there is often unused memory at the end of the buffer to facilitate appending. When that memory is used up, another chunk of memory is allocated for all of the current content (to be moved over) plus another big chunk at the end. This is so you don't have to reallocate memory every time you append something. However, whenever you want to add something to the beginning or the end, you still need to move the elements that come after down. So, to solve this, you move the chunk of unused allocated memory to where the cursor is. Then when the cursor moves you move the chunk of memory along with it. More correctly, what you would do is have two dynamic arrays, one before the cursor and one after the chunk of unused memory. When you move the cursor back, you are moving the last character of the dynamic array before the cursor to the front of the dynamic array after that chunk of unused allocated memory (the "gap"). When you move the cursor forward, you move the first character of the second dynamic array to the end of the first. Whenever you start typing, you just append to the end of the first dynamic array. When the "gap" is filled, you must reallocate the memory, leaving room for a new gap. This is how Emacs works. But there is another idea. This idea is what Vi, Vim, and EdimCoder use. The idea is that you have a dynamic array that contains pointers to dynamic arrays for each line. With this, adding a new line will require moving the lines after it down in memory. However, there would be significantly less lines than there would be characters. Additionally, adding text to a line, whether it be in the beginning or middle, would require moving the characters after it down in memory. However, like with the lines, the characters on a line would be significantly less than the amount of characters in the whole file. Since EdimCoder is line based, I chose to do it this way (and because it's simpler than gap buffers). Piece tables store the original file in an immutable buffer and other immutable blocks for each addition. Then, the resulting file is stored in memory as segments of the original file's buffer and the other buffers of the additions. Piece tables allow undo's to be easily implemented, because every addition remains in memory even after deletion (deletion just consists of shortening a segment of the resulting file or splitting one segment up into two pieces with the deleted content not included in the new segments). Piece tables are used by Word, VSCode, and Abiword. Ropes use binary trees. I am very unfamiliar with ropes and binary trees, so I wont be able to effectively detail how they work. Text editors that use ropes include GtkTextBuffer, Xi, and early versions of 4Coder. EdimCoder's Custom alternative to readline (and linenoise) EdimCoder uses it's own code to read input from the keyboard and display it on the screen. It is essentially a replacement to readline/linenoise. The primary reason why I did this was to be able to get tabs to show up as 4 spaces. However, it will also enable me to add custom keyboard shortcuts as well as ghost text and few other things. The way this works is by using VT100 escape codes and '\b'. By printing out '\b' to the screen, that will back the cursor up one character. Then, I can basically "erase" that character by printing a space. This is really everything that is used. I am able to get the key codes typed in, detect the key, add it to a dynamic array used for the current input, then print it to the screen. If tab was pressed, I add '\t' and print 4 spaces. If backspace was pressed, I delete a character from the dynamic array (moving characters after it if needed) and print '\b '. I delete a tab by doing the same thing, except just backing up 4 spaces. Moving the cursor is a little more complicated, but not much. When moving the cursor left, I just print '\b'. When moving the cursor right, I just print the same character that was already there. When deleting a character, I do '\b' then print all of the characters that came after so that they are moved down. Then I print a space for that last character that was moved down. Finally, I back up the cursor to where it should be by printing many '\b'. There might be a more efficient way to do this, but this is how EdimCoder currently works. Memory Usage of EdimCoder I tested a 50KiB file in EdimCoder. The memory usage was 64KiB, 4KiB more than Ed. I tested the same file in Vim, it was around 4MB, and Nano, where it was around 2MB. So EdimCoder's memory usage is currently comparable to Ed. However, if I end up adding more language- aware features (like outlines for various languages) or an undo system, then this memory usage would grow a little. Recent Updates to EdimCoder I have started working on EdimCoder again. I have fixed typing on Linux and some other bugs. The last time I worked on EdimCoder (in around April I think) I was working on implementing a new way to input commands. So, because of this, there are a few small bugs to work out. But, I believe I got most of them fixed and EdimCoder's command input should be decent now (although a little awkward due to the experiments I was doing previously). I have not tested it on Windows again. I do remember that it wasn't working too well on Windows when I tried it semi-recently. However, I don't know what the problem could be because the exact same version was working on Windows a while ago. So I'll need to investigate this soon. The GitHub page for EdimCoder is here. It should be relatively simple to compile the program. It uses C99. All you should need is a C compiler (or Visual Studio or clang on Windows). https://github.com/krixano/Edim Final Note: This whole post was written using EdimCoder, ssh'd into circumlunar.space!