💾 Archived View for zaibatsu.circumlunar.space › ~krixano › phlog › 111018_edimcoder_updates.txt captured on 2020-09-24 at 01:35:32.

View Raw

More Information

-=-=-=-=-=-=-

------------------------------------------------------------
 More On EdimCoder - Including Updates
------------------------------------------------------------

EdimCoder is my line editor program that I wrote in C. It
is based on ideas from Ed, Vim, and 4Coder. In this post,
I will cover some of the technical things about EdimCoder,
some experiments that I did to see what it's memory usage
looked like, how it compares to other text editors that
are similar to it, and recent updates that I have made.


How Text Is Stored

The first thing is how the editor stores the text in
memory as the file is open. Various text editors do this in
different ways. Vi and Vim use a dynamic array of dynamic
arrays of characters (or dynamic strings). Emacs uses gap
buffers. Microsoft Word uses Piece Tables. And other text
editors use ropes, or just strings. Below I will provide
a brief overview of each one.

The simplest is a dynamic array of characters (aka dynamic
string). This is the simplest but also the worst. The
reason why it's the worst is because adding a character
anywhere in the middle or in the beginning requires all of
the other characters that come after it to be moved down to
make room for the next character. And if not enough memory
was previously allocated, then you'd need to allocate a
whole new chunk for all of the characters plus that one
added character, then move the memory over. So, there's
a much better way of doing this - the Gap Buffer.

With a dynamic array, there is often unused memory at the
end of the buffer to facilitate appending. When that memory
is used up, another chunk of memory is allocated for all
of the current content (to be moved over) plus another big
chunk at the end. This is so you don't have to reallocate
memory every time you append something. However, whenever
you want to add something to the beginning or the end, you
still need to move the elements that come after down. So,
to solve this, you move the chunk of unused allocated
memory to where the cursor is. Then when the cursor moves
you move the chunk of memory along with it. More correctly,
what you would do is have two dynamic arrays, one before
the cursor and one after the chunk of unused memory. When
you move the cursor back, you are moving the last character
of the dynamic array before the cursor to the front of the
dynamic array after that chunk of unused allocated memory
(the "gap").  When you move the cursor forward, you move
the first character of the second dynamic array to the
end of the first.

Whenever you start typing, you just append to the end
of the first dynamic array. When the "gap" is filled,
you must reallocate the memory, leaving room for a new
gap. This is how Emacs works.

But there is another idea. This idea is what Vi, Vim,
and EdimCoder use. The idea is that you have a dynamic
array that contains pointers to dynamic arrays for each
line. With this, adding a new line will require moving
the lines after it down in memory. However, there
would be significantly less lines than there would
be characters. Additionally, adding text to a line,
whether it be in the beginning or middle, would require
moving the characters after it down in memory. However,
like with the lines, the characters on a line would be
significantly less than the amount of characters in the
whole file. Since EdimCoder is line based, I chose to do
it this way (and because it's simpler than gap buffers).

Piece tables store the original file in an immutable
buffer and other immutable blocks for each addition.
Then, the resulting file is stored in memory as segments
of the original file's buffer and the other buffers of
the additions. Piece tables allow undo's to be easily
implemented, because every addition remains in memory even
after deletion (deletion just consists of shortening a
segment of the resulting file or splitting one segment up
into two pieces with the deleted content not included in
the new segments). Piece tables are used by Word, VSCode,
and Abiword.

Ropes use binary trees. I am very unfamiliar with
ropes and binary trees, so I wont be able to effectively
detail how they work. Text editors that use ropes include
GtkTextBuffer, Xi, and early versions of 4Coder.


EdimCoder's Custom alternative to readline (and linenoise)

EdimCoder uses it's own code to read input from the
keyboard and display it on the screen. It is essentially
a replacement to readline/linenoise. The primary reason
why I did this was to be able to get tabs to show up as
4 spaces. However, it will also enable me to add custom
keyboard shortcuts as well as ghost text and few other
things.

The way this works is by using VT100 escape codes and
'\b'. By printing out '\b' to the screen, that will
back the cursor up one character. Then, I can basically
"erase" that character by printing a space. This is really
everything that is used. I am able to get the key codes
typed in, detect the key, add it to a dynamic array used
for the current input, then print it to the screen. If tab
was pressed, I add '\t' and print 4 spaces.  If backspace
was pressed, I delete a character from the dynamic array
(moving characters after it if needed) and print '\b '. I
delete a tab by doing the same thing, except just backing
up 4 spaces.

Moving the cursor is a little more complicated, but not
much. When moving the cursor left, I just print '\b'.  When
moving the cursor right, I just print the same character
that was already there. When deleting a character, I do
'\b' then print all of the characters that came after so
that they are moved down. Then I print a space for that
last character that was moved down. Finally, I back up the
cursor to where it should be by printing many '\b'. There
might be a more efficient way to do this, but this is how
EdimCoder currently works.


Memory Usage of EdimCoder

I tested a 50KiB file in EdimCoder. The memory usage was
64KiB, 4KiB more than Ed. I tested the same file in Vim,
it was around 4MB, and Nano, where it was around 2MB.
So EdimCoder's memory usage is currently comparable to
Ed. However, if I end up adding more language- aware
features (like outlines for various languages) or an undo
system, then this memory usage would grow a little.


Recent Updates to EdimCoder

I have started working on EdimCoder again. I have fixed
typing on Linux and some other bugs. The last time I worked
on EdimCoder (in around April I think) I was working on
implementing a new way to input commands. So, because
of this, there are a few small bugs to work out. But, I
believe I got most of them fixed and EdimCoder's command
input should be decent now (although a little awkward due
to the experiments I was doing previously).

I have not tested it on Windows again. I do remember
that it wasn't working too well on Windows when I tried
it semi-recently. However, I don't know what the problem
could be because the exact same version was working on
Windows a while ago. So I'll need to investigate this soon.

The GitHub page for EdimCoder is here. It should be
relatively simple to compile the program. It uses C99. All
you should need is a C compiler (or Visual Studio or clang
on Windows).  https://github.com/krixano/Edim

Final Note: This whole post was written using EdimCoder,
ssh'd into circumlunar.space!