💾 Archived View for gemini.hitchhiker-linux.org › gemlog › a_fun_little_bug_hunt.gmi captured on 2023-05-24 at 18:06:06. Gemini links have been rewritten to link to archived content
⬅️ Previous capture (2023-01-29)
-=-=-=-=-=-=-
I've been getting occasional pull requests from a Russian gentleman on my Vapad text editor. Not only has he provided Russian translations for the application and kept them up to date, but he has been helping with features as well. This has been a nice change from my usual workflow, as most of my projects are fairly special insterest and don't often get contributions more than just the very occasional drive by. Alex (my contributor) laid the groundwork for moving Vapad to using Libadwaita originally, and added the initial support for in application messaging using the Adwaita Toast system. Most recently, he sent me a pull request which replaced a few instances of the GtkMessageDialog widget with it's Adwaita equivalent, which helps keep things a little bit more visually consistent.
Pulling the changes into my local copy for testing, the code immediately compiled without errors and ran as expected, with one small hiccup. Vapad knows whether the current document has been changed since it was last saved, or, in the instance of a new file, if it has had any typing occur since the new buffer was opened and not been saved. If you press the tab's close button, ask to close the tab via keyboard shortcut, or close the window it pops up a message dialog asking you if you would like to save the file. This is the before mentioned Adw.MessageDialog. Everything was still functioning after the change, but if the tab being closed was the last tab in a window then I was getting a sementation fault. I never had when using the Gtk.MessageDialog. Weird. I needed to investigate.
Looking at the docs, the Adwaita version of this widget automatically closes when it receives a response (when the user presses the Ok or Cancel button). The Gtk+ version of the dialog had to be closed manually. Since the calls to the dialog's `.close ()` method were still in place, I suspected use after free. I took the calls to `close ()` out, and everything still worked, including the segfault. Interesting.
Attempting to run Valgrind was not fruitful. Valgrind wanted debug symbols for Glibc. I didn't feel like trying to figure out how to get them on Arch. So I just started trying to reason about what the code was doing, because it's not overly complex.
It turns out, the new code was not the problem. It just revealed a bug that I had written myself but which was masked by the old message dialog's behavior. See, the FileSaver object for a GtkSourceBuffer saves the buffer contents asynchronously. The call to it's `begin` method takes an optional argument for a callback to be executed once the file save action has completed. In this case, the callback being run updates a boolean flag on Vapad's `Tab` object which indicates whether the file has any changes sinc ethe last save. When it ran on the last tab in a window, the window was already closed and the application terminated by the time the filesystem reported the file was saved. The reason it wasn't showing up as a bug before is that the message dialog (remember the message dialog?) had to be manually closed, which it was, after the call to the save method. And the message dialog was holding it's parent window open until it was itself closed, which ate up enough cycles that unless you had a really slow filesystem it was unlikely that you'd ever see the bug. That's the difference. The new dialog closed, then ran it's callback, where the old dialog ran it's callback, which closed the dialog as it's last instruction.
A classic race condition, basically.
The fix wasn't all that difficult. A new method was created for the `Tab` object, `save_file_on_close`, which does exactly what the `save_file` method does, with the exception that the `FileSaver` does not run a callback when it completes. This is the version that runs when closing a tab, if you answer affirmatively to save the file.
It got me thinking about memory access, and programming languages. Vapad is written in Vala, which compiles to C and is basically just a layer of indirection above C itself. I do a lot of programming in Rust and love it for a lot of reasons, but I've never really given a ton of thought to the memory safety aspects as there is so much more to recommend it as a language. I'm usually more excited about algabraic data types and generics, and the awesome compiler messages and documentation. But Rust's borrow checker would have caught this sort of bug before it ever had a chance to display a symptom. I've seen it do it. And while I still think Vala is a great language for the domain of Gtk+ application programming, I'm even more sold on Rust after this little bug hunt. I know for a fact that I coded a program with a memory access bug. I wonder how many more I have coded in the past without realizing it?
All content for this site is released under the CC BY-SA license.
© 2022 by JeanG3nie