💾 Archived View for d.moonfire.us › blog › 2018 › 07 › 11 › author-intrusion-0.10.0 captured on 2023-05-24 at 18:46:22. Gemini links have been rewritten to link to archived content
⬅️ Previous capture (2023-04-26)
-=-=-=-=-=-=-
Even though it isn't quite Saturday, I finished all the issues for Author Intrusion v0.10.0[1] so I finalized the milestone and decided to give a celebratory post to announce it.
1: https://gitlab.com/author-intrusion/author-intrusion-cil/milestones/1
This still isn't even remotely stable enough to use but I think I need a cadence to keep working on it instead of letting it atrophy for a few months before going back and losing track of things. So, to keep it fresh in my mind, I'm going to try keeping with a two week cadence where I do at least something on the project to keep it going, updated, and working. Over time (not unlike writing a novel), this should produce useful results for other people.
The `v0.10.0` release is mainly to improve developing on the project. Most of the changes aren't very sexy: allowing packages to be force installed for debugging, adding scripts, reducing noise.
One of the reasons this is complicated is that I have to break apart English (a non-structured way of communicating) into discrete components. Looking at the first three words of a paragraph requires the system to know what a paragraph is for.
In the v0.9.0 version, I did a quick and dirty paragraph splitter. It failed on some of my bigger projects so I rewrote it to be faster and use less memory (`foreach` loop instead of `RegEx`).
I still have to do one for the tokens and I realized that I also need to find a better way of handling large documents. C# doesn't like objects over 85 kB. My largest story (single file document) is 43 kw (kilowords) and 238 kB. Also, C# uses UTF-16 which means loading that entire thing into memory requires a bit over 480 kB of RAM. That will be a bigger mess but it is low enough it needs to be dealt with sooner than later.
I added a `length()` function for the XSLT calls. That way, plugins like echo detection can ignore short words.
plugins: analysis: - compare: text() error: 5 plugin: EchoDetection key: echo-1 warning: 2 within: 200 select: //token[length() > 4]
One of the biggest things was reducing information overload by breaking apart the logging into different categories. Like MPlayer, there are a lot of things going on, so I added a switch.
./pcli analyze --log NuGet:verbose --verbose
The `--verbose` turns on what is logged to the console, the `--log NuGet:verbose` turns the NuGet management section from it's default warning to verbose to get the tedious details.
./pcli log-list
The `log-list` version will let you see the categories. Plugins can add additional logging targets which is why it's a project-based command but even without a project, it should work (we'll find out).
The next sprint, starting next Sunday, will be focused on those memory management problems. I think it will take me a while to puzzle through them.
The sprint after that is currently slated to be working on server mode. This is going to be used by the Language Server Protocol[2] which will let me hook up to Atom[3] and get real-time analysis, highlighting, and other fancy features.
Author Intrusion is currently being managed via its Gitlab project[4]. I'm not sure if it would be worthwhile for anyone to consider joining, but if you want to watch it, this would be the place.
4: https://gitlab.com/author-intrusion/author-intrusion-cil
If you have questions, please don't hesitate to poke me on any social network I'm on[5]. I always love to bounce ideas or talk about future place. The more I do, the more I can make it useful for everyone, not just myself.
Categories:
Tags:
Below are various useful links within this site and to related sites (not all have been converted over to Gemini).
https://d.moonfire.us/blog/2018/07/11/author-intrusion-0.10.0/