💾 Archived View for d.moonfire.us › blog › 2015 › 07 › 29 › node-author-intrusion-0.1.0 captured on 2022-01-08 at 13:40:18. Gemini links have been rewritten to link to archived content

-=-=-=-=-=-=-

Author Intrusion 0.1.0

Up a Level

Over the last few weeks, I did some minor improvements to Author Intrusion[1]. Since I need to actually use it to write stories and novels, I figured I'd get to a stopping point and update the code.

1: /tags/author-intrusion/

Part of Speech

One of the biggest features I needed was part of speech tagging. This is in the node-author-intrusion-pos-tagger[2] module. This uses the output from the splitter and adds more information about the part of speech, such as “present tense noun” or “adverb”. This lets me identify overuse of adverbs in a short distance (much like before).

2: https://github.com/author-intrusion/node-author-intrusion-pos-tagger

Reworking Echo

I also significantly reworked the echo[3] plugin to handle the POS tagging, checking for echo words against the stem (which can be used to treat “spit” and “spitting” as the same word).

3: https://github.com/author-intrusion/node-author-intrusion-pos-tagger

This was a breaking change, sadly, but hopefully not too many people are using it. I added some unit tests in this (and a number other plugins) to help explore the different functionality.

Documentation

I've updated the documentation[4] and the script to check everything out.

4: https://github.com/author-intrusion/author-intrusion-docs

Forums

I also created a category[5] on my forum, if anyone wants to talk about it. It also includes a sub-category for recipes for those who want to talk about how to do some of the analysis.

5: http://discuss.moonfire.us/c/author-intrusion

Next Steps

The next biggest step is to get the echo plugin to handle ngrams[6]. Right now, it works on a token-by-token basis, but I also want it to be able to identify a series of three or four word segments and treat them as higher priority duplicates. For example, to see if the paragraphs starts repeat themselves or sentences have the same beginning.

6: https://en.wikipedia.org/wiki/N-gram

Metadata

Categories:

Tags:

Footer

Below are various useful links within this site and to related sites (not all have been converted over to Gemini).