2009-08-31 RSS Feed Without RPG Articles

I added a a feed to the English diary pages excluding the RPG category to the metadata of this site. As far as I can tell there’s still the occasional German page that slips through. Let me know if you find so that I can try and improve the language filter. To be honest, it is very primitive – basically I count matches for the following regular expressions:

`\b(the|that|and|why|what|you|it|one)\b` – three matches and the page is considered to contain English text.

`\b(der|die|das|und|oder|eins|zwei|drei|vier|ich|du|er|sie|es|den|dem|des|ein|eine|einer|eines)\b` – three matches and the page is considered to contain German text.

If a page is not identified as English or German, it’ll match any language filter and get shown. Thus, German pages that are too short for three matches, or German pages that contain some English book titles matching the other regular expression three times will still show up in the feed.

​#Web