Done

Done 2022-01-04

- Improve site:-query QOL (/)

- Fix byte folder bug (/)

- refactor EC_URL (/)

ALTER TABLE EC_URL MODIFY COLUMN PROTO ENUM('http', 'https', 'gemini') NOT NULL;

-- put visit-metadata in separate table (/)

Done 2021-12-03

- fix bug in language detection (/)

-- re-fetching some pages (/)

Done 2021-12-02

- new approach for query rewriting (/)

Done 2021-11-14

- make site:-queries return a dummy entry when no site information is available (/)

Done 2021-11-11

- hybridized ordering of domains on reindex, F(previous rank, previous quality). (/)

- mark documents with audio, video, object tags (/)

Done 2021-11-10

- car service <2021-11-18> (/)

Done 2021-10-30

- Add auto redirects for guesswork rss/atom/feed-requests to /log/feed.xml (/)

Done 2021-10-29

- investigate extracting more keywords (/)

-- textrank (/)

-- tf-idf (x)

-- sideload additional keywords for most popular sites (/)

Done 2021-10-12

- refactor index converter (/)

- clean up code garbage (/)

Done 2021-10-05

- trial more vanilla PageRank approach as a tertiary algorithm (/)

- fix a search result priortization bugs for mixed rankings (/)

- fix search interface for firefox on android (x)

It is reportedly broken

-- figure out how to replicate this problem (x)

- fix potential DoS where certain search queries with a large number of common but mutually exclusive terms would take forever to process. (/)

test query: generic stores underground unusual

Done 2021-10-03

- prioritize n-gram matches over word matches (/)

- show informative error page when the index server reboots (/)

Done 2021-10-02

- Personalized Page Rank (/)

- Duelling Algorithms (/)

Done 2021-10-30

- Launch October Update (/)

Done 2021-09-26

- fix broken search use-cases (/)

-- c language (/)

-- 67 chevy (/)

-- 68000 (/)

-- c# (/)

-- @twitterhandle (/)

-- #hashtag (/)

- trial tar based archiving to save the poor ext4 fs (/)

- use words to tag document format etc (/)

- dynamic re-bucketing based on something like (/)

SELECT DEST.URL_PART,EXP(DEST.QUALITY)*SUM(EXP(SOURCE.QUALITY)) AS Q from EC_DOMAIN DEST INNER JOIN EC_DOMAIN_LINK ON DEST.ID=DEST_DOMAIN_ID INNER JOIN EC_DOMAIN SOURCE ON SOURCE.ID=SOURCE_DOMAIN_ID WHERE DEST.INDEXED>0 GROUP BY DEST_DOMAIN_ID

Done 2021-09-19

- Fix several indexing bugs that hid relevant search results (/)

Done 2021-09-17

- Added search profiles (/)

Done 2021-09-16

- Rephrased an error message that some people took to mean they weren't speaking a proper language (/)

Done 2021-09-15

- Using in-site domain link-names to add search terms (/)

- Fixed buggy default content-type (/)

- Even more aggressive unicode language dectection (/)

Done 2021-09-11

- Status flag for domains (/)

Indexed, Active, Blocked

- Improve topic detection (/)

Done 2021-09-09

- Tuned search results to demote very short results (/)

Done 2021-09-08

- Encyclopedia tries harder to find the right article if the case match isn't exact (/)

Done 2021-09-06

- Breaking changes for next Index-rebuild (/)

-- Change writer bucket scaling to 1/4 (/)

-- Move protocol and port from EdgeDomain to EdgeURL (/)

-- Change database schemas to reflect (/)

-- ISO-8859-1/UTF-8 charset sniffer (/)

-- Fixed a bug that would occasionally cause the crawler to re-index the same working set multiple times (/)

Done 2021-09-02

- improve edge-director throughput (/)

- give edge-director state for semi-blocking tasks (/)

Done 2021-08-31

- optimize URL index size (/)

Done 2021-08-28

- clean up gemini navigation (/)

- Atom feed for HTTPS and Gemini (/)

Done 2021-08-27

- Feed gemini server with rendered gmi-content (/)

-- Output the content (/)

-- Generate feeds (/)

-- Make the gemini server read it (x)

-- Switch over (/)

Done 2021-08-26

- Absorb gemini server into WMSA (/)

Done 2021-08-25

- wildcard domain for marginalia.nu (/)

-- move memex to memex-subdomain (/)

- feeds on FEED pragma (/)

Done 2021-08-24

- Top nav bar overhaul (/)

Done 2021-08-23

- add marker for which files are todo files (/)

Added %%%/pragmas for toggling behavior

-- Added template helpers for consuming pragmas (/)

-- Used to improve topic pages (/)

- Fixes for git (/)

Done 2021-08-22

- File manager (/)

-- Delete (/)

-- Delete Empty Dir (x)

-- Move/Rename (/)

--- System for tombstones/redirects (/)

- Edit for / does not work (/)

Needed better support for non-normalized URLs, e.g. //index.gmi

- Backlinks for index (/)

Done 2021-08-21

- Git Integration (/)

-- Use commit hooks to trigger pull (/)

https://git-scm.com/book/uz/v2/Appendix-B%3A-Embedding-Git-in-your-Applications-JGit

- Recursive directory watch (/)

- Two column layout (/)

Done 2021-08-20

- Overhaul MEMEX navigation (/)

-- Navigation bar (/)

-- Generate site map (x)

-- Editing (/)

--- Add update-root link (/)

- Tombstones aren't generated properly on-delete (/)

The tombstone db wasn't properly

reloaded after being updated.

- Just write static files to disk instead of using an intermediary backend server. (/)

-- Use alias directive to set different root for memex path. (/)

-- Content-type is finnicky (/)

I want to serve html-wrapped .gmi and .html

location ~* \.(gmi|png)$ {

types {

text/html gmi;

text/html png;

}

}

Done 2021-08-19

- Move away from statically generated HTML forms in memex (/)

- Fix stability of podcast scraper (/)

- Get crawling up again (/)

-- Monitoring (/)

--- Extraction (/)

--- Status page (/)

-- Scraper config (/)

-- DNS cache (?)

-- IP Block CDNs (/)

--- Parse CIDR (/)

Apache Commons.Net SubnetUtil seems to

do the job, although it can't deal

with IPV6 :-/

--- CloudFlare (/)

173.245.48.0/20

103.21.244.0/22

103.22.200.0/22

103.31.4.0/22

141.101.64.0/18

108.162.192.0/18

190.93.240.0/20

188.114.96.0/20

197.234.240.0/22

198.41.128.0/17

162.158.0.0/15

172.64.0.0/13

131.0.72.0/22

104.16.0.0/13

104.24.0.0/14

2400:cb00::/32

2606:4700::/32

2803:f800::/32

2405:b500::/32

2405:8100::/32

2a06:98c0::/29

2c0f:f248::/32

--- Fastly (/)

23.235.32.0/20

43.249.72.0/22

103.244.50.0/24

103.245.222.0/23

103.245.224.0/24

104.156.80.0/20

146.75.0.0/17

151.101.0.0/16

157.52.64.0/18

167.82.0.0/17

167.82.128.0/20

167.82.160.0/20

167.82.224.0/20

172.111.64.0/18

185.31.16.0/22

199.27.72.0/21

199.232.0.0/16

- Refactor task management (/)

-- Fix prepend (/)

-- Add tests (/)

- Refactor Floyd-Steinberg ditherer (/)

- Todo move-to-done function puts header last in #Done (/)

Done 2021-08-16

- Pictures-in-HTML (/)

-- Implement compression via Floyd-Steinberg dithering (/)

https://encyclopedia.marginalia.nu/wiki/Floyd%E2%80%93Steinberg_dithering

http://image4j.sourceforge.net/javadoc/index.html?net/sf/image4j/util/ConvertUtil.html

--- Ensure 4 bit (/)

--- On upload (/)

--- Convert existing stuff on-read (x)

-- Render image views (/)

--- Add to index (/)

-- Upload form (/)

Done 2021-08-15

- CSS fixes for mobile (/)

-- text align for tasks (/)

-- indent overflowed tasks (/)

- Fix CME (/)

java.util.ConcurrentModificationException: null

at java.util.HashMap.forEach(HashMap.java:1428) ~[?:?]

at nu.marginalia.wmsa.memex.MemexData.forEach(MemexData.java:51) ~[WMSA-1628951793.jar:?]

at nu.marginalia.wmsa.memex.Memex.reRender(Memex.java:49) ~[WMSA-1628951793.jar:?]

at io.reactivex.rxjava3.core.Scheduler$PeriodicDirectTask.run(Scheduler.java:566) [WMSA-1628951793.jar:?]

at io.reactivex.rxjava3.core.Scheduler$Worker$PeriodicTask.run(Scheduler.java:513) [WMSA-1628951793.jar:?]

at io.reactivex.rxjava3.internal.schedulers.ScheduledRunnable.run(ScheduledRunnable.java:65) [WMSA-1628951793.jar:?]

at io.reactivex.rxjava3.internal.schedulers.ScheduledRunnable.call(ScheduledRunnable.java:56) [WMSA-1628951793.jar:?]

at java.util.concurrent.FutureTask.run(FutureTask.java:264) [?:?]

at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304) [?:?]

at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130) [?:?]

at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630) [?:?]

at java.lang.Thread.run(Thread.java:832) [?:?]

ERROR 2021-08-14 16:36:39,467 RxCachedThreadScheduler-2 MemexMain : Uncaught exception

java.util.ConcurrentModificationException: null

at java.util.HashMap.forEach(HashMap.java:1428) ~[?:?]

at nu.marginalia.wmsa.memex.MemexData.forEach(MemexData.java:51) ~[WMSA-1628951793.jar:?]

at nu.marginalia.wmsa.memex.Memex.reRender(Memex.java:49) ~[WMSA-1628951793.jar:?]

at io.reactivex.rxjava3.core.Scheduler$PeriodicDirectTask.run(Scheduler.java:566) ~[WMSA-1628951793.jar:?]

at io.reactivex.rxjava3.core.Scheduler$Worker$PeriodicTask.run(Scheduler.java:513) ~[WMSA-1628951793.jar:?]

at io.reactivex.rxjava3.internal.schedulers.ScheduledRunnable.run(ScheduledRunnable.java:65) [WMSA-1628951793.jar:?]

at io.reactivex.rxjava3.internal.schedulers.ScheduledRunnable.call(ScheduledRunnable.java:56) [WMSA-1628951793.jar:?]

at java.util.concurrent.FutureTask.run(FutureTask.java:264) [?:?]

at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304) [?:?]

at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130) [?:?]

at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630) [?:?]

at java.lang.Thread.run(Thread.java:832) [?:?]

Done 2021-08-14

- Automatic TODO task categorization (/)

- Login API on separate service (/)

-- Set up service (/)

-- Route requests (/)

- Fix header auto-location (/)

- Display top tasks in index (/)

Done 2021-08-10

-- + in URLs? (/)

proxy_pass with / forces nginx to parse the url (why?)

Bad:

proxy_pass http://127.0.0.1:5025/public/wiki/

Good:

rewrite ^ $request_uri

rewrite ^/(.*) /public/$1 break;

return 400;

proxy_pass http://127.0.0.1:5025$uri;

- Encyclopedia (/)

-- Search API (/)

-- code tags (/)

Done 2021-08-06

- Memex (/)

-- GemtextParser (/)

-- Service skeleton (/)

-- Link extraction (/)

-- Rendering (/)

--- Stylesheet (/)

-- Metadata (-)

-- Uppdateringar (/)

--- API (/)

--- Formulär (/)

Done 2021-08-04

- Service Lockdown (/)

-- X-Public header in code (/)

-- Move endpoints (/)

--- Resource Store (/)

--- Search (/)

--- Assistant (/)

-- Update clients (/)

--- Resource Store (/)

--- Search Service (/)

--- Assistant (-)

-- Update nginx (/)

-- Update links on website (/)

- Tune wiki archive fs (/)

sudo tune2fs -O ^dir_index /dev/nvme0n1p2

- marginalia.nu:9999 "BBS" (/)

Done 2021-08-03

- encyclopedia.marginalia.nu (/)

- Verify automatic backup of git (/)

- Reddit frontend (/)

-- Scraper: (/)

-- API: Marginalia 2: (/)

- Wiki (/)

-- on Optane (/)

-- fix Hildegard of Bingen (/)

- Block bots on nginx (/)

https://kb.linuxlove.xyz/nginx-badbotblocker.html

Done 2021-08-02

- Install Optane (/)

-- Migrate MariaDB (/)

- Wiki (/)

-- redirects (/)

-- top notices (/)

- Bucket4J rate limiting (/)

- Service Monitoring (/)

Done 2021-08-01

- Update Cert (/)

- Backups for git (/)

Done 2021-07-30

- Load Wikidata from ZIM (/)

- Migrate Server to Debian Buster (/)

Done 2021-07-28

- Update description generation algorithm (/)

-- Recalculate descriptions (...) (/)

- Wiki data (/)

-- Load data (/)

-- Wrap wikipedia (/)

-- ZIM? (-)

-- Wikipedia Cleaner (/)

Done 2021-07-27

- Spell checker service? (/)

https://github.com/wolfgarbe/SymSpell

- Calculations (/)

-- Detection (/)

-- Parser (/)

-- Unit conversion (/)

--- Temperature (/)

--- Distance (/)

--- Weight (/)

--- Area (/)

--- Volume (/)

Done 2021-07-26

- Save websites to disk? (/)

-- GZipped (/)

-- XFS (?)

- Local backlinks in GMI (/)

-- Parse GMI for links and titles (/)

-- Create tags system (/)

- Use prime sizing for HashMap! (/)

-- How to find primes (/)

- Arbitarary size HashMap (/)

Done 2021-07-25

- Syntax for orgmode + GMI in kate (/)

Use /usr/share/kde4/apps/katepart/syntax/markdown.xml

Done 2021-07-23

- Dictionary analysis in scraping (/)

It seems viable to estimate

the lanaguage of a document

based on the overlap with a

N-most-common-words dictionary.

Threshold 0.05 ok?

-- English (/)

-- Swedish (/)

-- Latin (/)

- Clean up tests (/)

Done 2021-07-22

GZip Compression stats:

63% old

21% new

- Hash map (/)

-- Contiguous memory bins (/)

- Key Folding (/)

-- For strings (/)

-- For integers (/)

-- For dates (x)

- Debian Desktop (/)

-- Docker (/)

-- Java 14 (/)

-- IntelliJ (/)

-- Code (/)

-- Gradle (/)

-- OrgMode (/)

Done 2021-07-21

- Bugfix: Domain Resolution (/)

Done 2021-07-20

- Index Changes (/)

-- Remove Junk Logging (/)

-- Split Query (/)

-- Implement in Frontend (/)

- Dictionary Service (/)

-- Add Index To Table (/)

-- Populate test db (/)

-- Build tests (/)

-- Integrate into frontend (/)

- Site Information (/)

-- Fetch (/)

-- 404 (/)

Navigation

Back to Index

Reach me at kontakt@marginalia.nu