💾 Archived View for kennedy.gemi.dev › changelog.gmi captured on 2023-05-24 at 17:41:25. Gemini links have been rewritten to link to archived content
View Raw
More Information
⬅️ Previous capture (2023-04-19)
➡️ Next capture (2023-06-14)
🚧 View Differences
-=-=-=-=-=-=-
Kennedy Changelog
2023-05-23
- Organize captures by years in URL history view
- Streamlined search results page and image search results page based on feedback (Thanks Buckeye Lady!).
- Streamlined and better organized "Page Info" view.
- Hashtag and @mentions indexes deprecated and removed.
- Fix: Show results even if Wikipedia/Gemipedia unavailable.
2023-05-01
- Rebuild entire system to work off Web Archive (WARC) files. Kennedy crawler nows produces WARC files. Search indexer and Archiver ingest WARC files. Additional information like IP address of remote capsules stored in WARC files.
- Converted previous crawl databases to WARC files, allowing easier ingest into Delorean.
- Imported @mozz's late 2021 Gemini archives, which were in WARC format, into Delorean
- Delorean now stores metaline and response body, allowing storage of 1x, 3x, and 6x response codes.
- Changed archive database so allow for easier calculation of content sizes and savings due to content deduplication.
- Added a /stats/ endpoint, showing stats on URLs, snapshots, and sizes of the archive.
2023-03-19
- Massive improvement to Delorean, making it store a history of cached versions of content, and not just the copy found in the most recent crawl
2023-01-27
- Redesign of crawler code which improved speed of the crawler. Robots.txt files are downloaded ondemand instead of requiring a pre-flight step, ensuring that all capsules with Robots.txt are respected
2022-08-06
- Updated "Page Info" view to support image meta data (dimensions, format, text used in index)
- Updated Delorean to work show cached images and other cached, non-text content
2022-07-26
- Added image search! Images are indexed based on the text in their file path, as well as the text in all their inbound links
2022-06-04
- Updated searched Also include snippet for Gemipedia about the search query and link to Gemipedia entry
2022-03-01
- Added a "Page Info" view that shows title, language, # lines, size of response, and incoming/outbound links to a page
- Improved Delorean by adding a "View Cached" link for each page in the "Page Info" view.
- Streamlined the meta data shown on the search results page into a single line and made it a link to "Page Info" view.
- Improved "title" extraction code to use the first header encountered, regardless of level, or alt text from the first pre-formatted section.
2022-02-21
- Added Delorean which lets you view cached content from most recent scan by providing a URL
2022-02-14
- Added route/view for showing capsules with valid security.txt files