💾 Archived View for jb55.com › ward.asia.wiki.org › sitemap-scrape-statistics captured on 2022-01-08 at 14:17:26. Gemini links have been rewritten to link to archived content

View Raw

More Information

⬅️ Previous capture (2021-12-04)

-=-=-=-=-=-=-

Sitemap Scrape Statistics

We collect various counts while scraping and report them as a text file. json

json

Here we plot the most recent counts available. plots

plots

We fetch and parse the counts.txt file line by line. Each line is its own json record. github

github

We render each field as a separate time series using the recently open-sourced plotly.js following the advice provided in their quick-start documentation. plotly

plotly

The horizontal axis is in days. The scraper runs every six hours. Our first sample, sample number zero, was recorded on Sep 5, 2015 at 22:30 GMT.

Now with 6-hour growth rates for items and links. github

github

Now with x-axis dates that snap to weeks. github

github

Now with runtime range from 20-80 minutes, tolerating timezone and daylight shifts. github

github

See Search Index Downloads.

Search Index Downloads