Comment by Killed_Mufasa on 15/08/2019 at 21:50 UTC*

6 upvotes, 2 direct replies (showing 2)

View submission: Why did I build AmputatorBot?

View parent comment

Hey there, this is an interesting one:

The bot has marked that link as an amp link because it contained the string `amp.` (detention_cAMP.) . So that's a stupid but funny coincidence.

However, there are plenty of other ways the bot checks what is and isn't a false positive. It was determined that the link contained the HTML tag `rel` with attribute `canonical`. ~~This tag was specifically designed for AMP practices.~~ Basically, it (or rather has the value) points to the direct souce (which isn't using AMP). ~~So this tag being there is already weird.~~

Not every website with this attribute is using AMP, but every AMP page *is* (in theory) using the canonical attribute. If this attribute is missing, it's a false positive. But since it was there, it passed the first test.

The link also passed the second false-positives test: if the submitted is the same as the canonical one, it's either a false flag or badly implemented specs. But it wasn't the same, because of the last part of your link: `?smid=nytcore-ios-share` (yup). See how the bot removed that part?

Sorry about this. I genuinely thought the bot was 100% false flag proof this time. Welp guess not. Back to the drawing board it is.. Thx for letting me know haha!

Edit: removed the misinformation after a good comment of u/TheNominated!

Replies

Comment by TheNominated at 20/08/2019 at 16:01 UTC

6 upvotes, 1 direct replies

However, there are plenty of other ways the bot checks what is and isn't a false positive. It was determined that the link contained the HTML tag `rel` with attribute `canonical`. This tag was specifically designed for AMP practices. Basically, it (or rather has the value) points to the direct souce (which isn't using AMP). So this tag being there is already weird.

I think you're a bit mistaken on this point. The canonical attribute was conceived years before AMP became a thing, to signal the "single source of truth" to search engines, in the event that there are multiple URLs with the same content. While it does play well with AMPs, they have other, more important uses, and you definitely cannot deduce that a site is using AMP using that attribute.

Comment by femtoaggression at 15/08/2019 at 21:59 UTC

4 upvotes, 0 direct replies

No worries dude, I work in software and there’s always edge cases. Glad I could help.