Grabbing BibTeX from a DOI

DATE: 2021-08-08

AUTHOR: John L. Godlee

There's a website called doi2bib.org[1] that takes a DOI and returns a BibTeX entry. I have been using it for a while to quickly get references for writing my PhD thesis. The website uses the DOI proxy server REST API in the background, so I figured it wouldn't be too hard to use CURL directly on the API to do the same thing in the terminal, to save me opening my web browser. This CURL request works well, where $1 is the DOI.

1: https://www.doi2bib.org/

curl -LH "Accept: application/x-bibtex" http://dx.doi.org/$1

In a similar vein, I wrote a script that grabs DOIs from a PDF. I used the regex for DOIs provided in a blog post on CrossRef[2], which apparently matches 74.4 of 74.9 million registered DOIs. The script grabs the first DOI in the PDF by default, because that's most often the DOI of the article itself, rather than DOIs for references in the article.

2: https://www.crossref.org/blog/dois-and-matching-regular-expressions/

#!/usr/bin/env sh

pdftotext "$1" - |\
    grep -ioP "\b(10.\d{4,9}/[-._;()/:A-Z0-9]+)\b" |\
    head -n 1