So, you have two copies of a document. One of them contains an outline or bookmarks that you want to use for the other document. Except that maybe you also want to fix it up a bit. Basically it all boils down to this:
1. Extract bookmarks from an existing PDF document into a text file.
2. Edit bookmarks.
3. Apply bookmarks from the text file to an existing PDF document.
Painful experimentation has led me to this: Use mbtPdfAsm.
I also submitted this – in French! – to the mbtPdfAsm author → Editer des signets.
First, get the outline:
mbtPdfAsm -mSource.pdf -gO > Outline.txt
This will take the outline from Source.pdf and write it to Outline.txt.
Edit the outline. The format is hard to read and hard to write. Here’s an example:
1 0 1 1 My Book 2 1 1 1 Front Cover 3 1 2 2 Credits 4 1 3 3 Contents
Edit at will, harr harr, then apply the new outline:
mbtPdfAsm -mSource.pdf -dResult.pdf -oOutline.txt
This will take Source.pdf and Outline.txt creating Result.pdf.
Clearly we need a program to be able to write a decent outline.
This is a Perl program to convert the outline produced by mbtPdfAsm into editable text.
#!/usr/bin/perl while (<>) { my ($key, $parent, $no, $page, $title) = m/(\d+) (\d+) (\d+) (\d+) (.*)/; next unless $key; print ' ' if $parent > 0; print "$page $title\n"; }
Example use:
perl ~/bin/outline-to-text.pl < Outline.txt > Editable.txt
The format of Editable.txt is as follows:
1. If the line starts with a page number, it’s a chapter
2. If the line is indented and starts with a page number, it’s a section
3. After the page number is a space
4. Everything after that is the title of the bookmark
Example:
1 My Book 1 Front Cover 2 Credits 3 Contents 4 My First Chapter 4 First Section 7 Second Section
We also need something to convert it back...
This is a Perl program to convert the editable text described above into the kind of outline used by mbtPdfAsm.
#!/usr/bin/perl my $id = 1; my ($chapter, $chapterno, $section); while (<>) { my ($indent, $page, $title) = m/([ \t]*)(\d+) (.*)/; next unless $page; if (not $indent) { $chapter = $id; $chapterno++; $section = 1; } print join(' ', $id++, $indent ? $chapter : 0, $indent ? $section++ : $chapterno, $page, $title), "\n"; }
Example use:
perl ~/bin/text-to-outline.pl < Editable.txt > Outline.txt
And it works! 😄
#PDF #Software