💾 Archived View for plot47.space › docs › fixed-size-image-format.gmi captured on 2024-05-10 at 10:38:55. Gemini links have been rewritten to link to archived content
⬅️ Previous capture (2023-11-04)
-=-=-=-=-=-=-
The idea is to have a simple fixed sized 1-bit image format, which can be a stand in replacement for paper i.e. like A4, A5 etc. If we use the A series of paper A0-A10, as the basis (P series (pixel format)), and a little greater than or equal to 300dpi as the resolution (a lot of printing is happy with 300dpi (11.8dpmm)) then an A4 equivalent image format would be 8.268" x 11.693" (210mm x 297mm) or 2704 x 3824 pixels @ ~327dpi (~12.9dpmm). The reason for picking 2704 x 3824 for P4 is that it fits into a range of values which is the best optimsed set of doubling/halving numbers which also best approximate sqrt 2 (the A series aspect ratio), which are all good to four decimal places. The percentage error is also approximately constant throughout the range at between 0.00087532332258330253 and 0.00087533098454029533 percent:
P0 10816 x 15296 pixels P1 7648 x 10816 pixels P2 5408 x 7648 pixels (~8K) P3 3824 x 5408 pixels (~5K) P4 2704 x 3824 pixels (~4K) P5 1912 x 2704 pixels P6 1352 x 1912 pixels (~HD) P7 956 x 1352 pixels P8 676 x 956 pixels [1] (~800 x 640) P9 478 x 676 pixels [1] (~640 x 480) P10 338 x 478 pixels [1] (~480 x 320) P11 239 x 338 pixels [1][2] (~320 x 240) P12 169 x 239 pixels [1][2] P13 3586 x 5542 pixels [3] (Ledger/Tabloid) P14 2771 x 3586 pixels [3] (Letter) P15 2771 x 4564 pixels [3] (Legal)
These are all @ ~327dpi (326dpi for P13-P15) so a little greater than the normal print resolution of 300dpi.
[1] Close to the optimal 3 digit fraction approximations for sqrt 2.
[2] The A series runs from A0-A10, P11 and P12 were added into the series to accomodate smaller sizes of screen.
[3] These are 3 additional common paper formats, not in the A series sizes, but similar to P3 (P13 Ledger/Tabloid (11" x 17" (279mm x 432mm))) and P4 (P14 Letter (8.5" x 11" (216mm x 279mm))), and a bit bigger than P4 (P15 Legal (8.5" x 14" (216mm x 356mm))). They are all at 326dpi, close enough to the other P series ~327dpi to make no difference, and all have the exact same ratios as their paper equivalents, while sharing the same pixel sizes for the same shared sides.
The reason for aiming for at least 300dpi, is that the 1-bit format will be able to use dithering without too much loss of detail, which keeps the format simple, but allows some flexibility in grey scale representation of images. If the P series is printed at 300dpi rather than ~327dpi, then they would print larger than the equivalent A series page size. This could be mitigated by scaling, or by leaving a small white border around the edge, so that when printed at 300dpi, only the part within the white border would be printed. This isn't too bad, as normally a border is required to be left when printing anyway, this would just make the border a little larger, and wouldn't scale the contents noticeably. This is a decent compromise between standard printing resolutions, and using a range of values which is the best optimsed set of doubling/halving numbers, which also best approximate sqrt 2 and has an almost constant percentage error.
If more resolution is required, for example with high level black and white photographs, then picking a P series 2 higher in the series and printing it as if it were 2 lower, will double the resolution e.g. P2 printed as if it were P4 would print at ~654dpi.
The resolutions could also be used as specific screen resolutions and if made at ~327ppi they'd also be the same size as the A series paper they mimic i.e. a P4 screen @ ~327ppi would be the same size as a piece of A4 paper. A P3 screen would show two P4 pages side by side etc.
To allow storage of text data in the P series image format (not very efficient compared to pure text storge, but like paper might survive format changes better), a specific fixed sized (16 point?) unicode non aliased bitmap font could be used. This would mean that OCR isn't required for parsing the text, as each character would always be the same and exactly fit into the pixels. Obviously other text can be rendered without issues, but OCR would be needed if parsing was required.
The P series image format can be either uncompressed, or losslessly compressed with run-length encoding[1]. The first and last byte of the image format contains the specific P format from the P series it represents. The high nibble indicates the format from 0-15, while the low nibble indicates the preferred orientation (portrait (1) / landscape (0) using the lowest bit of the nibble) and the run-length encoding (1) or not (0) using the second lowest bit of the nibble. After the first byte, the next 16 bytes (128-bits) represent a UUID[2] specifically a UUID version 7, a time-based, sortable unique identifier. The bytes inbetween the header and footer represent the raw, or run-length encoded, pixel data. So an uncompressed landscape P4 image would start and end with #40 as the two nibbles:
40 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 .. .. .. .. {pixel data} .. .. .. .. 40
While a run-length encoded portrait P12 image would start and end with #C3:
C3 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 .. .. .. .. {pixel data} .. .. .. .. C3
The P series image format doesn't need anymore header/footer than this, as the format describes the bit depth, 1-bit, and image sizes for the format.
The exact type of run-length encoding is to be determined, but perhaps the one used in the TGA image format?