💾 Archived View for bbs.geminispace.org › u › blah_blah_blah › 16056 captured on 2024-05-10 at 12:54:47. Gemini links have been rewritten to link to archived content

View Raw

More Information

➡️ Next capture (2024-05-12)

-=-=-=-=-=-=-

Comment by 🚀 blah_blah_blah

Re: "How Can We Determine Files Types and Text File Encodings?"

In: s/Gemini

The responses to my post confirm my view that the final determinant of a file's type or encoding is human judgment about whether expected software chokes on the data or not. I guess only I find this an intriguing topic, or an alarming one.

🚀 blah_blah_blah [OP]

Apr 10 · 4 weeks ago

Original Post

🌒 s/Gemini

How Can We Determine Files Types and Text File Encodings? — Determining File Types I have a security question. How can we verify that a UTF-8 file contains only UTF-8 encoded bytes? Running iconv all the time (the preferred solution) isn't appropriate in every situation, and only pushes back the question: how does iconv perform the verification? Other proposals suggest pushing text through UTF-8 language tools, like `read().decode('UTF-8')` in Python, but, again, the /how/ remains...

💬 blah_blah_blah · 7 comments · Apr 04 · 5 weeks ago