From 7d31eda45dc9d615cbfa084a2149b17c5e313a85 Mon Sep 17 00:00:00 2001 From: Adrian Cochrane Date: Thu, 11 Nov 2021 16:52:12 +1300 Subject: [PATCH] Note charset optimization. --- ISSUES/charset-sniffing.md | 3 +++ 1 file changed, 3 insertions(+) create mode 100644 ISSUES/charset-sniffing.md diff --git a/ISSUES/charset-sniffing.md b/ISSUES/charset-sniffing.md new file mode 100644 index 0000000..4ed7031 --- /dev/null +++ b/ISSUES/charset-sniffing.md @@ -0,0 +1,3 @@ +# Optimize Charset Sniffing + +Almost all charsets are supersets of ASCII, so when sniffing the charset for files which don't specify the encoding in their MIMEtype I can treat all the preceding text as ASCII. Though I suppose for this trick to work on UTF16 or UTF32 I'd need to remove any 0 bytes. -- 2.30.2