[dev] [sbase] what is a text file?

From: Evan Gates <evan.gates_AT_gmail.com>
Date: Fri, 21 Nov 2014 17:28:19 -0800

POSIX defines a text file as: 3.397 Text File

A file that contains characters organized into zero or more lines. The
lines do not contain NUL characters and none can exceed {LINE_MAX}
bytes in length, including the <newline> character. Although
POSIX.1-2008 does not distinguish between text files and binary files
(see the ISO C standard), many utilities only produce predictable or
meaningful output when operating on text files. The standard utilities
that have such restrictions always specify "text files" in their STDIN
or INPUT FILES sections.

Notice there's no mention of ASCII, so bytes 0x80 to 0xFF are valid.
For sbase we want UTF-8 support. Should we assume/enforce only valid
UTF-8? Doing so makes a lot of coding easier and less sucky, but means
that some POSIX text files will not be sbase text files when we run
into the aforementioned bytes. In this case what's more important?
Strict POSIX compliance? Or code that sucks less?

-emg
Received on Sat Nov 22 2014 - 02:28:19 CET

This archive was generated by hypermail 2.3.0 : Sat Nov 22 2014 - 02:36:08 CET