Re: [dev] sed breaks utf8 in [ ]

From: FRIGN <dev_AT_frign.de>
Date: Tue, 31 Mar 2015 01:30:04 +0200

On Mon, 30 Mar 2015 19:09:41 -0400
Roger <rogerx.oss_AT_gmail.com> wrote:

Hey Roger,

> I thought non-ASCII characters required 16 bits within UTF-8, versus just 8
> bits for ASCII. Therefore more memory. More memory referencing, requires more
> processing.

I can't take you seriously, sorry. UTF-8 is the future, there's no way around it.
You need multiples of 8 bit to store non-ASCII-codepoints, but UTF-8 is doing a
great job.
Keep in mind: For most text streams, you are dealing with ASCII-characters. This
is one argument against UTF-16, which has a bottleneck in this regard.

Cheers

FRIGN

-- 
FRIGN <dev_AT_frign.de>
Received on Tue Mar 31 2015 - 01:30:04 CEST

This archive was generated by hypermail 2.3.0 : Tue Mar 31 2015 - 01:36:07 CEST