Re: [dev] sed breaks utf8 in [ ]

Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ] [ by messages with attachments ]

From: Roger <rogerx.oss_AT_gmail.com>
Date: Tue, 31 Mar 2015 20:48:47 -0400

> On Tue, Mar 31, 2015 at 01:30:04AM +0200, FRIGN wrote:
>On Mon, 30 Mar 2015 19:09:41 -0400
>Roger <rogerx.oss_AT_gmail.com> wrote:
>
>Hey Roger,
>
>> I thought non-ASCII characters required 16 bits within UTF-8, versus just 8
>> bits for ASCII. Therefore more memory. More memory referencing, requires more
>> processing.
>
>I can't take you seriously, sorry. UTF-8 is the future, there's no way around it.
>You need multiples of 8 bit to store non-ASCII-codepoints, but UTF-8 is doing a
>great job.

I tend to agree too UTF-8 is the future.

>Keep in mind: For most text streams, you are dealing with ASCII-characters. This
>is one argument against UTF-16, which has a bottleneck in this regard.

Copy that.

-- 
Roger
http://rogerx.freeshell.org/

Received on Wed Apr 01 2015 - 02:48:47 CEST

This archive was generated by hypermail 2.3.0 : Wed Apr 01 2015 - 03:00:14 CEST