[#79532] Immutable Strings vs Symbols — Daniel Ferreira <subtileos@...>

Hi,

15 messages 2017/02/15

[ruby-core:79551] [Ruby trunk Bug#13216] Possible unexpected behaviour reading string starting with a byte order mark

From: nobu@...
Date: 2017-02-16 08:03:48 UTC
List: ruby-core #79551
Issue #13216 has been updated by Nobuyoshi Nakada.


Shyouhei Urabe wrote:
> > $ echo -n -e '\xEF\xBB\xBFid' | ruby -e 'puts STDIN.read.bytes.pack("U")'
> > 誰
> 
> This IS weird.  Smells like a bug to me.

Not a bug.

`pack("U")` packs just one codepoint, and U+00EF is LATIN SMALL LETTER I WITH DIAERESIS, which is the printed exactly.

```
$ echo -n -e '\xEF\xBB\xBFid' | ruby -e 'puts STDIN.read.bytes.pack("U*")'
誰損多id
```


----------------------------------------
Bug #13216: Possible unexpected behaviour reading string starting with a byte order mark
https://bugs.ruby-lang.org/issues/13216#change-62991

* Author: Gabriel Giordano
* Status: Open
* Priority: Normal
* Assignee: 
* Target version: 
* ruby -v: ruby 2.4.0p0 (2016-12-24 revision 57164) [x86_64-linux]
* Backport: 2.2: UNKNOWN, 2.3: UNKNOWN, 2.4: UNKNOWN
----------------------------------------
Maybe the comparison between symbols has an unexpected behaviour. Tested with ruby 2.4.0

```
$ echo -n -e '\xEF\xBB\xBFid' | ruby -e 'puts STDIN.read.bytes'
239
187
191
105
100

$ echo -n -e 'id' | ruby -e 'puts STDIN.read.bytes'
105
100

$ echo -n -e '\xEF\xBB\xBFid' | ruby -e 'puts STDIN.read.to_sym'
id

$ echo -n -e 'id' | ruby -e 'puts STDIN.read.to_sym'
id

$ echo -n -e '\xEF\xBB\xBFid' | ruby -e 'puts STDIN.read.to_sym == :id' 
false

$ echo -n -e 'id' | ruby -e 'puts STDIN.read.to_sym == :id'
true

$ echo -n -e '\xEF\xBB\xBFid' | ruby -e 'puts STDIN.read.bytes.pack("U")'
誰



-- 
https://bugs.ruby-lang.org/

Unsubscribe: <mailto:ruby-core-request@ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>

In This Thread

Prev Next