[#107430] [Ruby master Feature#18566] Merge `io-wait` gem into core IO — "byroot (Jean Boussier)" <noreply@...>

Issue #18566 has been reported by byroot (Jean Boussier).

22 messages 2022/02/02

[#107434] [Ruby master Bug#18567] Depending on default gems when not needed considered harmful — "Eregon (Benoit Daloze)" <noreply@...>

Issue #18567 has been reported by Eregon (Benoit Daloze).

31 messages 2022/02/02

[#107443] [Ruby master Feature#18568] Explore lazy RubyGems boot to reduce need for --disable-gems — "headius (Charles Nutter)" <noreply@...>

Issue #18568 has been reported by headius (Charles Nutter).

13 messages 2022/02/02

[#107481] [Ruby master Feature#18571] Removed the bundled sources from release package after Ruby 3.2 — "hsbt (Hiroshi SHIBATA)" <noreply@...>

Issue #18571 has been reported by hsbt (Hiroshi SHIBATA).

9 messages 2022/02/04

[#107490] [Ruby master Bug#18572] Performance regression when invoking refined methods — "palkan (Vladimir Dementyev)" <noreply@...>

Issue #18572 has been reported by palkan (Vladimir Dementyev).

12 messages 2022/02/05

[#107514] [Ruby master Feature#18576] Rename `ASCII-8BIT` encoding to `BINARY` — "byroot (Jean Boussier)" <noreply@...>

Issue #18576 has been reported by byroot (Jean Boussier).

47 messages 2022/02/08

[#107536] [Ruby master Feature#18579] Concatenation of ASCII-8BIT strings shouldn't behave differently depending on string contents — "tenderlovemaking (Aaron Patterson)" <noreply@...>

Issue #18579 has been reported by tenderlovemaking (Aaron Patterson).

11 messages 2022/02/09

[#107547] [Ruby master Bug#18580] Range#include? inconsistency for String ranges — "zverok (Victor Shepelev)" <noreply@...>

Issue #18580 has been reported by zverok (Victor Shepelev).

10 messages 2022/02/10

[#107603] [Ruby master Feature#18589] Finer-grained constant invalidation — "kddeisz (Kevin Newton)" <noreply@...>

Issue #18589 has been reported by kddeisz (Kevin Newton).

17 messages 2022/02/16

[#107624] [Ruby master Bug#18590] String#downcase and CAPITAL LETTER I WITH DOT ABOVE — "andrykonchin (Andrew Konchin)" <noreply@...>

Issue #18590 has been reported by andrykonchin (Andrew Konchin).

13 messages 2022/02/17

[#107651] [Ruby master Misc#18591] DevMeeting-2022-03-17 — "mame (Yusuke Endoh)" <noreply@...>

Issue #18591 has been reported by mame (Yusuke Endoh).

11 messages 2022/02/18

[#107682] [Ruby master Feature#18595] Alias `String#-@` as `String#dedup` — "byroot (Jean Boussier)" <noreply@...>

Issue #18595 has been reported by byroot (Jean Boussier).

15 messages 2022/02/21

[#107699] [Ruby master Feature#18597] Strings need a named method like `dup` that doesn't duplicate if receiver is mutable — "danh337 (Dan H)" <noreply@...>

Issue #18597 has been reported by danh337 (Dan H).

18 messages 2022/02/21

[ruby-core:107549] [Ruby master Feature#18576] Rename `ASCII-8BIT` encoding to `BINARY`

From: duerst <noreply@...>
Date: 2022-02-10 07:53:35 UTC
List: ruby-core #107549
Issue #18576 has been updated by duerst (Martin D端rst).


Eregon (Benoit Daloze) wrote in #note-4:

> The property that bytes < 128 are interpreted as US-ASCII is nothing special, every `Encoding#ascii_compatible?` behaves like that.
> And almost all non-dummy Ruby encodings are `#ascii_compatible?`, the only two exceptions are UTF-16/32 (both LE/BE).
> 
> Two things particularly confusing about the name ASCII-8BIT:
> * It's completely unclear it might mean binary data or unknown encoding

Well, binary data can be character data with unknown encoding (or with encoding not yet set), or it can be truly binary data (e.g. as in a .jpg file or .zip file,...).

> * ISO-8859-* and many other encodings are 8-bit ascii-compatible encodings. Yet ASCII-8BIT which name seems to imply something close is nothing like that (the 8th bit is undefined, uninterpreted but valid).

ASCII-8BIT is an 8-bit ascii-compatible encoding, isn't it?

I think the idea of ASCII-8BIT goes back to the fact that in Ruby, many encodings can be used for source code, and as long as you only use ASCII in the code, it doesn't actually matter. That's to a large extent how Ruby 1.8 operated, and that was carried over into Ruby 1.9.

Now that the default source encoding is UTF-8, we have an encoding pragma for source files in other encodings, and so on, the importance of "something where we know ASCII is ASCII, but we are not sure about the upper half of the byte values" may be quite a bit less important.



----------------------------------------
Feature #18576: Rename `ASCII-8BIT` encoding to `BINARY`
https://bugs.ruby-lang.org/issues/18576#change-96461

* Author: byroot (Jean Boussier)
* Status: Open
* Priority: Normal
----------------------------------------
### Context

I'm now used to it, but something that confused me for years was errors such as:

```ruby
>> "f辿e" + "\xFF".b
(irb):3:in `+': incompatible character encodings: UTF-8 and ASCII-8BIT (Encoding::CompatibilityError)
```

When you aren't that familiar with Ruby, it's really not evident that `ASCII-8BIT` basically means "no encoding" or "binary".

And even when you know it, if you don't read carefully it's very easily confused with `US-ASCII`.

The `Encoding::BINARY` alias is much more telling IMHO.

### Proposal

Since `Encoding::ASCII_8BIT` has been aliased as `Encoding::BINARY` for years, I think renaming it to `BINARY` and then making asking `ASCII_8BIT` the alias would significantly improve usability without backward compatibility concerns.

The only concern I could see would be the consistency with a handful of C API functions:

  - `rb_encoding *rb_ascii8bit_encoding(void)`
  - `int rb_ascii8bit_encindex(void)`
  - `VALUE rb_io_ascii8bit_binmode(VALUE io)`

But that's for much more advanced users, so I don't think it's much of a concern.




-- 
https://bugs.ruby-lang.org/

Unsubscribe: <mailto:ruby-core-request@ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>

In This Thread