[#85940] [Ruby trunk Bug#14578] Forking a child process inside of a mutex crashes the ruby interpreter — ben.govero@...
Issue #14578 has been reported by bengovero (Ben Govero).
3 messages
2018/03/05
[#86205] [Ruby trunk Feature#14618] Add display width method to String for CLI — aycabta@...
Issue #14618 has been reported by aycabta (aycabta .).
3 messages
2018/03/19
[#86366] Re: [ruby-cvs:70102] usa:r63008 (trunk): get rid of test error/failure on Windows introduced at r62955 — Eric Wong <normalperson@...>
usa@ruby-lang.org wrote:
3 messages
2018/03/28
[ruby-core:85990] [Ruby trunk Bug#14586] URI::RFC2396_Parser#unescape raises an exception if the input is mixed Unicode and percent-escapes
From:
ashe@...
Date:
2018-03-08 01:55:11 UTC
List:
ruby-core #85990
Issue #14586 has been reported by kivikakk (Ashe Connor).
----------------------------------------
Bug #14586: URI::RFC2396_Parser#unescape raises an exception if the input is mixed Unicode and percent-escapes
https://bugs.ruby-lang.org/issues/14586
* Author: kivikakk (Ashe Connor)
* Status: Open
* Priority: Normal
* Assignee:
* Target version:
* ruby -v: ruby 2.6.0dev (2018-03-07 trunk 62693) [x86_64-darwin17]
* Backport: 2.3: UNKNOWN, 2.4: UNKNOWN, 2.5: UNKNOWN
----------------------------------------
Currently, the following test case passes:
~~~ ruby
def test_unescape
p1 = URI::Parser.new
assert_equal("\xe3\x83\x90", p1.unescape("\xe3\x83\x90"))
assert_equal("\xe3\x83\x90", p1.unescape('%e3%83%90'))
end
~~~
But the following raises `Encoding::CompatibilityError`:
~~~ ruby
def test_unescape
p1 = URI::Parser.new
assert_equal("\xe3\x83\x90", p1.unescape("\xe3\x83\x90"))
assert_equal("\xe3\x83\x90", p1.unescape('%e3%83%90'))
assert_equal("\xe3\x83\x90\xe3\x83\x90", p1.unescape("\xe3\x83\x90%e3%83%90"))
end
~~~
The issue is in the definition of `URI::RFC2396_Parser#unescape`:
~~~ ruby
def unescape(str, escaped = @regexp[:ESCAPED])
str.gsub(escaped) { [$&[1, 2].hex].pack('C') }.force_encoding(str.encoding)
end
~~~
The behaviour is as follows:
* If the `String` contains only ASCII characters (including percent-escapes), then substituting each result of `[$&[1, 2].hex].pack('C')` (which returns ASCII-8BIT) succeeds, because the `String` so far is safely coerced to `ASCII-8BIT` to let the concatenation work.
* If the `String` contains only ASCII + Unicode characters (and no percent-escapes), then `gsub` matches nothing.
* If the `String` contains both, however, then attempting to `gsub` individual ASCII-8BIT characters (which can't be coerced to UTF-8) fails with `Encoding::CompatibilityError`.
This patch:
1. Adds the test.
2. Records the original encoding of the input string, forces the encoding to ASCII-8BIT for the `gsub`, then forces the encoding back after `gsub`. If the percent-encoded characters aren't valid in the original encoding, that's up to the user, but this is better than just refusing to perform the unescape at all.
3. Corrects a minor doc mismatch.
Thanks to @tenderlovemaking for helping me find this in upstream Ruby, who suggested @naruse might be interested in reviewing this patch. We currently run with this monkey-patched at GitHub, and in Rails `master`.
---Files--------------------------------
rfc2396-uri-encoding.patch (1.48 KB)
--
https://bugs.ruby-lang.org/
Unsubscribe: <mailto:ruby-core-request@ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>