[#110568] [Ruby master Misc#19096] [Question] Time with `-00:00` offset is in UTC — "andrykonchin (Andrew Konchin)" <noreply@...>

SXNzdWUgIzE5MDk2IGhhcyBiZWVuIHJlcG9ydGVkIGJ5IGFuZHJ5a29uY2hpbiAoQW5kcmV3IEtv

10 messages 2022/11/01

[#110578] [Ruby master Feature#19099] Support `private_constant` for an undefined constant — "ujihisa (Tatsuhiro Ujihisa)" <noreply@...>

SXNzdWUgIzE5MDk5IGhhcyBiZWVuIHJlcG9ydGVkIGJ5IHVqaWhpc2EgKFRhdHN1aGlybyBVamlo

7 messages 2022/11/02

[#110621] [Ruby master Feature#19104] Introduce the cache-based optimization for Regexp matching — "make_now_just (Kitsune TSUYUSATO)" <noreply@...>

SXNzdWUgIzE5MTA0IGhhcyBiZWVuIHJlcG9ydGVkIGJ5IG1ha2Vfbm93X2p1c3QgKEtpdHN1bmUg

8 messages 2022/11/05

[#110636] [Ruby master Bug#19108] Format routines like pack blindly treat a string as ASCII-encoded — "chrisseaton (Chris Seaton)" <noreply@...>

SXNzdWUgIzE5MTA4IGhhcyBiZWVuIHJlcG9ydGVkIGJ5IGNocmlzc2VhdG9uIChDaHJpcyBTZWF0

8 messages 2022/11/07

[#110663] [Ruby master Bug#19113] Inconsistency in retention of compare_by_identity flag in Hash methods — "jeremyevans0 (Jeremy Evans)" <noreply@...>

SXNzdWUgIzE5MTEzIGhhcyBiZWVuIHJlcG9ydGVkIGJ5IGplcmVteWV2YW5zMCAoSmVyZW15IEV2

10 messages 2022/11/09

[#110670] [Ruby master Bug#19115] OpenSSL fails to autoload (macOS) — "thomthom (Thomas Thomassen)" <noreply@...>

SXNzdWUgIzE5MTE1IGhhcyBiZWVuIHJlcG9ydGVkIGJ5IHRob210aG9tIChUaG9tYXMgVGhvbWFz

10 messages 2022/11/09

[#110683] [Ruby master Feature#19117] Include the method owner in backtraces, not just the method name — "byroot (Jean Boussier)" <noreply@...>

SXNzdWUgIzE5MTE3IGhhcyBiZWVuIHJlcG9ydGVkIGJ5IGJ5cm9vdCAoSmVhbiBCb3Vzc2llciku

53 messages 2022/11/10

[#110689] [Ruby master Bug#19119] Add an interface for out-of-process profiling tools to access Ruby information — "kjtsanaktsidis (KJ Tsanaktsidis)" <noreply@...>

SXNzdWUgIzE5MTE5IGhhcyBiZWVuIHJlcG9ydGVkIGJ5IGtqdHNhbmFrdHNpZGlzIChLSiBUc2Fu

7 messages 2022/11/10

[#110708] [Ruby master Misc#19122] Use MADV_DONTNEED instead of MADV_FREE when freeing a Fiber's stack — "smcgivern (Sean McGivern)" <noreply@...>

SXNzdWUgIzE5MTIyIGhhcyBiZWVuIHJlcG9ydGVkIGJ5IHNtY2dpdmVybiAoU2VhbiBNY0dpdmVy

8 messages 2022/11/11

[#110737] [Ruby master Bug#19130] MRI failing when executing shell builtins with Errno::ENOENT — "ifiht (Mikal R)" <noreply@...>

SXNzdWUgIzE5MTMwIGhhcyBiZWVuIHJlcG9ydGVkIGJ5IGlmaWh0IChNaWthbCBSKS4NCg0KLS0t

9 messages 2022/11/14

[#110843] [Ruby master Feature#19141] Add thread-owned Monitor to protect thread-local resources — "wildmaples (Maple Ong)" <noreply@...>

SXNzdWUgIzE5MTQxIGhhcyBiZWVuIHJlcG9ydGVkIGJ5IHdpbGRtYXBsZXMgKE1hcGxlIE9uZyku

10 messages 2022/11/21

[#110870] [Ruby master Bug#19144] Ruby should set AI_V4MAPPED | AI_ADDRCONFIG getaddrinfo flags by default — "kjtsanaktsidis (KJ Tsanaktsidis)" <noreply@...>

SXNzdWUgIzE5MTQ0IGhhcyBiZWVuIHJlcG9ydGVkIGJ5IGtqdHNhbmFrdHNpZGlzIChLSiBUc2Fu

7 messages 2022/11/24

[#110876] [Ruby master Bug#19147] `TestFileExhaustive#test_expand_path_for_existent_username` and `TestDir#test_home` fails on i686 — "vo.x (Vit Ondruch)" <noreply@...>

SXNzdWUgIzE5MTQ3IGhhcyBiZWVuIHJlcG9ydGVkIGJ5IHZvLnggKFZpdCBPbmRydWNoKS4KCi0t

6 messages 2022/11/24

[#111027] [Ruby master Bug#19154] Specify require and autoload guarantees in ractors — "fxn (Xavier Noria)" <noreply@...>

Issue #19154 has been reported by fxn (Xavier Noria).

14 messages 2022/11/26

[#111036] [Ruby master Bug#19156] ObjectSpace.dump_all segfault during string inspection — mk <noreply@...>

Issue #19156 has been reported by mk (Matthias K=E4ppler).

25 messages 2022/11/28

[#111053] [Ruby master Bug#19158] Ruby 3.1.3 installs wrong gemspec for debug gem — deivid <noreply@...>

Issue #19158 has been reported by deivid (David Rodr=EDguez).

10 messages 2022/11/29

[#111075] [Ruby master Bug#19161] Cannot compile 3.0.5 or 3.1.3 on Red Hat 7 — "werebus (Matt Moretti)" <noreply@...>

SXNzdWUgIzE5MTYxIGhhcyBiZWVuIHJlcG9ydGVkIGJ5IHdlcmVidXMgKE1hdHQgTW9yZXR0aSku

15 messages 2022/11/29

[ruby-core:111020] [Ruby master Feature#19102] Optimize ERB::Util.html_escape more than CGI.escapeHTML for template engines

From: "Eregon (Benoit Daloze)" <noreply@...>
Date: 2022-11-26 13:03:14 UTC
List: ruby-core #111020
Issue #19102 has been updated by Eregon (Benoit Daloze).





I think it is unfortunate to add a C extension for ERB for that, ERB was al=
ways pure-Ruby and that was nice.



Also the C extension is slower on TruffleRuby, the Regexp is actually JIT-c=
ompiled and can use vectorization, unlike that C code. Also part of it is R=
STRING_PTR() basically forces a copy from managed memory (byte[]) to native=
 memory (char*) on TruffleRuby.

```

truffleruby 23.0.0-dev-57e53f8a, like ruby 3.1.2, GraalVM CE Native [x86_64=
-linux]

      CGI.escapeHTML     31.985M (=B1 1.2%) i/s -    160.093M in   5.006001s

ERB::Util.html_escape     7.427M (=B1 3.3%) i/s -     37.162M in   5.009721s

```

and CRuby 3.1 is:

```

ERB::Util.html_escape    14.551M (=B1 0.8%) i/s -     73.308M in   5.038335s

      CGI.escapeHTML     10.065M (=B1 0.6%) i/s -     51.054M in   5.072629s

```



Given those results, could you build the C extension only for CRuby?



I think it would also be much nicer to keep the optimized HTML escape in CG=
I which is in stdlib, so it can be used by all templates engines.



In #19090 I did not expect `rb_str_dup()` is so costly on CRuby, I guess th=
e allocation is slow and of course CRuby can't escape-analyze it.

A new method in CGI sounds best, or probably nicer an optional argument whe=
ther to always return a copy.

It seems tricky to change the existing method to return a mutable string as=
-is, I guess the person who found #11858 actually in that incompatibility.

Ruby users seem to assume in general if a core/stdlib method returns a stri=
ng either it's frozen or they are free to modify it, but here it would actu=
ally also mutate the original String passed to `CGI.escapeHTML`.



`String#to_s` really sounds like something every Ruby JIT/VM should be able=
 to trivially optimize.

So that I think should be an insignificant cost, even on CRuby (and if it i=
sn't it should be easy to fix).



----------------------------------------

Feature #19102: Optimize ERB::Util.html_escape more than CGI.escapeHTML for=
 template engines

https://bugs.ruby-lang.org/issues/19102#change-100275



* Author: k0kubun (Takashi Kokubun)

* Status: Closed

* Priority: Normal

----------------------------------------

## Proposal

Change the behavior of `ERB::Util.html_escape` in the following two parts:



1. Skip converting an argument with `#to_s` if the argument is already a `T=
_STRING`.

2. Do not allocate and return a new String when nothing needs to be escaped.



## Background

The current `ERB::Util.html_escape` is implemented as `CGI.escapeHTML(s.to_=
s)`. So the performance is almost equal to `CGI.escapeHTML` except for the =
`to_s` call. Because it's common to embed non-String expressions in templat=
e engines, a template engine typically calls `to_s` to convert non-String e=
xpressions to a String before escaping it, which is why the difference exis=
ts. Proposal (1) is useful for optimizing the case that something that's al=
ready a `String` is embedded. We ignore the extreme case that `String#to_s`=
 is weirdly monkey-patched.



As to proposal (2), my original implementation of `CGI.escapeHTML` https://=
github.com/ruby/ruby/pull/1164 was not calling `rb_str_dup` for that case. =
However, because [Bug #11858] claimed returning the argument object for non=
-escaped cases is a backward incompatibility with the old `gsub`-based impl=
ementation, we added the unneeded `rb_str_dup` call and the performance for=
 that case has been compromised. This behavior is completely unnecessary fo=
r template engines. On the other hand, because `ERB::Util.html_escape` is a=
 helper for ERB, we should not need to consider any backward compatibility =
that is not relevant to ERB or any template engines. So proposal (2) should=
 be possible in `ERB::Util.html_escape` unlike `CGI.escapeHTML`.



## Benchmark

Implementation: https://github.com/ruby/erb/pull/27



```rb

require 'benchmark/ips'

require 'erb'



class << ERB::Util

  def html_escape_old(s)

    CGI.escapeHTML(s.to_s)

  end

end



Benchmark.ips do |x|

  s =3D 'hello world'

  x.report('before') { ERB::Util.html_escape_old(s) }

  x.report('after')  { ERB::Util.html_escape(s) }

  x.compare!

end

```



```

ruby 3.1.2p20 (2022-04-12 revision 4491bb740a) [x86_64-linux]

Warming up --------------------------------------

              before     1.066M i/100ms

               after     1.879M i/100ms

Calculating -------------------------------------

              before     10.615M (=B1 0.3%) i/s -     53.320M in   5.023083s

               after     18.742M (=B1 0.4%) i/s -     93.929M in   5.011847s



Comparison:

               after: 18741747.6 i/s

              before: 10615137.1 i/s - 1.77x  (=B1 0.00) slower

```







--=20

https://bugs.ruby-lang.org/

 ______________________________________________
 ruby-core mailing list -- ruby-core@ml.ruby-lang.org
 To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org
 ruby-core info -- https://ml.ruby-lang.org/mailman3/postorius/lists/ruby-c=
ore.ml.ruby-lang.org/

In This Thread