[#7055] More on VC++ 2005 — Austin Ziegler <halostatue@...>

Okay. I've got Ruby compiling. I'm attempting to get everything in

17 messages 2006/01/05
[#7058] Re: More on VC++ 2005 — nobuyoshi nakada <nobuyoshi.nakada@...> 2006/01/06

Hi,

[#7084] mathn: ugly warnings — hadmut@... (Hadmut Danisch)

Hi,

22 messages 2006/01/10
[#7097] Re: mathn: ugly warnings — Daniel Berger <Daniel.Berger@...> 2006/01/10

Hadmut Danisch wrote:

[#7098] Design contracts and refactoring (was Re: mathn: ugly warnings) — mathew <meta@...> 2006/01/10

Daniel Berger wrote:

[#7118] Re: Design contracts and refactoring (was Re: mathn: ugly warnings) — mathew <meta@...> 2006/01/12

*Dean Wampler *<deanwampler gmail.com> writes:

[#7226] Fwd: Re: Question about massive API changes — "Sean E. Russell" <ser@...>

Hello,

23 messages 2006/01/28
[#7228] Re: Question about massive API changes — Caleb Tennis <caleb@...> 2006/01/28

>

[PATCH] Clarify String#scan Documentation (was Re: [ ruby-Bugs-3329 ] String#scan loops forefever if scanned string is modified inside block.)

From: Paul Duncan <pabs@...>
Date: 2006-01-26 16:58:12 UTC
List: ruby-core #7220
* noreply@rubyforge.org (noreply@rubyforge.org) wrote:
> Summary: String#scan loops forefever if scanned string is modified inside block.

The subject doesn't really reflect what's actually happening.

> Initial Comment:
> ruby 1.8.4 (2005-12-24)
> 
> Following code loops infinitely:
> 
> a = " 12345678 "; a.scan(/\d/) {|s| a[3,2]='test';  s} 

I'm not convinced this is a bug per-se.  At least not any more than 
"loop { }" is.  What's actually happening is easier to demonstrate than
explain, so here goes (I'm using the caret as the position indicator).

  " 12345678"
   ^ #=> no match
  " 12345678"
    ^ #=> match, a = " 01test45678 "
  " 12test45678 "
     ^ #=> match, a = " 12testst5678 "
  " 12testst5678 "
      ^ #=> no match
    ... (snipped several irrelevant steps)
  " 12testst5678 "
           ^ #=> no match
  " 12testst5678 "
            ^ #=> match, a = " 12teststst5678 "  <-- eek!
  " 12teststst5678 "  
             ^ #=> no match
  " 12teststst5678 "  
              ^ #=> match, a = " 12testststst5678 "
  " 12testststst5678 "
               ^ #=> no match
  " 12testststst5678 "
                ^ #=> match, a = " 12teststststst5678 "
  (and so on, ad infinitum)

What honestly bothers me about this behavior is the converse: making the
receiver _smaller_ can cause the scanner to actually _miss_ matches,
like so:

  a, strs = '    abcdef', []
  a.scan(/[\w]/) { |s| a[0, 1] = ''; strs << s }
  strs #=> ['a', 'c', 'e'] 

Most people would expect ['a', 'b', 'c, 'e', 'f'] there.  This could be
"fixed" in a a couple of ways:

* Raise an exception if the receiver is modified during a scan (I don't
  really like this option).
* Attempt to hack in offset adjustment into string modification.  The
  functions in question are rb_str_splice() and rb_str_aref(), although
  I haven't investigated fully, so there may be other methods as well.
  This is really my least-favorite option, because it doesn't handle the
  case where someone modifies the receiver while keeping the length the
  same.
* Leave things as they are and add a big warning to the String#scan
  documentation.  Personally, I prefer this option.

Anyway, attached is a patch that adds a brief note to String#scan.  The
patch is against 1.8.4, but it applies clean to HEAD as well.

-- 
Paul Duncan <pabs@pablotron.org>        OpenPGP Key ID: 0x82C29562
http://www.pablotron.org/               http://www.paulduncan.org/

Attachments (2)

ruby-1.8.4-str_scan_warning.diff (504 Bytes, text/x-diff)
diff -ur ruby-1.8.4/string.c ruby-1.8.4-string_doc/string.c
--- ruby-1.8.4/string.c	2005-10-27 04:19:20.000000000 -0400
+++ ruby-1.8.4-string_doc/string.c	2006-01-26 11:52:03.000000000 -0500
@@ -4240,6 +4240,11 @@
  *     
  *     <<cruel>> <<world>>
  *     rceu lowlr
+ *     
+ *  <em>Note:</em> You probably don't want to modify the receiver string
+ *  inside the block.  Ruby will let you do it, but the result probably
+ *  won't be what you expect or what you want.
+ *     
  */
 
 static VALUE
signature.asc (189 Bytes, application/pgp-signature)

In This Thread