[#27380] [Bug #2553] Fix pthreads slowness by eliminating unnecessary sigprocmask calls — Dan Peterson <redmine@...>

Bug #2553: Fix pthreads slowness by eliminating unnecessary sigprocmask calls

21 messages 2010/01/03

[#27437] [Feature #2561] 1.8.7 Patch reduces time cost of Rational operations by 50%. — Kurt Stephens <redmine@...>

Feature #2561: 1.8.7 Patch reduces time cost of Rational operations by 50%.

9 messages 2010/01/06

[#27447] [Bug #2564] [patch] re-initialize timer_thread_{lock,cond} after fork — Aliaksey Kandratsenka <redmine@...>

Bug #2564: [patch] re-initialize timer_thread_{lock,cond} after fork

18 messages 2010/01/06

[#27545] [Feature #2594] 1.8.7 Patch: Reduce time spent in gc.c is_pointer_to_heap(). — Kurt Stephens <redmine@...>

Feature #2594: 1.8.7 Patch: Reduce time spent in gc.c is_pointer_to_heap().

8 messages 2010/01/11

[#27635] [Bug #2619] Proposed method: Process.fork_supported? — Hongli Lai <redmine@...>

Bug #2619: Proposed method: Process.fork_supported?

45 messages 2010/01/20
[#27643] [Feature #2619] Proposed method: Process.fork_supported? — Luis Lavena <redmine@...> 2010/01/21

Issue #2619 has been updated by Luis Lavena.

[#27678] Re: [Feature #2619] Proposed method: Process.fork_supported? — Yukihiro Matsumoto <matz@...> 2010/01/22

Hi,

[#27684] Re: [Feature #2619] Proposed method: Process.fork_supported? — Charles Oliver Nutter <headius@...> 2010/01/22

On Thu, Jan 21, 2010 at 11:27 PM, Yukihiro Matsumoto <matz@ruby-lang.org> wrote:

[#27708] Re: [Feature #2619] Proposed method: Process.fork_supported? — Yukihiro Matsumoto <matz@...> 2010/01/22

Hi,

[#27646] Re: [Bug #2619] Proposed method: Process.fork_supported? — Tanaka Akira <akr@...> 2010/01/21

2010/1/21 Hongli Lai <redmine@ruby-lang.org>:

[#27652] Re: [Bug #2619] Proposed method: Process.fork_supported? — Hongli Lai <hongli@...99.net> 2010/01/21

On 1/21/10 5:20 AM, Tanaka Akira wrote:

[#27653] Re: [Bug #2619] Proposed method: Process.fork_supported? — Tanaka Akira <akr@...> 2010/01/21

2010/1/21 Hongli Lai <hongli@plan99.net>:

[#27662] Re: [Bug #2619] Proposed method: Process.fork_supported? — Vladimir Sizikov <vsizikov@...> 2010/01/21

On Thu, Jan 21, 2010 at 10:53 AM, Tanaka Akira <akr@fsij.org> wrote:

[#27698] [Bug #2629] ConditionVariable#wait(mutex, timeout) should return whether the condition was signalled, not the waited time — Hongli Lai <redmine@...>

Bug #2629: ConditionVariable#wait(mutex, timeout) should return whether the condition was signalled, not the waited time

8 messages 2010/01/22

[#27722] [Feature #2635] Unbundle rdoc — Yui NARUSE <redmine@...>

Feature #2635: Unbundle rdoc

14 messages 2010/01/23

[#27757] [Bug #2638] ruby-1.9.1-p37[68] build on aix5.3 with gcc-4.2 failed to run for me because it ignores where libgcc is located. — Joel Soete <redmine@...>

Bug #2638: ruby-1.9.1-p37[68] build on aix5.3 with gcc-4.2 failed to run for me because it ignores where libgcc is located.

10 messages 2010/01/24

[#27778] [Bug #2641] Seg fault running miniruby during ruby build on Haiku — Alexander von Gluck <redmine@...>

Bug #2641: Seg fault running miniruby during ruby build on Haiku

10 messages 2010/01/25

[#27791] [Bug #2644] memory over-allocation with regexp — Greg Hazel <redmine@...>

Bug #2644: memory over-allocation with regexp

12 messages 2010/01/25

[#27794] [Bug #2647] Lack of testing for String#split — Hugh Sasse <redmine@...>

Bug #2647: Lack of testing for String#split

14 messages 2010/01/25

[#27912] [Bug #2669] mkmf find_executable doesn't find .bat files — Roger Pack <redmine@...>

Bug #2669: mkmf find_executable doesn't find .bat files

11 messages 2010/01/27

[#27930] [Bug:trunk] some behavior changes of lib/csv.rb between 1.8 and 1.9 — Yusuke ENDOH <mame@...>

Hi jeg2, or anyone who knows the implementation of FasterCSV,

15 messages 2010/01/28
[#27931] Re: [Bug:trunk] some behavior changes of lib/csv.rb between 1.8 and 1.9 — James Edward Gray II <james@...> 2010/01/28

On Jan 28, 2010, at 10:51 AM, Yusuke ENDOH wrote:

[ruby-core:27712] Re: [Feature #2594] 1.8.7 Patch: Reduce time spent in gc.c is_pointer_to_heap().

From: Kurt Stephens <ks@...>
Date: 2010-01-22 23:39:04 UTC
List: ruby-core #27712
Roger Pack wrote:
>> I heard from the REE guys that 1.9 heaps do not grow exponentially, but remain fixed.  In that case a linear O(1) probe function could help find either the min or max index, before doing a binary search.
> 

Disclaimer: I probably need to read more of 1.9 gc.c.

> They do, though it allocates exponentially more of them as it needs to :)
> 
> i.e. first time it allocates 16K, second time 16K*2, third time 16K*4, etc.

Why allocate exponentially at the risk of never returning anything back 
to the OS?

If a program has a stable memory profile at 16k*16, but it allocates
just *one* more object, it just stole 16K*32 from the OS with almost no 
possibility of ever returning it -- just for one object.  Now multiply 
that by 10 processes, all with the same work profile.

> 
> The hope is that smaller heap chunks can be freed more readily, so it
> has a fixed size.
> 

A single live object in a chunk would prevent returning the chunk back 
to the malloc() pool.  It's rare that malloc() returns memory back to 
the OS, anyway, unless the allocation profile is LIFO-like (i.e. last 
malloc()'ed, first free()'ed).  Since we share the same malloc() pool 
with other C code (and resizing String and Array buffers), it's very 
unlikely, because malloc() suffers from the same restrictions: it cannot 
return partial pages it chunked back to the OS.

I doubt that we ever get lucky enough, because
MRI gc.c is not a copying collector and other code is competing for 
malloc()s.  Has anybody instrumented
free_unused_heaps(rb_objspace_t *objspace) to see if it ever actually
finds anything to free() in a real program?

Those unused pages of the 16k*32 have all been dirtied by chunking them 
into the free list.  Depending on the OS, allocated virtual memory may 
not be mapped to real (or swap) memory until it's been dirtied.

There is no benefit to allocating *anything* exponentially.  In fact 
it's likely that the allocation causes longer pauses over time, because 
of the chunking.   Any exponential allocation scheme, including 
"power-of-2" malloc() implementations, wastes about 50% of its memory on 
average.

> How would a probe function work with tons of 16K chunks, though?

If they are semi-contiguous, a probe would probably get close since the 
chunks are all of the same size.  If we allocated heap_slots via mmap() 
in our own "zone", we could avoid internal fragmentation of heap_slots 
by other malloc() calls.

FRAGMENTATION:

If there is a wide range of sizes in the RVALUE.as union, we would do 
better by allocating heap_slots by size of the objects and segregating 
free lists.

(gdb) p sizeof(struct RBasic)
$1 = 8
(gdb) p sizeof(struct RObject)
$2 = 20
(gdb) p sizeof(struct RClass)
$3 = 20
(gdb) p sizeof(struct RFloat)
$4 = 16
(gdb) p sizeof(struct RString)
$5 = 20
(gdb) p sizeof(struct RArray)
$6 = 20
(gdb) p sizeof(struct RRegexp)
$7 = 20
(gdb) p sizeof(struct RHash)
$8 = 20
(gdb) p sizeof(struct RData)
$9 = 20
(gdb) p sizeof(struct RTypedData)
$10 = 20
(gdb) p sizeof(struct RStruct)
$11 = 20
(gdb) p sizeof(struct RBignum)
$12 = 20

Thus each RFloat suffers from 25% internal internal fragmentation.
By segregating chunks by size, we could inline some of the auxiliary 
structures and further reduce external fragmentation caused by 
malloc()ing them:

Change:

struct RHash {
     struct RBasic basic;
     struct st_table *ntbl;      /* possibly 0 */
     int iter_lev;
     VALUE ifnone;
};

TO:

struct RHash {
     struct RBasic basic;
     struct st_table ntbl;
     int iter_lev;
     VALUE ifnone;
};

to avoid malloc()ing an st_table for each RHash.

"sizeof" allocators have advantages over "power-of-2" allocators:
* less internal fragmentation,
* better locality of types (if "size" is an attribute of "type"),
* can be tuned to behave like "power-of-2" allocators above a certain 
threshold.

Some Lisp systems segregate allocations by type, which localizes consed 
lists, improving memory cache hits during linked list processing.

I've written a "sizeof/type"-based conservative collector, but I'd 
rather invest time on getting the BDW collector working under MRI:

* it's thread-safe,
* it's been in use for more than 22 years,
* it works on almost every platform,
* in almost any program (except in MRI!)
* and is embedded in more hardware than we can probably guess.

I think BDW supports "sizeof" allocation through GC_API void * 
GC_malloc_explicitly_typed(size_t size_in_bytes, GC_descr d);

The biggest impediment to hooking up a different conservative GC
is ObjectSpace.each_object.  The sooner we deprecate that feature,
the sooner we can separate MRI from its allocator.

> -r
> 

- KAS

In This Thread