[#74190] [Ruby trunk Feature#12134] Comparison between `true` and `false` — duerst@...
Issue #12134 has been updated by Martin D端rst.
3 messages
2016/03/07
[#74269] Type systems for Ruby — Rob Blanco <ml@...>
Dear ruby-core,
5 messages
2016/03/10
[#74395] [Ruby trunk Feature#12142] Hash tables with open addressing — shyouhei@...
Issue #12142 has been updated by Shyouhei Urabe.
3 messages
2016/03/17
[ruby-core:74733] [Ruby trunk Feature#12222] Introducing basic statistics methods for Enumerable (and optimized implementation for Array)
From:
duerst@...
Date:
2016-03-31 05:23:46 UTC
List:
ruby-core #74733
Issue #12222 has been updated by Martin D端rst.
Benoit Daloze wrote:
> It seems to me Enumerable is not the right place for this, because it expects more than just #each.
The code is currently written in terms of #length and #[], but this can easily be fixed to use #each.
> Also, these methods are likely useful only for numeric collections.
Then just don't used them on other collections :-).
> Maybe a "Statistics" module at a stdlib?
> Statistics.mean/variance/etc(enum) would be a nicer API than mixing everything in Enumerable IMHO.
Why? I don't see much potential for conflicts. Or does anybody have any mean (as opposed to nice) collections?
Also, as far as I understand, a bigger API doesn't really slow anything down.
I would definitely see providing these (and more) statistical methods for Ruby as a big plus.
----------------------------------------
Feature #12222: Introducing basic statistics methods for Enumerable (and optimized implementation for Array)
https://bugs.ruby-lang.org/issues/12222#change-57868
* Author: Kenta Murata
* Status: Assigned
* Priority: Normal
* Assignee: Yukihiro Matsumoto
----------------------------------------
As python has statistics library for calculating mean, variance, etc. of arrays and iterators from version 3.4,
I would like to propose to introduce such features for built-in Enumerable, and optimized implementation for Array.
Especially I want to provide Enumerable#mean and Enumerable#variance as built-in features because they should be implemented by precision compensated algorithms.
The following example shows that we couldn't calculate the standard deviation for some arrays with simple variance algorithm because we get negative variance numbers.
```ruby
class Array
# Kahan summation
def sum
s = 0.0
c = 0.0
n = self.length
i = 0
while i < n
y = self[i] - c
t = s + y
c = (t - s) - y
s = t
i += 1
end
s
end
# precision compensated algorithm
def variance
n = self.length
return Float::NAN if n < 2
m1 = 0.0
m2 = 0.0
i = 0
while i < n
x = self[i]
delta = x - m1
m1 += delta / (i + 1)
m2 += delta*(x - m1)
i += 1
end
m2 / (n - 1)
end
end
ary = [ 1.0000000081806004, 1.0000000009124625, 1.0000000099201818, 1.0000000061821668, 1.0000000042644555 ]
# simple variance algorithm
a = ary.map {|x| x ** 2 }.sum
b = ary.sum ** 2 / ary.length
p (a - b) / (ary.length - 1) #=> -2.220446049250313e-16
# precision compensated algorithm
p ary.variance #=> 1.2248208046392579e-17
```
I think precision compensated algorithm is too complicated to let users implement it.
--
https://bugs.ruby-lang.org/
Unsubscribe: <mailto:ruby-core-request@ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>