Ruby used functionally is great, but it takes an already time-inefficient langua...

mullsork · on March 21, 2018

Isn’t there something called LazyEnumerator that deals with this? In terms of chained .map at least

djur · on March 21, 2018

Lazy enumerators don't make each individual step of the iteration any more efficient. They just let you stop the iteration partway through without traversing the entire original enumerable. Given this contrived example:

   a = (1..1000).lazy.select(&:even?).map{|i| i*2 }.map{|i| i + 1 }.map(&:to_s).map(&:reverse)

There is no benefit from lazy if you do `a.join(",")`, since that iterates over the entire sequence. But if you do `a.take(10)` or `a.detect{|s| s == "14" }`, the chain will only be executed for each item in turn until 10 elements are produced or an element is the string "14", respectively.

cutler · on March 21, 2018

Ruby's composabilty really puts Python to shame. I can't imagine how Ruby lost out to Python.

blunte · on March 21, 2018

While I much prefer writing Ruby to Python, Python is just dead simple to get real work done with. It's not elegant, but it's still so much less nasty than C++ or Java.

Ruby's #1 problem, for me at least, is that your small project someday runs into a performance wall. I don't know the latest benchmarks, but last I recall Python was about 8x faster than Ruby when both are being interpreted. Yes, there's JRuby, but that's not something you can drop into a new system and do useful things with immediately (without more setup).

And with Clojure, you get all the elegance and lovely collection manipulation tools you can possibly want, much faster performance, and a huge stable pile of Java libraries (compared to Ruby).

So with Python and Clojure as one's main tools, life is quite nice.

yxhuvud · on March 21, 2018

Say what? Ruby has been faster than Python for interpreted code for years now.

Granted, Python is faster for tasks like those Numpy solves, but in arbitrary execution performance, python is just slower. 8x has NEVER been true. That must have been you doing some really bad stuff in one of them but not both.

blunte · on March 22, 2018

I stand corrected. I used Ruby for 6 years, up to 2015. I wasn't using Python during those years, but I thought I had heard or read about the 8x speed difference in favor of Python.

For sure, Ruby was slow for what we were doing... processing millions of rows of data was so slow that it caused me to decide to try Go (and the same Go program was hundreds of times faster). But I didn't try Python then.

jashmatthews · on March 21, 2018

I think there was a short period about 11 years ago before the release of Ruby 1.9 where the Python VM was basically re-written to be as fast as YARV, but YARV wasn't released yet.

jashmatthews · on March 21, 2018

MRI Ruby has been slightly faster than Python for about a decade and the next release of Ruby will contain a basic JIT compiler giving MRI Ruby a significant advantage over CPython.

cutler · on March 22, 2018

What's your evidence for Python ever having been 8 times faster than Ruby? Apart from the numeric libraries, written mostly in C or Fortran, Python has at best only ever gained a very marginal speed lead over Ruby and that has been wiped out in recent releases of MRI. Ruby's startup time is a bit slower than Python's but once they're off to the races both of these horses have been neck and neck for a long time.

bbatha · on March 21, 2018

There is absolutely a benefit to lazy if you're doing more than one combinator even if you ultimately need to iterate over the whole list such as the join case.

The lazy version basically turns into

   a = ""
   (1..1000).each { |i|
       next if i.odd?
       i *= 2
       i += 1
       a << i.to_s.reverse + ", " # ya ya, trailing comma
   }

Where as the non lazy version turns into:

    odds = []
    (1..1000).each { |i| odds << i if i.even? }
    doubled = []
    odds.each { |o| doubled << o * 2 }
    incremented = []
    doubled.each { |d| incremented << d + 1 }
    strs = []
    incremented.each { |i| i.to_s }
    a = ""
    strs.each{ |s| a << s.reversed + ", " }

Which means that you have you went from N iterations to 5*N iterations, with 5 intermediate arrays in this case.

djur · on March 22, 2018

That's how it could work theoretically, but in practice lazy iterators are much less efficient due to their implementation. I benchmarked it here:

https://gist.github.com/mboeh/bb480d93c71046e23816f2c24c23e3...

When applied to the same amount of work, lazy iterators consume more memory and take more CPU time than the equivalent non-lazy chained iteration.

There's a place for lazy iterators, but you should pretty much always profile first. If you have performance problems due to a long chain of iterators, you're always going to get better results by merging some of those iterators into a single block than by making the whole chain lazy. The best use case for lazy iterators is if you're trying to avoid having the original collection in memory (if it's being streamed from a file or the like). And at that point, if you're trying to optimize for performance, you should avoid chained iterations, period.

The good news is that in the vast majority of cases this just doesn't matter. Chaining iterators works fine until you start running into performance problems.

mullsork · on March 22, 2018

Thanks a lot for the insight!