Ruby Performance

At ThoughtWorks, I work as a developer consultant, which is an airy way of saying I bounce around to many clients, in many tech stacks.

Right now, I’m working in Ruby, which is a language I have many FeelingsTM about.

On the one hand, I love the low barrier separating your ideas from your implementation, which feels uniquely Ruby. Truly, I love that Ruby flow where charming near-perfect English flows from your fingers and becomes working code. Amazing! So nice! So poignant!

On the other, I insist: Ruby is lies. The quality of the dev experience is brought to you by super expensive sugar everywhere and veritable TONS of code that looks innocent enough but will surprise you in sad ways.

For example, consider this sad surprise from Alexander Dymos wonderful Ruby Performance Optimization:

require "benchmark"
num_rows = 100_000
num_cols = 10
data = Array.new(num_rows) { Array.new(num_cols) { "x"*1000 } }

# GC.disable
time = Benchmark.realtime do
    csv = data.map { |row| row.join(",") }.join("\n")
end
puts time.round(2)

OK, making a 100k x 10 csv. How does it do?

Results And here’s what I got as well:

2.0 2.1 2.2 2.3 2.4
GC Enabled 7.32 1.89 1.91 1.84 1.86
GC Disabled 1.16 1.06 1.33 1.06 1.12
% time in GC 84% 43% 30% 42% 40%

DO YOU SEE THAT?! Even in later versions*, Ruby is spending a huge chunk of its time in the garbage collector! Yeesh!

Garbage

Well then.

Well then…there is hope, and it stems from always considering two things in turns. First, your code must be clean. Reducing complexity and bringing down the execution time is crucial, and within your control. Second, but perhaps more importantly, you have deliberate over the memory impact of your choices. As we see above, memory consumption and garbage collection make Ruby slow, before/below/beyond any single feature you implement.

One can make a dent, and I really love the questions Dymo use to capture a process for making realistic improvements. I record his great must-ask questions here for posterity:

  1. Is Ruby the right tool to solve my problem?

    There are things that Ruby is not so good at. The prime example is large dataset processing

  2. How much memory will my code use?

    The less memory your code uses, the less work Ruby GC has to do.

  3. What is the raw performance of this code?

    Once you’re sure the memory is used optimally, take a look at the algorithmic complexity of the code itself.

Lovely. So many feelings.

* Ruby was only designed with performance in mind starting around 2.0, which introduced copy-on-write. Garbage collection came in 2.1 and 2.2.

12/9/2017

Previous:omg technology whyyy
Next:Icon