0x0a: Some optimization hacks…

We are working on a somewhat larger Rails application, and one of the actions renders a table with up to 9.000 table elements. Now, somewhere when rendering that table we use a line like

some_array.collect(&:some_method)

and this line gets called several thousand times. As you can imagine this turns out to be not super-fast. But to our astounishment we found that more of 20% of the processing time was used up by creating several 1000 Proc objects!

Caching Proc

That led to a first optimization idea: caching those Proc object inside the Symbols. As you might know, the &:some_method calls the to_proc method of the :some_method Symbol. Rails come with an implementation which looks like this:

def to_proc
  Proc.new { |*args| args.shift.__send__(self, *args) }
end

As these Proc objects always represent the same block for each individual Symbol they should be cacheable in the symbol itself:

def to_proc
  @to_proc ||= Proc.new { |*args| args.shift.__send__(self, *args) }
end

This little change sped up rendering by 25%!

Garbage uncollecting

We noticed some strange hickups during rendering of that very same action. While one specific partial usually rendered in 5 milliseconds on average, but a few times it took about 200 milliseconds – without any dramatic change in the data to be rendered. After some time we came up with the idea that this is the responsibility of Ruby’s garbage collector.

Which makes sense: we create thousends of temporary objects, that become eligible for garbage collection pretty soon. And as you cannot finetune when Ruby’s garbage collector kicks in without recompiling the ruby interpreter we tried instead to disable garbage collection while the response gets rendered:

...
around_filter do |controller, action|
  GC.disable
  begin
    action.call
  ensure
    GC.enable
    GC.start
  end
end
...

This still runs the garbage collector before the response is send back to the client. But at least the garbage collector runs only once per request. This gave us a 20% speedup at the cost of a somewhat higher memory consumption. Our tests showed that the memory usage doesn’t grow over time: so garbage collection seems still to run fine.

Note: What I would call proper behaviour of Rails would be to run the garbage collector only after the response is sent back to the client. And I would expect the code above without the GC.start code line to behave just like this. Our tests, however, showed a slowly but continuously rising memory consumption.

Is this ready for production?

While our tests are running perfectly with those hacks, we considered that those code snippets modify the core behaviour of the Rails framework. And because a live system has different access characteristics than a test environment we decided against using those hacks in production mode.

To prevent Symbol#to_proc to be called that often we explicitely write down a block in that one place. And when we evaluated ruby enterprise we found that its memory allocator deals much better with our memory usage characteristics. With those two changes the performance advantages of our hacks are not so huge anymore. So we are fine for the moment.

But still: would you consider such hacks production safe? How would *you* test for that?

Further reading

Advertisements

3 responses to “0x0a: Some optimization hacks…

  1. This reply is so hilariously late and moot, but fwiw, if you wanted to run GC _after_ the response had been sent, you can now use rack middleware. Thanks for a nice read.

  2. alex, thanks for the comment. Yes, rack was one piece of a game changer in the web-on-ruby-world.

    It would probably be interesting to revisit those issues once more. If I am remember correctly the rails crowd started using Symbol#to_proc massively, which probably explains that there was no proper native implementation back then. But I don’t know if that is still the case; if it is someone should build one instantly ;)

    The MRI GC became much better in the last years too, so the other hack is probably even more unnecessary. But rereading that hack would even be not a good idea with async web servers (thin), as it would hurt parallel requests massively. Back then we used to run a bunch of mongrel servers: one stalling during GC would not afflict the rest.

    Today, which rack and all, that hack doesn’t even work anymore, because it would run the GC _before_ rails returns its response to the web server. To run the GC _after_ the request was written back to the client one would have to hack into the server.

    You mentioned a rack middleware which does that; I didn’t find any. Can you point me in the direction? (Not that I would use it, see above.)

    BTW: this blog has moved, and is now at radiospiel.org.

  3. BTW: Just verified that a) Symbol#to_proc is native on 1.9.3 at least, b) it caches the procs, and c) marks them as used so they will never be GCed at all. Nice.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s