0x23 – Study: ruby coders are more attractive than PHP coders.

Well, that is of course total bullshit. With that title I will conduct a small experiment.

I found that the least informative of my blog posts attract the most readers. So if this stays true with that post (and a title like that should attract at least some attention) I will add more noise here. Promise!

0x22 – Your daily class variables and constants surprise

It is not chrismas, hence no quiz, but this was strangely surprising:

class A
  @@t = "A::@@t"
  T="A::T"

  def self.s1; @@t; end
  def self.s2; T; end
end

class B < A
  def self.s1; @@t; end
  def self.s2; T; end
end

class C < A
  @@t = "C::@@t"
  T="C::T"
  def self.s1; @@t; end
  def self.s2; T; end
end

[ A.s1, A.s2, B.s1, B.s2, C.s1, C.s2 ]

gives you

["C::@@t", "A::T", "C::@@t", "A::T", "C::@@t", "C::T"]

0x21 – Array#to_proc


class Array
  def to_proc
    proc do |obj|
      self.map { |sym| obj.send(sym) } 
    end
  end 
end

gives you


Account.all.map(&%w(id email))

0x20 – Sphinx/ThinkingSphinx vs Xapian/ActsAsXapian shootout

This benchmark compares thinking_sphinx with acts_as_xapian. We need a search engine that gives us the IDs of matching documents from a fulltext index, basic text search only.

Data

  • one table with 200k entries with 5k of text (avg) in one column
  • one table with 500k entries with 7k of text (avg) in 6 columns
  • one table with 500k entries with 7k of text (avg) in 4 columns

Indexing

Initial indexing took 10 mins with thinking_sphins and 75 mins(!!) on acts_as_xapian

Search performance

The search performance on queries that return only a few items is nearly identical.

The search performance on queries that return many items (~10000) is nearly
identical, 90% of the time is spend in ActiveRecord.

In our case – we only need IDs and not the entire documents – sphinx runs
at 0.6 secs for a particular query (with 10000 results),
where acts_as_xapian needs 4.5 secs. This is because thinking_sphinx allows
you to only fetch the ids, where acts_as_xapian insists of pulling the
models from the database. When patching acts_as_xapian to allow for pulling
ids only, we land at 0.6 vs 0.4 secs.

Results

We will choose sphinx because

  • it is similarily fast to xapian
  • runs over the network by default
  • Indexing is way faster (I guess because acts_as_xapian pulls all data to be index from the database to hand it over, while sphinx can do that itself)
  • acts_as_xapian would need to be patched for performance reasons.

And here is some food for our beloved web spiders

CouchDB is here to stay!

While some of my 5 readers on average (per day) might already know, for everyone else: CouchDB will be part of the next Ubuntu release. And that one comes with long time support.

Congratulations, Couchies.

F# – not on german keyboards

Whoever designed F# did certainly not have a german programmer in mind. Or how would you explain this


(* Print even fibs *)
[1 .. 10]
|> List.map     fib
|> List.filter  (fun n -> (n mod 2) = 0)
|> printlist

The pipe on a German Mac keyboard is <Alt> + <7>, which in itself ok; but directly followed by a ‘>’ – which is &ShiftAlt> + <the key right next to <shift> is absolute nightmare for anyone with delicate bones in the left wrist. RSS: you nearly got me, but I still prefer other functional languages.

0x1f – There are no nested functions in ruby!

Long time no read, I know. Well, I have been away.

Anyways, while I am not back yet, I stumbled across something that made me wonder: consider this:

class X
  def self.a
    def self.b; "b"; end
    b()
  end

  def self.x
    b
  end
end

The ruby feature which allows you to define a function within another function is relatively new to me. Since I found out about it I used it to split a function into several parts but not to publish the parts in any namespace accessible from the outside, i.e. the parts should be accessible only from within the method.

Turns out that I was wrong. Apparently a “not-really inner function” is defined at whatever outer level exists (hence the need for the “self” part in “def self.b; …; end” in the example above). The method “b” is defined on the X class object, i.e. as if was written

class X
  def self.b; "b"; end
  def self.a; b(); end
end


Seems I will stop using that idiom.