Thursday, September 20, 2007

Some Thoughts about Ruby, Watir, IronPython, WatiN and Unicode

We had some trouble with ruby and some unicode testing lately @work. While the ruby based library for IE test automation Watir still is a great tool in my opinon, Ruby became a real pain :-( My decision to use Watir for that project in question also made the decision for ruby ( so much for the simple logic ;-)). At first i really enjoyed the great flexibility ruby gave me while implementing our test framework but Ruby had some serious (and expensive) downsides as well

  • Ruby may be very popular and well known among web developers because of Rails but its is rather unknown among test engineers. Most of my team mates in that project first had to learn Ruby and its' (sometimes weird, sometime beautyful) syntax
  • There may be many web related ruby libs arround (again because of rails) in other areas it seems sometimes very premature
  • Unicode support sucks in Ruby (maybe i underestimated how important unicode really was for our project and i expected Ruby just to support it properly - silly me)
  • The "flexible" syntax of Ruby makes Code less comprehensive and sometimes harder to debug or to maintain in my Opinion - but it was always fun coding those in the first place ;-)
I did know about WatiN and Watij, the .NET/Java Ports of Watir (for some reason those didn't seem "cool" to me - more on that later) and of cPamie which is a Python lib similar to Watir (but it seemed kinda "dead" - the watir community on the other hand was most alive and the mailing list was always very inspiring - thanx guys :)) and we already did a prototype/proof of concept using Selenium-RC (which wasn't very stable at that time and had issues with HTTPS but would have been language independant - I did choose Ruby on that proof of concept as well however...). I chose Watir because it seemed to be more up to the task than the alternatives.

One of the requirements we had to meet was verifing UTF-8 encoded strings on the web interface of the SUT against values from ASCII Java *.properties files which were encoded in the escaped unicode encoding (\uXXXX). Languages like Java, JavaScript or Python do support that encoding out of the box Ruby on the other hand doesn't (or we didn't find anything that does). We did spend some man-hours on that problem and came up with the following hack for Ruby's String class as solution which does work good enough for us:

class String
# returns a str with unescaped unicode sequence
def unescape_unicode
# fetch the escaped unicode sequence and convert to hex:
self =~ /\u([0-9a-fA-F]{4,8}?)/ ? self.gsub(/\\u[0-9a-fA-F]{4,8}?/, [$1.hex].pack("U*")) : self

# replaces the escaped unicode sequence inline
def unescape_unicode!
# fetch the escaped unicode sequence and convert to hex:
self.gsub!(/\\u[0-9a-fA-F]{4,8}?/, [$1.hex].pack("U*")) if self =~ /\u([0-9a-fA-F]{4,8}?)/

The "set" method of "text_field" objects didn't like UTF-8 chars at all which caused us a another headache. Željko Filipin discribes a workaround for that problem in his blog. Conversion of strings into another encoding is not really well supported in Ruby either. At least there was a wrapper lib for Iconv available.

Beside from the unicode problems we have found some other problematic areas in Ruby which nearly did render the language as too premature for use in production. I know there are many ppl out there using Ruby in production systems (mainly Rails apps - there's a big community and it seems to work just great) and it does work for us now as well after some extra work (which was really PITA sometimes - but some coffee can make a test engineer almost unstopable).

On the bright side the trouble caused by Ruby did enflame my deep love for Python again which is now my first choice programming language again :-) After all that buzz about dynamic languages like Ruby and Python on the two big VMs .NET/Mono and JVM (seems like Sun has some serious plans for JRuby - same goes for Microsoft with IronPython) i deceided to give IronPython a try. With our recent unicode problems in mind i wrote a mini script to check if IronPython + WatiN could handle those encodings out of the box:

import clr

from WatiN.Core import *

ie = IE()
for s in ['Przegląd', u'\u00fc', unicode('\305\204', 'UTF-8')]:
ie.Button(Find.ByValue('Szczęśliwy traf')).Click()

What can i say, it worked like a charm. The Reasons why I didn't consider WatiN or Watij before was because i somehow thought the time of languages like Java or C# was up and that dynamic languages like Ruby or my beloved Python should rule the world ;-) I didn't really realize the potential of dynamic languages on top of those mature enterprise class production ready platforms .NET or Java - silly me! ;-)

Actually, what the mini script from above does would probably be possible with CPython and cPamie as well (didn't check) . So no need for .NET there? Well, i just trust WatiN more than cPamie and I was really curious about IronPython. I guess i'll give Jython or JRuby and Watij a try as well (and mybe even IronRuby/JRuby) and post some of the discoveries to this blog...

BTW: An interesting aspect of dynamic languages on .NET/Java ist the current discussion about removing (or not) the global interpreter lock (GIL) from CPython - Ruby2.0 will have a GIL as well - /Iron[Python, Ruby]/, Jython or JRuby don't have this "limitation"....

so whats the point? I guess i have to change the way i think about Java/.NET on one Side and Python/Ruby on the other. Java and .NET are cool platforms from which my work can benefit greatly without losing the fun that dynamic languages (can) provide... maybe JRuby could make me love Ruby again... we'll see

1 comment:

Nick Chistyakov said...

Thanks a lot for sharing that code. I've recently had the same problem with parsing java *.properties files. Your code solved the \uXXXX unicode problem :)
Thank you one more time!