Ruby: Error with Gems 1.2.0 + Win32_Process + $KCODE + DRb

This is really awkward. The problem was found after moving some scripts that worked fine on one server to another server with rubygems recently updated. The script normally spawns several processes that are DRb Servers, but it failed silently. After some testing I found that with every other thing being equal (Ruby 1.8.6 patchlevel 111 and win32-process 0.5.3 on both machines) the script crashed with gem 1.2.0 and not with 0.9.4.

The script iterates calling Process.fork with a code block that creates the new DRb server. The DRb object is loaded via require at the begining of the script. I’m posting here a reduced example for clarity.

puts "iniciando"
require 'rubygems'
require 'drb'
puts 'loading pingpong'
require 'pingpong'
puts 'loaded pingpong'
puts 'loading process '
require 'win32/process'
puts 'loaded process'

def start_server(port)
  uri="druby://0.0.0.0:#{port}"
  trap("INT"){puts("Interrupted"); DRb.thread.exit}
  DRb.start_service(uri,PingPong.new)
  puts("Listening #{uri}")
  DRb.thread.join
end 

puts "here #{ARGV.inspect}"
2.times{|port| puts("Sending #{port}");Process.fork{start_server(5850+port)}}
puts "out"

The PingPong class is defined in the following script. Note the $KCODE declaration since it turned out to be the problem:

$KCODE = 'UTF-8'

class PingPong
  def ping
    "pong"
  end
end

In the current implementation of win32_process, Process.fork works by creating a new Windows process calling Ruby with the same script and passing an argument indicating which child is creating. This is very important as indicated in the documentation since one would expect that each child’s execution started at the code block passed to the fork.

The process failed like this with gem 1.2.0 (but worked perfectly on gem 0.9.4:)

C:\>test_fork_simple.rb
iniciando
loading pingpong
loaded pingpong
loading process
loaded process
here []
Sending 0
Sending 1
out

After some hours debugging I found that the problem arised when the $KCODE assigment is done on the child processes; if the assigment is deleted the processes are created correctly:

C:\>test_fork_simple.rb
iniciando
loading pingpong
loaded pingpong
loading process
loaded process
here []
Sending 0
Sending 1
out

iniciando
iniciando
loading pingpong
loading pingpong
loaded pingpong
loading process
loaded pingpong
loading process
loaded process
here ["child#0"]
Sending 0
loaded process
here ["child#1"]
Sending 0
Sending 1
Listening druby://0.0.0.0:5850
Listening druby://0.0.0.0:5851

But in the real programs the line can’t be left out since it’s critical. Oddly enough, the workaround consisted on switching the ‘require’ statments for the child class and win32/process:

#Crashes silently:
require 'pingpong'
require 'win32/process'

#Works!
require 'win32/process'
require 'pingpong'

It’s also very strange that, when the script fails, the child processes are not even created (as you can see on the first output, the “iniciando” message is never sent). I’ll post an update if I found what’s happening here.

Loading and Processing Big Matrices on Python/numpy for NNMF

I was loading a quite big dataset (3446×14807 floats) to a Python / Numpy matrix to perform a Nonnegative Matrix Factorization (NNMF). First I tried generating the Python source code from another process and inserting literally the whole matrix, something like:

w = matrix([
[0.0072992700729927,0.0072992700729927,0.0291970802919708,0.0145985401459854,
0.0072992700729927,0.0072992700729927,0.0072992700729927,0.0072992700729927, ... ,])

But the process crashed with a cryptic MemoryError while loading the matrix. Then I tried using

loadtxt()

Numpy’s method but the same thing happened. I found this thread. I’m using both a XP laptop and a Win2003 Server with Python 2.5.1 and 2.5.2 respectevly, both 32bits. This didn’t seemed the problem since the matrix could be constructed with ones or zeros and enough memory was available (the server has 8gigs of RAM). It rather looked like the problem was while building the matrix. So I found that building the array first and then adding the values did the trick:


JW = zeros((len(jobs),len(words)), float) f=open(argv[1]+'_JW.csv') line = 0 for l in f: vals = l.split(',') for i in range(0,len(vals)): JW[line,i] = float(vals[i]) line += 1 f.close()

Chrome usage, a comparision US-MX

Update 2: Data from Feb.-March’09

Update: I’ve added a more recent (January ’09) statistic at the end.

To complement my last post on the usage of Chrome, I’m posting data from Net Applications referenced by Computerworld, and our data:

Net Applications (presumably worldwide)

Week Starting Aug.24 Aug.31 Sept.7 Sept.15
IE 72.39% 71.03% 71.24% 71.48%
Firefox 19.54% 19.78% 19.35% 19.42%
Safari 6.27% 6.67% 6.95% 6.73%
Chrome —- 0.67% 0.85% 0.77%

OCCMundial.com (mostly México)

Week Starting Aug.24 Aug.31 Sept.7 Sept.15
IE 90.88% 90.69% 90.60% 90.39%
Firefox 7.46% 7.32% 7.22% 7.38%
Safari 1.27% 1.26% 1.25% 1.32%
Chrome —- 0.35% 0.54% 0.50%

Update:

From August to January IE is clearly loosing share to Firefox and to a lesser extent Safari and Chrome.

Data from Google Analytics for Dec. 22, 2008 to Jan. 21 2009

Dec.22 to Jan 21
IE 87.48%
Firefox 9.62%
Safari 1.66%
Chrome

0.72%

Opera

0.44%

Data from Google Analytics for Feb. 23, 2009 to March 25 2009

Feb. 23 to March 25, 2009
IE 86.82%
Firefox 10.10%
Safari 1.74%
Chrome

0.78%

Opera

0.46%

Chrome #4 on OCCMundial.com

Google Chrome ranks #4 with 0.51% of visitis on OCCMundial(my employer) as of today, from Sept 2. It ranks over Opera (0.27%) and below Internet Explorer (90.54%), Firefox (7.29%) and Safari (1.26%). OCCMundial is the leading online Job Board in Mexico, with more than 100 million pageviews per month.

And yes, it’s noticeable the margin of IE over every other browser, which I think can be extrapolated to the whole usage of browsers in Mexico since our traffic it’s pretty high for this market. (Data from Google Analytics).

On Artistic Taste and Psychology

How psychology accounts for taste development Via Psychology Today

“We wrongly assume, for instance, that people with highly decorated and cluttered rooms are more extroverted. We also assume such people are more open—when really we should be looking for variety in books and music, for books on art and poetry, and for art supplies. We assume that rooms with stale air belong to emotionally unstable people—when really we should be scanning for inspirational posters”