Loading and Processing Big Matrices on Python/numpy for NNMF

I was loading a quite big dataset (3446×14807 floats) to a Python / Numpy matrix to perform a Nonnegative Matrix Factorization (NNMF). First I tried generating the Python source code from another process and inserting literally the whole matrix, something like:

w = matrix([
[0.0072992700729927,0.0072992700729927,0.0291970802919708,0.0145985401459854,
0.0072992700729927,0.0072992700729927,0.0072992700729927,0.0072992700729927, ... ,])

But the process crashed with a cryptic MemoryError while loading the matrix. Then I tried using

loadtxt()

Numpy’s method but the same thing happened. I found this thread. I’m using both a XP laptop and a Win2003 Server with Python 2.5.1 and 2.5.2 respectevly, both 32bits. This didn’t seemed the problem since the matrix could be constructed with ones or zeros and enough memory was available (the server has 8gigs of RAM). It rather looked like the problem was while building the matrix. So I found that building the array first and then adding the values did the trick:


JW = zeros((len(jobs),len(words)), float) f=open(argv[1]+'_JW.csv') line = 0 for l in f: vals = l.split(',') for i in range(0,len(vals)): JW[line,i] = float(vals[i]) line += 1 f.close()

Chrome usage, a comparision US-MX

Update 2: Data from Feb.-March’09

Update: I’ve added a more recent (January ’09) statistic at the end.

To complement my last post on the usage of Chrome, I’m posting data from Net Applications referenced by Computerworld, and our data:

Net Applications (presumably worldwide)

Week Starting Aug.24 Aug.31 Sept.7 Sept.15
IE 72.39% 71.03% 71.24% 71.48%
Firefox 19.54% 19.78% 19.35% 19.42%
Safari 6.27% 6.67% 6.95% 6.73%
Chrome —- 0.67% 0.85% 0.77%

OCCMundial.com (mostly México)

Week Starting Aug.24 Aug.31 Sept.7 Sept.15
IE 90.88% 90.69% 90.60% 90.39%
Firefox 7.46% 7.32% 7.22% 7.38%
Safari 1.27% 1.26% 1.25% 1.32%
Chrome —- 0.35% 0.54% 0.50%

Update:

From August to January IE is clearly loosing share to Firefox and to a lesser extent Safari and Chrome.

Data from Google Analytics for Dec. 22, 2008 to Jan. 21 2009

Dec.22 to Jan 21
IE 87.48%
Firefox 9.62%
Safari 1.66%
Chrome

0.72%

Opera

0.44%

Data from Google Analytics for Feb. 23, 2009 to March 25 2009

Feb. 23 to March 25, 2009
IE 86.82%
Firefox 10.10%
Safari 1.74%
Chrome

0.78%

Opera

0.46%

Chrome #4 on OCCMundial.com

Google Chrome ranks #4 with 0.51% of visitis on OCCMundial(my employer) as of today, from Sept 2. It ranks over Opera (0.27%) and below Internet Explorer (90.54%), Firefox (7.29%) and Safari (1.26%). OCCMundial is the leading online Job Board in Mexico, with more than 100 million pageviews per month.

And yes, it’s noticeable the margin of IE over every other browser, which I think can be extrapolated to the whole usage of browsers in Mexico since our traffic it’s pretty high for this market. (Data from Google Analytics).

On Artistic Taste and Psychology

How psychology accounts for taste development Via Psychology Today

“We wrongly assume, for instance, that people with highly decorated and cluttered rooms are more extroverted. We also assume such people are more open—when really we should be looking for variety in books and music, for books on art and poetry, and for art supplies. We assume that rooms with stale air belong to emotionally unstable people—when really we should be scanning for inspirational posters”