Thursday 30 September 2010

Query to select last observation by date for each customer

Sometimes you have a table with a list of transactions of each, say, customer and you need the latest picture of each customer. The so called window functions in SQL2003 are instrumental in order to write neat, concise SQL for such purpose.

It would work as follows (in Teradata's SQL dialect):

select
  cust_id,
  status,
  status_date,
row_number() over (partition by cust_id order by status_date desc ) as ranking
from customer_status
qualify ranking = 1;

Saturday 25 September 2010

Relaying Postfix SMTP via smtp.gmail.com

You just need to follow these instructions to relay Postfix mail via smtp.gmail.com. You just need to be careful that some packages mentioned there have updated their names in recent Ubuntu versions: for instance, package libsasl2 mentioned there is now (as of today!) called libsasl2-2. 

Other than that, it works as a charm!

Saturday 11 September 2010

Run BOINC on a Windows box without Internet connection

If you have access to a Windows box without internet connection on which you wish to run BOINC, you can do as follows:
  1. Install BOINC on another Windows box with internet connection and register to your project(s).
  2. Copy the BOINC folder (possibly at C:\Program Files\BOINC) to your pen-drive. You may want to uninstall BOINC from this machine at this point.
  3. Navigate to the BOINC folder in your pendrive, and run the boinc executable file in order to download new tasks. Try to get as many as you can by setting the BOINC options adequately.
  4. Plug your pen-drive to the machine without internet connection and run the boinc executable file from it. Your tasks will start running. Those finishing will stay there waiting to be transferred back to the project servers.
  5. When the tasks have finished (or when you see fit!) unplug the pen-drive, plug it back to the machine with internet connection and see your tasks be uploaded to the project servers.
Note: beware of the submit time limit for tasks so that they do not expire.

Wednesday 25 August 2010

R trick: plot data in one scale, label axes on the original scale

Here is the way to do it. First, get the data and plot it in the logarithmic scale, say, without the y axis:

x <- exp( rnorm( 100 ) )
plot( log( x ), yaxt="n" )

Create the labels:

a <- 1:ceiling( max( x ) )

Finally, draw the axis at the original scale and with your desired labels:

axis(2, at=log(a), label=a)

Wednesday 18 August 2010

How many rows assigns Teradata to each AMP for a given table?

Teradata distributes table rows to different AMP's according to a hash value calculated from its primary key. The following query indicates how many rows fall into each AMP for a given table.

SELECT
HASHAMP (HASHBUCKET(HASHROW( col1, col2, col3 ))) AS AMP_NUM,
COUNT(*) as NUM_ROWS_PER_AMP
FROM
mytable
GROUP BY 1
ORDER BY 1;


It is useful to check whether your table is properly distributed or not!

Tuesday 17 August 2010

PHP's "Fatal error: Call to undefined function: curl_init()" on Ubuntu

After Twitter changed its access settings for applications to tweet directly into it, I had some problems updating my WP to Twitter plugin. I got a "Fatal error: Call to undefined function: curl_init()". 

Problem? Missing php5-curl package on my Ubuntu server.

Solution?

sudo apt-get install php5-curl
sudo /etc/init.d/apache2 restart

Listing missing statistics from Teradata query optimizer

Teradata is a fine DBMS whose performance critically depends on whether the right table statistics are available to the query optimizer.

You can request it to hint you for the missing statistics that it would like to have on your tables running the code

diagnostic helpstats on for session;

before checking query plans. After running the command above, the query plan, at the end, will list those statistics that the end of the query plan.

Saturday 14 August 2010

Completely eliminate packages marked as rc in Ubuntu or Debian

Removing a package is not the same as purging it. Simply removing leaves some cruft behind. If you remove packages from your Ubuntu or Debian system using

apt-get remove mypackage

you will still see it when you run a

dpkg -l

comand. And it is the case because configuration files are not erased. In order to completely remove (i.e., purge) a package you need to run

apt-get --purge remove mypackage

But what if you removed it but you did not purge it? It will always show up in your system.

If you want to get rid of all those rc-status (removed but not purged) packages in your system you can run something like:

sudo dpkg --purge `dpkg --list |grep "^rc" | cut -d " " -f 3 | xargs`

Tuesday 13 July 2010

Calling Python from R

The new R package, rJython, permits calling Python from R. It is not on CRAN yet, but it will be soon. Details about the package (installation from r-forge, architectural details, etc.) can be found here.

Sunday 4 July 2010

Passing arguments to Python functions

I found a good reference on how to pass arguments to Python functions using the star (or star-star) notation. You can find it here. Here goes a summary.

If you want to pass unnamed arguments, you can do as follows:

>>> a = (1,5)
>>> range( *a )
[1, 2, 3, 4]


If you want to pass named arguments, you need dictionaries:

>>> a = range( 1, 4 )
>>> b = range( 4, 8 )
>>> def concat( a, b ): return a + b
...
>>> concat( **dict( a = a, b = b ) )
[1, 2, 3, 4, 5, 6, 7]


You can mix named and unnamed arguments thus:

my_foo( *a, **b )

where a and b are a list and a dictionary. Caution needs to be taken to pass a single unnamed argument, though. Details can be found in the reference above.

Passing arguments to Python functions

I found a good discussion on how to pass vectors and dictionaries as arguments to Python functions. It can be found here.

In summary, for unnamed arguments:

>>> a = (1,4)
>>> range( *a )
[1, 2, 3]


And for named arguments:

>>> def concat(a,b): return a + b
...
>>> a = [1,2,3]
>>> b = range( 4,7)
>>> b
[4, 5, 6]
>>> concat(**dict( a = a, b = b ) )
[1, 2, 3, 4, 5, 6]


Named and unnamed arguments can be mixed:

my_foo( *a, **b )

There is only one caveat with functions requiring a single, unnamed argument. See the link above for details.

Wednesday 30 June 2010

Certificate problem trying to access Google Analytics

I upgraded my Firefox browser to version 3.6.6 yesterday. And I got errors when trying to access my Google Analtics account. A warning indicated me that I got something like a "ssl.gstatic.com:443 uses an invalid certificate".

Unwilling to learn about certificates by myself, I asked Google. And one hour later I found a page with a hint to solve the problem: download this
file (from the same server that issues the certificate) and just accept it.

From that point on, you will be able to navigate your Google Analytics account as usual.

Friday 18 June 2010

SVG to PS using inkscape

If you want to change the format of your SVG file into PS format, you can use inkscape thus:

inkscape my_graphic.svg --export-eps=my_graphic.eps --export-text-to-path

Sunday 30 May 2010

Using git to deploy code to remote server

I am developing a web application locally (at my laptop). I am using git to version my code. Now, I wish to deploy it to my virtual server (obviously, at a remote location).

Suppose I start from scratch. I usually keep my webpage code in /var/www/my_domain and at the same location in my web server.

The first thing I have to do is to create a bare git repo in my remote machine. Usually, in my home directory. I log into your remote machine and do (thanks to Caius) as follows:

mkdir git_my_domain
cd git_my_domain
git init --bare
git --bare update-server-info
git config core.worktree /var/www/my_domain
git config core.bare false
git config receive.denycurrentbranch ignore
cat > hooks/post-receive
#!/bin/sh
git checkout -f
^D
chmod +x hooks/post-receive

Then, at my local machine, I go to /var/www/my_domain and create a local git repository and add the remote repository I created before as a remote git repository:

mkdir /var/www/my_domain
cd /var/www/my_domain
git init
git remote add vps ssh://username@12.12.12.12[:port]/path/to/git_my_domain

Now I can add files to /var/www/my_domain, commit changes and then deploy automatically typing

git push vps master

Tuesday 25 May 2010

Permalink issue in Wordpress

As I tried to activate permalinks in Wordpress on Apache, I got 404 errors (page not found) after following the instructions. The problem was related to my default Apache configuration file.

Wordpress relies on Apache's mod_rewrite module for the URL conversion. As you define your conversion rules, it creates a custom .htaccess file that indicates Apache how to rewrite your URLs.

But there is a directive at the Apache configuration file that prevents Apache from reading your .htaccess file explicitly: AllowOverride.

In particular, within my directory section, I had: 

Options Indexes FollowSymLinks MultiViews
AllowOverride None
Order allow,deny
allow from all


Changed it to

Options Indexes FollowSymLinks MultiViews
AllowOverride All
Order allow,deny
allow from all


and rocked!

Saturday 13 March 2010

Annoying error message from startxfce4

Whilst trying to understand errors and warnings in logs from my brand new Xubuntu 10.04, I came accross some annoying error messages on my .xsession-error file:
<stdin>:1:3: error: invalid preprocessing directive #Those
<stdin>:2:3: error: invalid preprocessing directive #or
<stdin>:3:3: error: invalid preprocessing directive #Xft
<stdin>:4:3: error: invalid preprocessing directive #Xft
xrdb: "Xft.hinting" on line 9 overrides entry on line 6
xrdb: "Xft.hintstyle" on line 11 overrides entry on line 7
Not critical, it seems, not serious, just annoying. I guessed that xrdb was the culprit and searched for its configuration files:

carlos@molotov:~$ locate xrdb | egrep etc
/etc/xdg/xfce4/Xft.xrdb

There I had lines starting with "# Those", etc. A look at the man page for xrdb indicated that the Xft.xrdb file is passes through the C preprocessor, (whose directives are preceded by #). It clearly states that comments should be preceded by!rather than #.

So edit the conf file, replace # by!and the annoyance is gone!