Exim4 and Postgresql : Callouts

Sam Smith : 5 August 2008

Recently I had a project to limit the number of callouts Exim4 performs per message.

The theory was that a callout can be made by means of a variable, and the results written to the cache. This 'variable-in-the-cache' would then be referred to each time the needed information is referenced in the message process, limiting database lookups and processing time.

This theory was supported by 'Google evidence', but in practice, when we watched the logs, multiple callouts were still being made, once per variable:

  2008-08-05 11:08:01 BST LOG:  duration: 1.947 ms  statement: select
  mailbox,username,uid from mailtest where username='sam' 

  2008-08-05 11:08:01 BST LOG:  duration: 1.966 ms  statement: select
  mailbox,username,uid from mailtest where username='sam' 

  2008-08-05 11:08:01 BST LOG:  duration: 12.974 ms statement: select
  mailbox,username,uid from mailtest where username='sam'

We discovered that the reason multiple callouts were ensuing was that the cache is flushed between routers and transports in Exim4. Exim4 however provides one variable which is cached and not flushed, and this variable is address_data.

First, we define the lookup variable:

  PGS_VAR = {lookup pgsql \
     {select mailbox,username,uid from mailtest \   ## The select query
     where username='${quote_pgsql:$local_part}'}}  ## Defines what part of
                                                       the email address to 
                                                       quote postgres

Then, we define this as address_data so it doesn't get 'flushed' (note: this is done in the first router where it is referenced):

  address_data= $PGS_VAR

Then the lookup to PGS_VAR, where it is cached under address_data:

  local_parts = ${extract{username}$PGS_VAR}}	  ## Notice how only one field
                                                     can be extracted at a time
                                                     - this limits callouts
                                                     further!

Then the rest of the lookups can be made to address_data:

  user = ${extract{uid}{$address_data}}

  file = ${extract{mailbox}{$address_data}}

And the logs show only one lookup for the entire message processing - even though the select query is referenced 3 times:

  2008-08-05 10:28:22 BST LOG:  duration: 82.434 ms  statement: select
  mailbox,username,uid from mailtest where username='sam'

Many thanks to Phil Pennock, Rory, and Mark for their assistance and direction in this fun project.