Data extraction Perl style

October 21, 2010 under Main

We use a great email filtering solution from SpamTitan. It’s a commercial product which we license for a certain number of users. The only problem we’ve had with it, is that it doesn’t provide a mechanism to determine the user count for a set of domains. This makes it difficult to offer as a service to our customers who prefer to run their own email servers.

It does however offer a feature whereby you can export its maillogs. So today I whipped up a 63-line Perl script which can parse a its log files and count the unique users for a given account and all domains under it. In my test run, it parsed 7,062,499 lines (last week’s logs) in 42 seconds.

Maybe not a perfect solution to the problem, but definitely a workable one. If anyone is interested in a copy of the script just let me know.

