<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Chris Wik</title>
	<atom:link href="http://www.cwik.ch/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.cwik.ch</link>
	<description>hostmaster, postmaster, servermaster</description>
	<lastBuildDate>Thu, 10 Jan 2013 10:29:27 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.5.1</generator>
		<item>
		<title>Setting up a CentOS 6 server to host a secure site</title>
		<link>http://www.cwik.ch/2013/01/setting-up-a-centos-6-server-to-host-a-secure-site/</link>
		<comments>http://www.cwik.ch/2013/01/setting-up-a-centos-6-server-to-host-a-secure-site/#comments</comments>
		<pubDate>Thu, 10 Jan 2013 10:26:06 +0000</pubDate>
		<dc:creator>Chris Wik</dc:creator>
				<category><![CDATA[Main]]></category>

		<guid isPermaLink="false">http://www.cwik.ch/?p=2258</guid>
		<description><![CDATA[I recently had the task of setting up a CentOS 6 server which will be hosting a secure site. The site will be storing sensitive customer data including billing and credit card information, so security is critical. The site will be Apache/PHP/MySQL, so a fairly standard LAMP stack. It will need to pass PCI compliance [...]]]></description>
				<content:encoded><![CDATA[<p>I recently had the task of setting up a CentOS 6 server which will be hosting a secure site. The site will be storing sensitive customer data including billing and credit card information, so security is critical.</p>
<p>The site will be Apache/PHP/MySQL, so a fairly standard LAMP stack. It will need to pass PCI compliance scans which involve a lot of port scanning, fingerprinting and attempts to break the web server by sending it unexpected requests.</p>
<p>While this list is in no way complete or authoritative, I thought I&#8217;d share a few of the steps I took to configure the server. The very first step was to install a clean CentOS 6 x86_64 and apply all the latest updates available by running &#8216;yum update&#8217;. I then installed mysql-server, php, php-mysql (and other php-* modules we need), httpd and mod_ssl. I enabled the servers I needed and disabled everything else, until I ended up with just:<br />
<code><br />
# chkconfig --list|grep :on<br />
crond              0:off    1:off    2:on    3:on    4:on    5:on    6:off<br />
httpd              0:off    1:off    2:on    3:on    4:on    5:on    6:off<br />
iptables           0:off    1:off    2:on    3:on    4:on    5:on    6:off<br />
mysqld             0:off    1:off    2:on    3:on    4:on    5:on    6:off<br />
network            0:off    1:off    2:on    3:on    4:on    5:on    6:off<br />
rsyslog            0:off    1:off    2:on    3:on    4:on    5:on    6:off<br />
sshd               0:off    1:off    2:on    3:on    4:on    5:on    6:off<br />
udev-post          0:off    1:on    2:on    3:on    4:on    5:on    6:off<br />
</code></p>
<p>When it comes to security, the fewer servers running the better, as less code = fewer potential vulnerabilities and less servers = fewer potential exploitation points.</p>
<p><strong>1. Firewall<br />
</strong>CentOS ships with iptables which is a very capable and flexible firewall. There are lots of ways to configure iptables, but my personal favourite is to just open up the config file and write out the rules by hand. This way I can be completely confident that the firewall is doing what I intended it to do, nothing more and nothing less, and not what some configuration tool thinks will be a good configuration for me. The config file on CentOS is /etc/sysconfig/iptables and I ended up with the following rules:<br />
<code><br />
*filter<br />
:FORWARD ACCEPT [0:0]<br />
:INPUT ACCEPT [0:0]<br />
:OUTPUT ACCEPT [0:0]<br />
-A INPUT -m state --state ESTABLISHED,RELATED -j ACCEPT<br />
-A INPUT -p icmp -j ACCEPT<br />
-A INPUT -i lo -j ACCEPT<br />
-A INPUT -p tcp -m state -m tcp --dport 22 --state NEW -s 10.2.3.0/24 -j ACCEPT<br />
-A INPUT -p tcp -m state -m tcp --dport 443 --state NEW -j ACCEPT<br />
-A INPUT -j REJECT --reject-with icmp-host-prohibited<br />
-A FORWARD -j REJECT --reject-with icmp-host-prohibited<br />
COMMIT<br />
</code></p>
<p>Enable the firewall using &#8216;chkconfig iptables on; service iptables start&#8217;</p>
<p><strong>2. SSH</strong><br />
You may have noticed the only 2 ports I opened in the firewall are 443 for https (SSL web server) and 22 for SSH, but that the rule for port 22 has a source network restriction in place. This restriction limits access to the SSH server to connections from the local LAN. In this case that LAN is in a remote datacenter, and we can access the LAN using a VPN connection. This adds a layer of security to the already secure SSH server. If you don&#8217;t have a VPN available another good option is to restrict access to your static IP(s). If you don&#8217;t have a static IP, see <a title="Convenient And Secure Temporary Firewall Exceptions" href="http://www.cwik.ch/2011/04/convenient-and-secure-temporary-firewall-exceptions/" rel="bookmark">Convenient And Secure Temporary Firewall Exceptions</a> for another good solution.</p>
<p>The next step we took is to set up key based authentication by generating a keypair on our workstation using &#8216;ssh-keygen -t rsa&#8217;, then copying the public key part of the keypair to ~/.ssh/authorized_keys on the server. The .ssh directory, if it doesn&#8217;t already exist, must be chmod to 700 and authorized_keys to 600 &#8211; if you forget this step SSH will fail to use the keys as the permissions are insecure!</p>
<p>On our workstation we set up an alias for the host by opening (creating if it doesn&#8217;t exist) the file ~/.ssh/config and inserting:<br />
<code><br />
Host secureserver<br />
HostName 10.2.3.123<br />
IdentityFile ~/.ssh/mynewprivatekey<br />
Port 22<br />
User mylogin<br />
</code></p>
<p>(real values obscured)</p>
<p>We can now log in to the server simply by typing &#8216;ssh secureserver&#8217;. Once key based authentication was working, I disabled password authentication entirely by editing /etc/ssh/sshd_config and setting PasswordAuthentication to &#8216;no&#8217;, and restarting sshd. SSH is now locked down to our local LAN IP range and to users in possession of the correct private key. This should be pretty secure, and the PCI scanner won&#8217;t even pick up on the existence of the SSH server as the firewall will block any connection attempts from the WAN.</p>
<p>In addition to using SSH for administering the server, we&#8217;ll also use the SFTP subsystem to upload files to our website. This means we don&#8217;t need to run a separate FTP server &#8211; one less server is only a good thing from a security perspective. We&#8217;ll also use the SSH server to tunnel connections to our MySQL server. MySQL Workbench (free from dev.mysql.com) supports this type of connection of out the box, as do some other MySQL clients.</p>
<p><strong>3. Apache and PHP<br />
</strong>One of the things the PCI vulnerability scanners tend to do is try to fingerprint your Web server to see what software you have installed and what versions you are running. They then compare the version numbers against a database of known vulnerabilities. There are 2 problems with this approach: 1. it does not make any attempt to verify whether you are actually vulnerable and 2. the RHEL/CentOS philosophy of freezing software versions at release time then backporting security fixes means that the stock versions of things like Apache and PHP available from the CentOS yum repositories are never the latest ones available. This causes the vulnerability scanner to go wild saying you&#8217;re vulnerable to dozens of things that have actually been patched long ago. You can verify this if you are curious by looking at the changelogs which list the CVE numbers of security fixes which have been backported, eg. &#8216;rpm -q &#8211;changelog httpd&#8217;.</p>
<p>So let&#8217;s make their job a little harder by limiting what information our Web server discloses:<br />
1. Edit /etc/php.ini and set &#8220;expose_php = Off&#8221;. This prevents PHP from adding a line to the HTTP response headers declaring its presence on the server.<br />
2. Edit /etc/httpd/conf/httpd.conf and set:<br />
- &#8220;ServerTokens Prod&#8221; &#8211; hide the version number from the HTTP response headers<br />
- &#8220;ServerSignature Off&#8221; &#8211; hide the server name and version from server generated responses such as errors, directory listings, etc.</p>
<p>The HTTP headers now look like this:<br />
<code><br />
HTTP/1.1 200 OK<br />
Date: Thu, 10 Jan 2013 09:58:33 GMT<br />
Server: Apache<br />
Last-Modified: Tue, 08 Jan 2013 21:58:40 GMT<br />
ETag: "85af-348-4d2ce0c66c400"<br />
Accept-Ranges: bytes<br />
Content-Length: 840<br />
Connection: close<br />
Content-Type: text/html; charset=UTF-8<br />
</code></p>
<p>Good &#8211; the version number is no longer reported and the existence of PHP is also omitted.</p>
<p>Also in /etc/httpd/conf/httpd.conf, I commented out all the modules which we don&#8217;t need, such as ldap, webdav, usertrack, userdir, status, info, vhost_alias, autoindex, speling, proxy, cache and version. This step will depend on what features of Apache you intend to use. I also removed all the configuration stuff we don&#8217;t need such as (in our case) the entire virtual hosting and proxying sections.</p>
<p>The SSL server configuration lives in /etc/httpd/conf.d/ssl.conf on a default install and this is where I set up the SSL virtual host. Of particular note in the SSL server configuration is to disable all weak encryption and outdated SSL implementations. This is important for any secure site and also something the PCI scan will pick up on if you leave the insecure defaults. I ended up with the following virtual host definition:</p>
<pre>&lt;VirtualHost 1.2.3.4:443&gt;
        DocumentRoot "/var/www/mysite/public_html"
        ServerName myhostname:443

        &lt;Directory "/var/www/mysite/public_html"&gt;
                Options FollowSymLinks
                AllowOverride None
        &lt;/Directory&gt;

        # Use separate log files for the SSL virtual host; note that LogLevel
        # is not inherited from httpd.conf.
        ErrorLog logs/ssl_error_log
        TransferLog logs/ssl_access_log
        LogLevel warn

        #   SSL config
        SSLEngine on
        SSLProtocol -ALL +SSLv3 +TLSv1
        SSLCipherSuite ALL:!aNULL:!ADH:!eNULL:!LOW:!EXP:RC4+RSA:+HIGH:-MEDIUM
        SSLCertificateFile ssl/mysslcert.crt
        SSLCACertificateFile ssl/mycertprovider.ca
        SSLCertificateKeyFile ssl/myprivate.key

        CustomLog logs/portal_request_log "%t %h %{SSL_PROTOCOL}x %{SSL_CIPHER}x \"%r\" %b"
&lt;/VirtualHost&gt;</pre>
<p><strong>4. MySQL<br />
</strong>MySQL by default includes an anonymous user with access to the &#8216;test&#8217; database, and a root user without a password! (granted, access is limited to the local server, but still not very secure). So the first thing I did after installing mysql-server was log in as root and secure the server:<br />
<code><br />
- USE mysql<br />
- DELETE FROM user WHERE user='';<br />
- DELETE FROM db;<br />
- UPDATE user SET Password=PASSWORD('mysupersecretpassword');<br />
- FLUSH PRIVILEGES;<br />
</code></p>
<p>Then proceed to set up any databases and appropriate access control lists as required.</p>
<p>I also copied /usr/share/doc/mysql-server-5.1.66/my-large.cnf to /etc/my.cnf and used this as a basis for my MySQL server configuration, I found this a reasonable basis for our requirements on this server.</p>
<p><strong>Summary</strong><br />
This is not an exhaustive list of steps we took to secure this server and will probably not apply in full to you as every deployment is unique. I hope however that some of the information in this post is useful, and as always I welcome any comments, suggestions and feedback.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.cwik.ch/2013/01/setting-up-a-centos-6-server-to-host-a-secure-site/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Scheduled MySQL server backups with cron and mysqldump</title>
		<link>http://www.cwik.ch/2012/12/scheduled-mysql-server-backups-with-cron-and-mysqldump/</link>
		<comments>http://www.cwik.ch/2012/12/scheduled-mysql-server-backups-with-cron-and-mysqldump/#comments</comments>
		<pubDate>Thu, 20 Dec 2012 10:06:59 +0000</pubDate>
		<dc:creator>Chris Wik</dc:creator>
				<category><![CDATA[Main]]></category>

		<guid isPermaLink="false">http://www.cwik.ch/?p=2252</guid>
		<description><![CDATA[This is a task I perform fairly regularly, but not regularly enough to remember exactly which permissions are needed, so I invariably end up having to look up the MySQL reference manual. The task is to set up a scheduled backup of a MySQL server using mysqldump and cron. The backup should contain not only [...]]]></description>
				<content:encoded><![CDATA[<p>This is a task I perform fairly regularly, but not regularly enough to remember exactly which permissions are needed, so I invariably end up having to look up the MySQL reference manual.</p>
<p>The task is to set up a scheduled backup of a MySQL server using mysqldump and cron. The backup should contain not only a copy of all databases on the server but also the state of the binary logs at the time, ie. using the &#8211;master-data option. I find binary logging + regular full snapshots of the MySQL server to be a great and simple backup strategy that allows not only for restoring the entire server to a known good state (from the snapshot backup) but also restoring to any point in time using the binary logs.</p>
<p>For example, let&#8217;s say you run a daily backup at midnight. At 13:26 a friendly Web developer (definitely not you) made a typo in an SQL query and wiped out a whole column full of very important data.</p>
<p>With snapshots + binary log, you can:</p>
<p>1. Restore the snapshot from midnight<br />
2. Use the mysqlbinlog command to extract all the data-affecting queries that were run on the server between the time the backup ran and the time when the erroneous query was made. You can do this by starting your search from the log position recorded at the top of your backup file, and ending with the position just before the bad query. Use mysqlbinlog and grep to find the latter position. Pipe all this extracted data into another sql file.<br />
3. Import this sql file into your MySQL server.<br />
4. You now have your server restored to the exact state it was in immediately prior to the nefarious query.</p>
<p>Sounds good? Here&#8217;s how to set up this backup scheme:</p>
<p>1. Set up a MySQL user with permission to run the backups, nothing more:</p>
<p>mysql&gt; GRANT SELECT, LOCK TABLES, SHOW VIEW, RELOAD, SUPER, REPLICATION CLIENT ON *.* TO &#8216;backup&#8217;@'localhost&#8217;;</p>
<p>mysql&gt; FLUSH PRIVILEGES;</p>
<p>2. Set up a cron job: [root@myserver ~]# crontab -e<br />
# MySQL backup<br />
0 0 * * * mysqldump -u backup -A &#8211;master-data -v | gzip &gt; /backups/current/MySQL/backup.sql.gz</p>
<p>Change the path to taste. Or if you prefer to store 7 days worth of backups, you could do something like:</p>
<p>0 0 * * * mysqldump -u backup -A &#8211;master-data -v | gzip &gt; /backups/current/MySQL/backup-`date +\%a`.sql.gz</p>
<p>This will name your backups with the day of the week in the filename, eg. backup-Mon.sql.gz. See &#8216;man date&#8217; for formatting options.</p>
<p>Esc, :w and :q to save your crontab and exit.</p>
<p>An explanation of the options to mysqldump:</p>
<p>-u: The MySQL user name to use when connecting to the server.<br />
&#8211;master-data: Use this option to dump a master replication server to produce a dump file that can be used to set up another server as a slave of the master. It causes the dump output to include a CHANGE MASTER TO statement that indicates the binary log coordinates (file name and position) of the dumped server. These are the master server coordinates from which the slave should start replicating after you load the dump file into the slave.<br />
-A: Dump all tables in all databases.<br />
-v: Verbose mode. Print more information about what the program does.</p>
<p>MySQL dump files are plain text which compress really well, so I feed the data into gzip before writing to disk.</p>
<p>It is worth noting that in order for mysqldump to take a consistent snapshot using &#8211;master-data, &#8211;lock-all-tables is automatically turned on: &#8220;Lock all tables across all databases. This is achieved by acquiring a global read lock for the duration of the whole dump.&#8221;</p>
<p>If you have a large amount of data, the backup may take a considerable amount of time, during which any INSERT/UPDATE/DELETE operations on your databases will be blocked. The queries will be shown in the process list as &#8220;Waiting for release of readlock&#8221; and will execute when your backup has completed, if the connection has not yet timed out. To avoid this problem, I typically recommend to my customers to run a separate MySQL server specifically as a disaster recovery and/or backup system. This can be on a separate server, a small VM, or even on the same server as the primary but bound to a different port. Set the backup server to replicate from the master using <a title="MySQL Replication" href="http://dev.mysql.com/doc/refman/5.5/en/replication.html" target="_blank">MySQL&#8217;s built-in replication system </a>then schedule your backups to run on the backup server instead of the master. This way you will avoid any disruption to your live system and still get clean snapshots. While your backup is running the replication slave process will have to wait for the table locks, but as soon as the backup is finished replication will automatically resume so your slave catches up with the master again.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.cwik.ch/2012/12/scheduled-mysql-server-backups-with-cron-and-mysqldump/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Script to convert ECSL CSV files exported from MYOB on Mac to HMRC compatible CSV</title>
		<link>http://www.cwik.ch/2012/12/script-to-convert-ecsl-csv-files-exported-from-myob-on-mac-to-hmrc-compatible-csv/</link>
		<comments>http://www.cwik.ch/2012/12/script-to-convert-ecsl-csv-files-exported-from-myob-on-mac-to-hmrc-compatible-csv/#comments</comments>
		<pubDate>Tue, 04 Dec 2012 19:13:48 +0000</pubDate>
		<dc:creator>Chris Wik</dc:creator>
				<category><![CDATA[Main]]></category>

		<guid isPermaLink="false">http://www.cwik.ch/?p=2248</guid>
		<description><![CDATA[Every 3 months we have to send Her Majesty&#8217;s Revenue and Customs (HMRC) in the UK a list of customers in the EU to which we have sold our services, this list is called the ECSL or European Community Sales List. Unfortunately due to choices made long ago which are non-trivial to change, we use [...]]]></description>
				<content:encoded><![CDATA[<p>Every 3 months we have to send Her Majesty&#8217;s Revenue and Customs (HMRC) in the UK a list of customers in the EU to which we have sold our services, this list is called the ECSL or European Community Sales List. Unfortunately due to choices made long ago which are non-trivial to change, we use a program called MYOB on a Mac for our book keeping. Unlike some of its competitors, MYOB can&#8217;t upload the ECSL data directly to HMRC, so we have to enter the data into forms manually.</p>
<p>This is a very tedious job, and since I don&#8217;t like tedious things, I set out to find a better solution. HMRC offer a facility whereby you can automate the data entry by importing a CSV file formatted a <a href="http://customs.hmrc.gov.uk/channelsPortalWebApp/channelsPortalWebApp.portal?_nfpb=true&amp;_pageLabel=pageImport_ShowContent&amp;id=HMCE_PROD_010685&amp;propertyType=document" target="_blank">specific way</a>. MYOB can export the ECSL to CSV, but the format is not compatible with the HMRC system.</p>
<p>The solution I came up with was to write a small Perl script to convert from the MYOB format to the HMRC format. Posted here, in case anyone else finds it useful!</p>
<pre>#!/usr/bin/perl -w

# Input file (MYOB export) should be passed as an argument to this script
my $infile = $ARGV[0];
chomp($infile);

my $outfile = $infile . '-hmrc.csv';

# Ask for year and month for this submission
print "Year: ";
my $year = &lt;STDIN&gt;;
chomp($year);
print "Month: ";
my $month = &lt;STDIN&gt;;
chomp($month);

# open output file
open(HMRCFILE, '&gt;', $outfile) or die "Could not open output file: $!";

# Write header
my $vatregno = "your_number_here";
my $subsidiary = "000";
my $name = "Christopher Wik";
print HMRCFILE "HMRC_VAT_ESL_BULK_SUBMISSION_FILE\n";
print HMRCFILE "$vatregno,$subsidiary,$year,$month,GBP,$name,0\n";

# Set record separator to \r (Mac) - MYOB saves in this format
$/ = "\r";

# Convert data from MYOB to HMRC format
open(MYOBFILE, $infile) or die "Could not open input file: $!";
while(&lt;STDIN&gt;) {
  next if !/,\w\w\d+/;
  my ($cust,$vat,$amount) = split(/,/,$_,3);
  my $country = substr($vat, 0, 2);
  my $vatno = substr($vat, 2);
  $amount =~ /(\d+)/;
  $amount_num = $1;
  $amount =~ /(-)/;
  if( $1 eq '-' ) { $amount_num *= -1; }
  print HMRCFILE "$country,$vatno,$amount_num,3\n";
}
close(MYOBFILE);

# close HMRC output file
close(HMRCFILE);</pre>
]]></content:encoded>
			<wfw:commentRss>http://www.cwik.ch/2012/12/script-to-convert-ecsl-csv-files-exported-from-myob-on-mac-to-hmrc-compatible-csv/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Multi-file, multi-line find/replace with Perl</title>
		<link>http://www.cwik.ch/2012/11/multi-file-multi-line-findreplace-with-perl/</link>
		<comments>http://www.cwik.ch/2012/11/multi-file-multi-line-findreplace-with-perl/#comments</comments>
		<pubDate>Tue, 27 Nov 2012 13:50:14 +0000</pubDate>
		<dc:creator>Chris Wik</dc:creator>
				<category><![CDATA[Main]]></category>

		<guid isPermaLink="false">http://www.cwik.ch/?p=2244</guid>
		<description><![CDATA[A customer recently contacted me for assistance. Their PC had contracted a virus which had installed a keylogger, which in turn had been used to steal their FTP password. The attacker logged into their FTP account and infected almost every file in the site with a couple lines of Javascript which attempted to install a [...]]]></description>
				<content:encoded><![CDATA[<p>A customer recently contacted me for assistance. Their PC had contracted a virus which had installed a keylogger, which in turn had been used to steal their FTP password. The attacker logged into their FTP account and infected almost every file in the site with a couple lines of Javascript which attempted to install a trojan on the PC of anyone visiting their site.</p>
<p>Unfortunately they didn&#8217;t have a clean copy of the site files and since the infection had happened some time ago, our own backups only had infected copies of their files. This left no option but to try to clean up their site by removing the infection.</p>
<p>Thankfully the attacker had thoughtfully surrounded every line of inserted code with HTML comments. This made the cleanup easier as we could attempt a find/replace across all files replacing everything between the start and end comments with an empty string, thus removing the code from the site. An example:<br />
<code><br />
&lt;!--7e5a0c--&gt;<br />
&lt;script type="text/javascript" language="javascript" &gt;nefarious code here&lt;/script&gt;<br />
&lt;!--/7e5a0c--&gt;<br />
</code></p>
<p>My first thought was to use a GUI editor like BBEdit on Mac which has a great find/replace function, but the site had many thousands of files totaling 2.6GB. Downloading, cleaning and re-uploading would take hours.</p>
<p>Instead I turned to my trusty friend Perl in collaboration with find and xargs. The solution ended up to be very simple, just one line on the Linux terminal:</p>
<p>find /path/to/webroot -type f -print0 | xargs -0 perl -0777 -i -pe &#8216;BEGIN{undef $/;} s/7e5a0c.*7e5a0c//smg&#8217;</p>
<p>Breaking this down, we get:</p>
<p>find: find everything of type file in the given directory and print this list, using the null character as a separator instead of newline so xargs doesn&#8217;t choke in files with spaces in their names.</p>
<p>xargs: feed the results of the find command as an argument to the following command. -0 says expect null as separator instead of newline.</p>
<p>perl: -0777 sets slurp mode, ie. read the input file in one go. -i says replace in-place (don&#8217;t write out a new file), -p says iterate over given files in a sed-like manner, and -e says execute the perl code given as the following argument. The code has 2 parts: 1. undef the record separator which defaults to newline so we can match mutliple lines. 2. A find/replace regular expression (s = substitute). Modifiers (after the  regex): m = treat string as multi-line, s = treat string as a single line, g = global matching. See <a title="Perl Regular Expressions" href="http://perldoc.perl.org/perlre.html" target="_blank">perlre</a> for more details.</p>
<p>The whole thing took just a few seconds to complete, searching 11028 files and removing all instances of the infection. Success!</p>
]]></content:encoded>
			<wfw:commentRss>http://www.cwik.ch/2012/11/multi-file-multi-line-findreplace-with-perl/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Isolating the cause of seemingly random webapp crashes (and identifying who is responsible for fixing it)</title>
		<link>http://www.cwik.ch/2012/07/isolating-the-cause-of-seemingly-random-webapp-crashes/</link>
		<comments>http://www.cwik.ch/2012/07/isolating-the-cause-of-seemingly-random-webapp-crashes/#comments</comments>
		<pubDate>Sun, 01 Jul 2012 13:00:49 +0000</pubDate>
		<dc:creator>Chris Wik</dc:creator>
				<category><![CDATA[Main]]></category>
		<category><![CDATA[lasso]]></category>
		<category><![CDATA[mySQL]]></category>
		<category><![CDATA[preprocessing]]></category>
		<category><![CDATA[Who's-Fault-Is-It]]></category>

		<guid isPermaLink="false">http://www.cwik.ch/?p=2203</guid>
		<description><![CDATA[As a managed hosting provider, it is sometimes difficult to definitively draw the line between a customer&#8217;s problem and our problem. We are paid to provide reliable infrastructure and platform, on which customers deploy their code. What we&#8217;re not responsible for is the reliability of their code. What happens then when a customer&#8217;s webapp starts [...]]]></description>
				<content:encoded><![CDATA[<p><strong>As a managed hosting provider, it is sometimes difficult to definitively draw the line between a customer&#8217;s problem and our problem.</strong> We are paid to provide reliable infrastructure and platform, on which customers deploy their code. What we&#8217;re not responsible for is the reliability of their code.</p>
<p><strong>What happens then when a customer&#8217;s webapp starts crashing occasionally and seemingly at random?</strong><br />
In one recent case, the frequency of such crashes went from zero to as much as half a dozen crashes per day over the course of a few months. The customer was pushing for us to fix the problem, but as far as I could tell the problem did not lie in the infrastructure or platform.</p>
<p>Out of memory errors indicated the problem may be a resource shortage, but doubling the RAM allocated to the virtual machine had no impact on the frequency of the errors. At this point I began to suspect buggy code was causing a runaway condition. <strong>The problem I faced at this point was convincing an increasingly unhappy customer that the problem was with their code. <em></em></strong></p>
<p><strong><em>To do this I needed evidence that the crashes were related to requests to URIs in their app.</em></strong></p>
<p>This particular app is written using the <a href="http://www.lassosoft.com">Lasso programming language</a>. One nice feature of Lasso is define_atbegin and define_atend, two tags which can be used to insert code to preprocess and postprocess for a particular script. These can be defined globally to run as preprocessor and postprocessor scripts for every request to the server. Using this feature,<strong> I installed a debugger system which is relatively simple in principle but surprisingly powerful.</strong></p>
<p>At the start of each request, before any customer code is executed, the preprocessor script creates a record in a MySQL table containing details of the request: date and time, request URI including any GET params, client IP, client browser, and a column called &#8216;closed&#8217; with a value of 0.</p>
<p>Once the customer code has finished executing, the postprocessor script updates the MySQL table to set the &#8216;closed&#8217; column to 1, record the execution time (helps to identify slow running scripts) and record the contents of the error stack. If the customer code crashes, the postprocessor code won&#8217;t get a chance to run. In this case the status of &#8216;closed&#8217; will remain 0.</p>
<p>The MySQL database can then be queried to retrieve a list of pages that never finished executing, pages that were slow to execute, or pages which finished but contained an error stack. This is all very useful debugging info, but it is the first condition which is the most useful, as <strong>this allows you to see exactly which URIs were requested but never completed &#8211; ie which pages are crashing the system!</strong></p>
<div id="attachment_2207" class="wp-caption alignright" style="width: 310px"><a href="http://www.cwik.ch/wp-content/uploads/2012/07/Screen-shot-2012-07-01-at-11.png"><img class="size-medium wp-image-2207 " title="Screenshot of HTML display" src="http://www.cwik.ch/wp-content/uploads/2012/07/Screen-shot-2012-07-01-at-11-300x143.png" alt="" width="300" height="143" /></a><p class="wp-caption-text">Screenshot of HTML display</p></div>
<p>As the aim of this exercise was to highlight to the customer where they should be looking for errors, I quickly created an HTML display of the data from MySQL and put a password on it. The customer could then keep an eye on the various reports to check for slow pages, error stacks and pages that crashed.</p>
<h5><strong>Within a day of going live, the customer had identified the culprit code and uploaded a patched version. The webapp hasn&#8217;t crashed since.</strong></h5>
<p>While I can&#8217;t take credit for the original concept (<a href="http://www.linkedin.com/in/bilcorry" target="_blank">Bil Corry</a> came up with it many years ago) or even writing the debugger code (I hired an <a href="http://www.lassosoft.com/cld/17029/Ke-Carlton">enormously talented programmer</a> to do that), I can claim to have successfully implemented this method to isolate the offending code. This is a really great tool that not only helped fix a hard to find bug, but also helped to clearly define that it was customer code and not infrastructure or platform at fault. This made the customer happy, and a happy customer makes me happy!</p>
]]></content:encoded>
			<wfw:commentRss>http://www.cwik.ch/2012/07/isolating-the-cause-of-seemingly-random-webapp-crashes/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
	</channel>
</rss>
