Using PowerShell Scripts as SVN Hooks

§ December 2, 2008 18:18 by beefarino |

You can tell from my recent blog posts that I'm spending a good deal of time in my SCM role at work.  I've recently been looking at using powershell scripts to replace some binary hooks that have become dated, and to expand our repository's integration with our bug tracking system.  All in all it was pretty simple, but a few gotchas held me up:

First, SVN hooks have to be directly executable (e.g., .exe, .bat., .com, .cmd, or windows script files); powershell scripts are not directly executable for security reasons, so I had to use a batch script as a proxy.  

Second, SVN invokes the hooks with an empty environment (again for security purposes); without an active PATH variable specified, the batch and powershell scripts need to specify full paths to all files.

Finally, some hooks, such as the post-revprop-change hook, read data from STDIN, but the proxy batch file will not forward its STDIN to the powershell process without some intervention.  

Here is the batch proxy for the post-revprop-change hook:

rem   =====================================================================
rem    %  Argument     Description
rem   --- ------------ ----------------------------------------------------
rem   [1] REPOS-PATH   (the path to this repository)
rem   [2] REV          (the revision that was tweaked)
rem   [3] USER         (the username of the person tweaking the property)
rem   [4] PROPNAME     (the property that was changed)
rem   [5] ACTION       (the property was 'A'dded, 'M'odified, or 'D'eleted)
rem   [STDIN] PROPVAL  ** the old property value is passed via STDIN.
rem   ---------------------------------------------------------------------

c:\windows\system32\windowspowershell\v1.0\powershell.exe -command "$input | d:\repository\hooks\postrevprop.ps1" -args "%1" "%2" "%3" "%4" "%5"

A few things to note:

  • I'm using full paths to the powershell executable and the powershell script; this is necessary because the batch file will be run with an empty environment and no PATH to search.
  • I'm not really executing the powershell script, I'm executing a command string that passes the implicit $input variable to the powershell script.  This $input variable resolves to the STDIN for the batch proxy, and enables the powershell script to read it.
  • Because my command string invokes a script file, I need to make sure that the powershell execution policy is set to RemoteSigned or Unrestricted on the SVN server.    

Here is the powershell script invoked by the batch file proxy:

param( $repo, $rev, $user, $propname, $action );
@($repo, $rev, $user, $propname, $action) + $input
    | out-file "d:\repository\hooks\log.txt" -append

The script doesn't do much at the moment - just logs the hook parameters and input to a file so I can verify that the script is executing correctly.  Now that I have ironed out the major wrinkles, I can start porting our hooks to powershell!

Correlating and Grepping Logs with PowerShell

§ November 6, 2008 16:44 by beefarino |

I've spent the last few days doing a post-mortem on a colossal system failure at a client site.  We have plenty of data on the failure, in the form of log4net application logs and Windoze event logs, but the amount of data at our disposal is making interpreting and correlating those logs a gignormous task.  Some of the logs are XML, others are CSV, so making correlations between them has involved searching multiple spreadsheets and text files for information manually.  Correlating across the logs is made even more difficult as the event log export contains datetime instants that are local to the machine, whereas the log4net logs contain UTC timestamps.

So what do I do when presented with such a Gordian Knot?  I select a katana sharp enough to shred the knot into pieces.  Lately my weapon of choice for this type of problem has been powershell.  My goal is to be able to correlate events by time and find errors or warnings across all of these logs.  To that end, I'll be using some powershell Shikai to create object hierarchies out of XML and CSV data.  In addition, I'll release powershell Bankai and extend those objects with custom properties that allow both types of logs to be combined and searched as if they were a single log.

Importing Log4Net XML Log Files

My application logs use the XML formatter built-in the standard log4net distribution, which produces a partial XML document (as described here) similar to the following:

<log4net:event logger="MyApp" timestamp="2008-11-04T04:42:42.2612828-07:00" level="INFO">
    <log4net:message>validating xml digital signature on document</log4net:message>
<log4net:event logger="MyApp" timestamp="2008-11-04T04:42:43.1279382-07:00" level="INFO">
    <log4net:message>digital signature has passed validation</log4net:message>

Note there is no root element containing the <log4net:event /> elements, so the import script does a little fernagling to make the log appear to be a valid XML document:

# import-logxml
# imports a log4net application log file
# file: path to xml log file

param( $file )

# surround the log file contents with a root element
$xml = '&lt;log xmlns:log4net="uri:doesnotmatter"&gt;' + ( gc $file ) + '&lt;/log&gt;';

# load the log as XML and slurp out every log event
( [xml] $xml ).log.event |

    # add a note property indicating this log event is from an application log file
    add-member noteproperty -name logsource -value "applicationlog" -passthru |        
    # add a datetimeoffset property representing the instant of the log event
    add-member scriptproperty -name datetime -value {
            [DateTimeOffset]::Parse( $this.timestamp );
        } -passthru;

On line 11, contents of the log file are surrounded with a root element to make the log file valid XML.  Line 14 uses powershell's built-in [xml] datatype to translate the XML string into an object hierarchy, then retrieves every log event from the log file.  Each <log4net:event /> element from the XML log file is represented by an object containing properties corresponding to the node's attribute values and child elements.

The next few lines use powershell's add-member cmdlet to extend the log entry objects with new properties. Line 17 adds a simple property named "logsource" with a static value indicating that this object was created from an application log.  Line 20 adds a calculated property named "datetime" containing the instant of the log message as a System.DateTimeOffset structure.  Having this DateTimeOffset property will give me an appropriately-typed value against which I can compare, filter, and sort log entry timestamps.

Using the import-logxml script is straightforward:

> $log = import-logxml 'myapp.log.xml'

After running the script, the $log variable contains an array of objects representing the individual log events in the XML log file:

> $log[ 0 ]
datetime          : 11/04/2008 4:42:42 AM -07:00
logger            : MyApp
timestamp         : 2008-11-04T04:42:42.2612828-07:00
level             : INFO
thread            : 5076
message           : validating xml digital signature on document

Now powershell's filtering capabilities can be used to isolate log events; e.g., by severity level:
> $log | ? { $_.level -eq 'error' }
datetime          : 11/04/2008 4:44:16 AM -07:00
logger            : MyApp
timestamp         : 2008-11-04T04:44:16.8272804-07:00
level             : ERROR
thread            : 5076
message           : an exception was raised while processing the application configuration

Importing Event Log CSV Files

Importing the CSV event log data is a little more involved:

# import-eventlogcsv.ps1
# translates an event log export from
# CSV format to an object collection
# file: path to CSV export of event log
# offset: a timespan indicating the UTC offset
#     of the event log source

param( $file, [TimeSpan] $offset = [DateTimeOffset]::Now.Offset )

# add CSV column headers, if necessary
convert-eventlogcsv $file;

# import the CSV file
import-csv $file |

    # add a note property indicating this entry is from an event log
    add-member noteproperty -name logsource -value "eventlog" -passthru |
    # add a note property to store the value of the UTC offset
    add-member noteproperty -name offset -value $offset  -passthru |
    # add a datetimeoffset property representing the instant of the log entry
    add-member scriptproperty -name datetime -value {
            new-object DateTimeOffset
                ( [DateTime]::Parse( $ + " " + $this.instant ),
        } -passthru |
    # add a property that translates the event log message type
    #    into a log4net log-level value
    add-member scriptproperty -name level -value {
            switch -re ( $this.eventtype  )
                "info" { "INFO" }
                "succ" { "INFO" }
                "fail" { "INFO" }
                "warn" { "WARN" }
                "error" { "ERROR" }
        } -passthru

Event log timestamps are expressed in the time zone of the source computer, and in my particular case that source computer exists in a different time zone than my laptop.  So the import script accepts a parameter containing a UTC offset to indicate the time zone of the event log source computer (line 11).  

Importing CSV data into powershell is as simple as invoking the import-csv cmdlet (line 14); however, the import-csv cmdlet assumes the first row in the file contains column headers.  The event log exports no such header row, so one must be added.  I've wrapped up the process of adding this header row into a script (convert-eventlogcsv.ps1) that is included in the download for this post; this script is invoked on line 14.

The result of the import-csv cmdlet is an object array.  Each object in the array represents a single row of the CSV, with properties corresponding to each column in the row.  As with the application logs, several properties are added to each object using powershell's add-member cmdlet:

  • a "logsource" property indicating that the log event was read from an event log export;
  • an "offset" property containing the UTC offset passed into the script;
  • a "datetime" property that creates a DateTimeOffset structure representing the instant the log message was produced;
  • a "level" property that translates the event log entry type (Information, Warning, Error, etc. ) into strings matching the standard log4net log severity levels (DEBUG, INFO, WARN, ERROR, and FATAL).

Script use is straightforward:

> $evt = import-eventlogcsv 'eventlog.system.csv' ( [TimeSpan]::FromHours( -7 ) )

With the result similar to the log4net import:

> $evt[ 0 ]
date      : 11/04/2008
instant   : 4:42:42 AM
source    : MyApp
eventtype : Information
category  : None
eventid   : 2
user      : N/A
computer  : PPROS1
message   : application has started
logsource : eventlog
offset    : -07:00:00
datetime  : 11/04/2008 4:42:42 AM -07:00
level     : INFO

Correlating Events Across Logs

Now that the import scripts are available, correlating the logs is pretty simple.  First, we need to import our event and application logs:

> $log = import-logxml 'myapp.log.xml'
> $evt = import-eventlogcsv 'eventlog.system.csv' $log[ 0 ].datetime.offset

Note that the log4net log is imported first; this allows the time zone offest information it contains to be used to adjust the event log timestamps.  Once the two logs are imported, we can combine them using powershell's array concatenation operator:

> $data = $log + $evt

At this point, the $data variable contains the merged contents of the event log and application log.  The add-member cmdlet was used to extend these objects with custom properties during the import; now every merged log entry contains a logsource, datetime, and level property with identical types and semantics.  This means that we can now search, sort, and filter the combined logs using a single set of criteria.  For example, consider the following:

> $start = [DateTimeOffset]::Parse( "11/04/2008 4:40 AM -7:00" )
> $end = [DateTimeOffset]::Parse( "11/04/2008 4:45 AM -7:00" )
> $data | ? { $_.datetime -gt $start -and $_.datetime -lt $end }

which finds all events that occurred between 4:40 and 4:45 AM.  Or this:

> $data | ? { $_.level -eq 'error' -and $_.datetime -gt $start -and $_.datetime -lt $end }

which isolates all error-level events during the same time period.  Or this:

> $data | sort datetime

which outputs all log entries sorted by the time they were written.

Exporting the Correlated Logs

While powershell's interactive filtering capabilities are useful, you may find you want to export the newly merged logs for further analysis in a spreadsheet or other tool.  This can be easily accomplished using the export-csv cmdlet:

> $data | export-csv 'mergedlog.csv' -notypeinformation

Download the Code

The import scripts are available here: (1.49 kb).


SVN Log Parsing with Powershell

§ June 20, 2008 07:47 by beefarino |

I absolutely love powershell.  I would estimate that over the past 6 months, powershell has saved me about two work weeks of time.

Case in point: next week we're entering a rather bulky merge phase across - count 'em now - four branches of the same project.  I needed a list of all of my commits for each branch to use as a cross-reference.  I thought about seeking out or coding up a SVN log parser, until I saw my powershell console and realized I didn't need to.  After a half-minute of hacking, I came up with this little diddy:

( [xml] ( svn log --xml | join-string -sep "" ) ).log.logentry |
 ? { $ -eq "jim christopher" } |

In a nutshell, this one-liner requests the SVN log, filters log entries with my SVN username as the author, and formats them as a very readible list.

And that's the "power" in "powershell:" when I need a quick hack, powershell is there for me; but when I need the reliability and safety of objects, it's there for me too.



A Coalesce Filter for PowerShell

§ May 22, 2008 16:49 by beefarino |

Here's a little PowerShell script that acts as a coalescing pipeline filter:

param( [scriptblock] $predicate={ $_ } )
       $script:found = $false;
       $script:item = $null;
       if( ! $script:found -and ( &$predicate ) )
               $script:found = $true;
               $script:item = $_;
       return $script:item;

When used as a pipeline filter, this script returns the first non-null object coming out of the executing pipeline.  Here's a simple example:

$value = @( $null, "first", "second" ) | coalesce;

The result of this script is that $value is equal to "first".  I find myself using this quite a bit for things like default values for optional parameters:

param( $p1, $p2 );
$value = @( $p1, $p2, "defaultValue" ) | coalesce; 

The $predicate parameter allows you to pass in your own criteria as a script block; e.g.:

$p1 = @( $a, $b, $c ) | coalesce { $_ -gt 1 };

will return the first value from the list greater than 1.

Looks a lot like the where-object filter, except that coalesce returns only the first object from the pipeline matching the specified criteria.