2 Jun 2008

Designing a Log reporting tool

Last week was really interesting in terms of developer productivity. After a long time I managed to get my hands dirty on some interesting coding stuff. Now this idea had been behind my mind all the time but it took some job requirements to really get it moving.

Many of us developers have used logging and at times really found it cumbersome enough to wade through the sea of log files looking for statistics. Especially so in web applications that run on clusters and generate log dumps on multiple machines. This was exactly the requirement at our end and we needed to quickly come up with some tool to parse and extract statistics data from the logs.

The groundwork for this had already been laid by another team that had in place a nice log extraction utility. It could gather logs from multiple machines and extract lines matching particular patterns into a new file. So as the output, you'll have summary files containing all instances of a particular search term.

So far so good. However we needed to extract some statistics from these logs. And therein comes the reporting part. How does one go about creating a report from these output files. I hunted over for some open-source applications on sourceforge that would meet this goal but none seemed to fit our bill. And to top it all - we hadn't yet discussed with our client on how the report needed to be formatted - how the details needed to be segregated and so on. So whatever we'd come up with - it might need to be changed. Reports are always like that. Everyone has different opinions on how they should present the data and so on.

Therein lies a challenge for a developer. Using a strategy that would allow him to change the report format easily to suit the client's requirements. The more i thought about it the more i was convinced that the report needed to be generated in XML and the presentation dynamically built using XSL. So we decided to build a Java utility that'd take the logs generated from the first stage, extract statictics data from it, output this data as XML and then apply the required XSL stylesheet to generate the desired report. So it ended up something like this

The interesting part was that I managed to wrap it up in 2 days flat. It was the XSL styling that took up some time. But it was interesting. I always liked dabbling around in XSL. Especially grouping the data and displaying it by date, by machine etc. Its not yet finished but now I'll let the client put forth his views on it and drive forward the next phase of required changes. I can already envisage requirements for - say a provision to dynamically select report criteria in the HTML page. Hmmm better start brushing up some JQuery selectors ;-).


  1. Neat design mate.

    Post extraction we could store the Extraction data into Database rather than flat files.And during the analysis phase just convert the DB values into XML data.Just a peasant's tweak!

    This would make the system more scalable if the client wants to query the legacy data for creating reports.This would also reduce the overhead of maintaining the log files of multiple servers for later reference.j query is a good bet.

    We had created a similar reporting tool to handle data from multiple systems and storing the legacy data in a common store worked wonders for us, with the ever thirsty client always on the lookout for new types of Jazzy reporting apps.

    Cheers mate!!

  2. Monitoring var log messages file: Do you wish to monitor the /var/log/messages file on your Linux servers?
    seo log analyzer