OpenStats - logging of information

If you're interested in OpenStats project and you have installed it (see installation and configuration), there is nothing easier than trying to log several sessions, actions or parameters ;-)

Logging library consists of a single class - OpenStats_Logging, using OpenStats_DB class (and a related class OpenStats_DB_ResultSet) that defines an abstraction layer over a database, and is shared by all parts of the OpenStats project.

Let's have a look at these classes - the following text contains just a brief description of the methods, and if you need a more detailed information, check the comments in the source code.

OpenStats_DB

As the OpenStats project was founded before PDO was introduced to PHP, we have written our own abstraction layer responsible for separating application code from the database - part of this layer is the following class (OpenStats_DB_ResultSet) too.

This layer is not responsible for creating or closing of DB connections - it just performs DB queries, and creates or terminates transactions.

Only a PostgreSQL project is part of the OpenStats project, but thanks to the separation it's quite possible to port the project to a different database (e.g. MySQL).

Public methods of the class are:

  • __construct($connection) - constructor, a connection to the database is a parameter
  • query($query) - executes a query and returns the result wrapped as OpenStats_DB_ResultSet
  • execute($query) - executes a query and returns number of affected rows
  • begin() - creates a new transaction
  • commit() - commits a transaction
  • rollback() - rolls back a transaction
  • escape($value) - handles the value for usage in a SQL query
  • getLastId($sequence) - returns last used value from a sequence
  • getNextId($sequence)  - returns the next value from the sequence

Warning: Although this library separates the application code from database, this separation is by no means complete. Do not expect that all you need to port the library to a different database (e.g. MySQL) is to rewrite this class - there are many other implementation issues not solved by this library. Among the most important ones are differences in SQL support  (and their speed - what's fast in one database may be very slow in another one), sequences support (e.g. MySQL uses AUTOINCREMENT columns instead of them), etc.

Note: One of the possibilities provided by this encapsulation is logging of SQL queries along with their duration - see the PGMon project.

OpenStats_DB_ResultSet

This class encapsulates query (SELECT) results as an iterator. Thanks to this the application code does not depend on the database and it's quite simple to port it to a different database.

Public methods of the class:

  • __construct($result) - constructor, wraps the result
  • hasNext() - checks if the result contains at least one more row
  • fetchNext($mapper = null) - returns next row from the result (or an instance created by the mapper)
  • free() - removes the result from memory

OpenStats_Logging

This library is responsible for the logging of statistics - creation of sessions, actions belonging to sessions, and parameters for actions.

In your application code you'll be working with this class only (you'll create instances of the two classes mentioned above).

Public methods of the class are:

  • __construct($db) - constructor, parameter is an instance of the OpenStats_DB class
  • getVisitorId() - returns randomly generated new ID of a visitor
  • createSession($visitorId, $ip, $forwardedFor, $referer, $userAgent, $language) - creates a session
  • createAction($sessionId, $language, $typeId, $pageId, $parameters = null) - creates an action
  • createParameter($sessionId, $actionId, $name, $value) - creates a parameter
  • createParameters($sessionId, $actionId, $parameters) - creates multiple parameters at once

Basic usage

The following piece of code creates necessary instances, logs a new session, creates a new action for this session, and then logs a parameter for this action:

<?php

    // load necessary classes
    require_once('openstats/openstats_logging.php');

    // starts a new PHP session
    session_start();

    // definition of useful commands
    define('TYPE_PAGEVIEW', 1);
    define('PAGE_HOMEPAGE', 1);

    // initialization of variables
    $visitorId    = null;
    $ip           = $_SERVER['REMOTE_ADDR'];
    $forwardedFor = $_SERVER['HTTP_X_FORWARDED_FOR'];
    $referer      = $_SERVER['HTTP_REFERER'];
    $userAgent    = $_SERVER['HTTP_USER_AGENT'];
    $language     = $_SERVER['HTTP_ACCEPT_LANGUAGE'];

    // create an instance of the DB layer
    $conn = pg_open(...);
    $db = new OpenStats_DB($conn);

    // create an instance of the logging class
    $stats = new OpenStats_Logging($db);

    // read a previous value of visitor ID (stored in a COOKIE)
    if (isset($_COOKIE['visitorId'])) {
        $visitorId = $_COOKIE['visitorId'];
    } else {
        $visitorId = $stats->getVisitorId();
        setcookie('visitorId', $visitorId, time() + 31*24*60*60);
    }

    // create a new session (if necessary)
    if (! isset($_SESSION['sessionId'])) {
        $sessionId = $stats->createSession($visitorId, $ip, $forwardedFor,
                                           $referer, $userAgent, $language);
        $_SESSION['sessionId'] = $sessionId;
    } else {
        $sessionId = $_SESSION['sessionId'];
    }

    // create an action
    $actionId = $stats->createAction($sessionId, 'cs',
                                     TYPE_PAGEVIEW, PAGE_HOMEPAGE);

    // create a parameter
    $parameterId = $stats->createParameter($sessionId, $actionId,
                                           'myParameter', 'my value');

?>

This example quite nicely illustrates usage of the library - I believe it's not too difficult to customize it for your needs. For the sake of brevity I took the liberty to simplify things a little, and I'd like to improve that a little in the next section.

Several recommendations

Firstly some of the values read from the $_SERVER array (especially the HTTP_ variables) may not be defined, so the log (or HTML output, if you - my goodness - have not set a custom error handler) may be filled with notices. This is not a fatal problem - the code will work - but it's annoying at least.

A better treatment of such situation may be for example:

<?php

  $referer = (isset($_SERVER['HTTP_REFERER'])) ? $_SERVER['HTTP_REFERER'] : null;

?>

Secondly a better treatment of the cookie value is desirable - the code listed above is vulnerable, as the attacker may easily pass in an arbitrary value. It may not cause fatal problems to the application - at worst it may confuse the detection of unique visitors, but why not to fix this, right?

One of the possibilities is not to store just the visitor ID, but append a value that prevents modification of the original value, for example hash using a salt value (and check this value later):

<?php

    // definition of salt value (secret value)
    define('SALT', '3s8f9df');

    // read of a visitorId value from the COOKIE
    if (isset($_COOKIE['visitorId'])) {
        $tmp = explode(':', $_COOKIE['visitorId]);
        if ($tmp[1] == md5(SALT . $tmp[0]) {
            $visitorId = $tmp[0];
        }
    }

    // if the visitor ID is not set, create a new one
    if (! isset($_COOKIE['visitorId'])) {
        $visitorId = $stats->getVisitorId();
        $cookieValue = $visitorId . ':' . md5(SALT . $visitorId);
        setcookie('visitorId', $cookieValue, time() + 31*24*60*60);
    }

?>

Another possibility is to encrypt the value (but it's still necessary to check if the value was not modified "blindly").

And a last recommendation - when calling the createAction method, don't use numeric values directly, as it may become very confusing. Rather define constants for action types and pages in a separate file, and use these constants.

Comments

There are no comments for this article (or are awaiting acceptance).

New comment

All the comments have to be accepted, so there may be some delay between submitting and accepting (or rejecting) the comment. If you enter the e-mail address, you will be informed about acceptance or rejection.

Subject or body may not contain HTML tags - they will be automatically removed. Paragraphs may be separated using a newline (ENTER).

(optional)