Search Engine UK Workshops Search Engine Workshops Search Engine Workshops

Web analytics - where does the data come from?

Web analytics may sound daunting but it simply refers to looking at who is visiting your website, where they came from and how they behaved once there. Most importantly though, it is vital information for any online marketer.

So how do you find all this information. Where does the data come from that has to be analysed?

There are two methods of data acquisition. Either a small amount of JavaScript can be added to each page for which information is required and then the data this JavaScript generates is analysed or the log files that every website automatically generates came be used as the raw data.

The javascript method

This is usually effected using a hosted subscription service. By paying a monthly subscription, the service provider provides and analyses the JavaScript data and provides an interface through which it can be viewed. The level of analysis provided depends on the level of service subscribed to and the particular provider.

The advantage of the JavaScript method is that you do not need to access your log files, and the cost of analysis is spread over a monthly fee. The downside is that that fee has to be paid in perpetuity, and that your data is held on the service provider's server, that is you do not own it or have final control over it.

The log file method

With this method, some log file analysis software is required. Again, the level of web analysis provided will depend on the functionality of the software. The log files are downloaded from the website server and then run through the web analytics software. The analysis software will then analyse the data ready for use.

The advantage of log file analysis is that you own the data (which has security advantages where this is business critical, such as for banks), and that after a one off payment for the purchase of the software, there is no further financial investment necessary. The downside is that log files must be downloaded and may be lost if not downloaded in a certain timescale. However, with higher end products, such as ClickTracks Pro, this is usually automated which avoids the problem.

The differences in the information the two methods provide

We provide training on how to use ClickTracks to analyse your website data so lets look at how the two flavours of ClickTracks, the hosted JavaScript version and the log file version, differ in the information they deliver. Although very small, there are significant differences.

By the way, you can download a free trial of ClickTracks @ http://www.clicktracks.com

The robot report

One big and important difference is that the JavaScript method of data capture cannot monitor robot spider activity. (This is an intrinsic shortcoming of JavaScript and applies to all programs and services that use this method of data capture.)

In order to index and subsequently rank a page, search engines send out programs, called robots (or spiders) to grab information on what each site, and page within it, is about. In order to do this, the robot must be able to follow the links within a site, otherwise it cannot find all the pages. Robots are simple beasts and find static html links the easiest to follow and although robot technology is improving but it is not keeping pace with the technical developments of website design and a particular area of difficulty is the proliferation of content management systems. Some are spiderable but many are not.

ClickTracks (log file version) has a robot report that shows clearly which pages the robot has looked at and when. This report is probably the most useful tool there is if SERPS, ie organic or free rankings, are your aim. All your efforts at on and off page optimisation are totally useless if the search engine robot simply cannot find the page. The robot report shows exactly what is being found and what is not.

Returning visitors

The only way in which log files can pick up if a visitor has visited the site before is if the site has set persistent cookies. This is done automatically with the JavaScript version.

Access to data

Sometimes technical issues affect which web analytics method is used. It must be possible to add the requisite JavaScript to each page for the JavaScript method. Equally, it must be possible to access the raw log files in order to analyse them. Occasionally getting at log files can be problem. It may simply be a matter of security in that ftp access to the site is required in order to download the logs. This can usually be overcome though by the IT department downloading the log files and then making them available to the marketing department off line.

A bigger problem is when the hosting company does not provide raw log files but insists on presenting the website stats using only their own package, which is usually poor to very poor. All website generate log files but some hosting company servers are simply not configured to provide individual websites with their own log files. When multiple sites are hosted on a single server, sometimes the configuration does not separate out the individual sites' data. In this case the only solution is to move hosting company.

The scope of the analysis

The JavaScript method will give results on pages only to which the required JavaScript code has been added. This is usually done automatically but any pages without the code will not feature in the analysis.

Equally, information about a particular page will only be available in log files if there has been some activity on that page, ie someone has visited it.

The web analytics choice

Whether to use the log file method or JavaScript may well be determined for you. If getting access to your log files is a problem then JavaScript is the way to go. If robot activity is vital then the log file approach is essential.

It may come down to a commercial decision based on whether you prefer a one off payment for a log file analysis program or subscribing to a monthly service. Or you may base your choice on whether you prefer to have everything running on your own machines or whether you would prefer to have someone else responsible for running the software as happens with the hosted JavaScript solution. Whichever approach you take, the most important thing is that you analyse your visitors' behaviour, anything else is flying blind when the competition has their eyes wide open!!

Search Engine Workshops Ltd
Rosedale House
Rosedale Road
Richmond
TW9 2SZ

17th August 2005