Keeping Your Finger On Your Web Site’s Pulse : Using Web Site Statistical Report Programs Internic: 3/98 Mike A. Hall, General Manager, HostAmerica (a division of HomeCom) So you finally have your Web pages online and you anticipate hordes of visitors coming to your site to read about your products and services. It’s easy to think the most difficult work is behind you. But now that your Web site is online and being utilized, how are you going to rate its effectiveness? That’s easy, you say - Web site statistics. Statistics will tell you how popular your Web site is. But you obviously want to know more than just the number of "hits" to your site each day. Most statistical reports also feature the times of the day your Web site is most active, the amount of data transferred to your visitors, and who some of your most frequent visitors are (or at least which Internet Service Provider they are going through). This information is helpful, but there is a great deal more information locked inside your Web site’s access logs. For instance, you can quickly find out which of your pages are requested most frequently, least frequently and which page is most often the last that visitors see before exiting your site. You can also find out which particular Web browsers are being used to view your content. (Have you checked the way your HTML (Hypertext Markup Language) appears on Netscape’s and Microsoft’s latest browsers? How about America Online (AOL) or Prodigy?) Do you also want to know which Web pages outside of your site are sending visitors your way? If the answer is "yes," you need to choose from among the growing set of log analysis or statistical report generation programs available. Most of these programs are available for download via the Internet as "restricted shareware," which use the "try before you buy" approach. These programs will cease to function after a brief period, usually 14-30 days, but that is plenty of time to find out if a particular log analyzer will work for your needs. Why are they useful and what kinds of options are there? Understanding that these products will eventually require purchase and knowing that costs are often a concern for small business owners, the question arises, "why should someone with a Web site consider getting such software?" The short answer is that those programs can greatly enhance the tracking, design and successfulness of a Web site. For instance, going back to the type of Web browser used to view your content, Web site visitors won’t be coming back if your site looks like a mess on their systems. It pays to be proactive and knowing as much as you can about the visits to your Web site. As all veteran Web site designers know, it takes a concerted effort to catch the fleeting interests of Internet Web surfers. The often mentioned two or three clicks worth is sometimes all you will be able to count on by each visitor. You must therefore make the most out of each visit and over time you will be able to improve and change your design based on the data revealed by your Web site statistical reports. At HostAmerica, we use "Virtual WebTrends." There are numerous reasons, but in a nutshell, this well-written program is very easy to download and install, has a user-friendly interface, and creates very detailed reports with little fuss or muss. Still, there are a number of other software products available that will provide equally useful statistical reports and I’ve included a few links to lists of these programs at the end of this article. The Statistical Report Creation Process Most Web hosting providers create daily and monthly statistical reports automatically for their customers’ Web sites. The convenience of this is designed to provide a useful reporting mechanism to the client with minimal time required of the provider. Reports are processed by an automated system program running late at night in order to conserve each Web server’s CPU resources. Compiling thousands of reports takes many hours, even with the most powerful servers. Many Web designers and developers require more complex statistical reports than their Web hosting providers can generate automatically on a daily basis. So, in order to keep clients happy, most providers offer access to their site’s "raw logs" in addition to their normal reports. Web site owners then use their raw logs and process customized statistical reports using software programs like Virtual WebTrends. Let’s take a brief look at the makeup of these "raw logs." Raw access logs are more likely than not going to be in the "NCSA Common Log Format." For those of you who aren’t yet up to speed on your World Wide Web history, the NCSA Development Team was part of the Software Development Group of the National Center for Supercomputing Applications at the University of Illinois at Urbana - Champaign, Illinois (please see: http://hoohoo.ncsa.uiuc.edu/ for more information.) They created the original Mosaic Web browser and originated one of the most widely used Web server software programs of recent history, NCSA HTTPd. Over the years, their chosen log format was largely adopted by the majority of the Internet community, although significant modifications, additions and changes have been made since 1995. The common log format is comprised of a one-line text entry that is created by the Web server each time a file or Web document is requested. A typical raw log file is made up of thousands of these single line entries. Here is an example of one entry: ncsa.uiuc.edu - - [19/Sep/1995:15:19:07 -0500] "GET /images/icon.gif HTTP/1.0" 200 1656 The general common log syntax is as follows (each separated with a space): host ident authuser date request status bytes Log analyzers and statistical report generation programs use those data fields to compile detailed reports using graphs and charts, to aid in quick and easy interpretation of the data. Recent additions to this log format are the "referrer_log" and "agent_log" entries. These additional bits of information are often added to the general common log format and allow log analyzers to provide the browser type and referral location information reports. For a detailed description of these specific items, please see: http://www.apache.org/docs/mod/mod_log_common.html Generating a Statistical Report Stat programs acquire a Web site’s raw log files via the Web or FTP (File Transfer Protocol) and process them into detailed reports. These reports are often created in HTML so they can be displayed via the Web and reviewed by many different parties involved with the Web site. Virtual WebTrends, for instance, is designed to retrieve log files that are located on remote systems using an HTTP (Hypertext Transfer Protocol) URL (Uniform Resource Locator). Once Virtual WebTrends retrieves the log file to your local system, subsequent analysis requires only the transfer of the differences between your local file and remote file. This step saves a great deal of time by identifying any changes in the log file and automatically retrieving only the new information before starting the report generation process. Once you determine how to create these customized reports, many programs offer a scheduling feature to allow an automated report to be waiting whenever you return from work or when you arrive for work in the morning. Just make sure you have a way to connect your local computer to the Internet in order to retrieve the updated raw log files. If this is too difficult to achieve when you’re not present at your computer, simply save the report configuration and run it again each time you want an updated report. Once you have your report generated, take your time viewing it with your favorite Web browser. The higher quality report generation programs will automatically open your default browser with the locally generated file loaded in the browser. If you’d like to show others your newly created statistical report, FTP this HTML file to your Web site and Voila!, you’re instantly well ahead of the pack in regards to displaying detailed statistics. Some Statistical Report Generating Software Links http://www.reallybig.com/stats.htm This site lists 12 of the most popular and versatile statistical report creation/log analyzing products for a variety of platforms. http://www.yahoo.com/Computers_and_Internet /Software/Internet/World_Wide_Web/Servers/Log_Analysis_Tools http://www.uu.se/Software/Analyzers/Access-analyzers.html Permission is granted to quote, copy, or otherwise reproduce the materials in the InterNIC News, provided that the following copyright notice is retained on each and every copy: © Copyright 1997 Network Solutions, Inc. All rights reserved. For full copyright notice and disclaimer, please see copyright notice and disclaimer. - - - - - - - - - - - - - - - - - - - - - - - - - Original source: http://rs.internic.net/nic-support/nicnews/mar98/pulse.html - - - - - - - - - - - - - - - - - - - - - - - - -