Database Functions

In this Virtual Workshop we will examine some of the built-in 'database' functions of Excel as well as using Excel to produce graphs. We will use Excel to analyse some of our Jimmy Webserver logs and then use this analysis to produce graphs which can be used as a tool to give an overview of the situation.

Getting Started

In order to prepare for what we want to do in this Virtual Workshop we must do two things before starting Excel. Firstly we must have some server log data and secondly we must know the names of the pages (or files) that we want to analyse and count the number of 'hits' for.

To collect the data go to the Jimmy Logfind Utility and choose a month and year to collect data from and also enter your jimmy username. Start the search. For example purposes I am going to look for the files in kbrown/vworks during October 2001. You should see as a result something like this:


62.31.64.2 - - [01/Oct/2001:08:27:51 +0100] "GET /usr/kbrown/vworks/ HTTP/1.1" 200 2088
62.31.64.2 - - [01/Oct/2001:08:27:51 +0100] "GET /usr/kbrown/vworks/Templates/vworks.css HTTP/1.1" 304 -
10.1.5.148 - - [01/Oct/2001:09:22:19 +0100] "GET /usr/kbrown/vworks/use_ssh.shtml HTTP/1.0" 200 11230
10.1.5.148 - - [01/Oct/2001:09:22:19 +0100] "GET /usr/kbrown/vworks/Templates/vworks.css HTTP/1.0" 304 -
10.1.5.148 - - [01/Oct/2001:09:29:52 +0100] "GET /usr/kbrown/vworks/use_ssh.shtml HTTP/1.0" 200 11230
10.1.5.148 - - [01/Oct/2001:09:29:52 +0100] "GET /usr/kbrown/vworks/Templates/vworks.css HTTP/1.0" 304 -
etc etc

Select all the log entries between the Horizontal Rule bars and copy them (edit/copy or right-click copy or CTRL-C). Next open notepad and paste the logfile entries into the blank file. Save that file as mylog.txt and close Notepad.

Next I want to identify the files to be included in the analysis, which are:

Note: The URL http://jimmy.qmuc.ac.uk/usr/kbrown/vworks/ will also retrieve the 'default.shtml' file and thus we will have to consider this when doing the analysis.

Getting the logfile into Excel

If we are going to make Excel analyse the logs, we first need to open that file in Excel and convert it into something meaningful to Excel, as a plain text file cannot be manipulated. Start Excel and open the 'mylog.txt' file. Excel should start the Text Import Wizard.

Step 1 - Accept the default settings of the first dialog box with the 'Delimited' box checked and click next.

Step 2 - Will ask how you wish to delimit (or separate) the fields - add Space to default Tab selection and click next:

Step 3 - Leave the settings as they are and click finish. The text file has now been imported and placed into fields.

There are some fields which we don't need so we may as well remove them. Select the whole of Column 'B' by clicking on the 'B' at the top.

Right-click and select delete (or select delete from the Edit menu). Column 'B' has now been removed. It may not appear so as Column 'C' has now become Column 'B' and they are similar, so delete that column now as well, and you should have something similar to this:

Columns A-F with sample data

Also delete columns 'C', 'E' and 'F' and resize fit the text. The quickest way to do this is to select all the columns and then double click on the point that the columns join, which automatically resizes the columns.

Save the file as an Excel file, rather than a text file.

Preparing the File to be Analysed

We are going to use the DCOUNTA function to count the occurance of each of the filenames within our data. The DCOUNTA syntax is as follows:

DCOUNTA(database,field,criteria)

So we need to define an area or RANGE to serve as the database and we also need to define fields. We will define the fields first. Select the first row by clicking on the row number, then select Insert / Row, which will add a row above our data. Add the following text to the new first row.

Next select all the data in rows 'A','B' and 'C' from row 1 to the last row containing data - this selection should be apparent by the cells turning light blue. Next we name that range by clicking Insert / Name / Define, and call the range log_data (this range is also known as the 'database').

fefine Name Dialog Box

We should now have defined the database and fields and the only thing left to define is the critera. This is done by specifying fields and critera in pairs of cells. Insert a second worksheet (Insert / Worksheet) and enter URL in cell 'A1' and your first filename in 'A2' surrounded by the wildcard character '*'. eg:

Column A with defined Critera

This means that we can now use the range 'A1:A2' as our critera (ie URL = *use_ssh.shtml*). Fill out the rest of the critera for each of the files you need to count hits for.

Columns A-D with defined critera

The final critera we have to add is slightly different. As we noted above, just asking for the directory will automatically display the default file, so we have to create a critera for hits on the directory only as well as the 'default.shtml' file.

Critera DefinedWe could set the critera as '*vworks/*' but that would include all of the files as they all have that as the first part of their URL. What we need to do is add a space between the forward slash and the wildcard to select a hit on that directory only, with no other filename specified, i.e. *vworks/ *.

Doing the Analysis

Having defined the database, fields and critera we can now use them within the DCOUNTA function to count the number of hits per file. Enter the names of the pages in column 'A'.

The pages listed in Col A

We are now ready to place the DCOUNT formula in column 'B'. Select 'B5' (ie miss out the default file for now) and enter this formula:

=DCOUNTA(log_data,"URL",A1:A2)

...this will calculate the number of hits from the 'log_data' database in the 'URL' field, meeting criteria defined in 'A1:A2'. The result should be a figure displayed in cell 'B5'.

First Count worked out (Using SSH - 166)

Fill out the rest of the fields EXCEPT the default page and you should see something similar to this.

Other counts completed

So that only leaves out the default page which is slightly trickier as we need to add all the hits on the 'default.shtml' file to all the hits on the 'vworks/' directory. This is done by using the SUM function and including both DCOUNTA functions for each file.

=SUM(DCOUNTA(log_data,"URL",G1:G2)+DCOUNTA(log_data,"URL",H1:H2))

Thus the resulting (and final) total of the hits would look like so:

Final result 

Creating the Graph

Now that we have some data to represtents the number of hits per page we have received, we can draw a graph to represent this. To do this we will select the data then use the 'Chart Wizard'. Select all the data that you want to chart.

results selected

Then click the 'Chart Wizard' button Chart Wizard Toolbar Button on the standard toolbar to will start the wizard. Choose a bar graph, with a 3D style and click next.

Chart Wizard - Step 1

Step 2 asks you to define a range, but we have already done this by making a selection so just click next.

In Step 3 you can fill out the graphs labels, giving a title and appropriate Z-axis label to describe the values.

Chart Options dialog box

Step 4 asks where we want the final graph to be placed. The default option (the same worksheet as our selection) is usually fine.

We have now generated a graph from our data. If you cannot see all your graph labels resize the graph. This can be copied and placed into a Word document or made into a graphic. For more information on that type of transformation see George McMurdo's Using word to make gif's.

 

Lastest 10 Threads - view all

ctiiwqcbbi

Posted By: ctiiwqcbbi at 17:23:02 on Wednesday the 4th of July 2007

Thanks for this site! 63t2wh.rtehe.biz i0eld.poiui.info wrs.gfngfn.biz pwbh.rtreh.biz ucp8.xczv.info

Reply to this comment

tzcqjzapbx

Posted By: tzcqjzapbx at 16:32:21 on Tuesday the 3rd of July 2007

Hello! Good Site! Thanks you! sstrzlnahqotde

Reply to this comment

information about excel and databases

Posted By: bekale joel at 23:30:20 on Tuesday the 9th of August 2005

hello I am very interested by your site.I saw on your site of the mysql of Excel into the same time.I would like to know if it a bond between mysql and Excel?

Reply to this comment

3d sum functions

Posted By: Kerry-Ann Saddler at 17:08:31 on Monday the 4th of April 2005

I would like to know how to create a 3d sum function in excel. Could I get the formula and the concept. Thanks.

Reply to this comment

help needed

Posted By: manojkumar at 11:46:32 on Monday the 21st of February 2005

Plz. anybody help me I am new to MySQL, I want to populate a table with a text file available in my /var/lib/mysql/db directory. the file is exactly as the table description. with delimited by tabs. Please give the exact command I tried with insert db_table values ('mydata.txt'); but it is not working HELP ME..... manoj

Reply to this comment

Database and Excel

Posted By: Amjad at 06:45:40 on Thursday the 15th of July 2004

why we use database instead of excel? Addvantage of database as compare to excel

Reply to this comment

help with if funtions

Posted By: Steve at 15:05:01 on Monday the 12th of July 2004

Hey

I just would like some over view on if funtions please?

1. the formula

2. explanation on the concept

3. yes and no concept in the fuction example if a=1, answer= yes or if b=2 answer=b

Im juct helping kind learn this funtion but i forgot it

Thank for creating a site which helps.

Reply to this comment

formulae help

Posted By: matt at 14:43:16 on Thursday the 17th of June 2004

basically i have many spreadsheets that contain information, then i have one more spreadsheet which is my template. Im trying to pull accross information using the vlookup formulae. e.g. =vlookup(a15,"name of excel sheet",3,false). now the bit that says "name of excel sheet" can i reference this to a cell, e.g a3, so that what ever is in cell a3 now appears in my formulae?

Help Really Really Apreciated

Reply to this comment

DSUM(Database,field,criteria)

Posted By: KENNETH at 12:33:57 on Friday the 12th of March 2004

Pease,this isn't a comment but a request,assist me and explain the above because I'm trying to program my workshheet which has 3 reportsin that the formulas put in can pick answers say,there are 2 worksheets 1-involving the general,the other the day which has been day1 with 2reports.I want the final to pick the total hrs within day1 using the above. Pliz exolain the precise meaning of criteria.

Reply to this comment

Re: DSUM(Database,field,criteria)

Posted By: Keith at 12:50:44 on Friday the 12th of March 2004

...much like the about example with DCOUNT the critera are a limiting factor so that instead of searching the entire database they would only search for critera (eg people working on certain days etc) rather than the entire database

Reply to this comment

Neat one

Posted By: Alexander V. Alexeev at 02:03:36 on Wednesday the 2nd of April 2003

This is a neat way of analysing the Apache log files with a widely available application. Also serves as a good demonstration how portable delimited (somehow structured) data can be analysed.

Reply to this comment

Re: Neat one

Posted By: vitalik at 17:56:51 on Monday the 28th of March 2005

hi. ti v php sharish horosho?

Reply to this comment


Post a Comment or Question

Name
Email
URL (optional)
Title of Comment
Comment
HTML Allowed:
a,code,strong,em
I would like to be e-mailed any response to my comment.

In this section

Related Reading

Related Books

Data Analysis and Business Modeling with Excel

Related Ads