Operations Management & Spreadsheets - Data Structure

In yesterday's post about spreadsheets vs databases, I talked about why clients get stuck when using a spreadsheet to create management reports. Today, I want to expand upon this concept.

When clients come to me because they are stuck, often the problem is that the client does not know how to properly enter the data so the spreadsheet can be used as a flat-file database. Put another way, the data is not structured properly.

Sometimes, the data is almost structured properly. Let me share a few examples.

First, let's remember a few rules:

  • In a spreadsheet, every column is a field

  • The column heading is the field name

  • Every field name should be an item type - name, date, month, ID Number

  • Data in each cell in the column should be a form of the data type.

Problem 1 – Data is not normalized

This happens when the client thinks a field is unique when in fact it is not.

A typical error is that the client will have 12 columns, one for each month, with the headings “Jan”, “Feb”, “Mar”, and so on. But these are not unique fields. If fact, they are the same field – Month.

The client should have just one column labeled “Month”, with 12 rows, one for each month of the year.

When you correct a problem this way – removing columns that are data points, not fields – it is called normalizing the data.

Problem 2 – Purpose and the end result

The client has created the spreadsheet without taking the time to think through the report, answering such questions as:

  • Why do I need this report?

  • How will I use this report?

  • Who will use the report?

  • What is the source of data for the report?

  • How many people provide the data for the report?

  • What are the action steps to be taken based on the report data?

  • How critical is this report?

  • Does this report duplicate or overlap another report we use?

  • Is the report information available some other place so we don’t have to create this report at all?

Problem 3 – Singular focus

Often, the client's request is to fix only one spreadsheet, which is used to manage only one step in the entire process. The client does not view the spreadsheet in context - how it relates to other spreadsheets being used throughout the manufacturing process.

Data that serves as input into or output from the spreadsheet may be used in other spreadsheets that others use to manage the operations process. So the client ends up with multiple spreadsheets, all of which contain the same set of data. Duplicating data in this way increases the risks of error if there is no way to ensure all data sets are exactly the same.

It makes more sense to locate the source data in one location rather than having a separate data tab in every spreadsheet. This will not only ensure that data is not duplicated, but also that all relevant data is caputured, so management can make the best, fully informed decision.

I would note that with this situation occurs it is often because employees work in a silo - solely focused on finishing the tasks they are assigned, and viewing any coordination with others as a hindrance to getting their own work done. In such circumstances, the spreadsheet's structure and use is not the problem, but a symptom of a larger organizational issue that management must correct to improve operating efficiency and profitability.