 # How to correctly calculate meaningful average fuel consumption.

`You should always calculate the average fuel consumption from many individual values before you draw any conclusions.`

In my articles in the fleet management category, I repeatedly pointed out that no conclusions should be drawn from an individual consumption value determined during normal driving. It is too volatile.

Now, in this article, I will show you how to correctly calculate the average consumption and how to assess whether the value is good enough to be used.

You will find three ways of calculating and learn the pros and cons of each alternative.

Finally, we discuss how the quality of the average consumption value can be assessed by calculating a key figure.

The article will help you to choose which of the three options is most suitable for you.

You will be prepared to start improving your fuel consumption in a targeted and systematic manner.

Don’t let the formulas scare you, it’s easy to calculate. I will explain it very simply in the text.

## What is average consumption?

On some other websites, you will find what I call the “single value” under the term “average consumption”.

This is not a wrong statement.

The consumption over the route from one refueling stop to the next is made up of a number of averaged, unknown 100 kilometer values.

But, if I use the name “average consumption” for this number, how do I call then the average of the averages that we need for fleet monitoring?

For the sake of clarity, let’s agree:

• The consumption, which we calculate when filling up, is a single value of the relative fuel consumption. You can read how to calculate this figure here.
• The average consumption value covers many of these single values. This is the figure we need for drawing conclusions. You learn the correct calculation of this number here in this article.

This allows us to distinguish the two different values with their strengths and weaknesses.

You can read how to do fleet monitoring in the article How To Find Out The Fuel Consumption Using Fleet Monitoring.

The average consumption is the figure, which you need to compare and improve.

It can be calculated for individual vehicles, for the entire vehicle fleet, or for a selected group of vehicles sharing some common features. It depends on what you want to look at.

• You sort the values in groups depending on your matter of interest.
• You calculate the average consumption for every group.
• You compare the values and draw conclusions. The average fuel consumption values tell you the correlation between the respective features and the fuel consumption.

I am going to explain in detail how to do this search for improvement measures in a separate article.

Let’s look now at the different ways how the average fuel consumption value can be calculated.

## 1. Alternative—The mean of the single consumption.

The most obvious and therefore probably the most frequently used formula is the mean value.

All individual consumption values are added up and divided by the number of values.

Microsoft Excel has its own mean formula that you can use.

=AVERAGE(A1:A10)

Column A of the Excel spreadsheet contains the individual consumption values.

### Using the mean formula, we lose the correlation to the distance!

`However, this calculation method has a considerable potential for error. Therefore, I recommend using this procedure only if you control the error closely!`

By calculating the average consumption as the mean value from the individual consumption, you lose the relationship between the fuel consumption and the distance.

Each individual consumption value is taken equally seriously.

That’s bad because it means that the frequency of refueling has an impact on the result of the average consumption calculation, and that’s not correct.

The following happens:

• In heavy-duty applications with high fuel consumption, the tank is empty after a shorter distance compared to less heavy-duty applications.
• This increases the number of high consumption values because you have to fill up more often.
• For light operations with low consumption, the tank filling lasts longer and the refueling process is therefore delayed.
• This reduces the number of low fuel consumption figures because you don’t have to refuel as often on the same route.
• The balance of the individual values shifts in favor of the high and to the disadvantage of the low consumption values.
• The average consumption appears greater than it really is.

This error occurs whenever the severity of the operation or the load changes significantly during operation.

I want to show that with an example.

I chose some values arbitrarily so that this effect becomes clear. I am going to use the same values for all sample calculations. Then we can compare the results at the end.

The calculation of the mean value looks like this:

Remember this value, we will compare it with the results of the other calculation options below.

### Is the accuracy acceptable?

`Since this calculation method is very simple, you will surely wonder whether the error is acceptable.`

Well, the first thing you should do, is to assess how big the error really is in your specific case.

Therefore, you should use the other calculation methods for comparison.

If the deviation between the results of the different ways of calculating is small in your typical use cases, you can go on with the mean value formula.

Calculating the scatter gives you another clue. I’m going to show you how to do it later in the article. If the scatter of the individual values is low, then the error is also small.

Another alternative is to do the refueling based on the time instead of the fill level.

If you always fill up at the same time, regardless of the fill level in the tank, then this error is reduced as well. The refueling could be done at the beginning or the end of each day or shift.

However, with time-based refueling, you risk an error due to the mixing temperature. But that could be the better alternative because the errors are evened out over the refueling events if you don’t skip any.

## 2. Alternative—The total consumption.

A simple alternative that avoids the error of the mean value is the calculation of the average consumption as the total consumption.

For the total consumption, the tank quantities and the distances driven are added up individually. Then the sum of the diesel volumes is divided by the sum of the distances traveled and multiplied by 100.

Calculation of the total consumption

Although there is no predefined formula for this calculation in Excel, it is very easy to write down manually:

=100*SUM(A1:A10)/SUM(B1:B10)

Column A contains the tank quantities and column B the distances traveled.

Here is the example of the total consumption calculation with the input values from above:

In my example, this results in the correct average consumption of 30.0 l/100 km.

The error of the mean calculation is 1.7% (30.52 l/100 km vs. 30.0 l/100 km).

A drastic mistake! And that, although the individual consumption values are not sooooo far apart.

So you see it’s important to look at this closely.

`I recommend using the calculation of the total consumption, since all data are usually available and the time-consuming weighting of the mean values can be avoided.`

## 3. Alternative—The weighted mean of the single consumption.

Let’s now look at the calculation of the weighted average.

This formula gives each individual input value its correct share of the overall result.

The weighted average is calculated by calculating the share of the total consumption for each individual consumption value. To do this, the associated distance is divided by the total distance traveled and then multiplied by the fuel consumption value. The average is then the sum of all products from individual consumption and the associated route factor.

Calculation of the weighted average

For calculation of the weighted average in Microsoft Excel, you can use the sum-product formula.

=SUMPRODUCT(A1:A10, B1:B10)

Column A of the Excel table contains the individual consumption values. In column B, you need to calculate the weight for each individual value. This is the corresponding distance divided by the total distance.

Using my example data, the manual calculation looks like this:

As you can see, the result here is the same as when calculating the total consumption. So this method is also accurate.

Calculating the total consumption is more workable in my opinion, so I would prefer it.

At the moment, I can’t think of a situation where I would use the weighted mean method. Possibly if only the consumption data from the vehicle and the mileage are available and not the amount of diesel.

## The standard deviation – an indicator of trustworthiness.

Last but not least, we now want to check how much the individual consumption values scatter around the average consumption.

It is logical. The closer the single values are to each other, the more likely the average value is really representative.

If the individual values differ widely, there is a risk that the average consumption contains a larger error and is therefore not representative.

The statistical standard deviation (scattering) is useful for assessing this.

### Calculate the standard deviation for the average consumption.

Statistics offers us the standard deviation formula as an indicator of the scattering of values. So let’s dive right in.

The standard deviation is calculated by taking the difference between each individual value and the mean. Then square them, add them up, divide them by the number of measured values minus one and take the root of that.

Calculation of the scatter

Fortunately, the standard deviation is directly available as a formula in Microsoft Excel:

=STDEV.S(A1:A10)

Column A contains the individual fuel consumption values.

This calculation works for the mean calculation method.

Unfortunately, it’s not quite as easy with the other two values, the total consumption and the weighted average consumption.

However, since statistics don’t need to be absolutely accurate in this case, we can also use the individual values to calculate the standard deviation from the mean and transfer it to the other average values.

While that’s not entirely correct, it still gives us a sense of how great the uncertainty is.

This is what the scattering looks like in our example when I use the mean:

The scatter is here +/- 3,63 l/100 km.

The real average consumption is therefore very likely in the range between 26.89 and 34.15 l/100 km. That’s a pretty big range.

We have already seen that we found an error of 1.7%. However, there is still a great deal of uncertainty.

The different individual consumption values, which I deliberately chose to show the error effect in the mean, lead to a very large scatter.

In such a case, I would recommend that you take a closer look at the data and find out whether this is really correct or whether there are errors in it.

## Assess the statistical scattering of the average consumption.

But what does the standard deviation really mean exactly?

The standard deviation marks the area around the mean value, which contains more than half of all possible consumption values. (68%)

Since we don’t have an infinite number of values, but only a countable number, we don’t know the true mean.

The real mean value that would result if there were an infinite number of individual values is called the “expected value” or “true value”.

Unfortunately, it will always remain unknown to us, because we will never find out the “entirety” of all possible values.

Statistics give us formulas that we can use to find out how good our data is. I will describe that in the next article.

It is often helpful to compare the scattering of the average values with one another.

Let’s take a look at what the scatter can tell us.

### Case 1: Same standard deviation—The average consumption values are further apart than the standard deviation. The difference between the mean values is 2 l/100 km, the scatter is +/- 1 l/100 km in each case

That’s how we would like to see it always.

The standard deviation of the normal distributions is the same, so the spread of the values is the same. This suggests comparable operating conditions.

The difference between the means is equal to or greater than the standard deviation of the two means. With this, we can be relatively sure that the consumption of the blue groups is really better than the consumption of the red group.

To figure that out, you don’t necessarily need to visualize it with this graph. You can also see this when you look at the numerical values of the two distributions.

• Mean values: 25 l/100 km and 27 l/100 km
• The difference in consumption: 2 l/100 km
• Standard deviation: 2 x 1 l/100 km
• Conclusion: Consumption difference = standard deviation 1 + standard deviation 2

Now that I have shown you the curves, you can certainly imagine what it looks like based on the numerical values.

### Case 2: Same standard deviation—The average consumption values are within the standard deviation. Die Differenz der Mittelwerte ist 0,5 l/100 km, die Streuung ist jeweils +/- 1 l/100 km

When the curves are related, as in this graph, further analysis is required.

The standard deviation of both curves is the same here as well, so the scatter of the values of both groups of individual consumption is the same. The operating conditions, in which the trucks have been driven, do not seem to pose a problem.

However, the difference between the mean values is significantly smaller than the scatter at only 0.5 l/100 km. So we cannot be sure that the consumption of the blue groups is really better than the consumption of the red group.

Imagine the following scenario:

• The true expected value of the red curve is slight to the right of the red mean.
• The true expected value of the blue curve is slight to the left of the blue mean.
• Then the conclusion is tipped!

The difference in consumption could disappear or even turn around.

In such a situation, we must be careful with quick conclusions.

In the next article, I will explain in detail how to take a closer look at this situation in order to gain clarity. Basically, you have two options.

• Consider the average consumption to be about the same.
• Collect more data to reduce the standard deviation to a smaller amount.

### Case 3: Different standard deviation—The average consumption values are within the standard deviation. The difference between the mean values is 0.5 l/100 km, the scatter is +/- 1 l/100 km and +/- 2 l/100 km

This situation is something like the statistical super meltdown.

Here we have to pull out all the stops if we want to get a reasonably reliable statement at all.

The standard deviation of the normal distribution is significantly different. This forces us to analyze the data carefully.

Are the operating conditions so different that the scatter of the individual values is so different? If the usage is so different, does it even make sense to compare the consumption values with each other?

But there can also be errors or systematic influencing factors behind it.

If the use conditions are comparable but the distribution of the single values is so different, then something is wrong.

We take a closer look at this problem in the article on improving consumption with fleet monitoring.

At 0.5 l/100 km, the difference between the mean values is again significantly smaller than the scatter. So we cannot be sure that the consumption of the blue groups is really better than the consumption of the red group.

Due to the large spread, the situation is even more uncertain. So, in addition to the systematic factors, you also have to deal with the statistics.

In any case, caution is required at this point. If in doubt, we prefer not to draw any conclusions with consequences.

### Emergency aid

In case of such doubts, be sure to take a look at the data to see if anything stands out to you.

• Did you forget to write down refueling?
• Are the seasons or the duration of the data collection similar? (One full year vs. just one summer e.t.c.)
• Differences in tire changes?
• Are there systematic influencing factors that we have to exclude before we work with the data?

If you have documented additional usage conditions with your individual consumption values, you will be able to find such explanations.

The next step is to postpone the conclusion and collect more data first.

A self-healing effect with more data occurs when the database is too small. More data gives chance a chance to iron out the mistakes better.

With time and some experience, you get a feeling for these situations and can assess them better and better.

## The range of the data is not meaningful.

Scattering is not to be confused with range!

`The range is the difference from the lowest to the highest single value. It is easy to calculate, but says little, since individual “outliers” have a strong influence on the value.`

That’s some good news. We can simply ignore the value that suggests the greatest uncertainty.

## Summary

• An individual consumption value contains too many errors, so an average value is calculated, with which the random errors are averaged out.
• There are three different ways in which the average consumption can be calculated.
• In the case of widely differing operating conditions, the mean value is incorrect because the relation to the driving distance is lost.
• The calculation of the total consumption is more accurate.
• A weighted average gives the same result as the total consumption, but is more complex to calculate.
• The scatter provides information on how accurate the calculated average consumption is.
• The range is not relevant and should not be confused with the standard deviation.

In the next article, I will explain which statistical approaches you can use to further narrow down the uncertainties and when and how more data can help.

So please read it, it brings an even better understanding of the problem.