Like the past 100 years, even today cycling products come and go. And with them, so do their marketing sound bytes...
Any intelligent cyclist must carefully inspect marketing data handed to him, and question what is missing and why its missing. Weak data can lead to weak correlations, spurious percentage differences and other logical fallacies. Until the missing numbers are accounted for, I don't advise anyone to take faith in where they put their money.
When James posted a small article yesterday on the E-Hub at the Bicycle Design blog, I got very amused and decided to take a peek at the product website. I spent a little time looking at the interesting item proudly displayed but then had an itching desire to see the numbers behind the invention. Not just plain numbers. I wanted to see if they're meaningful numbers.
This page has a statement from Dr. Alen Orbanić (a University mathematician from Slovenia) telling us that the designers behind the innovation carried out a surefire experiment to prove without doubt that using the E-hub for cycling showed the following things :
1) Increased average power output when compared to cycling with a conventional rear hub.
2) 4% reduction in average and maximal heart rates in cyclists using this product, when compared to the same figure for cycling with conventional hub.
3) 10-15% of blood lactate reduction using the E-hub versus using a conventional hub.
So What Was The Experiment?
Well, I'll tell you the part of it they conducted outdoors. They brought together a population of cyclists from 20-60 years of age. How many? Not specified. Then they categorized them as "Professionals", "Recreational" and "Amateurs". How did they define who belonged where? No indication. What were their weights, fitness levels etc? No indication.
So this population of cyclists were asked to fit themselves with a Polar heart rate measuring system who then mounted Ergomo powermeter fitted MTBs to ride a 2km track (1.24 miles) with 14 degrees of average inclination. Apparently, they did this twice, one with the E-hub and one with a classic hub after 24 hours of rest between the two. Levels of lactic acid were measured twice, immediately after each run with a hub.
So What Does The Data Look Like?
The authors go on to claim they gathered a "vast quantity of data" but for the sake of the reader's reading convenience, they picked 3 'random' data points corresponding to 3 cyclists, for each class of cyclist. I guess this is a solid example of where you can't really thank people for their kindness :).
Here are the numbers :
1. Sample Points & Averages : There's a rule of thumb in good statistics. You need a minimum of 30 sample points before you do descriptive analysis on it to explain trends.
Take a look at the amount of power these cyclists are producing on this so-called 25% grade, 1.2 mile track. Professionals are producing puny average power outputs while recreational and amateurs are easily rivaling them, not only in power but also in speed.
This leads me to question firstly how the authors classified and defined these cyclists. It seems to me from this meager amount of data that all three classes were almost equal in their cycling abilities?
I also have to say that averages can fool you if data jumps all around the place wildly. For the meager sample points presented above, you can see that the average power is pretty sensitive to outliers.
Infact, if we had been handed 30 sample points or more for each class of cyclist, it is likely the data could have shown a decreased average power, which could have reduced the resultant power differences between the E-Hub and the classic hub. Any guarantee that's not the case? The authors haven't proven it here but go on to artificially bump up the averages using just 3 data points mined from here and there. Furthermore, their conclusions about the apparent efficiency increase with the E-Hub is only relevant for these 3 sample points.
2. Spread : Closely following the absence of more samples is the question, what's the spread and deviation of this "vast amount" of data? I don't have any idea of it as there's no indication of standard deviation. The data is meaningless. How can I tell if a majority of data points in this experiment are close to the average power output or not? What if outliers are pushing the average up?
3. Range : Because only one sample data point (for power, HR and lactic acid) have been presented to us going across for each cyclist, we have no idea of the true range, or the true maximum and minimum values that would be observed. The data point presented to us is just one of what could be many and they are all bound to vary, because that's how all processes are... they vary! Hence, the range could vary pretty significantly if we had more tests on the same individual.
4. Instrument & Measurement Error : Lastly, what about the instruments used? Were they calibrated properly and accurate to other power measurement systems? What's the bias in the system, if any? Are these numbers from just random variability or regression to the mean? It is often taken for granted by some that measurement systems (instrument+human operator) that produce such outstanding numbers are always pin-point accurate.
I simply have to conclude that this data, so far, to me is just meaningless. The rest of the data that follows on the webpage, done on an indoor ergometer, suffers from exactly the same types of weaknesses I have mentioned. These are basic rules to follow in statistics and I'm surprised they weren't in this case.
The product itself may be great. I cannot disagree for certain there. But the numbers don't show me much so far. Thus, I think the declaration that this hub system really improves the efficiency of a cyclist compared to what we usually use must be taken with a handful of salt.