top of page
Writer's pictureFernando Cuenca

You have more data than you think


"I don't have enough data to make any meaningful analysis... I need to wait until I collect more..."


Actually, you need less data than you think:


💡 5 data points are enough to know the order of magnitude of the distribution's SCALE (are we talking about days? months? weeks? years?) 


💡 12 data points: take the central 6 data points, they determine the "range of the median" (the "typical case", "this is how long things usually take")


💡 30 data points: things get more interesting:

  • take the lowest 6 data points: range of the "best case" (10th percentile, "this is how fast we can be")

  • take the highest 6 data points: range of the "worst case" (90th percentile, "this is how bad it can get")

  • take the central 10 data points: range of the "typical case" (median, or 50th percentile)



In all these cases you can compare the ranges you get from the data to the expectation of your customer/stakeholders, and use that as a guide to stimulate improvement. 


To end with an "alexeism": 

"An improved service is better than a more precise model of an unsatisfactory service" 😉 -- Alexei Zheglov

 

Some additional clarifications

A clarification on the meaning of the ranges you find with this technique: they refer to a high confidence range (90%) for the location of the given percentile. So, for example, the 6 central data points in a dataset of 12 samples gives us a range where, with 90% confidence, we can expect to find the median.


In the example diagram above, for 12 data points the range would be 3 to 9, meaning: I can say with 90% confidence that the median for that distribution will be located there. Of course, I can't say how close to 3 or how close to 9, and there's a small chance it will fall outside the range.


The point here is not high degree of accuracy, but to show that a few data points are enough to have some informed starting point for a conversation. For example, if someone claims that the "work here takes months", the 5 data points in the example above are enough for me to respond that it's likely not the case, that we should be discussing "weeks" and not "months".

4 views0 comments

Recent Posts

See All

Comments


bottom of page