They Shoot, They Score!

Are you trying to do a thing?

Probably a good idea to have some clear, unambiguous, and measurable success criteria then.

I know, I know, not exactly an earth-shaking revelation, but still, it's an important point that a lot of people tend to just gloss over. Like me. I used to gloss over it all the time.

Not anymore though, mostly because of an adventure that I had a few years back with Objectives and Key Results (OKRs).

I've written in depth about said adventure, but I went a little bit light on the process that we used to measure and discuss the key results for each objective.

That changes today.

Trying To Reach The Green From Here?

Let's go back a step and talk about key results and success criteria, because I've just used the terms interchangeably in the intro and I suspect it might be a little bit confusing.

From an OKR point of view, you generally have a high level, unmeasurable, somewhat abstract objective, backed up by a set of lower level, measurable and well-defined key results.

In comparison, the projects that I've been running recently do not use OKRs. Instead, you generally have a problem statement of some sort, supported by a set of success criteria that provide an indication of when enough of the problem has been solved.

To me, key results and success criteria are basically the same thing, which is a metric or value of some sort that you are watching over time, with the expectation that it is going to change based on your actions.

And they are hard to do well.

The real value here is just measuring something, talking about it regularly and adjusting the trajectory of the work based on whether or not you are seeing the changes that you're expecting.

And that isn't quite as hard.

Did That Go In? I Wasn't Watching, Did It Go In?

I'm going to narrow down on the OKR scoring process in this blog post, but I think that the same sort of thing could easily be applied any time you want to evaluate a metric relevant to a piece of work.

The key to the entire thing is to do the scoring process regularly and as soon as possible after the work starts.

I can't overemphasise that enough, because the sooner you have some sort of idea about how things are going and what direction you are trending in, the more likely you are to succeed overall.

The last time I did OKR's, scoring was a ritual each team did every 2 weeks.

It went something like this:

Gather the key results for the current point in time
Enter said key results into a simplified spreadsheet
Distribute spreadsheet for consumption
Get everyone to look at the key results and give a score between 0% and 100% that describes how confident they are that the goal of the key result will be accomplished by the end date
Find outliers (i.e. Bob is 10% confident but Mary is 80% confident)
Talk about the outliers and re-align confidence scores as necessary
Look into low confidence scores and create actions to increase confidence
Retire to a small, dark room
Cry quietly

Okay, those last two are more of a personal choice, but the rest of it is pretty much how it went.

It's All In The Hips

There are a lot of points in the process that I've just described where you can get real value.

The first and most obvious is the part where you actually have to go and get some numbers for your key results. This can be time consuming, especially if it's the first time that you're measuring something, but it's incredibly important.

The moment you try to get real numbers for a metric is the moment that it all becomes real. You can't hide behind vagaries anymore and you have to actually follow through with what you said you were going to do.

The second is in the sharing of those key result numbers. Distributing those key results via a simplified representation is a great way to reduce noise and narrow down the amount that people have to understand.

What this does is provide clarity with how things are trending, so that there is no ambiguity in the progress towards the goal. Obviously, you want to provide source dashboards or links for those who want to delve into the details, but its best to provide a simplified view at the top level to reduce cognitive load.

The third is in the calibration process, where the team discusses outliers in the confidence scores. This is an exercise in creating a shared understanding, ensuring that everyone in the team has access to the same information as everyone else.

The last is where you discuss any key results with low confidence scores and then change your approach to increase confidence moving forward. What you really want is a series of actions with well-defined owners about how things will change and what sort of impact you should be able to see on the key results come the next scoring session.

Honestly, the earlier steps are an implicit way of creating change by ensuring that everyone knows what is going on, but the last step is explicit. If you fail to do that well, you're probably just going to keep trundling along with middling confidence scores and maybe achieve mediocrity if you're lucky.

And I don't know about you, but I hate being mediocre.

Damned Alligator Bit My Hand Off

Overall, I think it's a pretty good process.

But in my experience, it tends to fall down in a few places, and, other than being more disciplined, I'm not sure what to do about that.

The most common problem that I saw was a fundamental inability to get actual numbers for the key results. People love coming up with measures that they hypothesise should get them closer to a goal, but actually going out and getting real numbers for those measures can require an incredible amount of effort.

The real impact is that sometimes the numbers just aren't there, which kind of makes the entire scoring process break down.

Another issue I found was that often so much time was spent doing the first few bits of the scoring process (discussing the numbers, getting the confidence scores, discussing outliers) that there was little to no time left for the part where you actually discuss what you're going to change.

Without that reflection, the value of the entire process is diminished significantly. There is still value in just knowing what is going on and how things are tracking, but it's easy for people to just accept that and continue on their current trajectory.

The last issue is something I mentioned in this blog post: scoring fatigue.

In general, I had to keep the scoring sessions relatively short (30 minutes) or people started complaining about "scoring taking up all their time". Trying to push through that just led to people disconnecting from the process, especially if the scores were saying that they were not going to achieve the goal.

Sure, it might have been the truth and it would have been better if it inspired them to try different things, but it never really seemed to shake out that way.

Just Get The Ball In With One Shot Every Time

Personally, I like the scoring process. It pleases the part of me that enjoys numbers and talking about numbers and using said numbers to actually measure progress.

It's the same part of me that likes watching numbers go up in video games.

I'm probably a bit biased though, because I helped to put it together and then invested a significant amount of sanity in getting people to use it.

I do feel like a bit of a hypocrite after writing this post, because while I'm shilling this process and writing about how awesome it is, I haven't exactly been doing it. I justify it in my head by telling myself that we just don't have good success criteria right now and that I need to focus on that first.

But this blog post has made me think that maybe I shouldn't get so stuck on that and that maybe I should push ahead and do some good old-fashioned scoring.

I mean, you can't score if you never take a shot, right?