Two Years Later, I Finally Saw Useful Software Testing

11 min readFeb 14, 2018

We’ve all heard these buzzwords flying through the office — Automation! Tests! Robustness! And some of you may have read a book or two on testing written by pompous authors who love nothing more than their own metaphors, cleverness, and theories. Almost everyone, I’m willing to bet, has written some tests and seen no useful benefit come of it. Somehow, for all this talk about testing, it always seems to fall short.

After two years of being a test engineer, looking at one team’s test suite taught me 10x more about testing than anything before it. I’m writing my experience because I don’t want anyone to suffer another horrible book on testing software and endure more cringe-worthy metaphors. I also want to share how all this talk about testing actually becomes something tangible and useful to provide a ton of real life benefits — some obvious ones like fewer bugs, time and money spent more wisely (duh), but some more surprising and specific ones as well.

Controlling code quality over time is hard because so many things change. Programming always starts off innocently. You sit down, refreshed and passionate, to write a function that reverses a word. It reminds you of a simpler time. You’re happy. Three weeks and five product meetings later, the reverse function can also render the fourth dimension in 3D in real time and future time and past time. It’s also been worked on by four interns, a new hire, and the company mascot, none of whom fully understand what this function was originally supposed to do. And it’s broken. It works as reliably as the Millennium Falcon’s hyperdrive.

Exaggeration aside, you’ve probably seen code complexity bloat before. No matter how hard you try to pretend it won’t happen, a year of, “Oh, can you just add…” doubles a codebase in size and quadruples it in complexity. If you have several years of experience, you’ve probably tried to stop the inevitable from happening. A few years beyond that and you realize that it’s unavoidable, but over the course of several failures and slow improvements, it’s more manageable.

I’ve seen one team stay in complete control in an exceptionally turbulent environment. For those of you still improving how you build software and avoiding the inevitable landslide of bloating complexity and unforeseen change (everyone), I hope my experience at Microsoft can be an inspiration that slingshots you a couple iterations ahead.

Working on the Hololens — an augmented reality headset that lets you see holograms around you — was chaos. Roughly 10 teams were given a Hololens and told to make cool applications that demonstrated what this technology could do because nobody, not even those teams, knew. Every application was completely different since they were supposed to show the diversity of the new technology. From games to tourism to construction, every programmer was a pioneer. We were the ones inventing how to build mixed reality applications. We invented the best design practices and the tools that future programmers would develop with. I was part of a team making the tools that future programmers would test and automate with.

With grand discoveries being made on a weekly basis, sharing information and new technology between teams was important, and incredibly challenging. For this, we had a Central team — a team dedicated to owning and managing our internally open sourced code. Each application team had a few test and automation engineers, like me, who built tools for the application team’s specific needs. When we made something for our team that happened to be useful, we would submit it to the Central team to standardize it and share with everyone else.

In other words, dozens of programmers were hurling code at the Central team with almost no idea what the Central team, or any other team, was doing. The Central team had the privilege of managing all of these contributions by duct taping them together and sharing them with everyone — bug free. The Central team only had 4 people.

Thanks to the Central team, a year of development went smoothly. Every team was able to contribute to our toolset easily, including new hires. In the end, we had a production ready automation framework to ship out with the Hololens SDK for others to develop with. Had they failed, the automation and test features in the Hololens SDK may have been cut, stripped down, or too buggy to use — and flaky tools would have been the death of much needed early adoption. How did they control all that change and keep code quality high and still have time for development of their own? They wrote tests. But how? We’ve all written tests without getting the miraculously wonderful effects that they had. What did they do differently?

Secret #1 Understand What’s at Stake

The Central team understood the real world importance of testing and the consequences of cutting corners. They looked ahead at all the hypothetical changes that usually hurt teams: what happens during crunch when you need help from other people? What happens when you need to hire 5 new programmers and an intern? What happens when a senior team member leaves or gets pulled onto another project and 3 new people are hired to fill her shoes?

That’s just from a team’s perspective. What happens when the business can’t add more features to a product because of uncontrollable technical debt? What happens if the business ends up spending all of its money reacting to bugs instead of moving forward? What if fixing bugs causes more bugs and the product spirals? What if the code is so deadlocked that hiring more people only makes the problem worse? Does this sound familiar?

The Central team knew that doing the right thing early on saved immeasurable amounts of time and money, and could be the difference between a successful product, or a failed product that hurts the company’s reputation and brand.

If you’re lucky, these situations will only be hypothetical. But the Central team did lose a senior engineer and kept running smoothly without anyone noticing — not that it wasn’t a challenge. They were prepared.

There are a lot of ways to prepare for these situations: documentation, good communication, over-hiring, but all of these fell short of what tests could accomplish.

Secret #2 Understand the Goal of Tests

Tests allow the team to control their code instead of letting code control them. Are you spending more time adding features you’re passionate about or reacting to bugs and unforeseen changes? Are you often surprised by how certain functionality works or by side-effect behaviors? Take control. Become predictable and make these annoying problems go away. Spend your time working on the fun things.

Code is fragile. Any programmer can change what you’ve just written at any time. What’s scarier is that the change can easily go unnoticed. Subtle, important behaviors may be changed on accident. I’ve been the culprit before. During my first contribution with Central, I made a tiny change that I thought would be innocent. The associated test passed, but two other seemingly unrelated tests failed. I realized that I had added one piece of unintended behavior that seemed correct to me at first. Their tests allowed me to course-correct and avoid accidents without ever reaching out to them.

Good design practices can help control what is allowed to change, but any decent programmer will still be able to find a way around it. Documentation can also help to preserve original intentions, but it only goes as far as the people willing to govern and police what’s documented. In every case I’ve seen, documentation gathers dust and becomes obsolete. For most practical cases, it’s too slow and boring to look through and isn’t worth the time lost.

Tests cement the intentions, behaviors, and subtleties of code. If behaviors are changed unintentionally — especially side-effect behaviors — test will fail and the bugs will never reach production. Intentional changes will be more aware and understood with no accidental side effects.

Use tests in this way to declare with confidence exactly how the code you’ve written is supposed to work. Prepare your code to be worked on by new employees of any skill level. Empower them to write code at the same level of quality that you did by writing tests that act like bowling lane bumpers, making it hard or impossible for others to steer off course and fall into the gutter.

Secret #3 Tests are Essential for Successful Product Development

On the Central team, the ability for others to contribute code at a large scale was an absolute requirement. Tests were the best way to control rapid contributions from outside the team. Central didn’t have to pretend.

A programmer’s perception of tests is usually negative. Tests are the annoying extra work you have to do after coding your masterpiece. They’re rushed, thoughtless and, as a result, useless. Sometimes this isn’t a programmers fault. Project managers can have the same negative perception and prioritize speed over quality, mistakenly thinking tests are a waste of time since nothing is immediately produced — choosing short-term gains over long-term growth and stability. Sometimes the problem is systematic. When you’re being considered for a raise, what’s more important? Speed or quality? What gets you paid?

On Central, the culture was aligned at every level around quality because we were making the tools that others would test with. If our product was buggy, it would be an absolute, unusable failure. Even if it was fixed later, the public perception would be negative — reviews and blogs already criticizing it, trust lost, and a comeback would be 10x the effort. Every code change was required to include, without exception, an added or modified test. If that took more time to accomplish they understood that it would still be time saved over dealing with bugs and messy code later. Tests were worked on as collaboratively as any other code with other developers and specialized testers. During the code review, the test received just as much attention and scrutiny as the feature, if not more.

Secret #4 Tests are Better Documentation Than Documentation

Including tests in every code change had the additional benefit of making changes easier to understand. They described the behavior, intentions, and different states of the code better than documentation, and in a more meaningful context. As a result, feedback and conversation around the change was more useful, and there was more understanding of the codebase across the team.

Outside the team, anyone could make changes to the code with the confidence that they weren’t doing anything unintended because the tests clearly documented how code should behave. This allowed programmers from anywhere to reliably contribute without going back and forth with the 4 programmers on Central until the peer review at the very end.

Secret #5 Make Testing Unavoidable

Tests are only useful if you use them. I wish that was sarcasm, but I’ve seen tests go unused, or partially used to the point of providing no benefit, more often than not. The Central team understood that outside teams had pressures and priorities of their own. It wouldn’t be realistic to expect every contributor to learn how their test ecosystem works, and prioritize it above their own team’s work. For this reason, tests were run automatically when changes were submitted and if they didn’t pass, the changes were automatically blocked from production.

Make sure that quick unit and integration tests are run automatically as early on and as frequently as possible. Every code commit should automatically trigger tests to run on the feature branch, as well as every merge to master and every push to production. If any test fails, the merges and deployments should automatically be stopped, and the programmer responsible should be notified and given the ability to diagnose the problem.

With these checkpoints in place, deploying code shouldn’t be scary. If every team member has written unit and integration tests that thoroughly cover the codebase and cement intended behavior, deploying should be boring.

Here’s the test: hire an entry level programmer today and tell them to push a minor change out to production tomorrow without minimal review or assistance. If that seems scary to you, try to figure out why. If they push and all tests are green, what could go wrong? What test coverage is missing? What other vulnerabilities exist? Make it automatic and make it impossible to mess up.

Secret #6 Understand Manual QA?

Disclaimer: The Central team didn’t necessarily have a dedicated QA team. People like me who contributed to their code also used their code on our specific application teams. It was up to us to discuss and fix things that sucked from a usage standpoint since we were using our own tools as we made them, and manually testing them ourselves.

I’ve seen a number of companies use manual QA as a final safety net to catch bugs with no other testing in place, which is slow, expensive, and ineffective. In this environment they’re seen as ‘testers’ rather than ‘quality assurance’ since they are bogged down by bugs and spend most of their time reacting. If you’ve passed the test in Secret #5, then you live in a perfect world where anyone can push code and bugs never exist. What does manual QA do? They go beyond testing to also fully assure quality — both technical issues and non-functional aspects including productivity, ease of use and responsiveness.

No software is without bugs, so their job doesn’t change; how they respond to bugs should change. Normally, they identify a bug and give a programmer all the information they need to fix it. With a focus on automated test coverage, there should be one additional step: try to figure out how this bug got past the wall of tests, then require both a bug fix and a new or modified test that will make sure this bug never sneaks in again.

The more exciting change is that manual QA can more completely fill their role of quality assurance. Having automated tests will give manual QA more time to explore and understand the product. They can do exploratory testing in order to find the more interesting technical bugs and flaws in product design. They can provide opinions for improvement. No opinions are more important than those of manual QA, who provide feedback before reputation or money is being risked with a live user base.

Manual QA and automated tests build each other up. A colleague told me a story from two previous video games he’d worked on. On one, the team invested solely in automated testing. The game was never over-burdened with bugs and released virtually bug free. But it wasn’t fun. On the other, the team invested solely in manual testing. The game received incredible ratings for being fun, but was marked down for being so buggy that it was, in some moments, unplayable. Manual QA will not catch every bug, and automated tests will not make a product good. But together, they will both be more effective.

Now What?

Write a test! Just one. See what you learn. Is your code too hard to test? Then you need tests more than others. Did you not have the time to test? Rushed teams will benefit even more. Try it and don’t rush. Refactor your tests 20 times if you want. Keep going until you can feel how rock solid the code being tested is. Throw a cat on your keyboard and if your tests are still green, submit the code without flinching. Delete a random line of code and if the tests are still green, ship it! (Or write more complete tests). Feel the difference getting a good night’s sleep knowing that nothing you did today could have possibly created a bug. Then imagine your entire codebase feeling like that — even after a large refactor, a version upgrade, or an army of cats breaking into the building and walking on every keyboard.

Start today and come back here when you’re ready to make the Central team’s strategies your own.