Continuous Integration (CI) and Automated Testing are key components to a successful agile software process. Over the last 19 years, SDS has built a significant number of automated test systems for our customers, and we have assisted customers in creating and improving their agile software environments.
This whitepaper describes the best practices that SDS has identified over many years and many projects. Some of these are non-controversial agile practices that will get little argument from any reader. Others may seem like arbitrary opinion, but they are backed up by a combination of successful projects as well as painful object lessons.
Continous Integration on Every Change
Every high-performance software engineer is in a constant search of ways to do less work—that is—less repetitive, boring, error-prone work. Many engineers will spend 2 hours building an automated system to perform a 1 hour one-time task. Except for that one case, automating any manual process is a pay-off in efficiency and developer productivity.
When we ask customers about their automated build system, many of them say, “Yes, we have one, it runs every night.” There are generally two things wrong with that statement. First, “every night’ is the wrong frequency. Second, our next follow-up question is always, “When was the last time the build succeeded?” Customers will usually answer, “A few days/weeks/months ago.” (See Broken Builds below.)
Automated builds need to occur on every change that goes into source control, immediately. There should be a quick “build and smoke test” that can complete in 5 minutes or less—before the developer has changed context and moved on to their next task. This doesn’t mean full testing; the smoke test just proves that the system builds (e.g., that a new file hasn’t been neglected to be added to source control), and that the simplest of tests will run. After the smoke test, the more elaborate testing can begin.
Broken Builds are a Black Mark
Just like the description of “broken windows” in The Pragmatic Programmer (Hunt, Thomas 1999), allowing checked-in changes to break a software build and leaving that build broken for any length of time is a signal to your team that substandard results are acceptable and maybe even the norm. This can have a significant negative impact on your quality culture and morale.
Breaking changes happen. The key is to address them quickly. When the build breaks, someone (preferably the offending party) stops what they are doing and fixes it. That is why the 5-minute smoke test above is so important. The offending party should not be “too busy” to fix the build they broke only 5 minutes ago.
In our shop, SDS has run an online chat for over 12 years, first in Campfire and now Slack. In both systems, we have added a bot to integrate our CI system to the chat. When a build breaks, the bot posts to the chat channel identifying the change (and sometimes the author of the change). Someone stops what they are doing and fixes the build. Yes, if builds were breaking all day long, we would not get anything accomplished. But if the builds were breaking all day long, its clear we weren’t getting anything useful accomplished anyway, right?
Developers Must Take Responsibilty for Test Writing
Test writing is not a QA or Test Team responsibility. Certainly, these teams may own the testing process, and they may maintain the test equipment and the test cases. But allowing devlelopers to abdicate the responsibility for test writing harms the product and the team overall, and it fosters an adversarial QA approach that is the antithesis of agile software development and high-performance teams.
Developers need to work directly with the QA and Test team during requirements analysis, design, and development. The QA/Test team should be the owners of the test frameworks and the test automation system. They should not be writing tests in a vacuum during or after the development is complete. This is the equivalent of trying to paint over a large dent in an auto fender. You can make it shiny, but it will still be a dent.
Whitebox before Blackbox
When it comes to testing, there is whitebox and blackbox (and shades of gray, but we’ll leave those out for now). Whitebox testing is done knowing the inner workings of the system, leveraging that knowledge, and perhaps exposing some of those inner workings to the test case. Blackbox testing is done purely at an interface or user perspective with none of the advantages of system knowledge.
Except for usability testing, we prefer whitebox over blackbox in all other aspects of testing. The additional knowledge and the additional visibility gained by having “hooks” into the system makes the testing process more rigorous and often more robust. There is no comparison in the efficiency with which a whitebox, scriptable test suite can be written (see Scripting Language below) versus a blackbox GUI test (see Paradox of GUI Testing below).
Build in a Scripting Language
Scripting can do things that manual testers will tire of, such as loading/saving a file thousands of times to check for memory leaks in that portion of the code. Or having a regression test that times the loading of a large input file to watch for slowdowns in the file load process and catch any change that causes it (we have done this, and it paid off).
Building in a scripting language prevents two major mistakes later. First is being tempted to start building a “command windows’” with commands, then loops, then ifs (“Hey, I’m building my own scripting language, this is fun.”) Waste of time. Second, it removes the need to use these “automated testing tools” with their substandard scripting languages.
The Paradox of GUI Testing
This is probably the most controversial statement in this whitepaper: We have participated in half a dozen pure GUI testing projects, and none of them had a positive ROI (return on investment). Sure, we found bugs while writing the GUI tests; in fact, we found all the bugs while writing the tests. Then came the ROI drain of maintaining a GUI test suite. Months and years go by of fixing the GUI tests, adapting to changes, and making them robust under all possible circumstances. Bye-bye ROI.
Macro-record-playback tests have a shelf life of about one software revision, then they are useless and impossible to maintain. More sophisticated object-matching frameworks yield tests that can be maintained, but in all of our experience, the cost of maintaining those tests would have been better spent writing new tests (and finding new bugs) or doing other process improvement.
So, what is the answer? See above – scripting. In one of our systems (a drag-and-drop application builder with many GUI actions), SDS automated tens of thousands of “GUI” tests with no actual GUI interaction. Every GUI action was performed one level down by scripting that drove the same action as the GUI. We didn’t need to test that drag/drop from a palette to an X,Y position worked in MS Windows; we needed to test that the software did the right thing when component C1 was dropped at position X,Y, connected to component C2 and then the “Build” action was performed. For that, scripting was a far better choice.
In our experience, GUI testing is also usability testing, which still requires a human interaction. An automated GUI test won’t notice an annoying GUI flicker or a 300ms latency which is enough to annoy the user—unless a GUI test is specifically written for it. Another waste of resource.
Perform your usability testing at the GUI manually. Do everything else that you can with scripting.
Start Small and Automate Everything
These two statements seem contradictory, and they are. And at the same time, they are not. The end goal of a CI system is to automate the entire build/test/deliver-deploy process. One of SDS’s largest projects had a system that could take a source change, build, smoke test, full validation test, build the installer, test the installer, generate the release notes, FTP the installer, and release notes to the customer, and send email to the customer informing them of the release. That is a world-class system and it felt great to sit and watch it do our work while we added value in other ways.
But even that project started small, with a simple automated CI build. Then a few unit tests. Then an automated installer build. Each step adding to the capability, removing human error, and freeing developers and testers for more engaging, productive work.
Continuous Integration and Automated Testing do not have to be all-or-nothing propositions. Selectively choosing those parts that are easily done first and getting some payback there can go a long way to convincing youself, and your management, that CI and Automated Testing are well worth the investment.
SDS has helped many customers in building automated build and test environments. We have a certified Agile coach on staff who guides our internal processes and has helped our customers improve their processes as well. No matter what your level of agile adoption or process automation, SDS can help you take that first step or fine tune the process that you already have in place.
Contact: [email protected]