
Mockists Are Dead. Long Live Classicists.
The recent has made the level of dissatisfaction about too much mocking and stubbing when writing automated testsĀ pretty clear. DHH expressed his strong opinions about fundamentalism around tests that canāt access collaborators. This has also been a key point in . The threeĀ of them claimed to āmock almost nothingā.
The āCan you sleep at night?ā test
Kent Beck emphasized, that at the end of the day as developers/programmers, itās our job to make sure we sleep at night knowing we didnāt break anything. Very different from early days when developers would just commit code and wait for a different group of people - the testers - to make sure nothing had been broken. This is one of the main purposes of an automated suite of tests, regardless of whether it has been written in a ātest firstā way or after the code. The objective of the tests is to make sure things are working and that the new code doesnāt cause any problems with the existing code. We do not get this level of confidence if we have a test that mocks or stubs all its collaborators. Iāll explain why.
Why is mocking/stubbing dangerous?
are not evil per se. However, as with any other tool, itās the way we use them that makes them bad. The problem is not when we stub or mock external dependencies like web service calls or other integration points. The main issues emerge when we isolate our code from our own code. is an anti-pattern that explains some of the reasons why mocking and stubbing too much is dangerous. It was said during the hangout, āIf I use TDD I can refactorā. But, what happens if your test is too white box and knows so much about how things are implemented that if you refactor something, your test fails and you have to refactor your test? It entirely defeats the purpose of having a test to make sure that you didnāt break anything with your changes.
TTDD is already extremely dangerous in languages like Java and C#, but even more so in dynamic languages like Ruby and Javascript where you can stub or mock a method that doesnāt even exist. I will illustrate a real-world example that Iāve seen happen several times. Letās say you have a controller whose responsibility is to validate a model and display its errors. Letās say the Model has a method ā±š°ł°ł“ǰł²õā. The image below, , shows this example.
When testing the controller, mockists would decide to isolate the model, and then mock or stub the method errors on the model. Ruby, dynamic as it is, and testing frameworks like , allow us to stub/mock the model.
If you pay attention, the method in the model is called errors (plural). However, the controller has a problem because itās calling it in singular error. Which means that the code is wrong! Butā¦the test passes! What do I have here? A false positive. A green test giving the developer some sense of security, when their code is actually wrong. Not only could the name of the methods be wrong, but also their interface. Recently, after upgrading a dependency, I discovered that a method that was being āstubbedā in several unit tests changed its return contract: instead of returning nil it returns an empty array, for when there are no errors. And, once again, our tests are green, but the code is broken. Furthermore, not only names and interfaces, but also most importantly the behavior of the code could be wrong. Wrong methods could be called, but if the test is too white box and only checks for interactions, the test will pass and the developers will think theyāre sleeping well and safe at night, when their code is broken, sometimes already in production, where problems will be caught by real users. Believe me, Iāve seen this happen, have you?
So whatās the purpose of these tests that only look for collaboratorās interactions and mocks more than what it needs to? Itās better not to have them; at least we would have known we had to test this behaviour from a different level.Ā The 3 gurus vigorously discussed this topic as well: What is a Unit?
How do you define a āUnitā?
The unit to be tested is the entire point of confusion and debate. As Martin pointed out āObject-oriented design tends to treat a class as the unit, procedural or functional approaches might consider a single function as a unitā, when the unit is actually a behavior. It has to be up to the developers, without any fundamentalism, to determine what a unit is. Think about the (DOT) and make sure you make them shallow enough that you can test one behavior. The image below illustrates the concept of DOT.
Itās not necessarily a class or a method or a function, but itās whatever YOU, as the developer writing the code and the test, decide it to be based on your design and your boundaries. And, of course, if after determining the depth of your test you still need some stubs or mocks underneath its boundaries, go for it. But letās stop mocking and stubbing just because we can or think we have to.
Letās reboot our thinking about TDD!
Disclaimer: The statements and opinions expressed in this article are those of the author(s) and do not necessarily reflect the positions of ÷ČÓ°Ö±²„.