I’m a person that usually writes tests before the implementation. In the context of my backup application project this has turned out to really slow me down. But it’s not just a problem of my personal projects. It also affects my professional work.
Here’s the issue: for some tests you need test data and generating that test data can be a tedious task, depending on the complexity. This has caused me to procrastinate on my backup app. So, one evening, after having thought about this during a workout, I grabbed my laptop, sat down in my comfy bed and wrote a “DSL” that makes creating the data much simpler. Not only is it easier to create the data now, allowing me to continue at a faster pace, it’s also much more readable and the test setup doesn’t clutter the test case anymore. This is a very important aspect of a test. What good does it to have one if, after some time, you have to update it and don’t understand what it does anymore?
I put “DSL” in quotes because I wouldn’t call it a proper Domain Specific Language. It’s still specific enough to kind of call it that way because I wouldn’t know how else to name it.
Let’s get to some code. In my application I have the following POJOs (omitting methods for brevity).
public enum ArchiveType {
FULL,
INCREMENTAL
}
public class ArchiveBase implements Comparable<ArchiveBase> {
private ArchiveType type;
private Instant created;
}
public class Archiveextends ArchiveBase {
private Archive incremental;
}
An Archive is the result of a backup operation and it is either a full- or an incremental backup. These classes are serialized to JSON and make up the file that contains the complete backup history.
Let’s assume I wanted to create the POJO equivalent of this JSON.
[
{
"type": "full",
"created": "2019-03-21T21:23:33Z",
"incremental": {
"type": "inc",
"created": "2019-03-23T11:23:33Z",
"incremental": {
"type": "inc",
"created": "2019-04-23T11:23:33Z",
"incremental": null
}
}
},
{
"type": "full",
"created": "2019-03-24T21:23:33Z",
"incremental": null
}
]
Using only the available classes, I would have to write something like this in my tests.
var now = Instant.now();
var full1 = Archive.createFull(now);
var inc1 = Archive.createIncremental(
now.plus(1, ChronoUnit.DAYS), full1);
var inc2 = Archive.createIncremental(
now.plus(2, ChronoUnit.DAYS), inc1);
var full2 = Archive.createFull(
now.plus(3, ChronoUnit.DAYS));
As you can see, this is extremely verbose and difficult to read. I wouldn’t even want to write it 10 or 20 or however many times. Just imagine how hard to follow this will become when I create 15 or 25 such objects in order to simulate a complex file.
For example, in my case I rarely need to have exact control over the timestamp. It is enough that it is predictable. I don’t want to calculate the different values every time as that adds a lot of additional code to the test.
In this application only two parameters are required to create the objects that I want, the created timestamp and the nested archive incremental. If you have more complex classes, it would be even worse. From my experience I dare say that most parameters can be set to the same default value in every test. Most of the times it is enough to have a specific number of objects and their relationships to each other that make sense in the context of the application.
In order to make this simpler, I have created specific classes only available in tests and given them the ability to calculate the timestamp themselves, a few more internals to maintain the hierarchy and provided short helper methods to make it concise to write.
public abstract class ArchiveType {
public static class FullArchive extends ArchiveType {
FullArchive(Incremental child) {
super(child);
}
Archive create(Instant instant) {
Archive archive = Archive.createFull(instant);
return setAndReturnParent(child, archive);
}
}
public static class IncrementalArchive extends ArchiveType {
Archive parent = null;
IncrementalArchive(Incremental child) {
super(child);
}
Archive create(Instant instant) {
Archive archive = Archive.createIncremental(
instant, parent);
return setAndReturnParent(child, archive);
}
}
// Generic approach is less to write here,
// but more when using it.
public static <T> T mk(Class<T> clazz)
throws ReflectiveOperationException {
return mk(clazz, null);
}
public static <T> T mk(Class<T> clazz, Incremental child)
throws ReflectiveOperationException {
Constructor<T> constructor =
clazz.getDeclaredConstructor(Incremental.class);
return constructor.newInstance(child);
}
// Specific methods create some boilerplate here,
// but make using it shorter.
public static FullArchive Full() {
return Full(null);
}
public static FullArchive Full(Incremental child) {
return new FullArchive(child);
}
public static IncrementalArchive Inc() {
return Inc(null);
}
public static IncrementalArchive Inc(IncrementalArchive child){
return new IncrementalArchive(child);
}
IncrementalArchive child;
ArchiveType(IncrementalArchive child) {
this.child = child;
}
abstract Archive create(Instant instant);
}
The other piece of the puzzle is an abstract class that test classes extend.
public abstract class ArchiveTestBase {
List<Archive> allArchives = new ArrayList<>();
Instant baseInstant;
@BeforeEach
void setup() {
baseInstant = Instant.now();
}
@AfterEach
void cleanup() {
allArchives.clear();
}
protected void create(ArchiveType... archives) {
Arrays.stream(archives)
.forEach(this::createWithChildrenAndAddToAllArchives);
}
protected Archive get(int index) {
assert index >= 0 && index < allArchives.size();
return allArchives.get(index);
}
private void createWithChildrenAndAddToAllArchives(
ArchiveType archiveData) {
if (Objects.nonNull(archiveData)) {
Archive archive = archiveData.create(getNextInstant());
allArchives.add(archive);
createWithChildrenAndAddToAllArchives(
archiveData.child);
}
}
}
Side-note: In Swift I very likely could have gone with enums all the way because they support variable parameters that you can even assign yourself when you create an instance of an enum literal. Then there would have been no need to write any factory method.
The usage now looks like this. First the generic version…
class ArchivesFileTest extends ArchiveTestBase {
@Test
void read_from_json_simple()
throws ReflectiveOperationException {
create(
mk(Full.class,
mk(Incremental.class,
mk(Incremental.class))),
mk(Full.class)
);
}
}
…and now the one with specific methods.
class ArchivesFileTest extends ArchiveTestBase {
@Test
void read_from_json_simple() {
create(Full(Inc(Inc())), Full());
}
}
I simply nest a few method-calls et voilà, I have my test data. It’s much easier to follow in my eyes than creating every object using its constructor or factory method. This little one liner replaces the ugly eight lines I showed you earlier. Now tell me, which version allows you to best reason about it?
Side-note: If Java were to support creating an instance off of the generic type, like in C++, then it should be possible to skip the type Class<T> parameter and call the method without the “.class” postfix. Something like this:
create(mk<Full>(mk<Incremental>(mk<Incremental>())),mk<Full>());
Unfortunately this is not possible, but the Full() and Inc() methods are nicely concise as well.
As you can see from the ArchiveTestBase abstract class, my approach to accessing the data is by index in the order the objects were specified.
Archive inc1 = get(1);
Archive full2 = get(3);
The complete code can be found on GitHub, including Javadoc that I have left out to keep the samples shorter. Beware that this is only the commit containing the relevant files for the “DSL”. The project does not yet compile as I have unfinished code on my computer still.