One of the hardest things to teach growing devs is when not to use their newfound skills. After putting huge effort into learning with a mentor, having that same mentor pick up a feature and say “No, just crap this one out, it’s not important” seems inconsistent and bewildering.

To help explain, I use what I call “The Everybody Poops Rule.” It goes like this:

Everybody poops. But you don’t poop in every room in the house. You have a special room where you poop, you put a door on it and you only poop there.

Okay, it’s crass but that makes it memorable. There’s two key points here:

First, in the same way DDD is about admitting there is no single consistent model of the world or that TDD is about admitting failures are unavoidable, the Everybody Poops rule is about admitting some parts of a system will be…well, crap. Not every part of the system could or should be top tier quality. Eventually, everybody poops. It’s a natural process.

But, and this is the second point, that doesn’t mean it’s unmanageable. DDD copes with the lack of a consistent model by embracing and integrating multiple models. TDD puts verification upfront, admitting all code is untrustworthy until proven otherwise.

The Everybody Poops rule responds to the existence of crap by embracing encapsulation. That company blog? It’s not a core system, throw it together. A shell script? Pffft, crap it out. That little framework glue code? Psha, it’ll be removed in the next framework release.

When in doubt, default to quality. But if the code is isolated, unimportant and behind an interface: who cares if it smells? We can still build a reasonable codebase, we can live in a reasonable house, as long as we only poop in specific areas and you keep the door closed.

Let’s talk about some of the implications.

Let It Go

First and foremost, you can’t stop people from pooping. When dealing with a sufficiently complex system, there will always be waste. Holding it in is unhealthy and impossible.

Most teams follow the Broken Window Theory, fearing even a single tradeoff starts the slide down a slippery slope. This can reduce discussion (read: dissension) in the short term but leads to arbitrary compliance or worse.

Deciding on a level of quality isn’t like deciding on a coding standard, you can’t have an off-the-shelf-always-okay answer. Quality is the place to have nuanced discussions.

Not On The Walls

Even when you are producing crap, there are still standards of decency. Maybe the team is fine with a God Class, or a really long method, maybe it doesn’t have all the unit tests, that’s okay but it still better have docs and pass your static analyzers.

Finding the right balance can be difficult since you’re already lowering your standards. Focus on the tradeoffs here, including the time and energy spent discussing.

Encrapsulation

How does one isolate crap? In practical terms, any way you normally split code. Put it behind an interface. Drop in a service layer. Pull it out to another namespace. Split it into a separate codebase. Delegate to a third party package. Build a bridge interface. Push it behind an API.

Remember, none of this erases the existence of the poop. Putting it behind a door doesn’t somehow make it clean or better. It’s just hopefully somewhere you don’t have to touch it all the time.

This isn’t easy work. Perversely, it often takes more skill to handle poop safely than it would to make to not make crap in the first place. Poop juggling is a senior ability.

Waste Management Systems

Never put poop behind a door unless you have a plan to get rid of it later.

That might seem silly since the poop is, by definition, unimportant. But ask yourself: if you can’t flush it, is it really isolated? Without good waste management, poop has a nasty habit of backing up and flooding you at the worst moment. And poop rarely ages well.

Besides, it might be an unimportant detail now but times change. Make sure you’re in a good position to replace or redirect that code if the business pivots. Today’s manure is tomorrow’s fertilizer.

Iterable Bowel Syndrome

Other times, you might want to go in and make a deliberately crappy pass at building an important system. Even worse than a spike or proof-of-concept, these barebone first drafts may stink but they’re a great way of gaining knowledge or scaffolding other systems.

Just make a firm commitment to flush it at the end. If you name it, you (or your manager) is going to get attached to it.

Don’t Force It

Just because you can manage poop successfully doesn’t mean you should do it everywhere. No amount of engineering will ever make a kitchen toilet a good idea. Similarly, don’t crap in the domain layer.

Before you make the decision to crap something out, ask yourself:

  • How often do you have to touch it?
  • Is it it going to hang around for a long time?
  • Is it a core domain for your company, e.g. do you draw competitive advantage from it?
  • What are the consequences if the project fails in development? At each quarter of its projected life cycle?
  • What are the consequences if it exceeds its lifecycle?
  • Do you know the maximum and minimum amount of complexity possible?
  • Can you find a solution to keep the complexity “bounded”, i.e. within a certain range?

Raising a Stink

If you do make the decision to underengineer something, bring your stakeholders into the decision. They’ll know if it’s worth it and it avoids surprise. Nothing undermines confidence in an engineering team faster than someone else coming along and saying “What is this crap?!” without knowing it’s supposed to be crap.

Make sure you document it. Not only will this help cover your butt, this is the stuff you want to touch the least so it’s the most likely to be forgotten. Finally, this documentation signals to new hires that quality is the standard and YOLO is a rare deviation.

Teach the standard. Document the deviation.

In The End

Many have commented that the best devs often write code indistinguishable from the newest juniors. On a line by line basis, that might be true. But that’s okay. When the work is done, you’ll realize their focus was somewhere else the whole time.

They weren’t worrying about the plumbing.

They were building the house.

Thanks to Cees-Jan Kiewiet, Igor Wiedler, Frank de Jonge, Anthony Ferrera and Matthias Noback for proofreading.

The most common question I get about my command bus library is: “can commands really return nothing?” The second most common question is “Why do you let commands return something, don’t you know that’s Wrong(tm)?”

It’s easy to get hung up on form, not function. That’s a shame because command buses aren’t important. In the grand scheme of things, they’re completely, utterly, totally unimportant. It’s all about where the messages are going, not how they get there.

Still, let’s take a look at some myths about command buses.

“Command Buses are a part of CQRS and they always go together.”

CQRS is about the separation of a (write) model that solves a problem and a (read) model that informs you about problems. This is a useful lens for viewing some domains.

While I wasn’t there when the term was coined (and haven’t asked), it’s pretty logical to assume the “Command” in “Command Bus” had the same etymology. It was a piece of plumbing used to address the write model, which is the important part of our app.

But that’s a one way relationship. Write model doesn’t care how you send it information. A “old school” service layer like this:

$this->serviceLayer->openAccount($email, $password);

is just as CQRS-y as a command bus like this:

$this->serviceLayer->handle(new OpenAccount($email, $password));

It has the same guarantees, the same ability to encapsulate. It’s just a minor difference in implementation. And no, it doesn’t return anything.

Conversely, if we’re using either version in a non-CQRS app, it doesn’t matter if we return something or not. The non-returning is a property of CQRS, not the Command Bus.

“You can only use a Command Bus if you’re doing CQRS.”

Nowadays I would argue the term “Command” in Command Bus has less to do with the CQRS write model and more to do with the classic Gang of Four definition of “Command”, i.e. breaking down operations into discrete units and passing them around. Command Bus has found a broader audience as a modern take on implementing a service layer. And that’s okay.

The advantages for a command bus hold true whether you’re using CQRS or not:

  • It’s easy to decorate the uniform interface
  • It’s different enough to form a clear barrier between UI and domain model
  • The command class is a nice way to capture ubiquitous language
  • The command DTOs can pair well with form libraries and serializers.

We use a command bus because it’s a useful implementation idiom, not because it’s a fixed part of CQRS. It’s not key architecture, it’s just a handy little twist.

“I should be using CQRS and Command Buses because that’s good practice”

No, they’re useful practices for some types of problems. CQRS is rarely a primary goal anyways. Most of the folks I see using it are actually oriented at doing Event Sourcing, which is another orthogonal concept altogether.

“I shouldn’t be returning anything because CQRS is CQS which says don’t return anything”

CQS (like CQRS) can be a useful lens for some problems but not all situations or practices. It is a good guideline for warning you against side-effect prone code. That said, CQS is aimed at method level whereas CQRS is targeting higher-level architecture. I’d argue they’re spiritually similar but CQRS isn’t direct a superset of CQS.

“We don’t return things because commands are asynchronous”

At a technical level, this seems legit. We could be sending the command into a background queue so we don’t know when it’s actually going to be executed. Therefore, we can’t count on a return value for any command.

In earlier times, this was touted as a feature. Commands are so easy to serialize, why wouldn’t we put them in queues and message buses like we do for events? At least, this is how I first understood it.

As time passed though, there’s been a general realization this doesn’t hold up. If I can’t execute the bulk of a user’s request directly because of some resource constraint, then I should still record and acknowledge they started it.

If you’re doing Event Sourcing, using a Saga or Process Manager is a very natural fit here; the initial command doesn’t do anything beyond raising an event that a user requested we do something but at least we have a firm record of it. Then using the same systems we use to distribute events to projectors, we can trigger a background worker when resources are available. If there are any problems, the Saga/Process Manager can retry using the recorded event.

In other words: let events be asynchronous, make commands be synchronous.

“We don’t return things because it’s our write model.”

Yes, now we’re getting somewhere.

The CQRS aversion to return values has nothing to do with return values specifically. It’s about making certain the write model isn’t reporting about the current state of the system, that’s the read model’s job. That’s the main supposition of CQRS, after all, that treating problem solving and reporting separately lead to less friction and more flexible models.

Whether the write model gives back state by return value, reference, pointer or passing indirectly via another object doesn’t matter. The goal is to keep the write model from taking on reporting duties.

The lack of return values is a consequence, not a first order rule. We’re not silencing the write model, it just has nothing to say.

“Right, we never return anything ever. EVER.”

We’re not having the write model generate and pass back the changed state of the app.

That said, there are instances the write model responds. In PHP or similar languages, the write model typically raises exceptions to signal error states. While exceptions are semantically and idiomatically different than return values, they can be used in a similar fashion. If we were working in a language without exceptions (like Golang), then it would be very reasonable to return errors from the command bus.

None of this violates CQRS principles. Even orthodox folks will concede that returning an acknowledgement or some other “small” value is reasonable (albeit in rare, always unspecified edge cases). In practice, the most common use case like this is returning a complex id from the command handler. While I (personally) don’t think that’s a huge problem, it’s pretty easy to work around this by introducing a domain service that can be used to pre-generate the id in the controller.

Finally, it’s important to remember that write models often produce state and make it available through other means. The most common reason for CQRS right now is Event Sourcing, which produces state as events. While events are recorded mainly for the purpose of future write model decisions, they are distributed through the system, so in the sense of availability, they’re as good as returned.

“Wait, so I can return events from my command handlers and still be doing CQRS?”

I don’t know. Maybe? But even if that was internally consistent, I wouldn’t recommend returning events all the way up to the controller dispatching the command.

Examining the event stream and deciding what to show the user can be a complex task. After all, it’s complex enough that we have the concept of a read model.

The last thing most apps need is more logic in controllers, so don’t burden controllers with the task of interpreting a partial event stream. There might not be enough information to make a good projection but there is enough complexity to make a bad one.

So, while we could get that data at the controller level, that doesn’t make it the best place to handle it. Nor does having the events there overcome the UX frustrations of eventually consistency. If that’s a problem, you don’t have scaling issues and you have a taste for the unorthodox, then flout eventual consistency of your read models however you like but just returning events isn’t the fix for that.

In the end, my advice is let the write model communicate to the read model horizontally and query the read model from your user interface vertically.

“Cool story bro but I’m using a Command Bus that doesn’t return anything so I know I’m doing CQRS right.”

Not necessarily. What if your read and your write model are still tightly coupled? That can happen regardless of any bus.

In the same way it’s possible to use a data mapper ORM and still produce tight coupling on a database, it’s perfectly possible to use a command bus that doesn’t return anything and not be doing CQRS.

Tooling never guarantees an outcome.

“So…should I be returning stuff or not?”

In a non-CQRS app? Do whatever the heck you want.

In a CQRS app? Probably not, depending on how you’ve wired it up. But remember, the point isn’t to cargo-cult about return values, it’s to understand the constraints and decisions that lead to that tactical choice.

Architectures have evolved certain conventions in the way they’re built, it’s the tradeoffs we’re trying to respect, not the specific form. At the same time, these conventions evolved because of the experience of previous developers and we can learn from them to avoid bumping into the same issues.

Thanks to Frank de Jonge, Jeroen vd Gulik and Shawn McCool for proofreading and feedback.

tl;dr When creating value objects representing time, I recommend choosing how finegrained the time should be with your domain experts and round it off to that precision in the value object.

When modeling important numbers, it’s considered good form to specify the precision. Whether it’s money, size or weight; you’ll typically round off to a given decimal point. Even if it’s only for user display, rounding off makes the data more predictable for manipulation and storage.

Unfortunately, we don’t often do this when handling time and it bites us in the rear. Consider the following code:

$estimatedDeliveryDate = new DateTimeImmutable('2017-06-21');

// let's assume today is ALSO 2017-06-21
$now = new DateTimeImmutable('now');

if ($now > $estimatedDeliveryDate) {
    echo 'Package is late!';
} else {
    echo 'Package is on the way.';
}

Since it’s June 21 in the code sample, this code should print “Package is on the way.” After all, the day isn’t over yet, it might just be coming later in the afternoon.

Except the code doesn’t do that. Because we didn’t specify the time component, PHP helpfully zero pads $estimatedDeliveryDate to 2017-06-21 00:00:00. On the other hand, $now is calculated for…now. “Now” includes the current time (which probably isn’t midnight), so you’ll get 2017-06-21 15:33:34 which is indeed later than 2017-06-21 00:00:00.

Solution 1

“Oh, this is a quick fix.” folks might say and update it to the following.

$estimatedDeliveryDate = new DateTimeImmutable('2017-06-21');
$estimatedDeliveryDate = $estimatedDeliveryDate->setTime(23, 59);

Cool, we changed the time to include up to midnight. Except the time is padded to 23:59:00 so if you look in the last 59 seconds of the day, you’ll have the same problem.

“Grrr, okay.” folks might say.

$estimatedDeliveryDate = new DateTimeImmutable('2017-06-21');
$estimatedDeliveryDate = $estimatedDeliveryDate->setTime(23, 59, 59);

Cool, now it’s fixed.

…Unless you’re on PHP 7.1 which adds microseconds to DateTime objects. So now it only occurs on the last second of the day. I may be biased after working on too many high traffic systems but sooner or later a user or a automated process will hit that and complain. Good luck tracking down THAT bug. :-/

Okay, let’s add microseconds.

$estimatedDeliveryDate = new DateTimeImmutable('2017-06-21')
$estimatedDeliveryDate = $estimatedDeliveryDate->modify('23:59:59.999999');

And this works.

Until we get nanoseconds.

In PHP 7.2.

Okay, okay, we CAN reduce the margin of error further and further to the point that errors become unrealistic. Still, at this point it should be clear this approach is flawed: we’re chasing an infinitely divisible value closer and closer to a point we can never reach. Let’s try a different approach.

Solution 2

Instead of calculating the last moment before our boundary, let’s check against the boundary instead.

$estimatedDeliveryDate = new DateTimeImmutable('2017-06-21');

// Start calculating when it's late instead of the last moment it's running on time
$startOfWhenPackageIsLate = $estimatedDeliveryDate->modify('+1 day');

$now = new DateTimeImmutable('now');

// We've changed the > operator to >=
if ($now >= $startOfWhenPackageIsLate) {
    echo 'Package is late!';
} else {
    echo 'Package is on the way';
}

So this version works and it’s always accurate throughout the whole day. Unfortunately, it’s also become more complex. If you don’t encapsulate this logic within a value object or similar, it’ll get missed somewhere in your app.

Even if you do encapsulate it, we’ve made this one type of operation (>=) logical and consistent but it’s not a consistent fix for all operations. If we wanted to support equality checks, for example, we’d have to do another, different type of special data juggling to make that operation work correctly. Meh.

Finally (and this might just be me) this solution has the misleading smell of a potentially missed domain concept. “Is there a LatePeriodRange? A DeliveryDeadline?” you might say. “The package enters into a late period, then….something happens? The domain expert never mentioned a deadline, but it seems to be there. Is that different than the EstimatedDeliveryDate? Where does it go?” It doesn’t go. It doesn’t go anywhere. It’s just a weird quirk of the implementation that’s now stuck in your head.

So, this is a better solution in that it consistently yields a correct answer…but it’s not a great solution. Let’s see if we can do better.

Solution 3

So, all we want to do is compare two days. Now, if we picture a DateTime object as a set of numbers (year, month, day, hour, month, second, etc…) everything up to the day part is working fine. All of the problems we’ve had are due to the extra values after that: hour, minute, second, etc. We can argue about the annoying and insidious ways those values keep leaking in there, but the fact remains that the time component that’s wrecking our checks.

If the day component is all that’s important to us, why do we put up with these extra values? Unless it rolls over into the next day, a few extra hours or minutes won’t change the outcome of the business rules.

So, let’s just throw the extra cruft away.

// Simplify the dates down to just the day, discarding the rest
$estimatedDeliveryDate = day(new DateTimeImmutable('2017-06-21'));
$now = day(new DateTimeImmutable('now'));

// Now the comparison is simple
if ($now > $estimatedDeliveryDate) {
    echo 'Package is late!';
} else {
    echo 'Package is on the way.';
}

// Clunky but effective way to discard extra precision. PHP
// will zero pad the remaining values such as milli/nanosecond.
// In the case of dates, we can just use the setTime function but for
// other precisions, you'll have to use other logic to discard the extra
// data beyond what you need.
function day(DateTimeImmutable $date) {
    return $date->setTime(0, 0, 0);
}

This gives us the simpler comparison/calculation we saw in solution 1, with the accuracy we had in solution 2. It’s just…the ugliest version of the code yet, plus it’s super easy to forget to call day() within in your code.

However, the code IS easy to abstract. More importantly though, it’s becoming clear that when we’re talking about estimated delivery dates, we’re ALWAYS talking about a day, never about a time. Both of these things make this a good candidate for pushing this into a type.

Encapsulation At Last

In other words, let’s make this a value object.

$estimatedDeliveryDate = EstimatedDeliveryDate::fromString('2017-06-21');
$today = EstimatedDeliveryDate::today();

if ($estimatedDeliveryDate->wasBefore($today)) {
    echo 'Package is late!';
} else {
    echo 'Package is on the way.';
}

Look how nice that reads. The value object itself is nice and boring:

class EstimatedDeliveryDate
{
    private $day;

    private function __construct(DateTimeImmutable $date)
    {
        $this->day = $date->setTime(0, 0, 0);
    }
    public static function fromString(string $date): self
    {
         // Possibly verify YYYY-MM-DD format, etc
        return new static(new DateTimeImmutable($date));
    }
    public static function today(): self
    {
        return new static(new DateTimeImmutable('now'));
    }
    public function wasBefore(EstimatedDeliveryDate $otherDate): bool
    {
        return $this->day < $otherDate->day;
    }
}

Because we’ve now made this a class, we’re automatically enforcing a lot of helpful rules: You can only compare a EstimatedDeliveryDate to another EstimatedDeliveryDate, so the precision always lines up.

The correct precision handling is in a single internal place, the consuming code never needs to consider precision at all.

It’s easy to test.

You’ve got a great single place to centralize your timezone handling (not discussed here but super important).

One quick pro-tip: I’ve used a today() method here to show how you can have multiple constructors. In practice, I’d recommend creating a system clock and get your “now” instances from that, it’ll make your unit tests much easier to write. The “real” version would probably look like:

$today = EstimatedDeliveryDate::fromDateTime($this->clock->now());

Precision Through Imprecision

The important takeaway here isn’t “value objects, yay, inline juggling, boo!” It’s that we were able to remove several classes of errors by reducing the precision of the DateTime we were handling. If we hadn’t done that, the value object would still be handling all of these edges cases and probably failing at some of them too.

Reducing the quality of data to get a correct answer might seem counter-intuitive but it’s actually a more realistic view of the system we’re trying to model. Our computers might run in picoseconds but our business (probably) doesn’t. Plus, the computer is probably lying anyways.

As devs, it might feel we’re being more flexible and future-proof by keeping all possible information. After all, who are you to decide what information to throw away? Yet, the truth is that while information can potentially be worth money in the future, it definitely costs money to keep it until that future. It’s not just the cost of a bigger hard drive either, it’s the cost of complexity, of people, of time, and in the case of bugs, reputation. Sometimes working with data in its most complex form will turn out to be worth the cost but sometimes it isn’t, so just blindly saving everything you can because you can isn’t always a winning game.

To be clear: I’m not recommending you just randomly remove available time information.

What I am recommending: Explicitly choose a precision for your time points, together with your domain experts. If you’re getting more precision than you expect, it can cause bugs and additional complexity. If you’re getting less precision than you expect, it can cause bugs and failed business rules. The important thing is that we define the expected and necessary level of precision.

Further, choose the precision separately for each use case. Rounding will usually be in the value object, not at the system clock level. As we’ll talk about later, some places still need nanosecond precision but others might only need a year. Getting the precision right makes the language clearer.

This Crap Is Everywhere

It’s worth pointing out that we’ve only talked about a specific type of bug here: excess precision throwing off greater than/less than checks. But this advice applies to a much wider set of errors. I won’t go into all of them, though I do want to point out a personal favorite, “leftover” precision.

// Let's assume today is June 21st, so this equals June 28
$oneWeekFromNow = new DateTimeImmutable('+7 days');
// Also June 28 but set explicitly or loaded from DB
$explicitDate = new DateTimeImmutable('2017-06-28');

// Comparing based on state, are these the same date?
var_dump($oneWeekFromNow == $explicitDate);

No, they’re not the same date because $oneWeekFromNow also has the current time whereas $explicitDate is set to 00:00:00. Delightful.

The examples above talked about precision primarily in time vs date but modeling precision applies to any unit of time. Imagine how many scheduling apps only need times to the minute and how many financial apps need support for quarters of the year.

Once you start looking at it, you realize how many time errors can be explained by undefined precision. They might look like bad range checks or poorly designed bounds but when you dive in, you start to see a pattern emerge.

My experience is that this class of errors are often missed in testing. System clock objects aren’t a common sight (yet), so testing code that uses the current time is a bit tricky. And when there are tests, the fixtures often don’t pad the date out completely so so it’s easy to miss the error windows.

Nor is this a problem specific to PHP’s DateTime library. When I tweeted about this last week, Anthony Ferrara mentioned how Ruby’s time precision varies depending on the operating system yet the database library had an fixed level. That sounds fun to debug.

Time is just hard to work with. Time math doubly so.

Choosing A Level Of Precision

So we can say that choosing a level of precision for your time objects is super important but how do we select the right one? As a rule of thumb, I would say be open-ended with timepoints for your technical needs but set an explicit level of precision for all of your domain objects.

For your logs, your event sourcing data, your metrics, go as fine-grained as you need/want/can. These are primarily aimed at technical personnel who are more familiar with fine grained dates and the extra precision is often necessary for debugging. You’ll likely need to get very finegrained for system or sequenced data. That’s okay, it’s what the constraints demand.

For business concerns, talk to your domain experts about how fine-grained that information needs to be. They can help you balance what they’re using now vs what they might need in the future. Business rules are often an area where you’re playing with borrowed knowledge, so shedding complexity can be a smart move. Remember, you’re not after an accurate-to-real-life model, you’re after a useful one.

Within the code, this might occasionally lead to varying levels of precision, even within the same class. For example, consider this class in an event sourced application.

class OrderShipped
{
    // Order object that's capped to day precision.
    private $estimatedDeliveryDate;

    // Order object that's capped to second level precision.
    private $shippedAt;

    // Event sourcing object that's capped to microsecond
    private $eventRecordedAt;
}

If the varying levels of precision seem strange, remember that these time points have very different use cases. Even the $shippedAt and $eventRecordedAt might point to the same “time”, but they belong to very different sections of the code.

You might also find the business is working with units (and therefore precisions) of time you don’t expect: quarters, financial calendars, shifts, morning/afternoon/evening parts. There’s a lot of interesting conversations to be had in exploring these extra units.

Changing Requirements

Another good part of having these conversations: If the business rules change in the future and you turn out to need more precision than originally recorded, then it was a joint decision and you talk about how to fix the legacy cases. Ideally, we can shift it from being a technical problem to a business one.

In most cases, this can be simple: “We originally only needed the date they signed up but now we need the time so we can see if it falls before business closing times.” Maybe this affects a small number of accounts and you can set them to the beginning of the next business day. Or just zero pad the dates. Or maybe there’s an extra business rule where signing up after 18:00 sets the subscription end date to tomorrow+1 year instead of today+1 year. Talk to them about it. Folks are more proactive and understanding with changes if you include them in the discussion from the beginning (if only to mitigate blame).

In more complex scenarios, you can look at reconstructing data based on other data in your system. Maybe we can derive it from event sourcing times or user registration dates. In some cases, it simply isn’t possible and you’ll have to construct new business rules about what to do with migrating legacy cases. But the truth is, you can’t plan for everything and you probably won’t know what will change. That’s life.

So that’s my thoughts about time precision: use what you need and no more.

Appendix: An Ideal Solution

Going forward, I feel there’s real practical benefits to choosing fixed precisions and modeling them as custom types. My ideal PHP time library would probably be something that provides units of time as abstract classes that I extend into my value objects and then build on.

class ExpectedDeliveryDate extends PointPreciseToDate
{
}
class OrderShippedAt extends PointPreciseToMinute
{
}
class EventGenerationTime extends PointPreciseToMicrosecond
{
}

By pushing the precision to the class, we force a decision about precision. We can limit methods like “setTime()” to the precisions they actually apply to (not on Dates!) and we can round DateInterval to whatever makes sense for the type. Most of the utility methods could have protected visibility and my value objects could expose only those that make sense for my domain. Also, we’d be encouraging folks to create value objects. So. Many. Value objects. Yessss.

Bonus points if the library makes it easy to define custom units of time.

Actually building it though? Ain’t nobody got time for that.

Many thanks to Frank de Jonge, Jeroen Heijmans, Anna Baas and Matthias Noback for feedback on earlier drafts of this article!

Over the last couple years, I’ve started putting my Exception messages inside static methods on custom exception classes. This is hardly a new trick, Doctrine’s been doing it for the better part of a decade. Still, many folks are surprised by it, so this article explains the how and why.

How does it work?

Let’s say you’re writing a large CSV import and you stumble across an invalid row, perhaps it’s missing a column. Your code might look like this:

if (!$row->hasColumns($expectedColumns)) {
    throw new Exception("Row is missing one or more columns");
}

This works in terms of stopping the program but it’s not very flexible for the developer. We can improve this is creating a custom exception class.

class InvalidRowException extends \Exception
{
}

Now we throw our custom Exception instead:

if (!$row->hasColumns($expectedColumns)) {
    throw new InvalidRowException("Row is missing one or more columns");
}

This might look like boilerplate but it allows higher level code to recognize which error was raised and handle it accordingly. For example, we might stop the entire program on a NoDatabaseConnectionException but only log an InvalidRowException before continuing.

Still, the error message isn’t very helpful from a debugging perspective. Which row failed? It would be better if we always included the row number in our error message.

if (!$row->hasColumns($expectedColumns)) {
    throw new InvalidRowException(
        "Row #" . $row->getIndex() . " is missing one or more columns"
    );
}

That’s better in the log but now the formatting on this one little message is getting a bit noisy and distracting. There’s no upper bound on this complexity: as the log message gets complex, the code will get uglier. Not to mention, there are multiple reasons we might throw an InvalidRowException but we’d need to format them all to include the row number. Booorrrriiing.

Moving the Formatting

We can remove the noise by pushing the formatting into the custom Exception class. The best way to do this is with a static factory:

class InvalidRowException extends \Exception
{
    public static function incorrectColumns(Row $row) {
        return new static("Row #" . $row->getIndex() . " is missing one or more columns");
    }
}

And now we can clean up the importing code without losing readability:

if (!$row->hasColumns($expectedColumns)) {
    throw InvalidRowException::incorrectColumns($row);
}

The only extra code is the function block surrounding our message. That function block isn’t just noise though: it allows us to typehint and document what needs to be passed to generate a nicely formatted message. And if those requirements ever change, we can use static analysis tools to refactor those specific use cases.

This also frees us from the mental constraints of the space available. We’re not bound to writing code that fits into a single if clause, we have a whole method to do whatever makes sense.

Maybe common errors warrant more complex output, like including both the expected and the received list of columns.

public static function incorrectColumns(Row $row, $expectedColumns)
{
    $expectedList = implode(', ', $expectedColumns);
    $receivedList = implode(', ', $row->getColumns());

    return new static(
        "Row #" . $row->getIndex() . " did not contain the expected columns. " .
        " Expected columns: " . $expectedList .
        " Received columns: " . $receivedList
    );
}

The code here got significantly richer but the consuming code only needed to pass one extra parameter.

if (!$row->hasColumns($expectedColumns)) {
    throw InvalidRowException::incorrectColumns($row, $expectedColumns);
}

That’s easy to consume, especially when throwing it in multiple locations. It’s the type of error messages everyone wants to read but rarely take the time to write. If it’s an important enough part of your Developer Experience, you can even unit test that the exception message contains the missing column names. Bonus points if you array_diff/array_intersect to show the actual unexpected columns.

Again, that might seem like overkill and I wouldn’t recommend gold plating every error scenario to this extent. Still, if this is code you really want to own and you can anticipate the common fix for these errors, spending 1 extra minute to write a solid error message will pay big in debugging dividends.

Multiple Use Cases

So far we’ve created a method for one specific use case, an incorrect number of columns.

Maybe we have other issues with our CSV file, like the existence of blank lines. Let’s add a second method to our exception:

class InvalidRowException extends \Exception
{
    // ... incorrectColumns ...

    public static function blankLine($rowNumber)
    {
        return new static("Row #$rowNumber was a blank line.");
    }
}

Again, a bit of boilerplate but we get extra space to write more detailed messages. If the same issue keeps occurring, perhaps it’s worth adding some extra details, links or issue numbers (or you know, fixing it more permanently).

public static function blankLine($rowNumber)
{
    return new static(
        "Row #$rowNumber was a blank line. This can happen if the user " .
        "exported the source file with incorrect line endings."
    );
}

Locking It Down

When you’re creating “named constructors” for domain objects, you use a similar technique but also declare the original constructor as private. I don’t do this with exceptions though.

First off, we can’t in PHP. Exceptions all extend from a parent Exception class and can’t change their parent’s __construct access level from public to private.

More importantly, there’s little benefit for Exceptions. With domain objects, we do this to capture the ubiquitous language and prevent users from instantiating the object in an invalid state. But with an exception, there’s very little state to keep valid. Furthermore, we’re inherently dealing with exceptional circumstances: we can’t foresee every reason a user might throw an exception.

So, the best you can do is create a base factory that you recommend other folks use when creating their exceptions. This can typehint for useful things commonly included in the message:

class InvalidRowException extends \Exception
{
    // One off error messages can use this method...
    public static function create(Row $row, $reason)
    {
        return new static("Row " . $row->getIndex() . " is invalid because: " . $reason);
    }

    // ...and predefined methods can use it internally.
    public static function blankLine(Row $row)
    {
        return static::create($row, "Is a blank line.");
    }
}

Which might be useful but is probably pretty far out there. I haven’t seen a convenient way to enforce it though.

The Big Picture

There’s one final benefit I’d like to touch on.

Normally, when you write your exception messages inline, the various error cases might be split across different files. This makes it harder to see the reasons you’re raising them, which is a shame since exception types are an important part of your API.

When you co-locate the messages inside the exception, however, you gain an overview of the error cases. If these cases multiply too fast or diverge significantly, it’s a strong smell to split the exception class and create a better API.

// One of these isn’t like the others and should probably be a different Exception class
class InvalidRowException extends \Exception
{
    public static function missingColumns(Row $row, $expectedColumns);
    public static function blankLine(Row $row);
    public static function invalidFileHandle($fh);
}

Sometimes we underestimate the little things that shape our code. We’d like to pretend that we’re not motivated by getting our error messages neatly on one line or that we regularly do a “Find Usages” to see our custom exception messages, but the truth is: these little details matter. Creating good environments at a high level starts with encouraging them at the lowest levels. Pay attention to what your habits encourage you to do.

Recently, a few folks asked about a trait in a new project I wrote. Right around the same time, Rafael Dohms showed me his new talk about complex cognitive processes we don’t notice. Because my brain is a big mushy sack, the two blended together. The result was this post, which tries to capture how I use traits but also how I decide to use them in the first place.

Leverage vs Abstraction

The first thing you should do is go read this blog post: “Abstraction or Leverage” from Michael Nygard. It’s an excellent article.

If you’re short on time, the relevant part is that chunks of code (functions, classes, methods, etc) can be split out for either abstraction or leverage. The difference is:

  • Abstraction gathers similar code behind a higher level concept that’s more concise for other code to work with.
  • Leverage gathers similar code together so you only have one place to change it.

A common abstraction would be a Repository: you don’t know how an object is being stored or where but you don’t care. The details are behind the concept of a Repository.

Common leverage would be something like your framework’s Controller base class. It doesn’t hide much, it just adds some nifty shortcuts that are easier to work with.

As the original blogpost points out, both abstraction and leverage are good. Abstraction is slightly better because it always gives you leverage but leverage doesn’t give you abstraction. However, I would add that a good abstraction is more expensive to produce and isn’t possible at every level. So, it’s a trade off.

What’s this have to do with traits?

Some language features are better than others at producing either Leverage or Abstraction. Interfaces, for example, are great at helping us build and enforce abstractions.

Inheritance, on the other hand, is great at providing leverage. It lets us override select parts of the parent code without having to copy it or extract every method to a reusable (but not necessarily abstracted) class. So, to answer the original question, when do I use traits?

I use traits when I want leverage, not abstraction.

Sometimes.

Sometimes?

Benjamin Eberlei makes a good argument that traits have basically the same problems as static access. You can’t exchange or override them and they’re lousy for testing.

Still, static methods are useful. If you’ve got a single function with no state and you wouldn’t want to exchange it for another implementation, there’s nothing wrong with a static method. Think about named constructors (you rarely want to mock your domain models) or array/math operations (well defined input/output, stateless, deterministic). It makes you wonder if static state, rather than methods, are the real evil.

Traits have roughly the same constraints, plus they can only be used when mixed into a class. They’re more macro than object.

This does gives traits an extra feature though: they can read and write the internal state of the class they’re mixed into. This makes them more suitable for some behavior than a static method would be.

An example I often use is generating domain events on an entity:

trait GeneratesDomainEvents
{
    private $events = [];

    protected function raise(DomainEvent $event)
    {
        $this->events[] = $event;
    }

    public function releaseEvents()
    {
        $pendingEvents = $this->events;
        $this->events = [];
        return $pendingEvents;
    }
}

While we can refactor this to an abstraction, it’s still a nice example of how a trait can work with local object state in a way static methods can’t. We don’t want to expose the events array blindly or place it outside the object. We might not want to force another abstraction inside our model but we certainly don’t want to copy and paste this boiler plate everywhere. Plain as they are, traits help us sidestep these issues.

Other practical examples might be custom logging functions that dump several properties at once or common iteration/searching logic. Admittedly, we could solve these with a parent class but we’ll talk about that in a moment.

So traits are a solid fit here but that doesn’t make static methods useless. In fact, I prefer static methods for behavior that doesn’t need the object’s internal state since it’s always safer to not provide it. Static methods are also more sharply defined and don’t require a mock class to be tested.

Assertions are a good example of where I prefer static methods, despite seeing them commonly placed in traits. Still, I find Assertion::positiveNumber($int) gives me the aforementioned benefits and it’s easier to understand what it is (or isn’t) doing to the calling class.

If you do have an assertion you’re tempted to turn into a trait, I’d treat it as a code smell. Perhaps it needs several parameters you’re tired of giving it. Perhaps validating $this->foo relies on the value of $this->bar. In either of these cases, refactoring to a value object can be a better alternative. Remember, it’s best if leverage eventually gives way to abstraction.

So, to restate: I use traits when I want leverage that needs access to an object’s internal state.

Parent Classes

Everything we’ve seen is also possible through inheritance. An EventGeneratingEntity would arguably be even better since the events array would be truly private. However, traits offer the possibility of multiple inheritance instead of a single base class. Aside from their feature set, is there a good heuristic for choosing?

All things being equal, I like to fallback on something akin to the “Is-A vs Has-A” rule. It’s not an exact fit because traits are not composition but it’s a reasonable guideline.

In other words, use parent classes for functionality that’s intrinsic to what the object is. A parent class is good at communicating this to other developers: “Employee is a Person”. Just because we’re going for leverage doesn’t mean the code shouldn’t be communicative.

For other non-core functionality on an object (fancy logging, event generation, boiler-plate code, etc), then a trait is an appropriate tool. It doesn’t define the nature of the class, it’s a supporting feature or better yet, an implementation detail. Whatever you get from a trait is just in service of the main object’s goal: traits can’t even pass a type check, that’s how unimportant they are.

So, in the case of the event generation, I prefer the trait to a base EventGeneratingEntity because Event Generation is a supporting feature.

Interfaces

I rarely (if ever) extend a class or create a trait without also creating an interface.

If you follow this rule closely, you’ll find that traits can complement the Interface Segregation Principle well. It’s easy to define an interface for a secondary concern and then ship a trait with a simple default implementation.

This allows the concrete class to implement its own version of the interface or stick with the trait’s default version for unimportant cases. When your choices are boilerplate or forcing a poor abstraction, traits can be a powerful ally.

Still, if you’ve only got one class implementing the interface in your code and you don’t expect anyone else to use that implementation, don’t bother with a trait. You’re probably not making your code any more maintainable or readable.

When I do not use traits

To be quite honest, I don’t use traits very often, perhaps once every few months. The heuristic I’ve outlined (leverage requiring access to internal state) is extremely niche. If you’re running into it very often, you probably need to step back and reexamine your style of programming. There’s a good chance you’ve got tons of objects waiting to be extracted.

There’s a few places I don’t like to use traits due to style preferences:

  • If the code you’re sharing is just a couple of getters and setters, I wouldn’t bother. IDEs can be your leverage here and adding a trait will only add mental overhead.
  • Don’t use traits for dependency injection. That’s less to do with traits themselves and more to do with setter injection, which I’m rather opposed to.
  • I don’t like traits for mixing in large public APIs or big pieces of functionality. Skip the leverage step and moving directly to finding an abstraction.

Finally, remember that while traits do not offer abstraction and they are not composition, they can still have a place in your toolbox. They’re useful for providing leverage over small default implementations or duplicate code. Always be ready to refactor them to a better abstraction once the code smells pile up.