The most common question I get about my command bus library is: “can commands really return nothing?” The second most common question is “Why do you let commands return something, don’t you know that’s Wrong(tm)?”

It’s easy to get hung up on form, not function. That’s a shame because command buses aren’t important. In the grand scheme of things, they’re completely, utterly, totally unimportant. It’s all about where the messages are going, not how they get there.

Still, let’s take a look at some myths about command buses.

“Command Buses are a part of CQRS and they always go together.”

CQRS is about the separation of a (write) model that solves a problem and a (read) model that informs you about problems. This is a useful lens for viewing some domains.

While I wasn’t there when the term was coined (and haven’t asked), it’s pretty logical to assume the “Command” in “Command Bus” had the same etymology. It was a piece of plumbing used to address the write model, which is the important part of our app.

But that’s a one way relationship. Write model doesn’t care how you send it information. A “old school” service layer like this:

$this->serviceLayer->openAccount($email, $password);

is just as CQRS-y as a command bus like this:

$this->serviceLayer->handle(new OpenAccount($email, $password));

It has the same guarantees, the same ability to encapsulate. It’s just a minor difference in implementation. And no, it doesn’t return anything.

Conversely, if we’re using either version in a non-CQRS app, it doesn’t matter if we return something or not. The non-returning is a property of CQRS, not the Command Bus.

“You can only use a Command Bus if you’re doing CQRS.”

Nowadays I would argue the term “Command” in Command Bus has less to do with the CQRS write model and more to do with the classic Gang of Four definition of “Command”, i.e. breaking down operations into discrete units and passing them around. Command Bus has found a broader audience as a modern take on implementing a service layer. And that’s okay.

The advantages for a command bus hold true whether you’re using CQRS or not:

  • It’s easy to decorate the uniform interface
  • It’s different enough to form a clear barrier between UI and domain model
  • The command class is a nice way to capture ubiquitous language
  • The command DTOs can pair well with form libraries and serializers.

We use a command bus because it’s a useful implementation idiom, not because it’s a fixed part of CQRS. It’s not key architecture, it’s just a handy little twist.

“I should be using CQRS and Command Buses because that’s good practice”

No, they’re useful practices for some types of problems. CQRS is rarely a primary goal anyways. Most of the folks I see using it are actually oriented at doing Event Sourcing, which is another orthogonal concept altogether.

“I shouldn’t be returning anything because CQRS is CQS which says don’t return anything”

CQS (like CQRS) can be a useful lens for some problems but not all situations or practices. It is a good guideline for warning you against side-effect prone code. That said, CQS is aimed at method level whereas CQRS is targeting higher-level architecture. I’d argue they’re spiritually similar but CQRS isn’t direct a superset of CQS.

“We don’t return things because commands are asynchronous”

At a technical level, this seems legit. We could be sending the command into a background queue so we don’t know when it’s actually going to be executed. Therefore, we can’t count on a return value for any command.

In earlier times, this was touted as a feature. Commands are so easy to serialize, why wouldn’t we put them in queues and message buses like we do for events? At least, this is how I first understood it.

As time passed though, there’s been a general realization this doesn’t hold up. If I can’t execute the bulk of a user’s request directly because of some resource constraint, then I should still record and acknowledge they started it.

If you’re doing Event Sourcing, using a Saga or Process Manager is a very natural fit here; the initial command doesn’t do anything beyond raising an event that a user requested we do something but at least we have a firm record of it. Then using the same systems we use to distribute events to projectors, we can trigger a background worker when resources are available. If there are any problems, the Saga/Process Manager can retry using the recorded event.

In other words: let events be asynchronous, make commands be synchronous.

“We don’t return things because it’s our write model.”

Yes, now we’re getting somewhere.

The CQRS aversion to return values has nothing to do with return values specifically. It’s about making certain the write model isn’t reporting about the current state of the system, that’s the read model’s job. That’s the main supposition of CQRS, after all, that treating problem solving and reporting separately lead to less friction and more flexible models.

Whether the write model gives back state by return value, reference, pointer or passing indirectly via another object doesn’t matter. The goal is to keep the write model from taking on reporting duties.

The lack of return values is a consequence, not a first order rule. We’re not silencing the write model, it just has nothing to say.

“Right, we never return anything ever. EVER.”

We’re not having the write model generate and pass back the changed state of the app.

That said, there are instances the write model responds. In PHP or similar languages, the write model typically raises exceptions to signal error states. While exceptions are semantically and idiomatically different than return values, they can be used in a similar fashion. If we were working in a language without exceptions (like Golang), then it would be very reasonable to return errors from the command bus.

None of this violates CQRS principles. Even orthodox folks will concede that returning an acknowledgement or some other “small” value is reasonable (albeit in rare, always unspecified edge cases). In practice, the most common use case like this is returning a complex id from the command handler. While I (personally) don’t think that’s a huge problem, it’s pretty easy to work around this by introducing a domain service that can be used to pre-generate the id in the controller.

Finally, it’s important to remember that write models often produce state and make it available through other means. The most common reason for CQRS right now is Event Sourcing, which produces state as events. While events are recorded mainly for the purpose of future write model decisions, they are distributed through the system, so in the sense of availability, they’re as good as returned.

“Wait, so I can return events from my command handlers and still be doing CQRS?”

I don’t know. Maybe? But even if that was internally consistent, I wouldn’t recommend returning events all the way up to the controller dispatching the command.

Examining the event stream and deciding what to show the user can be a complex task. After all, it’s complex enough that we have the concept of a read model.

The last thing most apps need is more logic in controllers, so don’t burden controllers with the task of interpreting a partial event stream. There might not be enough information to make a good projection but there is enough complexity to make a bad one.

So, while we could get that data at the controller level, that doesn’t make it the best place to handle it. Nor does having the events there overcome the UX frustrations of eventually consistency. If that’s a problem, you don’t have scaling issues and you have a taste for the unorthodox, then flout eventual consistency of your read models however you like but just returning events isn’t the fix for that.

In the end, my advice is let the write model communicate to the read model horizontally and query the read model from your user interface vertically.

“Cool story bro but I’m using a Command Bus that doesn’t return anything so I know I’m doing CQRS right.”

Not necessarily. What if your read and your write model are still tightly coupled? That can happen regardless of any bus.

In the same way it’s possible to use a data mapper ORM and still produce tight coupling on a database, it’s perfectly possible to use a command bus that doesn’t return anything and not be doing CQRS.

Tooling never guarantees an outcome.

“So…should I be returning stuff or not?”

In a non-CQRS app? Do whatever the heck you want.

In a CQRS app? Probably not, depending on how you’ve wired it up. But remember, the point isn’t to cargo-cult about return values, it’s to understand the constraints and decisions that lead to that tactical choice.

Architectures have evolved certain conventions in the way they’re built, it’s the tradeoffs we’re trying to respect, not the specific form. At the same time, these conventions evolved because of the experience of previous developers and we can learn from them to avoid bumping into the same issues.

Thanks to Frank de Jonge, Jeroen vd Gulik and Shawn McCool for proofreading and feedback.

tl;dr When creating value objects representing time, I recommend choosing how finegrained the time should be with your domain experts and round it off to that precision in the value object.

When modeling important numbers, it’s considered good form to specify the precision. Whether it’s money, size or weight; you’ll typically round off to a given decimal point. Even if it’s only for user display, rounding off makes the data more predictable for manipulation and storage.

Unfortunately, we don’t often do this when handling time and it bites us in the rear. Consider the following code:

$estimatedDeliveryDate = new DateTimeImmutable('2017-06-21');

// let's assume today is ALSO 2017-06-21
$now = new DateTimeImmutable('now');

if ($now > $estimatedDeliveryDate) {
    echo 'Package is late!';
} else {
    echo 'Package is on the way.';
}

Since it’s June 21 in the code sample, this code should print “Package is on the way.” After all, the day isn’t over yet, it might just be coming later in the afternoon.

Except the code doesn’t do that. Because we didn’t specify the time component, PHP helpfully zero pads $estimatedDeliveryDate to 2017-06-21 00:00:00. On the other hand, $now is calculated for…now. “Now” includes the current time (which probably isn’t midnight), so you’ll get 2017-06-21 15:33:34 which is indeed later than 2017-06-21 00:00:00.

Solution 1

“Oh, this is a quick fix.” folks might say and update it to the following.

$estimatedDeliveryDate = new DateTimeImmutable('2017-06-21');
$estimatedDeliveryDate = $estimatedDeliveryDate->setTime(23, 59);

Cool, we changed the time to include up to midnight. Except the time is padded to 23:59:00 so if you look in the last 59 seconds of the day, you’ll have the same problem.

“Grrr, okay.” folks might say.

$estimatedDeliveryDate = new DateTimeImmutable('2017-06-21');
$estimatedDeliveryDate = $estimatedDeliveryDate->setTime(23, 59, 59);

Cool, now it’s fixed.

…Unless you’re on PHP 7.1 which adds microseconds to DateTime objects. So now it only occurs on the last second of the day. I may be biased after working on too many high traffic systems but sooner or later a user or a automated process will hit that and complain. Good luck tracking down THAT bug. :-/

Okay, let’s add microseconds.

$estimatedDeliveryDate = new DateTimeImmutable('2017-06-21')
$estimatedDeliveryDate = $estimatedDeliveryDate->modify('23:59:59.999999');

And this works.

Until we get nanoseconds.

In PHP 7.2.

Okay, okay, we CAN reduce the margin of error further and further to the point that errors become unrealistic. Still, at this point it should be clear this approach is flawed: we’re chasing an infinitely divisible value closer and closer to a point we can never reach. Let’s try a different approach.

Solution 2

Instead of calculating the last moment before our boundary, let’s check against the boundary instead.

$estimatedDeliveryDate = new DateTimeImmutable('2017-06-21');

// Start calculating when it's late instead of the last moment it's running on time
$startOfWhenPackageIsLate = $estimatedDeliveryDate->modify('+1 day');

$now = new DateTimeImmutable('now');

// We've changed the > operator to >=
if ($now >= $startOfWhenPackageIsLate) {
    echo 'Package is late!';
} else {
    echo 'Package is on the way';
}

So this version works and it’s always accurate throughout the whole day. Unfortunately, it’s also become more complex. If you don’t encapsulate this logic within a value object or similar, it’ll get missed somewhere in your app.

Even if you do encapsulate it, we’ve made this one type of operation (>=) logical and consistent but it’s not a consistent fix for all operations. If we wanted to support equality checks, for example, we’d have to do another, different type of special data juggling to make that operation work correctly. Meh.

Finally (and this might just be me) this solution has the misleading smell of a potentially missed domain concept. “Is there a LatePeriodRange? A DeliveryDeadline?” you might say. “The package enters into a late period, then….something happens? The domain expert never mentioned a deadline, but it seems to be there. Is that different than the EstimatedDeliveryDate? Where does it go?” It doesn’t go. It doesn’t go anywhere. It’s just a weird quirk of the implementation that’s now stuck in your head.

So, this is a better solution in that it consistently yields a correct answer…but it’s not a great solution. Let’s see if we can do better.

Solution 3

So, all we want to do is compare two days. Now, if we picture a DateTime object as a set of numbers (year, month, day, hour, month, second, etc…) everything up to the day part is working fine. All of the problems we’ve had are due to the extra values after that: hour, minute, second, etc. We can argue about the annoying and insidious ways those values keep leaking in there, but the fact remains that the time component that’s wrecking our checks.

If the day component is all that’s important to us, why do we put up with these extra values? Unless it rolls over into the next day, a few extra hours or minutes won’t change the outcome of the business rules.

So, let’s just throw the extra cruft away.

// Simplify the dates down to just the day, discarding the rest
$estimatedDeliveryDate = day(new DateTimeImmutable('2017-06-21'));
$now = day(new DateTimeImmutable('now'));

// Now the comparison is simple
if ($now > $estimatedDeliveryDate) {
    echo 'Package is late!';
} else {
    echo 'Package is on the way.';
}

// Clunky but effective way to discard extra precision. PHP
// will zero pad the remaining values such as milli/nanosecond.
// In the case of dates, we can just use the setTime function but for
// other precisions, you'll have to use other logic to discard the extra
// data beyond what you need.
function day(DateTimeImmutable $date) {
    return $date->setTime(0, 0, 0);
}

This gives us the simpler comparison/calculation we saw in solution 1, with the accuracy we had in solution 2. It’s just…the ugliest version of the code yet, plus it’s super easy to forget to call day() within in your code.

However, the code IS easy to abstract. More importantly though, it’s becoming clear that when we’re talking about estimated delivery dates, we’re ALWAYS talking about a day, never about a time. Both of these things make this a good candidate for pushing this into a type.

Encapsulation At Last

In other words, let’s make this a value object.

$estimatedDeliveryDate = EstimatedDeliveryDate::fromString('2017-06-21');
$today = EstimatedDeliveryDate::today();

if ($estimatedDeliveryDate->wasBefore($today)) {
    echo 'Package is late!';
} else {
    echo 'Package is on the way.';
}

Look how nice that reads. The value object itself is nice and boring:

class EstimatedDeliveryDate
{
    private $day;

    private function __construct(DateTimeImmutable $date)
    {
        $this->day = $date->setTime(0, 0, 0);
    }
    public static function fromString(string $date): self
    {
         // Possibly verify YYYY-MM-DD format, etc
        return new static(new DateTimeImmutable($date));
    }
    public static function today(): self
    {
        return new static(new DateTimeImmutable('now'));
    }
    public function wasBefore(EstimatedDeliveryDate $otherDate): bool
    {
        return $this->day < $otherDate->day;
    }
}

Because we’ve now made this a class, we’re automatically enforcing a lot of helpful rules: You can only compare a EstimatedDeliveryDate to another EstimatedDeliveryDate, so the precision always lines up.

The correct precision handling is in a single internal place, the consuming code never needs to consider precision at all.

It’s easy to test.

You’ve got a great single place to centralize your timezone handling (not discussed here but super important).

One quick pro-tip: I’ve used a today() method here to show how you can have multiple constructors. In practice, I’d recommend creating a system clock and get your “now” instances from that, it’ll make your unit tests much easier to write. The “real” version would probably look like:

$today = EstimatedDeliveryDate::fromDateTime($this->clock->now());

Precision Through Imprecision

The important takeaway here isn’t “value objects, yay, inline juggling, boo!” It’s that we were able to remove several classes of errors by reducing the precision of the DateTime we were handling. If we hadn’t done that, the value object would still be handling all of these edges cases and probably failing at some of them too.

Reducing the quality of data to get a correct answer might seem counter-intuitive but it’s actually a more realistic view of the system we’re trying to model. Our computers might run in picoseconds but our business (probably) doesn’t. Plus, the computer is probably lying anyways.

As devs, it might feel we’re being more flexible and future-proof by keeping all possible information. After all, who are you to decide what information to throw away? Yet, the truth is that while information can potentially be worth money in the future, it definitely costs money to keep it until that future. It’s not just the cost of a bigger hard drive either, it’s the cost of complexity, of people, of time, and in the case of bugs, reputation. Sometimes working with data in its most complex form will turn out to be worth the cost but sometimes it isn’t, so just blindly saving everything you can because you can isn’t always a winning game.

To be clear: I’m not recommending you just randomly remove available time information.

What I am recommending: Explicitly choose a precision for your time points, together with your domain experts. If you’re getting more precision than you expect, it can cause bugs and additional complexity. If you’re getting less precision than you expect, it can cause bugs and failed business rules. The important thing is that we define the expected and necessary level of precision.

Further, choose the precision separately for each use case. Rounding will usually be in the value object, not at the system clock level. As we’ll talk about later, some places still need nanosecond precision but others might only need a year. Getting the precision right makes the language clearer.

This Crap Is Everywhere

It’s worth pointing out that we’ve only talked about a specific type of bug here: excess precision throwing off greater than/less than checks. But this advice applies to a much wider set of errors. I won’t go into all of them, though I do want to point out a personal favorite, “leftover” precision.

// Let's assume today is June 21st, so this equals June 28
$oneWeekFromNow = new DateTimeImmutable('+7 days');
// Also June 28 but set explicitly or loaded from DB
$explicitDate = new DateTimeImmutable('2017-06-28');

// Comparing based on state, are these the same date?
var_dump($oneWeekFromNow == $explicitDate);

No, they’re not the same date because $oneWeekFromNow also has the current time whereas $explicitDate is set to 00:00:00. Delightful.

The examples above talked about precision primarily in time vs date but modeling precision applies to any unit of time. Imagine how many scheduling apps only need times to the minute and how many financial apps need support for quarters of the year.

Once you start looking at it, you realize how many time errors can be explained by undefined precision. They might look like bad range checks or poorly designed bounds but when you dive in, you start to see a pattern emerge.

My experience is that this class of errors are often missed in testing. System clock objects aren’t a common sight (yet), so testing code that uses the current time is a bit tricky. And when there are tests, the fixtures often don’t pad the date out completely so so it’s easy to miss the error windows.

Nor is this a problem specific to PHP’s DateTime library. When I tweeted about this last week, Anthony Ferrara mentioned how Ruby’s time precision varies depending on the operating system yet the database library had an fixed level. That sounds fun to debug.

Time is just hard to work with. Time math doubly so.

Choosing A Level Of Precision

So we can say that choosing a level of precision for your time objects is super important but how do we select the right one? As a rule of thumb, I would say be open-ended with timepoints for your technical needs but set an explicit level of precision for all of your domain objects.

For your logs, your event sourcing data, your metrics, go as fine-grained as you need/want/can. These are primarily aimed at technical personnel who are more familiar with fine grained dates and the extra precision is often necessary for debugging. You’ll likely need to get very finegrained for system or sequenced data. That’s okay, it’s what the constraints demand.

For business concerns, talk to your domain experts about how fine-grained that information needs to be. They can help you balance what they’re using now vs what they might need in the future. Business rules are often an area where you’re playing with borrowed knowledge, so shedding complexity can be a smart move. Remember, you’re not after an accurate-to-real-life model, you’re after a useful one.

Within the code, this might occasionally lead to varying levels of precision, even within the same class. For example, consider this class in an event sourced application.

class OrderShipped
{
    // Order object that's capped to day precision.
    private $estimatedDeliveryDate;

    // Order object that's capped to second level precision.
    private $shippedAt;

    // Event sourcing object that's capped to microsecond
    private $eventRecordedAt;
}

If the varying levels of precision seem strange, remember that these time points have very different use cases. Even the $shippedAt and $eventRecordedAt might point to the same “time”, but they belong to very different sections of the code.

You might also find the business is working with units (and therefore precisions) of time you don’t expect: quarters, financial calendars, shifts, morning/afternoon/evening parts. There’s a lot of interesting conversations to be had in exploring these extra units.

Changing Requirements

Another good part of having these conversations: If the business rules change in the future and you turn out to need more precision than originally recorded, then it was a joint decision and you talk about how to fix the legacy cases. Ideally, we can shift it from being a technical problem to a business one.

In most cases, this can be simple: “We originally only needed the date they signed up but now we need the time so we can see if it falls before business closing times.” Maybe this affects a small number of accounts and you can set them to the beginning of the next business day. Or just zero pad the dates. Or maybe there’s an extra business rule where signing up after 18:00 sets the subscription end date to tomorrow+1 year instead of today+1 year. Talk to them about it. Folks are more proactive and understanding with changes if you include them in the discussion from the beginning (if only to mitigate blame).

In more complex scenarios, you can look at reconstructing data based on other data in your system. Maybe we can derive it from event sourcing times or user registration dates. In some cases, it simply isn’t possible and you’ll have to construct new business rules about what to do with migrating legacy cases. But the truth is, you can’t plan for everything and you probably won’t know what will change. That’s life.

So that’s my thoughts about time precision: use what you need and no more.

Appendix: An Ideal Solution

Going forward, I feel there’s real practical benefits to choosing fixed precisions and modeling them as custom types. My ideal PHP time library would probably be something that provides units of time as abstract classes that I extend into my value objects and then build on.

class ExpectedDeliveryDate extends PointPreciseToDate
{
}
class OrderShippedAt extends PointPreciseToMinute
{
}
class EventGenerationTime extends PointPreciseToMicrosecond
{
}

By pushing the precision to the class, we force a decision about precision. We can limit methods like “setTime()” to the precisions they actually apply to (not on Dates!) and we can round DateInterval to whatever makes sense for the type. Most of the utility methods could have protected visibility and my value objects could expose only those that make sense for my domain. Also, we’d be encouraging folks to create value objects. So. Many. Value objects. Yessss.

Bonus points if the library makes it easy to define custom units of time.

Actually building it though? Ain’t nobody got time for that.

Many thanks to Frank de Jonge, Jeroen Heijmans, Anna Baas and Matthias Noback for feedback on earlier drafts of this article!

Over the last couple years, I’ve started putting my Exception messages inside static methods on custom exception classes. This is hardly a new trick, Doctrine’s been doing it for the better part of a decade. Still, many folks are surprised by it, so this article explains the how and why.

How does it work?

Let’s say you’re writing a large CSV import and you stumble across an invalid row, perhaps it’s missing a column. Your code might look like this:

if (!$row->hasColumns($expectedColumns)) {
    throw new Exception("Row is missing one or more columns");
}

This works in terms of stopping the program but it’s not very flexible for the developer. We can improve this is creating a custom exception class.

class InvalidRowException extends \Exception
{
}

Now we throw our custom Exception instead:

if (!$row->hasColumns($expectedColumns)) {
    throw new InvalidRowException("Row is missing one or more columns");
}

This might look like boilerplate but it allows higher level code to recognize which error was raised and handle it accordingly. For example, we might stop the entire program on a NoDatabaseConnectionException but only log an InvalidRowException before continuing.

Still, the error message isn’t very helpful from a debugging perspective. Which row failed? It would be better if we always included the row number in our error message.

if (!$row->hasColumns($expectedColumns)) {
    throw new InvalidRowException(
        "Row #" . $row->getIndex() . " is missing one or more columns"
    );
}

That’s better in the log but now the formatting on this one little message is getting a bit noisy and distracting. There’s no upper bound on this complexity: as the log message gets complex, the code will get uglier. Not to mention, there are multiple reasons we might throw an InvalidRowException but we’d need to format them all to include the row number. Booorrrriiing.

Moving the Formatting

We can remove the noise by pushing the formatting into the custom Exception class. The best way to do this is with a static factory:

class InvalidRowException extends \Exception
{
    public static function incorrectColumns(Row $row) {
        return new static("Row #" . $row->getIndex() . " is missing one or more columns");
    }
}

And now we can clean up the importing code without losing readability:

if (!$row->hasColumns($expectedColumns)) {
    throw InvalidRowException::incorrectColumns($row);
}

The only extra code is the function block surrounding our message. That function block isn’t just noise though: it allows us to typehint and document what needs to be passed to generate a nicely formatted message. And if those requirements ever change, we can use static analysis tools to refactor those specific use cases.

This also frees us from the mental constraints of the space available. We’re not bound to writing code that fits into a single if clause, we have a whole method to do whatever makes sense.

Maybe common errors warrant more complex output, like including both the expected and the received list of columns.

public static function incorrectColumns(Row $row, $expectedColumns)
{
    $expectedList = implode(', ', $expectedColumns);
    $receivedList = implode(', ', $row->getColumns());

    return new static(
        "Row #" . $row->getIndex() . " did not contain the expected columns. " .
        " Expected columns: " . $expectedList .
        " Received columns: " . $receivedList
    );
}

The code here got significantly richer but the consuming code only needed to pass one extra parameter.

if (!$row->hasColumns($expectedColumns)) {
    throw InvalidRowException::incorrectColumns($row, $expectedColumns);
}

That’s easy to consume, especially when throwing it in multiple locations. It’s the type of error messages everyone wants to read but rarely take the time to write. If it’s an important enough part of your Developer Experience, you can even unit test that the exception message contains the missing column names. Bonus points if you array_diff/array_intersect to show the actual unexpected columns.

Again, that might seem like overkill and I wouldn’t recommend gold plating every error scenario to this extent. Still, if this is code you really want to own and you can anticipate the common fix for these errors, spending 1 extra minute to write a solid error message will pay big in debugging dividends.

Multiple Use Cases

So far we’ve created a method for one specific use case, an incorrect number of columns.

Maybe we have other issues with our CSV file, like the existence of blank lines. Let’s add a second method to our exception:

class InvalidRowException extends \Exception
{
    // ... incorrectColumns ...

    public static function blankLine($rowNumber)
    {
        return new static("Row #$rowNumber was a blank line.");
    }
}

Again, a bit of boilerplate but we get extra space to write more detailed messages. If the same issue keeps occurring, perhaps it’s worth adding some extra details, links or issue numbers (or you know, fixing it more permanently).

public static function blankLine($rowNumber)
{
    return new static(
        "Row #$rowNumber was a blank line. This can happen if the user " .
        "exported the source file with incorrect line endings."
    );
}

Locking It Down

When you’re creating “named constructors” for domain objects, you use a similar technique but also declare the original constructor as private. I don’t do this with exceptions though.

First off, we can’t in PHP. Exceptions all extend from a parent Exception class and can’t change their parent’s __construct access level from public to private.

More importantly, there’s little benefit for Exceptions. With domain objects, we do this to capture the ubiquitous language and prevent users from instantiating the object in an invalid state. But with an exception, there’s very little state to keep valid. Furthermore, we’re inherently dealing with exceptional circumstances: we can’t foresee every reason a user might throw an exception.

So, the best you can do is create a base factory that you recommend other folks use when creating their exceptions. This can typehint for useful things commonly included in the message:

class InvalidRowException extends \Exception
{
    // One off error messages can use this method...
    public static function create(Row $row, $reason)
    {
        return new static("Row " . $row->getIndex() . " is invalid because: " . $reason);
    }

    // ...and predefined methods can use it internally.
    public static function blankLine(Row $row)
    {
        return static::create($row, "Is a blank line.");
    }
}

Which might be useful but is probably pretty far out there. I haven’t seen a convenient way to enforce it though.

The Big Picture

There’s one final benefit I’d like to touch on.

Normally, when you write your exception messages inline, the various error cases might be split across different files. This makes it harder to see the reasons you’re raising them, which is a shame since exception types are an important part of your API.

When you co-locate the messages inside the exception, however, you gain an overview of the error cases. If these cases multiply too fast or diverge significantly, it’s a strong smell to split the exception class and create a better API.

// One of these isn’t like the others and should probably be a different Exception class
class InvalidRowException extends \Exception
{
    public static function missingColumns(Row $row, $expectedColumns);
    public static function blankLine(Row $row);
    public static function invalidFileHandle($fh);
}

Sometimes we underestimate the little things that shape our code. We’d like to pretend that we’re not motivated by getting our error messages neatly on one line or that we regularly do a “Find Usages” to see our custom exception messages, but the truth is: these little details matter. Creating good environments at a high level starts with encouraging them at the lowest levels. Pay attention to what your habits encourage you to do.

Recently, a few folks asked about a trait in a new project I wrote. Right around the same time, Rafael Dohms showed me his new talk about complex cognitive processes we don’t notice. Because my brain is a big mushy sack, the two blended together. The result was this post, which tries to capture how I use traits but also how I decide to use them in the first place.

Leverage vs Abstraction

The first thing you should do is go read this blog post: “Abstraction or Leverage” from Michael Nygard. It’s an excellent article.

If you’re short on time, the relevant part is that chunks of code (functions, classes, methods, etc) can be split out for either abstraction or leverage. The difference is:

  • Abstraction gathers similar code behind a higher level concept that’s more concise for other code to work with.
  • Leverage gathers similar code together so you only have one place to change it.

A common abstraction would be a Repository: you don’t know how an object is being stored or where but you don’t care. The details are behind the concept of a Repository.

Common leverage would be something like your framework’s Controller base class. It doesn’t hide much, it just adds some nifty shortcuts that are easier to work with.

As the original blogpost points out, both abstraction and leverage are good. Abstraction is slightly better because it always gives you leverage but leverage doesn’t give you abstraction. However, I would add that a good abstraction is more expensive to produce and isn’t possible at every level. So, it’s a trade off.

What’s this have to do with traits?

Some language features are better than others at producing either Leverage or Abstraction. Interfaces, for example, are great at helping us build and enforce abstractions.

Inheritance, on the other hand, is great at providing leverage. It lets us override select parts of the parent code without having to copy it or extract every method to a reusable (but not necessarily abstracted) class. So, to answer the original question, when do I use traits?

I use traits when I want leverage, not abstraction.

Sometimes.

Sometimes?

Benjamin Eberlei makes a good argument that traits have basically the same problems as static access. You can’t exchange or override them and they’re lousy for testing.

Still, static methods are useful. If you’ve got a single function with no state and you wouldn’t want to exchange it for another implementation, there’s nothing wrong with a static method. Think about named constructors (you rarely want to mock your domain models) or array/math operations (well defined input/output, stateless, deterministic). It makes you wonder if static state, rather than methods, are the real evil.

Traits have roughly the same constraints, plus they can only be used when mixed into a class. They’re more macro than object.

This does gives traits an extra feature though: they can read and write the internal state of the class they’re mixed into. This makes them more suitable for some behavior than a static method would be.

An example I often use is generating domain events on an entity:

trait GeneratesDomainEvents
{
    private $events = [];

    protected function raise(DomainEvent $event)
    {
        $this->events[] = $event;
    }

    public function releaseEvents()
    {
        $pendingEvents = $this->events;
        $this->events = [];
        return $pendingEvents;
    }
}

While we can refactor this to an abstraction, it’s still a nice example of how a trait can work with local object state in a way static methods can’t. We don’t want to expose the events array blindly or place it outside the object. We might not want to force another abstraction inside our model but we certainly don’t want to copy and paste this boiler plate everywhere. Plain as they are, traits help us sidestep these issues.

Other practical examples might be custom logging functions that dump several properties at once or common iteration/searching logic. Admittedly, we could solve these with a parent class but we’ll talk about that in a moment.

So traits are a solid fit here but that doesn’t make static methods useless. In fact, I prefer static methods for behavior that doesn’t need the object’s internal state since it’s always safer to not provide it. Static methods are also more sharply defined and don’t require a mock class to be tested.

Assertions are a good example of where I prefer static methods, despite seeing them commonly placed in traits. Still, I find Assertion::positiveNumber($int) gives me the aforementioned benefits and it’s easier to understand what it is (or isn’t) doing to the calling class.

If you do have an assertion you’re tempted to turn into a trait, I’d treat it as a code smell. Perhaps it needs several parameters you’re tired of giving it. Perhaps validating $this->foo relies on the value of $this->bar. In either of these cases, refactoring to a value object can be a better alternative. Remember, it’s best if leverage eventually gives way to abstraction.

So, to restate: I use traits when I want leverage that needs access to an object’s internal state.

Parent Classes

Everything we’ve seen is also possible through inheritance. An EventGeneratingEntity would arguably be even better since the events array would be truly private. However, traits offer the possibility of multiple inheritance instead of a single base class. Aside from their feature set, is there a good heuristic for choosing?

All things being equal, I like to fallback on something akin to the “Is-A vs Has-A” rule. It’s not an exact fit because traits are not composition but it’s a reasonable guideline.

In other words, use parent classes for functionality that’s intrinsic to what the object is. A parent class is good at communicating this to other developers: “Employee is a Person”. Just because we’re going for leverage doesn’t mean the code shouldn’t be communicative.

For other non-core functionality on an object (fancy logging, event generation, boiler-plate code, etc), then a trait is an appropriate tool. It doesn’t define the nature of the class, it’s a supporting feature or better yet, an implementation detail. Whatever you get from a trait is just in service of the main object’s goal: traits can’t even pass a type check, that’s how unimportant they are.

So, in the case of the event generation, I prefer the trait to a base EventGeneratingEntity because Event Generation is a supporting feature.

Interfaces

I rarely (if ever) extend a class or create a trait without also creating an interface.

If you follow this rule closely, you’ll find that traits can complement the Interface Segregation Principle well. It’s easy to define an interface for a secondary concern and then ship a trait with a simple default implementation.

This allows the concrete class to implement its own version of the interface or stick with the trait’s default version for unimportant cases. When your choices are boilerplate or forcing a poor abstraction, traits can be a powerful ally.

Still, if you’ve only got one class implementing the interface in your code and you don’t expect anyone else to use that implementation, don’t bother with a trait. You’re probably not making your code any more maintainable or readable.

When I do not use traits

To be quite honest, I don’t use traits very often, perhaps once every few months. The heuristic I’ve outlined (leverage requiring access to internal state) is extremely niche. If you’re running into it very often, you probably need to step back and reexamine your style of programming. There’s a good chance you’ve got tons of objects waiting to be extracted.

There’s a few places I don’t like to use traits due to style preferences:

  • If the code you’re sharing is just a couple of getters and setters, I wouldn’t bother. IDEs can be your leverage here and adding a trait will only add mental overhead.
  • Don’t use traits for dependency injection. That’s less to do with traits themselves and more to do with setter injection, which I’m rather opposed to.
  • I don’t like traits for mixing in large public APIs or big pieces of functionality. Skip the leverage step and moving directly to finding an abstraction.

Finally, remember that while traits do not offer abstraction and they are not composition, they can still have a place in your toolbox. They’re useful for providing leverage over small default implementations or duplicate code. Always be ready to refactor them to a better abstraction once the code smells pile up.

I started this month by speaking or hosting at 4 events in 2 countries over 5 days (DrupalCon Amsterdam, DomCode, ZgPHP, WebCamp). While hectic, it was a great way to see a cross section of the community back-to-back. So, I’d like to talk about the events in turn but also some meta-topics about conferences themselves.

DrupalCon

This was one of the biggest conferences I’ve been to, and definitely the slickest, most well produced. Huge stages, a giant conference center, a “pre-note” with zebra pattern suits and super heroes. The conference had a good sense of humor about itself but the ambition was evident. Drupal is Big(tm) and they know it.

From a tech and community perspective, it’s an interesting position. On the one hand, Drupal began life as a open-source project but now has a massive commercial side (check out the sponsors hall). The keynote discussed Drupal as becoming a Public Good, citing Paul Samuelson’s work and outlining a 3 step lifecycle in this process: Volunteers → Business → Government. The fact the keynote began with a sponsored doppelganger act was a tacit admission of where Drupal currently stands in this process.

This isn’t necessarily bad. I have little knowledge of Drupal but my impression is the commercial interest helps drive an increasing “professionalization” of the developers and tooling. Rolling out best practices and applying general engineering principles to Drupal is a great step forward. Make no mistake, they are tackling genuinely hard problems.

Perhaps for this reason, Drupal is also trying to reconnect to the PHP community at large. This is also a hugely positive (and arguably necessary) move to bring in some of those professional dev practices. At the same time, the feedback I received for my talk on the general PHP track was different enough from other conferences to remind that it is a separate audience, at least for now.

Still, the question on everyone’s mind was “How do we keep hobbyist developers?” I interpreted this as: “Has the re-engineering of Drupal made it so complex that only experienced pro developers can reasonably work with it and was it worth the additional complexity?” To that: I don’t know. I don’t know the code. A well engineered codebase should be less complex but using it may require you to sideload a lot of knowledge (i.e. indirect complexity). That’s possibly a culture or documentation issue, not a technical one.

Some of the articles in the Drupal magazine I found in my swag bag questioned if this was even an issue. Perhaps the natural evolution of Drupal is a focus on high-end, high-traffic CMS sites. Perhaps not.

Either way, it was a great conference. Drupal is juggernaut and here to stay. I don’t know where the road will take them but they’re trying to head in the right direction, even when it’s uphill. Much respect.

ZgPHP

Two days later, and I was presenting at ZgPHP in Zagreb, Croatia (my first trip to the region). The difference was intense.

ZgPHP is a local user group in Zagreb and this was their second annual conference but their first in English. As such, it’s a much smaller event: Total attendees were around 80, though the number waxed and waned throughout the day.

One of the great things about this conference though is the price: it’s completely free to attend. Like DrupalCon, ZgPHP is also worried about the hobbyists and the lone cowboy coders and the tiny dev shops. So, their response is to lower the barrier to obtaining pro-level knowledge, at least as much as possible and free tickets are a great step to do that in a country where the unemployment rate is a crushing 20%. They can’t reach everyone with this tactic but it’s certainly not for lack of trying.

To be fair, that comes with certain tradeoffs: The venue was donated by the local Chamber of Commerce. There was no catering staff and lunch was local pastries with 2 liter bottles of Coke. My badge was an oversized sticker with my name and twitter handle written on it in marker. (When I returned to the hotel, I peeled the sticker off carefully and cut up toilet paper with shaving scissors then stuck to the back so I could bring it home for my badge collection).

So, yes, it’s a small conference but by design. It was well run, well organized and a really nice community.

Croatia isn’t a huge country (4.23 million people, slightly less than Kentucky). Nonetheless, there’s multiple user groups and even another annual conference. That’s some great hustle. By and large, the conversations I had were the same as anywhere else: HHVM and REST APIs, unit testing and decoupling. I was the only speaker not from the region, which made me nervous but folks laughed at my jokes and left nice joind.in reviews. It was a good crowd and I had fun.

WebCamp

Two days later and I was at another conference in Zagreb: WebCamp. Rather than focus on any specific language or tool, WebCamp is about, well, the web.

It’s also massive. And free. Like ZgPHP, WebCamp is totally free to attend. All you had to do is register for one of the 1100+ free tickets on their website, which lasted less than a day. Of course, as with any free ticket, there were a lot of no-shows but around 800 people attended on the first day. True, that’s smaller than DrupalCon but 800 people is significantly larger than almost any PHP conference out there.

Still, a conference of this scale has bills, so attendees could buy a supporter pack (~40 euros?), which came with a t-shirt and some other goodies. Lots of sponsors chipped in and they received perks like evening stands and an hour-long recruitment session on the first day.

WebCamp was a joint effort by several local user groups: PHP, Ruby, Python, JS, you name it. In fact, WebCamp and ZgPHP were part of a whole initiative called “Week of WebCamp” where there were several smaller conferences and workshops throughout the city of Zagreb.

It’s an ambitious undertaking which culminated in WebCamp itself. I’m don’t know if any of the individual groups could’ve pulled this together individually but together, they made it work really well. I saw talks on everything from Elixir to Gearman to data aggregation.

For my part, I was asked to do the opening keynote, which was an enormous honor but the size made it intimidating. Still, the crowd laughed and had fun and even I was reasonably pleased with the result. They also asked me to host one of the tracks which meant a crash course in pronouncing Crotian names (final score: D+).

I’m a huge fan of multi-technology events and also the free-to-attend model but they’re hard to get right. Watching someone pull off a 2-day, dual-track, 800 person conference in those formats plus the additional events throughout the week was amazing. Yes, nothing is perfect but it was absolutely a success.

I’m not an “Everything is awesome”, “Community Works” cheerleader sort of fellow. Still, WebCamp in action was inspiring and leaves me wondering if it’s a taste of things to come: a free-to-attend, multi-community event supported by donations and sponsors.

Conclusion

I had a great time at all of the events and would highly recommend any of them (I skipped over DomCode as I’m one of the founders but would highly recommend it as well)!

Still, I left with a lot of questions. Everywhere I went, people mentioned hobbyist or junior developers. How do we reach them? How do we attract them? What do we teach them? This is especially true in the PHP community where there’s a huge variation in knowledge level.

It also left me wondering about the role of those teaching. Are free-to-attend events more open? Doesn’t that leave us more dependent on our sponsors? Should our conferences be more cost-effective affairs or can we teach more effectively when we go all out? Is commercial backing driving knowledge or impeding it? How can we encourage the right attitude in our partners? Who are we really targeting with technical conferences? Is turning teaching into a commercial interest an acceptable path for an open source community? What are the role of speakers in this?

Each of these events brings a different model to the table. Is there a right one? Well, the fact that all of them are successful in different ways says probably not. I loved the variety of speakers plus the pomp and circumstance of DrupalCon but you probably couldn’t pull that off on a ZgPHP budget. At the same time, I felt like I could take more specific questions and help people more deeply at ZgPHP precisely because it so direct. There are things you can do with one model that you can’t necessarily do with another. I suppose we need different events pushing different models to catch the widest possible subset of the community, though I wish we could do better.

One amazing thing was that all of these conferences were recording their talks to publish them on the web for free. My DrupalCon talk was online less than 2 hours after I gave it, which is bonkers. Perhaps the web-based model is the real future here, though as long as there’s a hallway track and a social to network at, there will be value in attending conferences. Which just leaves one question an event to consider: who are we trying to bring in and how do we get them there?

Many thanks to Igor Wiedler for proofreading an earlier version of this article