Persisting Value Objects in Doctrine

11 minute read

I’ve been using more and more Value Objects in my applications over the last year, primarily with Doctrine ORM. Value Objects are an extremely powerful technique and I’ve been impressed with how much they can clean up a codebase.

One of the main questions I’ve had when starting with Value Objects is how to persist them with Doctrine. This post attempts to create a reference for all the different persistence techniques I’ve seen so far. Keep in mind, I’ll be talking purely about persistence, not about how to use Value Objects as part of your modeling. I’ll also assume that you’re somewhat familiar with the concept of a Value Object and a fair hand with Doctrine. If not, I’d suggest reading up on both topics before going further.

It’s important to understand that (until recently) Doctrine had no true support for Value Objects and therefore almost all of the techniques you’ll see are workarounds, so keep that in mind.

Map It Yourself

Let’s say you wanted to store IP ranges in your database. We’ll start off simple, storing ranges as strings in CIDR notation like “192.168.1.1/24” (don’t worry if that doesn’t make sense, it’s just a shorthand for expressing 192.168.1.1 – 192.168.1.255). We’ll use a couple of Value Objects like this:

class IPAddress
{
    public function createFromString($ipAsString);
    public function __toString();
}

class IPRange
{
    // Returns the high or low IP addresses of this range.
    public function getStartIP();
    public function getEndIP();

    // Different ways of creating this value object
    public static function createFromCidrNotation($cidr);
    public static function createFromIPAddresses(IPAddress $startIP, IPAddress $endIP);

    // Cast the range to a CIDR notation
    public function __toString();
}

Now, let’s imagine we have a Server entity that’s responsible for storing an IPRange:

class Server
{
    public function setIPRange(IPRange $range);
    public function getIPRange();
}

So, we have an Entity (Server) that stores a Value Object (IPRange) that is made out of two other value objects (IP Address) and this all needs to be stored. This is more complicated than a standard Entity->Value Object but it lets us explore more trade offs.

The important thing is the interface on the entity itself: it only deals in Value Objects. What happens inside the entity is technically the entity’s business. We only really care that an IPRange is given and returned, not what form it takes inside the entity.

With that in mind, the simplest way to handle a value object in Doctrine is to just map it to and from the database format yourself, in the entity code.

/**
 * @Column(type=”string”)
 */
protected $ipRangeAsCidr;

public function setIPRange(IPRange $range) {
    $this-> ipRangeAsCidr = $range->__toString();
}

public function getIPRange() {
    return IPRange::createFromCidrNotation($this->ipRangeAsCidr);
}

The example above might seem a bit dumb but that’s all internal to the entity. To the rest of the world, the integration is flawless and that’s what’s really important. Yes, we are mixing concerns. Yes, that makes a single tear roll down my cheek. Nonetheless, on the Grand Scale Of Architectural Sins, this is probably a lesser.

However, before you grit your teeth and bear it, keep in mind that for 80% of your use cases, a better way already exists.

DBAL Types

For most Value Objects, a custom DBAL type is the best option today. If you’re not familiar with a DBAL Type, imagine the DateTime mapping. When I set a property as a DateTime in Doctrine, it appears in PHP as a Datetime object but it’s stored in the database in the famous ‘Y-m-d H:i:s’ format. Something has to do this conversion. That’s what the DBAL types are for.

These are small converter objects with a super simple interface:

convertToDatabaseValue($value, AbstractPlatform $platform)
convertToPHPValue($value, AbstractPlatform $platform)

where $value is the PHP or raw database value and $platform represents the type of database you’re working with (MySQL, Postgres, etc). The $platform is extremely useful if the column type might vary based on the database (for example, Postgres has a native IP Address type but we might fallback to a string or integer in MySQL).

If your value is going to be represented as a single column in the database, DBAL types are by far the best solution. They’re easy to build and Doctrine supports them as a core feature. Furthermore, you can override getSqlDeclaration() to add support for Doctrine migrations or custom declarations.

public function getSqlDeclaration(array $fieldDeclaration, AbstractPlatform $platform)
{
    // Built-in helper function for getting platform independent DDL
    return $platform->getVarcharTypeDeclarationSQL($fieldDeclaration);
}

For more details, check the docs. http://docs.doctrine-project.org/projects/doctrine-dbal/en/latest/reference/types.html#custom-mapping-types

Multiple Columns

Unfortunately, DBAL types are limited: They only work if you have a single value and you want to store it in one column. So, if my database looked like this…

|---------------|
| ip_range      |
|---------------|
| 192.68.1.1/24 |
| 4.2.2.1/24    |
|---------------|

…DBAL types would be fine. If I wanted them to look like this…

|------------+---------------|
| start_ip   | end_ip        |
|------------|---------------|
| 192.68.1.1 | 192.168.1.254 |
| 4.2.2.1    | 4.2.2.254     |
|------------|---------------|

…then I’m out of luck.

In the current stable release of Doctrine, falling back to the “Map It Yourself” method is your best bet.

/**
 * @ORM\Column(type=”ip_address”)
 */
protected $startIP;

/**
 * @ORM\Column(type=”ip_address”)
 */
protected $endIP;

public function setIPRange(IPRange $ipRange)
{
    $this->startIP = $ipRange->getStartIP();
    $this->endIP = $ipRange->getEndIP();
}

public function getIPRange()
{
    return new IPRange($this->startIP, $this->endIP);
}

In the example above, we’re splitting this into two properties/columns internally but holding to the external contract. Keep in mind that the startIP() and endIP() methods return IPAddress Value Objects (rather than IPRange). This means we’re still using our IPAddress DBAL type for its standard string handling, etc (otherwise, you would also need to cast each $ipAddress to and from a string representation like in the Map It Yourself technique).

If you’re using the value object elsewhere inside your object and don’t want to constantly recreate it, you might consider a third unmapped property with the value object and only using the other two mapped properties for database reading/writing. This is useful, even when using the Map It Yourself technique.

// No annotations, thus not saved in the db.
protected $ipRange;

/**
 * @ORM\Column(type=”ip_address”)
 */
protected $startIP;

In the event of multiple properties shared across multiple classes, you could potentially use a trait to avoid repeating this code everywhere.

If you read the above, you might be thinking “Why should I bother with DBAL types at all if I’m just going to map everything myself eventually?” Remember, in most cases, you’ll only have the one value and the DBAL Type will suffice.

So, this method is reasonably simple but we’re still mixing concerns and repeating this across entities will create duplicate code.

Don’t fear, help is on the way.

Embeddables, The Promised Land

A better method for managing Value Objects has been a long running Doctrine pull request (the original JIRA ticket dates back to 2009 ). Several years and PRs later, it’s only one release away.

This new feature, called Embeddables, allows one to embed a non-entity class inside an entity. The feature is aimed at storing Value Objects but uses the more generic name “Embeddable” to be more inclusive towards other use cases.

I haven’t personally used this feature but the early docs are already online.

Based on the PR, however, we should be able to get the best of both worlds: Reusable mappings/types AND flattened database columns. I suspect a solution would look like this:

<?php

/**
 * @Embeddable
 */
class IPRange
{

    /**
     * @Column(type="ip_address")
     */
    protected $startIP;

    /**
     * @Column(type="ip_address")
     */
    protected $endIP;

    // public methods...
}

// And then in our entity...

class Server
{
    /**
     * @Embedded(class="IPRange")
     */
    protected $ipRange;
}

In this example, our Server entity has an embedded IPRange object. Notice that we’re using a different annotation, @Embedded, rather than an @Column type. That’s because @Column(type=*) maps to DBAL types but @Embedded maps to a regular PHP object with Doctrine metadata.

The IPRange itself is still using the ip_address DBAL type we created before because of a major limitation with Embeddables: you can’t use them inside of other Embeddables. (Update: No longer true, this was fixed by Doctrine 2.5! Hooray!) That might sound like overkill but it would’ve been perfect for our use case: we could’ve had the Server containg an Embeddable for the IPRange and the IPRange could’ve had 2 Embeddables for the start and end IP Addresses. (Update: And now that it’s there in 2.5, it’s exactly what I’d do! Hooray!)

Still, embeddables are simpler than creating lots of DBAL types: you’re just using adding Doctrine metadata to PHP objects like you normally do.

Even better, the Value Object fields are split into actual database columns so you can query them easily. The default naming strategy is the entity property name and the embeddable property name, combined with an underscore. In our case:

|-------------------+-----------------|
| ip_range_start_ip | ip_range_end_ip |
|-------------------|-----------------|
| 192.68.1.1        | 192.168.1.254   |
| 4.2.2.1           | 4.2.2.254       |
|-------------------|-----------------|

Embeddables are now in Doctrine master and are slated to be released with 2.5 to much fanfare and acclaim. Once they’re released, these should replace DBAL types as your default method for dealing with Value Objects.

Multiple Value Objects (i.e. Collections)

So far, we’ve talked about single values as one or more columns, but what about when you have multiple columns? For example, a record might have multiple IPRanges it should check for.

Unfortunately, neither DBAL types nor Embeddables work in these situations. They have no support for collections. Filthy as it is, I usually recommend a Fake Entity. Simply put, you create a fake entity that does nothing but contain your value object. This way, you can use Doctrine’s *ToMany relations on the fake entity and thus your value object.

The important rule here is to not expose these fake entities outside your object. If you have a list of IPRanges and those are encapsulated in IPRangeAssignments fake entities, then a getter should loop over and expose only the IPRanges, as these are the contract you want.

/**
 * @Entity
 */
class IPRangeAssignment
{
    /**
     * @Embeddable/@Column/@Whatever method of storing the IPRange
     */
    protected $ipRange;
}

/**
 * @Entity
 */
class Server
{

    /**
     * @OneToMany(targetEntity="IPRangeAssignment", orphanRemoval=true)
     */
    protected $ipRangeAssignments;

    public function __construct()
    {
        $this->ipRangeAssignments = new ArrayCollection();
    }

    public function setIpRanges(Collection $ipRanges)
    {
        // Converts a list of IPRange objects to IPRangeAssignment
        $this->ipRangeAssignments = $ipRanges->map(function($ipRange) {
            return new IPRangeAssignment($ipRange);
        });
    }

    public function getIPRanges()
    {
        // Unwrap the IPRangeAssignments back to IPRanges
        return $this->ipRangeAssignments->map(function($ipRangeAssignment) {
            return $ipRangeAssignment->getIpRange();
        });
    }
}

This method’s noise and boilerplate makes it less than pleasing but it works fairly well. You can clean this up by using custom helper collections or even a trait to reuse the code across multiple classes.

However, after some experience with it, I’ve come to regard it as a design smell. When clients want lists of Value Objects, they often later want to add metadata, such as priorities, dates or names. This is logical for the end user since Value Objects have no identity of their own.

Thus, the IPRangeAssignment may quickly go from fake entity to actual entity once you add the metadata. It would be best to avoid the work of breaking the contract and ask your client more up front.

Other Avenues (that I tend to avoid)

I’d be remiss if I didn’t talk about serialization. The quickest, easiest way to store a Value Object or collection in Doctrine is to just serialize it into a single column. You can do this with Doctrine’s built in serialization support (these are just DBAL types), json_encode it or just call serialize yourself. Again, as long as you keep this internal to the model, it’s not a massive issue.

The drawback is that you can’t effectively query this data. Technical debt isn’t just something that appears in your code. Your infrastructure, your processes, and yes, your database can all have technical debt. Simplifying your code at the expense of complicating your database is just moving the debt around.

So, I might serialize values into my database when I’m absolutely 100% sure that I’ll never want to run queries over them or change the data from anything other than my application. In all other cases, I prefer to spend the extra few minutes setting up a DBAL type, Embeddable or Fake Entity.

Another option would be to use Doctrine’s Event system to do the mapping yourself but external to the class. I’m a big fan of events but they’re probably not the best fit here: it’s more complex than the other options and you may still need to expose the internal state of your object to set things back inside it. If I wanted to make updates to other Entities, I might use the Event system but I’d be more likely to use a Domain Event than a Doctrine Event.

The Nuclear Option

Finally, Doctrine is great for the vast majority of applications but if you’ve got edge cases that are making your entity code messy, don’t be afraid to toss Doctrine out. Setup an interfaces for your repositories so create an alternate implementation where you do the querying or mapping by hand. It might be a PITA but it might also be less frustration in the long run.

You can also do this in a partially, by letting Doctrine map the simple properties of your entity and then use the postLoad event to attach some complex object building at hydration time.

Finally, don’t lose sight of the real goal here: using Value Objects in your code. Don’t let this article scare you off with the complexity for some of the more extreme solutions; Value Objects can improve your codebase tremendously and they’re a solution worth investing in.

*Many thanks to Marco Pivetta, Rafael Dohms, Sandy Pleyte and Benjamin Eberlei for suggestions and corrections related to this article.

Updated: