With most web applications, there comes a time when your administrative users need reports. These statistics go beyond the typical domain entities and properties that you deal with day-to-day. They consist of aggregate functions (i.e. SUM and MAX) that return numeric scalar values. Lets look at a common MVC solution to expose this data in PHP, and explore how we might refactor that code into something that is maintainable. Our goal is to find the highest quality code that is simple and easy to test.
The source code and tests can be found on Github.
Suppose that you maintain a PHP web application that needs to expose meaningful company statistics to an administrative user. Think of how you might approach this problem. What SQL will you write? How will you separate your code with regards to the database and the UI?
Below is a typical solution using the MVC pattern with the Company Controller issuing a direct SQL query. For now lets focus on the UI code and how the interaction between the Controller and View layer impacts the overall maintainability. Setting aside the Model layer simplifies the following example.
<?php class CompanyController { public function action_view($companyId) { $sql = ' SELECT SUM(IF(employee.isActive=1,1,0)) AS totalActiveEmployees, SUM(IF(employee.isActive=0,1,0)) AS totalInactiveEmployees FROM employee WHERE company_id = ' . (int) $companyId; $companyStats = $conn->query($sql); $this->setVar('companyStats', $companyStats); $this->setVar('company', ...); } }
<div id="company-stats"> Company Name: ... Active Employees: <?=$companyStats['totalActiveEmployees']?><br> Inactive Employees: <?=$companyStats['totalInactiveEmployees']?> </div>
Here, arrays and strings are used to access the domain object values. This stringly-typed nature tightly couples our UI layer with the repository layer. A small change to any of the strings will cause another part of the system to malfunction.
To reduce our dependence on strings, lets introduce a Data Transfer Object (DTO) to strongly type the company statistics. This is a simple anemic PHP object with two getter methods.
<?php class CompanyStatsDTO { protected $totalActiveEmployees; protected $totalInactiveEmployees; public function __construct(int $totalActiveEmployees = null, int $totalInactiveEmployees = null) { $this->totalActiveEmployees = (int) $totalActiveEmployees; $this->totalInactiveEmployees = (int) $totalInactiveEmployees; } public function totalActiveEmployees(): int { return $this->totalActiveEmployees; } public function totalInactiveEmployees(): int { return $this->totalInactiveEmployees; } }
/** @var CompanyStatsDTO $companyStats */ Company Name: ... Active Employees: <?=$companyStats->totalActiveEmployees()?><br> Inactive Employees: <?=$companyStats->totalInactiveEmployees()?>
Now our UI layer only depends on the DTO. Any future refactoring of this object will be a safe operation. This object does one thing and is easy to test.
Next, we separate the repository layer from the controller. In doing so, we further remove our coupling to a relational database. Why should your Controller or View care that you are using a SQL database? First, we need an interface to code against.
<?php interface CompanyRepositoryInterface { public function getCompanyStats(int $companyId): CompanyStatsDTO; }
That was easy. There is one method that computes the stats for a company and returns a CompanyStatsDTO object. With this new API, the controller shrinks and delegates the operation to a more appropriate layer.
<?php class CompanyController { public function view($companyId) { $companyStats = $this->companyRepository->getCompanyStats($companyId); $this->setVar('companyStats', $companyStats); $this->setVar('company', ...); } }
Ok. I get it. Separation of concerns. Single responsibility. How do I actually use this?
Lets look at four ways you can implement this Company Repository Interface using Doctrine.
Below we implement the CompanyRepositoryInterface using the Doctrine Query Builder. This helps to construct the same raw SQL query from before using the Builder Pattern. Ignore the DQL language specifics for now, and notice how we return a new CompanyStatsDTO object using the query result array.
<?php class ManualHydratorCompanyRepository implements CompanyRepositoryInterface { public function getCompanyStats(int $companyId): CompanyStatsDTO { $companyStatsArray = $this->getEntityManager() ->createQueryBuilder() ->addSelect('SUM(IF(Employee.isActive=1,1,0)) AS totalActiveEmployees') ->addSelect('SUM(IF(Employee.isActive=0,1,0)) AS totalInactiveEmployees') ->from(Employee::class, 'Employee') ->where('Employee.company = :companyId') ->setParameter('companyId', $companyId) ->setMaxResults(1) ->getQuery() ->getArrayResult(); return new CompanyStatsDTO( (int) $companyStatsArray[0]['totalActiveEmployees'], (int) $companyStatsArray[0]['totalInactiveEmployees'] ); } }
This is somewhat painful. Maybe there is another way to do the same thing. Lets try the Result Set Mapping strategy Doctrine provides.
<?php class ResultSetMappingCompanyRepository implements CompanyRepositoryInterface { public function getCompanyStats(int $companyId): CompanyStatsDTO { $resultSetMapping = new ResultSetMapping; $resultSetMapping ->addScalarResult('sclr_0', 'totalActiveEmployees', 'integer') ->addScalarResult('sclr_1', 'totalInactiveEmployees', 'integer'); $companyStatsArray = $this->getEntityManager() ->createQueryBuilder() ->addSelect('SUM(IF(Employee.isActive=1,1,0)) AS totalActiveEmployees') ->addSelect('SUM(IF(Employee.isActive=0,1,0)) AS totalInactiveEmployees') ->from(Employee::class, 'Employee') ->where('Employee.company = :companyId') ->setParameter('companyId', $companyId) ->setMaxResults(1) ->getQuery() ->setResultSetMapping($resultSetMapping) ->getArrayResult(); return new CompanyStatsDTO( $companyStatsArray[0]['totalActiveEmployees'], $companyStatsArray[0]['totalInactiveEmployees'] ); } }
That was a bit too much. The Result Set Mapping strategy is good for complex results; however, we are only concerned with simple stats for a single company. Code should be short and simple. This adds quite a bit of extra effort with the advantage of having integers being hydrated into the query result array. Maybe there is an alternative.
<?php class CustomHydratorCompanyRepository implements CompanyRepositoryInterface { public function getCompanyStats(int $companyId): CompanyStatsDTO { $this->addCustomHydrationMode(CompanyStatsDTOHydrator::class); return $this->getEntityManager() ->createQueryBuilder() ->addSelect('SUM(IF(Employee.isActive=1,1,0)) AS totalActiveEmployees') ->addSelect('SUM(IF(Employee.isActive=0,1,0)) AS totalInactiveEmployees') ->from(Employee::class, 'Employee') ->where('Employee.company = :companyId') ->setParameter('companyId', $companyId) ->setMaxResults(1) ->getQuery() ->getResult(CompanyStatsDTOHydrator::class); } protected function addCustomHydrationMode(string $className) { $this->getEntityManager()->getConfiguration()->addCustomHydrationMode( $className, $className ); } }
<?php class CompanyStatsDTOHydrator extends AbstractHydrator { protected function hydrateAllData(): CompanyStatsDTO { $row = $this->_stmt->fetch(PDO::FETCH_ASSOC); return new CompanyStatsDTO( (int) $row['sclr_0'], (int) $row['sclr_1'] ); } }
Ok, that looks a lot like the Result Set Mapping approach. This isolates the less than ideal scalar to integer hydration to a separate class. The benefit is that our query result is now a CustomerStatsDTO object rather than an array. I still think we can do better. Maybe there is a better way.
<?php class DTOHydratorCompanyRepository implements CompanyRepositoryInterface { public function getCompanyStats(int $companyId): CompanyStatsDTO { return $this->getEntityManager() ->createQueryBuilder() ->select('NEW ' . CompanyStatsDTO::class . '( SUM(IF(Employee.isActive=1,1,0)), SUM(IF(Employee.isActive=0,1,0)) )') ->from(Employee::class, 'Employee') ->where('Employee.company = :companyId') ->setParameter('companyId', $companyId) ->getQuery() ->getOneOrNullResult(); } }
Yes, this looks nice. We leverage the “NEW” Operator Syntax with Doctrine (as of 2.4) to both hydrate and return the DTO object. Short and simple. Maintenance becomes easy with simplistic code. Code that only has one reason to change. It does one thing and is easy to test.
Quality is unmeasurable; it is subjective, highly personal, and often biased. Maybe we can all agree that short and simple code is an indicator of quality. Hopefully the above examples can help you to find quality in your code.
Part 1 | Part 2