Growing Gradually in PHP

With the release of the much-anticipated PHP 7.4 and the planned release of PHP 8.0 next year, there’s been discussion in some quarters of whether or not PHP is getting “too strict”. Some have even phrased it as an existential split between two fundamentally different views of the language, incompatible worldviews on strictness vs flexibility that are at odds and will tear the language apart bit by bit if nothing is done to address that schism.

I do not share that view. Quite the contrary, I find PHP’s balance in recent years between casual code style and more formal code style to be one of its most remarkable and strongest features, and one we should be both proud of and continue to encourage.

It’s true there really are different ways and styles to approach programming; not all of them are always valid, but most are valid in certain situations and contexts. When writing a one-off simple script, or a hit counter, or some other “casual” use case where the code base is small, it’s entirely true that a lot of the formalisms and stricter, more pedantic language features are unnecessary and can in some cases, get in the way.

On the other hand, the larger and more complex your application, the more that formalism and strictness can save your butt. Rather than get in your way, it heads off bugs and design flaws early, before they can fester and become even more expensive to fix. I wouldn’t care if a simple side project app isn’t strictly typed all the way through; I wouldn’t trust an ecommerce application that isn’t.

But that’s the great thing about PHP: We can do both. While I encourage all developers to leverage PHP’s opt-in strictness and typing and other formal design tools as much as possible, they’re not required. If your use case really doesn’t call for them… don’t use them. If it does, use them.

Where some languages are engineered for big systems and thus have a high ramp-up to learn all the moving parts, and others are engineered for casual use cases and thus struggle to be scalable to large and complex use cases, PHP allows a “pretty good” of both worlds. It’s easily approachable, easy to learn, but contains an increasing number of features to allow for a more robust and self-checking programming style.

We can (and frequently do) debate when using those stricter, more explicit features is appropriate; personally I prefer to use them even on tiny projects as it’s good practice, but not everyone does, and that’s OK. But the fact that PHP lets you gradually scale from untyped loosey-goosey code to partially typed to strictly typed code, with casual error handling to strict error handling, all in the same language, all in an inter-compatible way, is absolutely incredible and something the PHP core team should be proud of.

I’m all for adding more robust formal tools to the language; I’m excited about typed properties and short lambdas in PHP 7.4; I am looking forward to union types in PHP 8, and I’d be ecstatic if we could figure out how to wrap intersection types and generics into the mix. There’s plenty of other language tools to help constrain large code bases and algorithms we could consider, too. But by necessity all of those would be opt-in features, just like types are now; and that’s what’s great about them, and about PHP.

I’ve never agreed with the argument that type definitions or classes make the language inherently harder to learn, but PHP, fairly unique among languages I know, lets you learn them piece by piece or all at once at whatever your pace happens to be. That wasn’t a conscious, deliberate decision, but it’s how the language has evolved and we are all better off for it.

We don’t need to split up PHP. We don’t need divergent “modes” or “editions”. We need to continue doing what PHP has been so remarkably good at: building a language that scales with you, from casual loosey-goosey scripts to formally typed and structured and error checked designs that would make a Computer Science professor proud. Fitting all of those into one language is hard, but PHP has done it, and we should continue to do it.

And that’s something we can all be thankful for.

Thinking about Dependency Management

The last 8 years can be characterized as a massive change in how we work with dependencies within PHP projects. The arrival and further advancement of Composer has brought a big change in how we decide which dependencies to use, how to maintain them and how we update and upgrade them. Only few people would deem this change as not positive – overall this has been a big success story for the PHP community and environment.

Unfortunately, often each blessing comes with strings attached, and this is no exception. Because it is so easy to add another dependency, we tend to add just another one without thinking about whether we really need it or if there isn’t another solution available. You may not maintain the library that you just added as a dependency, but the moment you add it it becomes your liability. Nowadays, it’s often not simply adding a single library – those have dependencies as well, and become part of your project and your liability as well. It may not be as bad as in the JavaScript world with npm (1, 2), but are we always really sure we want to have all those dependencies?

While we all enjoyed the new freedom made possible with Composer and earned the fruits, I think we didn’t look into the other non-positive aspects as much as we should. Are you sure you know about all packages that are in your application? Not just the direct ones you declared as a dependency, but the dependencies of those, and the dependencies of the dependencies of your dependencies? You could prolong this chain even more I guess.

The issue here is trust. For example, if you decide to add a package from Symfony, you place trust in the maintainers of this package. For a package from Symfony, that may be easy to do. But does this apply to all the maintainers of all the packages you use? At least for making sure your dependencies don’t contain known security vulnerabilities there are tools available, like SensioLabs Security Checker and GitHub Security Alerts. But what about packages where it’s not that easy to put trust into them? Theoretically, on each update you’d have to vet the new version to ensure it’s still ok to use for you.

Packages vary in size – for small packages of just a couple files it may be worth considering not to add the package as a dependency, but to simply copy the required code you just vetted (with respect to the license, of course!) to your application and not bother with thinking about the dependency. This might be easier said then done, as this has a tradeoff: you lose simple access to bugfixes and more importantly security fixes. However, the component might not be very susceptible to security flaws. The additional burden of maintaining this piece of code yourself might be worth it in constrast to the burden of caring about the dependency throughout the future.

On the other hand, adding it as a dependency will provide you with easy access to bugfixes and security fixes. But are you sure your dependencies are really updated often enough? Too often I’ve seen cases where developers don’t update and applications fall behind. Just because it might be easy to update it’s not done automatically. This might have reasons like missing unit tests so you can’t be sure that an update doesn’t break anything, even if it’s supposed to be a bugfix update only. To ensure upgrades happen often the existence of tests is necessary, but not sufficient. I believe dependency updates must be automated, and luckily we see the first steps in that direction, like automated security updates from GitHub. This is great, but I think we have the necessary ingredients in place to automate all bugfix- and probably minor updates this way, and we definitely need this as the amount of software just grows bigger and bigger.

As always in software, there isn’t a definitive answer, and decisions taken vary from situation to situation. But being aware about what adding an additional dependency (or more than one if this one itself has additional dependencies) entails is definitely not a bad thing.

Lessons Learned from Testing and Refactoring Legacy

I remember when I first discovered automated testing. I immediately wanted to apply it to all the projects that I was working on, but it didn’t work as well as I expected. In fact, it was a disaster, which is why so many developers shy away from tests after a few failed attempts. It turns out that adding tests to a project that never had any tests is a much bigger challenge than testing new software. People get thrown at the deep end of the pool, then either learn to swim or get scarred for life.

The problem with legacy applications is that they almost never follow SOLID or clean code principles. This makes them hard to unit-test. You’d need to refactor the code before you can test it, but how do you know that the refactoring won’t break existing functionality? You’d need tests to ensure that the refactoring goes smoothly. We are in a deadlock… or are we?

Characterization Tests

When I want to refactor code, I don’t start with unit tests. I start with characterization tests, which are meant to characterize the current software’s behavior. I usually write them by calling an HTTP endpoint or batch script, then inspecting the output.

For example, I want to make sure that product prices are correctly synchronized based on a daily CSV file that we fetch from an FTP server. I will write a test for that and when I refactor the underlying code, this test would still pass, because it does not concern itself with the internal working of the code. It only cares that the end result is the same. Characterization tests will survive a refactoring operation.

Note that these tests are meant to ensure that the existing behavior doesn’t change, not that the behavior is correct. This means that if I uncover bugs, I’ll document them in tickets and continue testing.

Later, when refactoring is done, I can update these tests to capture the correct behavior that is expected, then fix the bugs that pop up, along with any unit tests that I have written during the refactoring. They will then become acceptance tests.

Exploratory Testing

Once I built myself a safety net with characterization tests, I need to understand how the existing code works. For this, I use exploratory testing. These are often throw-away tests that I only use to understand the code, its design and where things can be refactored.

I don’t refactor everything at once, as that would be a 2-year effort on some projects. I try to find smaller components and make them testable, but not so small that the refactoring will be pointless. With some practice, I eventually found the sweet spot.

To decide how to refactor, I look at difficult tests. The following testing points usually indicate an underlying code design flaw.

Hard Dependencies

These are easy to spot. When my test makes me mock a static call or a new keyword, I know that I will need to move these things to constructor dependencies.

I will also extract an interface from these dependencies, so that my code would respect the “D” in SOLID: one should depend upon abstractions, not concretions. Dependency inversion is the first thing that I will introduce into any legacy to make my tests sane and any code changes a lot less painful.

Uncontrolled Variables

Sometimes, my test will depend on things that I cannot control, like the current time or a user’s IP address. This is a great opportunity to refactor the code to not explicitly depend on these things.

  1. I will find all places that use mktime(), new DateTime(), etc.
  2. I will create and inject a Clock interface as a dependency for these classes.
  3. I will replace the time creation with $this->clock->now().
  4. I will write an implementation for it using DateTimeImmutable and another using a value that comes from a database, which I can later leverage for controlling the time in acceptance tests. The acceptance test writes a date to the DB and the application under test reads it.

Not only do these interfaces allow for robust and repeatable unit tests, but I can also simulate things like cache expiry or other interval-based logic.

Mixed Concerns

Sometimes, a class requires me to mock 15 dependencies, but only a tiny fraction of them is used in each method. This often happens in the context of an MVC framework where it’s common to have one controller class with many actions.

$controller = new ProductController($dep1, $dep2, ...);

Perhaps this class is mixing too many concerns and should be split into smaller classes. Group the classes by the dependencies that they have, even if it means one method per class. Instead of having a class called ProductController, you’ll have ProductListHandler, ProductViewHandler, etc. The resulting classes will be much easier to test, and the code will be easier to debug and modify.

Long Methods

Is a method too long and requires 200 unit tests, each with a ridiculous setup? Myself from 10 years ago would have written several thousand lines of unit tests and then hoped to never have to understand them again.

Today, I will split that into small private methods, group them by dependencies and move whatever makes sense into separate classes.

Let’s say that each comment in this example corresponds to about 20 lines of code:

public function synchronizePrices(): void
{
    // load CSV from file
    // parse CSV
    // create product array
    // if product doesn't exist, throw exception
    // look up product in database
    // if price is different, update it
}

As a first step, I’ll extract the code into private methods and ensure that my characterization tests still pass:

public function synchronizePrices(): void
{
    // This depends on the filesystem.
    $csv = $this->loadCsvFromFile();
    $parsedCsv = $this->parseCsv($csv);
    $products = $this->getProductsFromArray($parsedCsv);

    foreach ($products as $product) {
        // These two depends on the database.
        $this->findProduct($product);
        $this->updatePrice($product);
    }
}

Now I should move the code from those methods into new classes that I’ll inject through the constructor. More than 100 lines of code turn into this:

public function __construct(
    ProductRepository $sourceProductRepository,
    ProductRepository $targetProductRepository
) {
    $this->sourceProductRepository = $sourceProductRepository;
    $this->targetProductRepository = $targetProductRepository;
}

public function synchronizePrices(): void
{
    $products = $this->sourceProductRepository->getAll();

    foreach ($products as $product) {
        $this->targetProductRepository->updatePrice($product);
    }
}

Notice that both our repositories are implementations of the ProductRepository interface. Basically, I want to be able to synchronize between two repositories. This class doesn’t need to care how things are stored. I’ll just instantiate it with the CSV implementation on one side and the DB implementation on the other.

Maybe the third party will one day stop uploading a CSV and instead I’ll need to fetch the prices from a REST API, which I know just requires an additional implementation that can be done in an afternoon or two.

Of course, nothing prevents me from splitting things further inside of those concrete classes if they are still too complex. I will end up with smaller, testable classes.

More Refactoring

This is by no means an exhaustive list of all the refactoring that can be done. For more ideas, please read “Clean Code” by Robert C. Martin, “Working Effectively with Legacy Code” by Michael Feathers and “Modernizing Legacy Applications in PHP” by Paul M. Jones.

However, don’t try to do everything at once. As soon as you have a small class that depends on just a handful of interfaces and a rather short method, like in the synchronizePrices example, go write unit tests.

Unit Tests

Refactoring will make the synchronizePrices method much easier to test. Here is what the exploratory test might have looked like:

protected function setUp(): void
{
    // 50 lines to mock hard dependencies
    $this->synchronizer = new Synchronizer();
}

public function testSynchronizePrices(): void
{
    // 50 lines to build CSV content
    // create the CSV file and upload to FTP
    // 200 lines of mocks for the ORM calls
}

Here is how I would generally write the test after the refactoring:

protected function setUp(): void
{
    $this->csvProductRepository = $this->createStub('ProductRepository');
    $this->databaseProductRepository = $this->createMock('ProductRepository');

    $this->synchronizer = new Synchronizer(
        $this->csvProductRepository,
        $this->databaseProductRepository
    );
}

public function testSynchronizePrices_WithProducts_WillUpdatePrices(): void
{
    $this->csvProductRepository
         ->method('getAll')
         ->willReturn([$product1, $product2]);

    $this->databaseProductRepository
         ->expects($this->exactly(2))
         ->method('updatePrice')
         ->withConsecutive([
             [$product1],
             [$product2],
         ]);

    $this->synchronizer->synchronize();
}

In the unit test, I instantiate the class with a stub and a mock, then make sure to cover the two execution paths (with and without products). If I expect updatePrice to throw an exception, I can decide to add yet another test case and then implement a try/catch in the code.

As you gain a better understanding of what constitutes good code design, your tests will become increasingly easier to write.

Helpful Tools

In addition to all these tests, I also rely heavily on PHPStorm, which offers automated refactoring tools and a plethora of inspections that highlight any potential error that I’m making, like trying to call a method on a property that is possibly null: $entity->relationship->getName().

I go even further with PHPCS, PHPStan and Psalm, all at once, to break my CI in case I write something that has even a remote possibility of being incorrect. Because legacy code lights up my CI like a Christmas tree, these tools can be configured to only apply on files that you have edited.

I also had some great experience with RectorPHP, which can automatically improve things like moving static calls to constructor dependencies, add type declarations, etc. There are many great tools in there to give you a refactoring boost.

If you have a problem, there is a good chance that someone solved it years ago, wrote a book and maybe even created a tool for it. Keep exploring and have fun!

The grass could be a lot greener on both sides of the fence

It is the end of 2019, and the invitation to write an article for 24 Days in December is an excellent opportunity to reflect on what has happened this year – and maybe even look back a bit further.

Past

In 1990 I wrote my first line of code on an Atari Portfolio, and at the time, I would have never guessed that one day I would be building and maintaining software for a living.

In 1999 I got paid the first time for building a website with static HTML.

In 2001 I took up an internship in a startup in Berlin, and there I would write my first line of PHP code. In 2007, shortly before dropping out of business school, I started working full-time with PHP in another startup. In the years to follow, I worked for several startups, always with PHP.

In 2012 I found out that PHP developers meet up monthly to give and listen to talks about their experiences at so-called user groups, and I first attended the Berlin PHP user group.

From writing my first line of PHP code in 2001 to attending my first user group in 2012, it took me eleven years to realize that I am part of an actual community! A community of people who not necessarily work in the same companies but a community of people who share their experiences; people who are eager to learn from and eventually help each other to become better developers!

In the months and years to follow, a lot of things changed for me. Most of all, seeing that there is a community out there, and that it is easy to become a part of it, made me want to become a better developer.

Soon I started following other developers on Twitter. I made my first contributions to open-source software. I attended my first PHP conference. Attending the user groups more or less regularly, I got to know other developers, some of whom introduced me to other people in the PHP community. I took note of books recommended by speakers, and added them to an ever-growing reading list. I overcame my fear of writing tests. Eventually, I started writing tests first. I continued to make contributions to open-source software. I began working for a company with an actual build pipeline and a process that I still admire today. I overcame framework fanboy-ism and got hired over Twitter to work remotely for a company in New York. I attended conferences more regularly. Seeing a lot of developers over and over again, I found it rather easy to get in touch with them – and was surprised that they are very approachable. I became acknowledged with more and more developer tools – tools that made my life as a PHP developer a lot easier. I started contributing to these tools, and even published a few small open-source libraries that have users other than me. I came around to writing blog posts, and at the moment, I’m struggling with writing this article here.

Probably none of this would have happened if I had never attended the meetings of the PHP user group – none of it. Up until the moment when I first joined the user group meeting, I was entirely concerned with making things run. By going to these meetings, I had opened a new chapter and began wondering how to make things right.

Present

In the last five years, the focus of my work has shifted from creating to maintaining and modernizing legacy applications. Thanks to excellent tooling experiences made, dealing with legacy applications is not a hard problem.

What I have come to realize in 2019, but have heard numerous time before, is, that people are the hard problems – just as Jerry Weinberg, who passed away in August 2018, put it in The Secrets of Consulting:

No matter how it looks at first, it’s always a people problem.

Setting up a build pipeline, putting developer tools in place, refining a process, making it harder for developers to ship faulty code – this is all excellent.

Working on code, however, has only short-term effects: as soon as the project is over, developers are left alone with a code base they hardly understand. When developers have only learned to follow the rules enforced by automated systems, but have not learned why they have been put in place, these automated systems become an annoyance rather than a crutch.

Working with developers, on the other hand, can have long-term effects: by asking developers questions instead of giving them directions, by letting developers fail and helping them up again, by showing them alternative ways of developing software, and by encouraging them to question the status quo, they will eventually become better versions of themselves.

Future

Perhaps you are working on a legacy code base, with colleagues who do not care so much. Perhaps you are looking for a new job already. Of course, the grass always looks greener on the other side – but is it?

For some of your colleagues, being a software developer is just a 9-to-5 job. Others have already so much on their plate, maybe they really cannot and will not be able to do much more. However, there will always be interested developers who only need a little nudge or a pointer.

You could be the person reaching out, go ahead and invite them to the next user group.

Why Can’t We All Just Get Along?

In the 1996 movie Mars Attacks, the president of the United States, played by Jack Nicholson, makes an impassioned plea to the Martian invaders.

The outcome is often how I feel when discussing what I do for a living with some members of my local PHP community.

I joined the WordPress community (out of choice) after attending my first WordCamp in 2015. In the 15 years since I started my programming career, the closest I’d gotten to an open source community was following the blogs of folks who had written articles I found useful. To me, the people involved in open source were mostly programmers like myself, who spent their spare time only in submitting features and patches to the specific project they were involved in. Thus attending my first open source conference was a huge eye opener.

Here were a bunch of folks who not only contribute to their chosen open source project, but also met regularly to share knowledge and discuss all things around this project. Not only that, but the usual ‘brogrammer’ attitude I’d encountered for much of my journey, did not seem to exist here. Everyone was friendly, helpful, and above all, positive. For someone who’d reached a point of feeling like an outsider in his own community, WordCamp was an experience I will never forget. It was at that event that I chose to learn WordPress as a foundation for my future development journey.

As time went by I started looking for other open source communities, specifically related to PHP. Having read the many online (usually negative) opinions of PHP and having had many, many conversations with opinionated programmers in other languages, who ALWAYS expressed how poor my preferred language of choice was, I was looking forward to meeting and having conversations with other developers who used the same languages, frameworks and CMS’s I did, and would share my love of our often misunderstood language.

What I found surprised me. When folks asked me what I did/worked on for a living, and I mentioned WordPress, the general response was the same as when I used to tell Python developers I code in PHP. The general disdain and almost hatred for a CMS that powers a large majority of the online space amazed and astounded me. Just by being associated with WordPress, I felt as if other PHP developers looked down on me and whatever skills I had.

The irony is that WordPress has enabled me, in the last 4 years, to scale up in skill and knowledge at an astronomical amount. This is because, as those WordPress developers who’ve been active in the space since almost day one started delving into more advanced concepts like modern object orientated programming, automated testing and continuous integration, they share this knowledge with others. And because there are so many more of them, the amount of knowledge out there is abundant. I’ve forgotten how many times I’ve reached out to members of the WordPress community who I look up to, and how they’ve always taken the time to help me understand a difficult concept I was struggling with.

At the end of the day, by choosing PHP we are already taking flak from the so-called ‘real’ programmers of the world. Sure, PHP has come a long way, but it still has the stigma of being something that only ‘noobs’ and ‘kids’ use, even though it’s still one of the top 10 most popular languages on the web. Do we really need to shame each other for the choice we’ve made to use this ‘legacy’ framework, or that popular CMS, or procedural instead of object orientated, or whatever, because we ‘have seen the light’ and ‘know better’?

Why can’t we all just get along?

So the next time someone is using something you deem ‘less worthy’, maybe instead of deriding them, reach out to them. Find out why they are using what they are using, and if they are experiencing any hurdles (they probably are) and share your own knowledge and experiences with them. If we can put aside our differences, focus on our similarities and work towards sharing our knowledge and experience with each other, regardless of what we work with every day, all of us will be in a better place.

Curiosity is a .. killed the cat

I was 7 years old when I came across a phone number of a local music school and asked my mum when is she going to sign me in. Eighteen years later I am fully educated classical musician, working as an opera prompter in local opera house.

As long as I can remember I had two wishes: to become a musician and to learn everything that can be learned.

First wish – done. I’m nailing this “life” thing.

Second wish? Well. I realised that’s not going to happen any time soon. And as if that wasn’t frustrating enough, over the years I have found more and more things I wish I never learned. Curiosity, huh? But that’s another story.

I’m also learning a lot of useful things. Being a loud and uncomfortably direct extrovert in a web development world can be a lonely place at times. So I’m developing my talking-to-introverts skills.

I can tell you right now: learning PHP is so much easier. If for nothing else, error handlers report less casualties. And logging all exceptions requires a lot of memory. You start wondering how much of memory you should allocate for an average social event? What is the limit? Will I ever learn? So many questions.

Curiosity..

But there are questions sent my way too. Looks like wonderment is mutual. “How difficult was it for you to switch from music to PHP?”

To be honest, I believe that music made learning to code far more intuitive for me than it would have been expected. Musicians among you are nodding right now. And there are many musicians who write code.

No, it’s not weird.

Both, music and web apps, are time based and event focused products of human attempt to communicate with the rest of the world.

In music performance we put a lot of thought into the right amount of expressing oneself, effortless technique and delivering the exact message without explaining how this particular piece of music is created and why.

In building a web page, we put a lot of thought into the right amount of necessary code and functionality, not overloading the server and delivering exactly what is expected on the page without explaining how it works and why.

Once the performance starts, once the web page starts loading, it is of crucial importance that every piece of it occurs in its specific time. Otherwise it can cause errors. Fatal even.

Too general comparison?

  • Production – Live performance (concert).
  • Staging – General rehearsal (as for real).
  • Local dev environment – Practicing (piece by piece, repeating difficult parts, making connections between sections).
  • Bugs – Obviously bad practice, errors in performance, off key/tune.
  • Company team (design, frontend, backend)Chamber music.
  • Open sourceOrchestra.
  • Code reviewSight-reading.
  • Localisation and internationalization of strings – Lyrics and librettos can and have been localised as well.

Yeah, but specifics..

Visibility of a property:

Types:

I’m sure there’s more but my understanding of PHP is limited here. For someone who wanted to learn everything that can be learned I’m unexpectedly happy that I’ll never fully know PHP. Or understand the point of combinatoriality in music.

I’m just enjoying this ride of endless curiosity, randomly successful conversations with introverts and hugs. Lots of hugs.