Evolving PHP

With 2022, I see PHP’s cost as becoming prohibitive. Here’s why.

PHP continued to evolve in 2022. That’s a good thing. PHP also scored an “own goal” near the end of 2022. This latter concern is not at all obvious. Here’s my view of the situation.

Double-digit bugs

Do you remember the Year 2000 problem? With 2022 PHP, we have some similarities–and differences.

One of the most difficult problems was moving from two-digit years (such as 87 for 1987) to four-digit years (1987). The problem looks trivial, right? But even the largest and fastest computers in the world ran on one-eighth of a megabyte of RAM for operating system, I/O buffers, everything. It was not trivial to find that extra two bytes of storage! Making more space “here” meant something else lost space “there.” We took months and years with cascading re-designs.

Our measure of success was that as January 1, 2000, rolled across the planet from one time zone to the next… nothing happened.

The situation was a forced upgrade. We had no choice. When people wrote software according to current best practice in the 1960s, 1970s, and 1980s, none of us expected that the same code–literally, the same executable file–would still be running and producing revenue in the 2000s. The books 1984 and 2001: A Space Odyssey, in our minds, still described as distant a future as Star Trek.

“Decades from now” and “centuries from now” were both, in our minds, the same problem category. In 2015, NASA famously searched for a programmer fluent in 60-year-old languages because the Voyager 1 and Voyager 2 spacecraft just kept working. In 2020, IBM scrambled to find or train more COBOL programmers to help states because various U.S. states continued to rely on mainframes running COBOL to manage their unemployment systems. The problem was that the 2020 pandemic had overloaded many such unemployment systems.

When functionality must continue

We collectively had a large legacy code base to be sure, but that’s not the point of this analogy. For Y2K, we took years out of our lives for what was essentially a bugfix. There was no new functionality to be gained. The whole point was continuing functionality. Because the installed code base (i.e., nearly all software on the planet) was so large, businesses, government, and software vendors all had compelling interest in keeping that revenue-producing software running past January 1, 2000.

This investment was reluctant. Nobody wanted to invest time or salary in non-features. In my own experience, moving a running production codebase from PHP 5 to PHP 7 encountered similar reluctance. “Upgrade” time was a cost rather than a benefit. It carried the “opportunity cost” of time taken away from developing new features or reacting to business needs.

There was a similar situation with U.S. gas station pumps becoming EMV (“chipped” credit card) compliant. The banks refused to pay for upgrading point-of-sale equipment. Instead, the card issuers implemented a liability shift, assigning the problem to the gas pump owner.

As with Y2K software upgrades, the necessity was the continuing functionality. Customers expected to continue using credit cards for gasoline purchases, as had been possible since the 1920s.

PHP language changes

Nikita Popov proposed a discussion nearly three years ago (February 2020).

In recent years, there has been an increasing tension in the PHP community on how to handle backwards-incompatible language changes. The PHP programming language has evolved somewhat haphazardly and exhibits many behaviors that are considered undesirable from a contemporary position.

Fixing these issues benefits development by making behavior more consistent, more predictable and less bugprone. On the other hand, every backwards-incompatible change to the PHP language may require adjustments in hundreds of millions of lines of existing code. This delays the migration to new PHP versions.

The general solution to this problem is to allow different libraries and applications to keep up with language changes at their own pace, while remaining interoperable.

Comments on Popov’s proposal describe a possible 4-7 year window for people migrating to the next major release. That, given my experience with Y2K, I see as tight but feasible.

Aftermath

I then watched, in horror, the discussions of a year ago (November 2021), concerning plans for future PHP releases. Branko Matić, for example, implored:

Give us a break, at least for a year or two. Stop updating and “improving” everything. The developer work is now 50% of time updating and compatibility fixes. So much time is lost for that, globally.

Matić was talking about Juliette Reinders Folmer‘s thread concerning PHP 8.2 proposals.

Deprecations are not the problem

Brent, explaining deprecations, discloses the internal developers’ perspective:

Of course, one could ask: are these breaking changes and fancy features really necessary? Do we really need to change internal return types like IteratorAggregate::getIterator(): Traversable, do we really need to disallow dynamic properties?

In my opinion–and it’s shared by the majority of PHP internal developers–yes. We need to keep improving PHP, it needs to grow up further.

The problem is not the deprecations themselves. The problem is the shortened migration timeframe.

With Y2K, vendors and compiler writers were answerable to the installed customer base. Operating systems and database engines required Y2K fixes, as did the banks who were running most of the financial transactions around the planet.

PHP, on the other hand, is free and open source software. The PHP internal developers, an extremely talented group of people and mostly volunteers donating their time, are not generally answerable to PHP’s installed customer base.

Where is the “own goal”? It’s in the difference between that vision of 4-7 years for each migration path, and the reality of 1-2 years at most.

As Matić explains, the problem is not the pace of new features or even of deprecations. It’s the narrow window of time allowed for the forced upgrades.

The fundamental shift

Consider our own tiny little team, a probably-typical PHP shop with 3-4 people doing PHP. We’ve been developing PHP software full time for the past ten years or so.

As you can well imagine, once the code for a certain overnight process, or a certain report, works, there’s no need (or reason) to touch that code unless a business need changes. That code, developed once, continues to run in production. This situation is quite similar to the legacy code bases leading up to Y2K.

Our legacy code looks very PHP 4-ish, or at least very PHP 5.2-ish. That is, it was developed according to the way things were done back then.

Is there a difference? Yes, indeed! You will recall that, with the introduction of PHP 5.0, that one of the “very big deals” with PHP 5 was Object-Oriented Programming. The PHP Certification Exam even had Design Pattern questions.

I said at the time that, with PHP 5, in my view PHP had now become a “real” programming language. PHP, in my view, became a “mature” language with PHP 7. With PHP 8, PHP became… well… something else. What happened?

Up through PHP 7, we could keep our ten years’ worth of legacy code base and patch it up to continue running. It was not that difficult to change mysql() calls to mysqli() calls, for example. Big fat god object arrays still worked fine. “Loosey-goosey” sloppy coding continued to work. Duck typing worked.

When writing new code for new requirements or functionality, I strongly favor strong typing and automated tests. I can be obnoxious about it. I have a solid supply of “I told you so” and “this is why.”

But the thing is, our loosey-goosey code can’t make the jump to modern PHP. Most would argue that it shouldn’t make the trip; it needs a rewrite.

Wait a minute! Let’s think about this. In November 2022, Brian Jackson wrote Is PHP Dead? No! At Least Not According to PHP Usage Statistics:

According to W3Techs’ data, PHP is used by 78.9% of all websites with a known server-side programming language.

PHP, as you and I know (that’s why we’re here!), built the modern World Wide Web. That’s why half the planet still runs on PHP, whether it’s a dead language or not.

These days, with PHP dying off (allegedly), the question becomes, “what do you call a good PHP programmer?”

And, these days, the answer remains, “employed.”

During the 1970s, 1980s, and 1990s, the banking and business systems of the world ran on COBOL, and for crypto, FORTRAN. Old code continued to run. Vendors made darn sure this was true, so they still made money. For example, IBM’s System/360 was announced in 1964, yet:

Application-level compatibility (with some restrictions) for System/360 software is maintained to the present day with the System z mainframe servers.

IBM supported binary level compatibility for decades. Literally the same compiled modules ran for decades. IBM existed to make money.

But now today, we run on free software. There’s no vendor. Those producing the free software–on their own time as volunteers in most cases–tell us, “you need to keep up.”

The result is that the software that built the internet won’t run anymore. The “old” PHP is no more. The “new” PHP not only encourages better-written code, it requires it. The loosey-goosey idioms no longer apply.

What’s the barrier? Strangely enough, the situation is similar to Y2K. Y2K required re-thinks, re-designs, rewrites. Even with minor compatibility changes, our PHP 5.2-era codebase still uses PHP 5.2-era techniques. The code is loosey-goosey because that’s how it was done back then.

There was a low barrier to entry with PHP. One could put a site up in five minutes. We did just that! Live edits, in production, were easy and quick. So we did!

A computer scientist would have done things differently and more reliably. But it wasn’t computer scientists who built the World Wide Web with PHP. In fact, to the best of my knowledge, for 5-10 more years, no 4-year computer science program even taught PHP as a programming language. We learned PHP from blogs, stack overflow, and eventually boot camps.

That’s why it’s meaningless to try to remain compatible with PHP 5.2 “best practices.” None of us had any idea what “best practices” were!

If we choose to remain compatible–and we chose otherwise, but play along–we need to remain compatible with our real-world legacy code bases. But why? Is this reasonable? Yes… and no. Let’s look at my own employer.

Our team of 3-4 developers, and our predecessors, built that legacy code base over the past 10+ years. It was not written or updated by computer scientists. This situation is, in my experience, absolutely typical.

We’re forced into a rewrite, or something very like a rewrite, while at the same time remaining in production and producing new features to deal with rapid growth. It’s a deadly combination. Certainly we all have technical debt, and we all need to consider rewrites.

Sam Newman, author of Building Microservices, explains:

The need to change our systems to deal with scale isn’t a sign of failure. It is a sign of success.

Running the numbers

Let’s do the math. We have the software developed by 3-4 people over the course of 10+ years. Can we do necessary PHP codebase upgrades over a period of 1-2 years? Yes, we should be able to. Can we do a complete rewrite with modern coding practices, in 1-2 years, of what took ten-plus years to write in the first place? That does not sound so likely, does it?

Meanwhile, what do we do about upcoming business needs, new features, and so on, while 100% of our development time is already engaged in that rewrite? We tried to handle both sets of needs… it didn’t go well.

With a 1-2 year time limit, the numbers show it just can’t be done. Remember, it’s not the deprecations–it’s that PHP 8 is no longer PHP.

PHP 8 has, in my view, mandated that the way we design our PHP software must change for the better. How much time do we have to effect that change in our legacy code bases? On November 28, 2022, Official PHP achieved an “own goal”:

As of today, PHP 7.4, and with that PHP 7 is no longer supported.

The w3techs report, as of December 2, 2022, states:

PHP is used by 77.5% of all the websites whose server-side programming language we know.

Their breakdown by PHP version is:

  • PHP 4, 0.2%
  • PHP 5, 22.8%
  • PHP 7, 70.4%
  • PHP 8, 6.7%

I believe it’s a remarkable achievement that 70% of installations made it to PHP 7. Meanwhile, though, we close out 2022 with the knowledge that 72.4% of the world’s websites (whose server side language is known) have been abandoned by Official PHP. This sounds rather like when the credit card issuers created a “liability shift,” abandoning their own guarantees, shifting fraud costs onto the merchant.

What’s a reasonable migration path? Martin Fowler describes the Strangler Fig Application approach:

An alternative route is to gradually create a new system around the edges of the old, letting it grow slowly over several years until the old system is strangled.

If we had the luxury of staying on a “long term support” version of PHP 5 or even PHP 7, then we could rewrite one feature at a time, over the course of months and years.

With the close of 2022, those days are done.