Software Architecture Software Development Web Development

Software Architecture and the YAGNI principle

I consider myself a developer with a good appreciation for a well-designed and carefully implemented architecture. I like to do things the “right” way whenever possible, instead of the quick way. I try my best to make everything I write modular, with minimal dependencies, and adaptable to any platform. Over the years I’ve been programming, however, I’ve learned that while that is a noble goal, there are certain occasions where the effort to conform to the aforementioned goal is not actually worth it. The key driving principle behind what I am referring to is the “YAGNI principle.”

YAGNI, meaning “You Aren’t Gonna Need It,” is the idea that you should never add a feature to your code base that you aren’t certain that you are going to need soon, or even eventually. I think many software developers who are relatively new to the craft, and who begin to learn about all of the architectural design patterns that can be used to keep your code base from turning into something that keeps you awake at night, sweating with fear, there is a tendency to begin developing a fear of cutting any corners ever. There is a general sense that making things more “smart” is always worth the effort, regardless of the cost. But YAGNI says that is not so. This was a principle that I unfortunately had to learn the hard way about 3 years ago.

I was working on an (unfortunately closed source) web-based asset management system software for my previous employer. Believe it or not, this was the first piece of professional software that I worked on (as in, the first that I was getting paid to make). I started work on it in mid 2014, and at that time I was making it with a fairly difficult-to-maintain procedural PHP code base, with no libraries or frameworks whatsoever. No functionality was being delegated to JavaScript, it was just going to be PHP doing all the work and sending a big chunk of HTML. In other words, it was an atrocious mountain of spaghetti at that point.

During that year, I worked on a lot of other projects, and began to learn quite a bit more about the PHP language, and above all, I decided that I was high time I learned how to use Object Oriented Programming to make my code a lot more maintainable, “DRY” if you will. So I dug into some tutorials and messed around with it, and got a general sense of how it worked. I also started learning about basic template systems, AJAX, and RESTful principles.

Some time in 2015, I decided to take all of this new-found knowledge back to the old asset management system code base, and try to apply it everywhere I could. And so I spent a large amount of time modernizing the code base and getting it up to par with what I thought was excellent design. And most of the ideas I had stuck around through the remaining years, more or less, so I know I wasn’t completely misguided.

There was one idea that I had, however, that was the biggest waste of time, and one of my absolute biggest regrets in the entire project. This is the purpose of this postmortem: to learn from this mistake that cost countless hours of valuable development time.

I decided that it would be cool to make this piece of software as portable as possible, so it was easy to install on any system my employer would ever use. I followed a lot of the patterns that I’d seen in other self-hosted web applications. Despite the fact that all of our apps only used MySQL as a database server, I wanted to make it possible to run it on MySQL, PostgreSQL, and SQLite. I also wanted to make this feature a library that I could use in other projects, and other people could use in theirs, so I did release that code under the MIT license on GitHub.

That idea, in itself, is not always bad. There are truly some occasions where doing such a thing would actually be valuable. They are all relational databases, so despite some syntactical and feature differences, they operate on the same core concepts. The way I decided to implement this cross-sql-platform support was by abstracting the entire concept of a relational database into PHP, and having that PHP act as a code generator writing SQL for prepared statements and submitting the relevant data. Essentially the idea was that instead of writing SQL queries, I would have database objects with methods allowing me to modify the structure and data in the database. The library would have to be pretty intelligent in figuring out how to safely and efficiently write a SQL query to perform the requested operation, so there were certainly security and performance risks associated with it.

I spent months of time adding each feature and testing them one by one as I started to integrate it with the existing code base. For all of the features that I actually integrated and tested, it actually worked quite well. The problem, though, was the incredible complexity of the system. SQL is implemented as a full programming language because of that complexity, and somehow I didn’t grasp that at the beginning of the project.

At some point, well into the project and not making the kind of traction that I wanted to make, I kept the library code on GitHub as before, but I scrapped integrating it into the asset management system, instead opting for another database abstraction design that I had learned and found to be much more reasonable for a project that is unlikely to ever be moved from the RDBMS that it started on (but is still flexible enough if it does). That approach was to create a single static class representing the database, with very basic abstraction methods as private, utilized by the public methods representing more high-level operations like inserting a new item, or editing an existing department. This means that all of the database-related code is in this single file, so it’s easy to change it whenever needed, but I can safely assume that any operation can be written specifically for MySQL. If the RDBMS needs to be changed one day, then the queries in this file would need to be rewritten, but the rest of the code base would not need to be touched, as the interface would remain entirely the same.

Most of the benefits of the abstraction library, at a fraction of the complexity. This design took far fewer lines of code to implement, and took a much shorter time. From the ground up, all features implemented in this static class, took less than half the time that I spent on a still-incomplete full database abstraction.

So the lesson to be learned here is what? Don’t implement a full database abstraction because it’s a waste of time? Absolutely not. Like I alluded to before, there are certainly some projects that would benefit from such a thing, and in those cases, you’ll just have to suck it up and trudge through thousands of lines and dozens of classes to make it work. But the question you should be asking before you go off on such a costly mission is an honest estimate of how often you expect the code base to switch its RDBMS. Maybe you’re marketing your software to a lot of varied users who may not want to install your database server of choice. In that case, you might want to consider it. But in my case, simply asking myself that question, the answer would have clearly been “probably never, maybe once in 5 to 10 years”. I should not have done it. Frankly, the cost of rewriting the queries in the static class MAYBE once would have been much less than the amount of time that I spent writing that database abstraction library.

So remember the YAGNI principle during the design phase of your project. Whenever you see something potentially complex getting added to the design, make sure to ask yourself, “Are you really gonna need it?” and if the answer isn’t a certain “yes,” then just remember that it’s okay to rewrite things later on when you have more information. Refactoring your code later just might be a lot cheaper and better than worrying about the minute details so early on. The finished product is rarely the same as the finished concept anyway, so be an adaptable programmer first, and write an adaptable code base later.