February 2018 – Alex "Danger" Barber

Recently, I’ve read a lot of hate toward JavaScript (which is easy to explain, see my previous post titled “Programming Language Wars“), and specifically there are a number of people who would request that browser vendors and the web standards organizations implement additional/alternative programming languages. This is a very naive request that I believe stems from lack of respect for standardization and the challenges facing browser development.

First of all, “JavaScript”, AKA ECMAScript AKA ECMA-262 AKA ISO/IEC 16262, exists primarily as a standardization, and that’s the important part here. Despite its origins, JavaScript has been around for decades, and likely will persist as such for decades, maybe even centuries to come. It was selected to be the core of the programming side of the standardized web. This standardization has been of particular benefit to making the web more or less completely compatible between a variety of browsers. JavaScript is defined in clearly written specifications, and as a result, each implementation, from Firefox’s Spidermonkey, to Chrome’s V8, to the dozens of other complete implementations, can be implemented separately and therefore be mutually independent from one another, while at the same time each supporting the same code so long as the developers making these products follow the standards correctly. This also affords freedom for competition to create distinct implementations instead of relying on a centralized one, which cultivates innovation.

There are many different languages that have a similar standardization and specification process, but this is not the only important characteristic of JavaScript. As a language designed for the web, it cannot make any presumptions of the architecture on which it will run. This means it would need to be provided as a scripting language (which JavaScript is), or a language of virtual machine bytecode (which WebAssembly is, which I’ll get into later). In addition to this, the language is supposed to be entirely forward compatible, so that code written a long time ago is capable of running on newer engines without modification. On some languages, this requirement can be, and is, safely ignored. Allowing things to be routinely deprecated and changed breaks old code. Often in these cases, the solution is simply to run the code on an outdated interpreter. This is not an option for browser vendors, however, whose releases already push several hundred megabytes in size, to say nothing of the nightmare of deciding which versions are still to be shipped, or the support nightmare of requiring end-users to manually choose their interpreter versions. These characteristics rule out a number of other possible languages.

Yet another issue with implementing other languages is the prevalence of JavaScript. Whether or not you like it, which is really not of concern here, JavaScript is arguably the most popular programming language in the world right now. Its forgiving nature cultivates ease of use for the beginner, largely due to being untyped and having automatic garbage collection. Additionally, the massive amount of development done due to the importance of the web has made JavaScript very performant. It also has a very C-like syntax, which is very popular and familiar to many developers.

Some people are hasty to criticize a language or library for catering to the needs of the less-experienced. These people usually complain about the loss of quality by making programming more accessible. But as is usually the case, ease of use does not preclude advanced functionality. JavaScript is a very mature language, and as such, it has a large user base that demands access to more advanced features and the ability to optimize performance more than an entry-level user requires.

And finally, and in my opinion, the biggest reason against implementing new languages for the browser is maintainability. As I write this, the Chromium JavaScript implementation, V8, has 1350 open bug reports. This is the number for a single programming language. If the web standards were to mandate the addition of another language, this would have to be implemented in yet another system of roughly equal complexity, and this would cripple the progress of development teams that even now are hard at work steadily advancing the single language implementation they have to maintain already.

All of these arguments aside, however, I fully understand the desire for developers to work in a language that suits their own style. I know the struggle of being forced to work with a programming language that I don’t like. So if I believe that, why do I still deny that other languages should be implemented in the web standards? First of all, I believe the benefit is greatly outweighed by the detrimental effect it would have on the speed of development for the open web standards. Second of all, if new languages are implemented, sooner or later, even those languages will come under fire for essentially the same reasons.

What we need isn’t to explicitly modify those standards to include languages, but to provide a common foundation on which new languages can be founded without requiring modification to the central standard itself. The simple answer to this is an intermediate language or virtual machine.

So far, JavaScript itself has acted as this intermediate language. Languages such as TypeScript, CoffeeScript, and Dart transpile into simplified JavaScript with boilerplate to implement missing features. Emscripten has even allowed the transpiling of languages such as C or C++ into JavaScript.

More recently, however, there is a new standard known as WebAssembly, which I believe solves this problem even better. One of the main purposes of WebAssembly was to provide an avenue for higher performance computing in a browser environment. One of the lowest levels of abstraction of an execution environment is assembly language. It encodes the primitive computational operations of the hardware itself, and has very few, if any, high level features. This lower level of abstraction also means that instructions written in it have little to no overhead. Due to the requirement for web languages to be architecture-neutral and to have some level of access control security, this had to be implemented in a virtual machine architecture that is conceptually close to a CPU architecture, but with a universal specification, and the ability to restrict the execution context to resources that it should have access to. This design is remarkably similar to the design of the Java virtual machine.

In the future, I suspect that many more programming languages will have either integrated or third-party backends that allow software to be compiled to run in this environment. This will require a fair amount of work to get this completed for each language, but I think it is the best way forward. With this method, any language could become a web-compatible language in the future, without needing to be integrated into the web standards, and without requiring extra work by the browser developers. By funneling all languages into the same intermediate language, it also ensures compatibility between them. And finally, because this method requires a compilation step, it means that if the programming language is updated and breaking changes implemented, previously compiled code will continue to function as normal until recompiled. The breaking changes may therefore still impact the developer, but not the end-user.

So to sum up: NO, new languages should not be integrated into web standards, but YES, developers of other languages should utilize transpiling and WebAssembly to make their languages available for web development. Other web-friendly programming languages should not be competing with JavaScript, they should rather see JavaScript and WebAssembly as their natural choice for an intermediate compilation target.

I consider myself a developer with a good appreciation for a well-designed and carefully implemented architecture. I like to do things the “right” way whenever possible, instead of the quick way. I try my best to make everything I write modular, with minimal dependencies, and adaptable to any platform. Over the years I’ve been programming, however, I’ve learned that while that is a noble goal, there are certain occasions where the effort to conform to the aforementioned goal is not actually worth it. The key driving principle behind what I am referring to is the “YAGNI principle.”

YAGNI, meaning “You Aren’t Gonna Need It,” is the idea that you should never add a feature to your code base that you aren’t certain that you are going to need soon, or even eventually. I think many software developers who are relatively new to the craft, and who begin to learn about all of the architectural design patterns that can be used to keep your code base from turning into something that keeps you awake at night, sweating with fear, there is a tendency to begin developing a fear of cutting any corners ever. There is a general sense that making things more “smart” is always worth the effort, regardless of the cost. But YAGNI says that is not so. This was a principle that I unfortunately had to learn the hard way about 3 years ago.

I was working on an (unfortunately closed source) web-based asset management system software for my previous employer. Believe it or not, this was the first piece of professional software that I worked on (as in, the first that I was getting paid to make). I started work on it in mid 2014, and at that time I was making it with a fairly difficult-to-maintain procedural PHP code base, with no libraries or frameworks whatsoever. No functionality was being delegated to JavaScript, it was just going to be PHP doing all the work and sending a big chunk of HTML. In other words, it was an atrocious mountain of spaghetti at that point.

During that year, I worked on a lot of other projects, and began to learn quite a bit more about the PHP language, and above all, I decided that I was high time I learned how to use Object Oriented Programming to make my code a lot more maintainable, “DRY” if you will. So I dug into some tutorials and messed around with it, and got a general sense of how it worked. I also started learning about basic template systems, AJAX, and RESTful principles.

Some time in 2015, I decided to take all of this new-found knowledge back to the old asset management system code base, and try to apply it everywhere I could. And so I spent a large amount of time modernizing the code base and getting it up to par with what I thought was excellent design. And most of the ideas I had stuck around through the remaining years, more or less, so I know I wasn’t completely misguided.

There was one idea that I had, however, that was the biggest waste of time, and one of my absolute biggest regrets in the entire project. This is the purpose of this postmortem: to learn from this mistake that cost countless hours of valuable development time.

I decided that it would be cool to make this piece of software as portable as possible, so it was easy to install on any system my employer would ever use. I followed a lot of the patterns that I’d seen in other self-hosted web applications. Despite the fact that all of our apps only used MySQL as a database server, I wanted to make it possible to run it on MySQL, PostgreSQL, and SQLite. I also wanted to make this feature a library that I could use in other projects, and other people could use in theirs, so I did release that code under the MIT license on GitHub.

That idea, in itself, is not always bad. There are truly some occasions where doing such a thing would actually be valuable. They are all relational databases, so despite some syntactical and feature differences, they operate on the same core concepts. The way I decided to implement this cross-sql-platform support was by abstracting the entire concept of a relational database into PHP, and having that PHP act as a code generator writing SQL for prepared statements and submitting the relevant data. Essentially the idea was that instead of writing SQL queries, I would have database objects with methods allowing me to modify the structure and data in the database. The library would have to be pretty intelligent in figuring out how to safely and efficiently write a SQL query to perform the requested operation, so there were certainly security and performance risks associated with it.

I spent months of time adding each feature and testing them one by one as I started to integrate it with the existing code base. For all of the features that I actually integrated and tested, it actually worked quite well. The problem, though, was the incredible complexity of the system. SQL is implemented as a full programming language because of that complexity, and somehow I didn’t grasp that at the beginning of the project.

At some point, well into the project and not making the kind of traction that I wanted to make, I kept the library code on GitHub as before, but I scrapped integrating it into the asset management system, instead opting for another database abstraction design that I had learned and found to be much more reasonable for a project that is unlikely to ever be moved from the RDBMS that it started on (but is still flexible enough if it does). That approach was to create a single static class representing the database, with very basic abstraction methods as private, utilized by the public methods representing more high-level operations like inserting a new item, or editing an existing department. This means that all of the database-related code is in this single file, so it’s easy to change it whenever needed, but I can safely assume that any operation can be written specifically for MySQL. If the RDBMS needs to be changed one day, then the queries in this file would need to be rewritten, but the rest of the code base would not need to be touched, as the interface would remain entirely the same.

Most of the benefits of the abstraction library, at a fraction of the complexity. This design took far fewer lines of code to implement, and took a much shorter time. From the ground up, all features implemented in this static class, took less than half the time that I spent on a still-incomplete full database abstraction.

So the lesson to be learned here is what? Don’t implement a full database abstraction because it’s a waste of time? Absolutely not. Like I alluded to before, there are certainly some projects that would benefit from such a thing, and in those cases, you’ll just have to suck it up and trudge through thousands of lines and dozens of classes to make it work. But the question you should be asking before you go off on such a costly mission is an honest estimate of how often you expect the code base to switch its RDBMS. Maybe you’re marketing your software to a lot of varied users who may not want to install your database server of choice. In that case, you might want to consider it. But in my case, simply asking myself that question, the answer would have clearly been “probably never, maybe once in 5 to 10 years”. I should not have done it. Frankly, the cost of rewriting the queries in the static class MAYBE once would have been much less than the amount of time that I spent writing that database abstraction library.

So remember the YAGNI principle during the design phase of your project. Whenever you see something potentially complex getting added to the design, make sure to ask yourself, “Are you really gonna need it?” and if the answer isn’t a certain “yes,” then just remember that it’s okay to rewrite things later on when you have more information. Refactoring your code later just might be a lot cheaper and better than worrying about the minute details so early on. The finished product is rarely the same as the finished concept anyway, so be an adaptable programmer first, and write an adaptable code base later.