Saturday, October 28, 2017

Freedom

A big part of programming culture is the illusion of freedom. We pride ourselves on how we can tackle modern problems by creatively building new and innovative stuff. When we build for other programmers, we offer a mass of possible options because we don’t want to be the ones who restrict their freedoms.


We generally believe that absolutely every piece of code is malleable and that releasing code quickly is preferable over stewing on it for a long time. In the last decade, we have also added in that coding should be fun, as a fast-paced, dynamic response to the ever-changing and unknowable needs of the users. Although most of us are aware that the full range of programming knowledge far exceeds the capacity of our brains, we still express overconfidence in our abilities while contradictorily treating a lot of the underlying mechanics as magic.


Freedom is one of our deepest base principles. It appears to have its origin in the accelerated growth of the industry and the fact that every 5 to 10 years some brand new, fresh, underlying technology springs to life. So the past, most programmers believe, has very little knowledge to help with these new emerging technologies.


That, of course, is what we are told to believe and it has been passed down from generations of earlier programmers. Rather than fading, as the industry ages, it has been strengthening over the last few decades.


However, if we examine programming closely, it does not match that delusion. The freedoms that we proclaim and try so hard to protect are the very freedoms that keep us from moving forward. Although we’ve seen countless attempts to reverse this slide, as the industry grows most new generations of programmers cling to these beliefs and we actively ignore or push out those that don’t.


Quite obviously, if we want a program to even be mildly ‘readable’, we don’t actually have the freedom to name the variables anything we want. They are bound to the domain; being inconsistent or misusing terminology breeds confusion. It's really quite evident that if a piece of data flows through a single code base, that it should always have the same name everywhere, but each individual programmer doesn’t want to spend their time to find the name of existing things, so they call it whatever they want. Any additional programmers then have to cope with multiple shifting names for the exact same thing, thus making it hard to quickly understand the flow.


So there isn’t really any freedom to name variables ‘whatever’. If the code is going to be readable then names need to be correct, and in a large system, those names may have been picked years or even decades earlier. We’ve always talked a lot about following standards and conventions, but even cursory inspections of most large code bases show that it usually fails in practice.


We love the freedom to structure the code in any way we choose. To creatively come up with a new twist on structuring the solution. But that only really exists at the beginning of a large project. In order to organize and build most systems, as well as handle errors properly, there aren’t really many degrees of freedom. There needs to be a high-level structure that binds similar code together in an organized way. If that doesn’t exist, then its just a random pile of loosely associated code, with some ugly spaghetti interdependencies. Freely tossing code anywhere will just devalue it, often to the point of hopelessness. Putting it with other, similar pieces of code will make it easy to find and correct or enhance later.


We have a similar issue with error handling. It is treated as the last part of the effort, to be sloppily bolted on top. The focus is just on getting the main logic flow to run. But the constraints of adding proper error handling have a deep and meaningful impact on the structure of the underlying code. To handle errors properly, you can’t just keep patching the code at whatever level, whenever you discover a new bug. There needs to be a philosophy or structure, that is thought out in advance of writing the code. Without that, the error handling is just arbitrary, fragile and prone to not working when it is most needed. There is little freedom in properly handling errors, the underlying technology, the language and the constraints of the domain mostly dictate how the code must be structured if it is going to be reliable.


With all underlying technologies, they have their own inherent strengths and weaknesses. If you build on top of badly applied or weak components, they set the maximum quality of your system. If you, for instance, just toss some randomish denormalized tables into a jumble in an RDBMS, no matter how nice the code that sits on top, the system is basically an untrustworthy, unfixable, mess. If you fix the schema problems, all of the other code has to change. Everything. And it will cost at least twice as long to understand and correct it, as it would to just throw it away and start over. The value of code that ‘should’ be thrown away is near zero or can even be negative.


So we really don’t have the freedom to just pick any old technology and to use it in any old, creative, way we choose. That these technologies exist and can or should be used, bounds their usage and ultimately the results as well. Not every tech stack is usable for every system and picking one imposes severe, mostly unchangeable limitations. If you try to build a system for millions of users on top of a framework that can only really handle hundreds of them, it will fail. Reliably.


It might seem like we have the freedom to craft new algorithms, but even here it is exceptionally limited as well. If for example, you decide to roll your own algorithm from scratch, you are basically taking a huge step backward, maybe decades. Any existing implementations may not be perfect, but generally, the authors have based their work on previous work, so that knowledge percolates throughout the code. If they didn’t do that research then you might be able to do better, but you’ll still need to at least know and understand more than the original author. In an oversimplified example, one can read up on sorting algorithms and implement a new variant or just rely on the one in their language’s library. Trying to work it out from first principals, however, will take years.


To get something reasonable, there isn’t much latitude, thus no freedom. The programmers that have specialized in implementing algorithms have built up their skills by having deep knowledge of possibly hundreds of algorithms. To keep up, you have to find that knowledge, absorb it and then be able to practically apply it in the specific circumstance. Most programmers aren’t willing to do enough legwork, thus what they create is crude. At least one step backward. So, practically they aren’t free to craft their own versions (although they are chained by time and their own capabilities).


When building for other programmers it is fun to add in a whole boatload of cool options for them to play with, but that is only really useful if changing any subset of options is reliable. If the bulk of the permutations have not been tested and changing them is very likely to trigger bugs, then the options themselves are useless. Just wasted make-work. Software should always default to the most reasonable, expected options and the code that is the most useful fully encapsulates the underlying processing.


So, options are not really that useful, in most cases. In fact, in the name of sanity, most projects should mostly build on the closest deployments to the default. Reconfiguring the stuff to be extra fancy increases the odds of running into really bad problems later. You can’t trust that the underlying technology will hold any invariants over time, so you just can’t rely on them and it should be a very big and mostly unwanted decision to make any changes away from the default. Unfortunately, the freedom to fiddle with the configurations is tied to causing seriously bad long-term problems. Experience makes one extraordinary suspect of moving away from the default unless your back is up against the wall.


As far as interfaces go, the best ones are the ones that the user expects. A stupidly clever new alternative on a date selection widget, for example, is unlikely to be appreciated by the users, even if it does add some new little novel capability. That break from expectations generally does not offset a small increase in functionality. Thus, most of the time, the best interface is the one that is most consistent with the behaviors of the other software that the user uses. People want to get their work done, not to be confused by some new ‘odd’ way of doing things. In time, they do adjust, but for them to actually ‘like’ these new features, they have to be comfortably close to the norm. Thus, for most systems, most of the time, the user’s expectations trump almost all interface design freedoms. You can build something innovative, but don’t be surprised if the users complain and gradually over time they push hard to bring it back to the norm. Only at the beginning of some new technological shift is there any real freedom, and these are also really constrained by previous technologies and by human-based design issues.


If we dig deeply into software development, it becomes clear that it isn’t really as free as we like to believe. Most experienced programmers who have worked on the same code bases for years have seen that over time the development tends to be refined, and it is driven back towards ‘best practices’ or at least established practices. Believing there is freedom and choosing some eclectic means of handling the problems is usually perceived as ‘technical debt’. So ironically, we write quite often about these constraints, but then act and proceed like they do not exist. When we pursue false freedoms, we negatively affect the results of the work. That is, if you randomly name all of the variables in a system, you are just shooting yourself in the foot. It is a self-inflicted injury.


Instead, we need to accept that for the bulk of our code, we aren’t free to creatively whack it out however we please. If we want good, professional, quality code, we have to follow fairly strict guidelines, and do plenty of research, in order to ensure that result.


Yes, that does indeed making coding boring. And most of the code that programmers write should be boring. Coding is not and never will be the exciting creative part of building large systems. It’s usually just the hard, tedious, pedantic part of bringing a software system to life. It’s the ‘working’ part of the job.


But that is not to say that building software is boring. Rather, it is that the dynamic, creative parts of the process come from exploring and analyzing the domain, and from working around those boundaries in the design. Design and analysis, for most people, are the interesting parts. We intrinsically know this, because we’ve so often seen attempts to inject them directly into coding. Collectively, most programmers want all three parts to remain blurred together so that they don’t just end up as coding monkeys. My take is that we desperately need to separate them, but software developers should be trained and able to span all three. That is, you might start out coding from boring specs, learn enough to handle properly designing stuff and in time when you’ve got some experience in a given domain, go out and actually meet and talk to the users. One can start as a programmer, but they should grow into being a software developer as time progresses. Also, one should never confuse the two. If you promote a programmer to run a large project they don’t necessarily have all of the necessary design and analyze skills and it doesn’t matter how long they have been coding. Missing skills spells disaster.

Freedom is a great myth and selling point for new people entering the industry, but it really is an illusion. It is sometimes there, at very early stages, with new technologies, but even then it is often false. If we can finally get over its lack of existence, and stop trying to make a game out of the process, we can move forward on trying to collect and enhance our already massive amount of knowledge into something that is more reliably applied. That is, we can stop flailing at the keyboards and start building better, more trustworthy systems that actually do make our lives better, instead of just slowly driving everyone nuts with this continuing madness and sloppy results we currently deliver.