In Programming You Create The World

I’m personally convinced that computer science has a lot in common with physics. Both are about how the world works at a rather fundamental level. The difference, of course, is that while in physics you’re supposed to figure out how the world is made up, in computer science you create the world. Within the confines of the computer, you’re the creator. You get to ultimately control everything that happens.

Linus Torvalds, The Beauty of Programming

Programming, or software engineering, as a profession is relatively young in the grand scheme of things. I’m reminded of a quote I saw once that I haven’t been able to find the source of that said essentially ‘when civil engineering was as young a profession as software engineering is, they hadn’t even discovered the right triangle’. Sure, maybe it isn’t completely an apples to apples comparison, but we’re clearly still in a highly experimental phase with people trying many different things and strong disagreements in every direction. This is a feature not a bug.

Hash Tables

In fact, hash tables are so efficient that they can, at times, seem like magic. Think of the Babel fish in the Hitchhiker’s Guide to the Galaxy trilogy—something so impossibly useful that it really has no business existing in the first place.

Conrad Barski, Land of Lisp

I came across this post over the weekend and it reminded me of something I realized a few years ago about a certain class of engineering interview questions. Every once in a while some pattern, tool, or technique gets discovered that solves huge swathes of problems that you run into while programming and it becomes so important that people start using it to assess where candidates are in their development.

A lot of times during engineering interviews you get faced with an optimization problem. Maybe the initial, naive implementation runs in polynomial time, you optimize a bit more and you get it to linear time, and the interviewer asks you if you can solve it in constant time. I’ve found more often than not this is code for “how would you implement this with a hash table?”. Which as it turns out is also a great question to ask yourself when you’re struggling with a programming problem!

Yes, hash tables’ constant time lookup is impossibly useful, so useful that for a while it made a lot of sense to make sure any new hires were comfortable with them. Even more evidence for how useful they are is that languages now have them built in to the language with special syntax to make it easy to use them. Javascript can even be thought of as a language that essentially asks “what if everything is a hash?”

Map

Another example of an incredibly useful discovery is the map function (and really any of the other collection operations like reduce, filter, etc). Frequently while programming you’ll find yourself doing or wanting something equivalent to this:

function addOneToEvery(listOfNumbers) {
  for(var i = 0; i < listOfNumbers.length; i++) {
    listOfNumbers[i] += 1;
  }
  return listOfNumbers;
}

You find yourself with a bunch of things and you want to do something to each of them. Early on in programming you probably even come up with the above pattern basically on your own without really planning for it. What Map does is essentially the same thing, but returns a new, modified array after the operations are complete. Check out the implementation in underscore.js:

function map(obj, iteratee, context) {
  iteratee = cb(iteratee, context);
  var _keys = !isArrayLike(obj) && keys(obj),
      length = (_keys || obj).length,
      results = Array(length);
  for (var index = 0; index < length; index++) {
    var currentKey = _keys ? _keys[index] : index;
    results[index] = iteratee(obj[currentKey], currentKey, obj);
  }
  return results;
}

It is a little more complicated to make it more generic for any use case, but now that you have that function, you really only need to take the data you have and tell map what operation you want to run on each of the items and map spits out a new list of items for you.

Once you become comfortable with using it, a world of possibilities opens up for you. Problems that previously seemed hard or time consuming become trivial. In a similar way to the question you ask yourself about hash tables, you can also challenge yourself and ask "how would I solve this if I had to use Map here?"

Creating The World

If all you have is a hammer, everything looks like a nail

Maslow's Hammer

Which brings me to the way I understand what Linus is talking about when he says "...in computer science you create the world". In programming, nearly nothing is set in stone, and really the only limitation is your own creativity. As you learn to intuitively use more tools like Hash Tables and Map (and of course many others), you find they have the amazing property of being able to just cleanly solve your problem.

One mental barrier I had early on was that during the course of programming when you run into a problem you were on a linear path toward solving it and that was the state of the world. But that isn't the case. As you uncover more of the problem you're trying to solve, all code and data structures leading up to that point are also candidates for change. Once you have a hammer like a Hash Table or Map or any of the many others out there, you can just decide to turn your problem into a nail and it becomes trivial to solve.

What does it mean for one programmer to be better than another?

Most programmers have only a vague notion of how competent they are at what they do for a living.

Steve Yegge – Being the Averagest

As you can probably tell from most of the posts on this blog, I’m a huge fan of Steve Yegge’s blogs (they are incredible, seriously, go read them if you haven’t). But that above quote put into words something that had been bugging me for a long time before I read the post, and then has been on my mind ever since.

Over the last few months, I’ve read through The Phoenix ProjectThe Goal, and Toyota Kata, and I’m starting to get somewhere with a useful framework for figuring out a way to determine competency levels of programmers and a systematic way to improve.

The Goal

or “What the hell are we doing here?”

To figure out how to become a better programmer, first we have to be able to define what exactly our goal is. What does it mean for one programmer to be “better” than another? If you can identify programmers as being better than one another, how does someone determine what those qualities are to focus on themselves or help others find ways to improve? Improve to what? Improve on what?

For engineers I’ve mentored in the past, it really bothered me that I didn’t really have a good framework for answering those questions, and I haven’t really gotten an answer that really satisfied me from anyone I’ve talked to. The best advice I’ve been able to come up with in the past has been essentially: find a co-worker who is noticeably better than you at something and learn from them on how to get better at it. In the absence of having a solid answer to what it means to be a good programmer that seemed like a useful method to improve, but there is a much more effective way of improving that doesn’t rely on the luck of having good teammates and randomly picking useful attributes.

So, with that said, lets move on to figuring out what that goal might be.

Potential goals

There are a handful of targets that I’ve seen people strive toward that seem like potential candidates for what our goal might be. If you have others, please let me know and I’ll add them here for discussion. As you can imagine, I don’t think these are the whole picture, but it is useful to talk through why that is.

To write the most elegant, beautiful, and clean code possible

When art critics get together they talk about form and structure and meaning. When artists get together they talk about where you can buy cheap turpentine. – Picasso/Renoir (apocryphal)

If you were to only read popular programming books and watch conference talks, you might be convinced that these things are what we should all be striving for. The measure of what makes a good programmer. What usually gets lost is that the advice you’re reading or listening to is usually a solution to a specific problem and not absolute commandments on what constitutes good code. Even Sandi Metz has said that her famous 5 Rules came about because the project she was on had “They had the classic ‘many things too big’ problem – ‘many things too connected’ problem.” If things are generally going well on your project, you aren’t a failure because you wrote a 6 line method. Sarah Mei also has a tweetstorm about Livable Code that I feel expresses  better than I can.

I’m not saying the idea of “code quality” and all the different of it going around have no merit, but I do feel that it is mostly accomplishment theater. It is really easy to apply the “extract method” refactor or make a class conform to the command object pattern and get a sense that you accomplished something or “cleaned up” the code base without needing to do the hard part of figuring out if that actually makes things better. What metric are you using to determine if one code design is better than another? (that it complies with the style guide more than the other isn’t the right answer).

To write the most performant and scalable code


There are times and situations where this is important, but not only is a small percentage of your code actually performance sensitive, if you take Sarah Mei’s tweet above into consideration, even your performance sensitive areas are usually solved with relatively simple solutions. Some evidence has been found correlating a reduction in response time to an increase in sales, but you can find many companies in our industry who claim to hire great programmers and still end up having performance and scalability issues. You also have plenty of quotes all essentially saying that premature optimization is the root of all evil. So while this is a sliding scale and definitely a part of it, I can’t see having the idea of “10x programmers” floating around and being about people who write incredibly performant and scalable code if writing it only mattered a small percentage of the time.

To write completely, 100% bug free code

Can you name one product you’ve used where you’ve never encountered a bug? Facebook’s original mantra of “move fast and break stuff” comes to mind here. Facebook, a company that is known for and strives to hire great programmers wanted them to not worry about releasing bugs. If our ultimate goal were only about writing bug free code, they would be paying a lot of people a lot of money to ignore what makes them worth paying that amount of money if this were the case.

This is another sliding scale like performance. There’s a threshold where you begin to get diminishing returns on time spent. You need to be able to write code that hits these thresholds for your product’s situation (if you’re writing pacemaker software you’ve got a different bug threshold than if you were working on a social media site for example), but like the Facebook saying implies, sometimes there are better uses of your time.

“Productivity ‘N’ Quality (PNQ)” or “Done and Gets Things Smart”

These two come from Steve Yegge blog posts, “Being the Averagest” and “Done and Gets Things Smart“. I feel like they come really close to what it is we’re trying to do. We’re trying to be productive at a high quality level and we’re trying to be done while getting things smart. These start to talk more than just about the code itself and bring a delivery parameter into the discussion. I think where this falls apart, and Steve mentions it in both posts, is that they don’t really give you a framework for measuring the impact. How do I get better at being done and getting things smart? What does “getting things smart” even mean?

The Goal from The Goal

If the above options aren’t the goal, then what is? I have a strong feeling that there is a rewording of the goal that they define in The Goal that captures some of what Steve is saying in the two lines above and also gives us a useful framework for figuring out somewhat of a measurement system.

The goal from The Goal is to maximize throughput while minimizing inventory and operating expense. A big idea from the book is that where a lot of people focus on reducing operating expense, you can really only reduce it to $0.  Throughput, on the other hand, is unbounded. New languages, ideas, frameworks, and libraries allow you to produce exponentially more than without them. You’re also able to use these tools to create new ones, faster, enabling the creation and creativity needed for even further enhancements.

Throughput

Throughput is probably the easiest of the three to define, as we mostly already have rough measures for it. Most of us use tools like Pivotal Tracker, JIRA, Asana, etc at work and work on estimating complexity and/or time to complete a task. Even if estimation isn’t completely perfect, it is a measurement to start with that we improve on as we get further. It is also the one I think will produce useful insights if we focus on it.

To kick things off, I’ll throw out an MVP of an answer to the question posed in the title of this post:

Given the same complexity, and quality, bug rate, and performance output, the programmer who is “better” is the one who completes the work first.

This is a simplistic first version of an answer and makes a ton of assumptions, but it gives us somewhere to start. To illustrate this, lets look at this chart of 6 different programmers.graph showing different points for programmers on a graph of (quality to time spent)
Like I said, there are a lot of assumptions baked in here like technical skills being the only things that matter, being able to define a quality threshold, being able to quantify the quality/bugs/performance values for work, and being able to quantify complexity enough that we can normalize it. It is also one dimensional, but my plan over the next few posts is to propose some ways to measure what we’re assuming are “measured” here enough to give us this simple chart, and ways to use this definition to help us find ways to improve technical ability as programmers.

Even if we weren’t going to assume that throughput is one of our biggest goals, it should be clear that everyone on the chart is a better programmer than F. C is also in bad shape, but probably only needs to spend more time focusing on areas that take them below the quality threshold, but the situation isn’t dire. A, B, and D are the high performers here, and any kind of additional factor could make the case for one of them to be “the best”.

What has me most excited about the above definition is that helping each one of these different programmers improve technically becomes drastically easier.

A, B, and D should focus on ways to complete their work in less time, maybe it is mechanical ability, becoming better at using their tools, automating repetitive tasks away, ore even creating a new abstraction to simplify their work.

C can take more time, maybe talking through their implementation, pairing with A, B, or D, or trying to find a common thread in the bugs that are found in their code.

F’s situation might look dire, but we all started out where F is on this chart. They need to work on everything A-D need to, and they’re probably overwhelmed. The trick is to pick one thing and have them focus on it. Finding what is taking so much time will likely have the biggest return, since it allows them to go through iterations on everything else much faster.

E likely either needs to study YAGNI and DTSTTCPW, or they need help becoming more confident in the code they’ve written. They are able to write very high quality code, well above the team’s threshold, but are taking much longer than everyone else to do it.

Conclusion

I could probably go on and on about this forever and in a lot of different directions, but I’ve been sitting on this post for months and this seems like a good place to stop. I have a bunch of ideas for future posts, and my rough outline for the next few posts is:

  1. What does it mean to be a better programmer than another?
  2. Improving throughput in isolation
  3. Using throughput to find areas of improvement
  4. Where Inventory and Operating Expense come in
  5. Code quality, the why, the when, and some measurements
  6. The added complexities of a team, and some measurements
  7. Higher order goals and implications on traditional career ladders and interview practices

If any of these seem particularly interesting, I’d love to chat with you about them.

TDD is Debugging Features Into Existence

Two books I’ve read recently, Making Software and Beyond Legacy Code, have touched on Test Driven Development and made some interesting points I hadn’t thought of before. In Making Software they bring up addictiveness:

Could it be addictive? Personal communications with TDD developers suggest that it is an addictive practice. It changes the way people think and their approach to coding in a way that is difficult to roll back. Therefore, leaving TDD practices may be as difficult as adopting them.

Oram, Andy; Greg Wilson (2010-10-14). Making Software: What Really Works, and Why We Believe It (p. 215). O’Reilly Media. Kindle Edition.

From my personal experience and the people I’ve talked to, TDD/BDD is absolutely addictive. I’ve been lucky to experience working professionally in two completely different environments in respect to testing. For the first programming job of my career, I knew of tests and what little I wrote, I wrote in a test-after fashion. The process there was your typical: think about a problem, come up with some solution, type it in, run it, see what doesn’t work, fix that, and repeat. As you can imagine, I got very good at debugging there since that was what I spent most of my day doing.

At my next job as the second engineer at SideTour, though, the entire app had been test driven from day one. I was even able to pair full time with the CTO and learned how to do it effectively. I picked it up quickly and was immediately hooked to the point that I test drive all of my side projects as well.

Bugs Are Missing Tests

I’ve wondered for a while now how I picked up such a different process so quickly, and it wasn’t until I came across Beyond Legacy Code that it really started coming together. In the chapter “Bugs Are Missing Tests”, David Bernstein says:

Every bug exists because of a missing test in a system. The way to fix bugs using TDD is first write a failing test that represents the bug and then fix the bug and watch the failing test turn green.

Most of the writing I’ve seen around TDD and BDD explains it from a feature perspective first, but if you look at it from a bugs perspective you can easily reframe features as debugging them into existence. Once I started thinking about it like this, it makes a lot of sense why I got so hooked. Being someone that enjoys and is pretty good at debugging, it is obvious that a process which turns programming into a constant debugging session would click. It is a bug that the feature doesn’t exist, so you need to write a failing test that exposes that bug.

What Your Text Editor Can Say About You

In my role as a manager on a growing engineering team, a lot of my time lately is spent reviewing resumes and conducting phone interviews. Resumes and LinkedIn profiles leave a lot to be desired when trying to evaluate applicants, but unfortunately they’re the only things we have. It is incredibly hard to gauge where a person is in their personal development as a software engineer just from reading a one or two page document and talking for an hour over the phone. Fortunately, one strong signal I’ve found is if the candidate has Vim or Emacs listed on their resume.

An important step in your personal development as a software engineer is the realization that you need to invest time in mastering your tools. In a similar vein to Steve Yegge writing about programmers needing to know how to type, programmers also need to put the time in toward mastery of their text editor. What makes Vim and Emacs such a strong signal is that to be able to be productive in them, you have to have made the decision that you’re going to spend at least some time learning how to use them. They both have steep learning curves and could even be said to be hostile to beginners, but investing the time to completely grok one or both of them shows that you have grasped the value of learning a tool will pay itself back over your entire career.

Now I’m not saying that people who use different text editors haven’t put time in to mastering them, just that having them on your resume gives very little if any signal. When you’re applying to jobs with a resume, the person screening the resumes is likely seeing many resumes a week on top of their standard day-to-day work, and is always looking something in the resume that jumps out to them and suggests a deeper investigation.

Optimize for Debugging Speed

“If debugging is the process of removing software bugs, then programming must be the process of putting them in. ” – Edsger Dijkstra

As close as you can get to a guarantee in programming is that if you’re building a program of any kind of complexity, you’re going to be spending a lot of time debugging it. Taking Eagleson’s Law* into account, the situation is even more dire – you’re almost always debugging code that might as well have been written by someone else. How would you prefer past-you to have written code so that current-you has an easy time fixing anything that comes up? Besides writing tests, of course, you would prefer that they wrote the code to make the process of debugging as easy and fast as possible.

“ Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it. ” – Brian Kernighan

The code needs to be simple and easy to understand. You can get some of the way there with tools like Code Climate, but the simplicity I’m talking about here goes beyond what static analysis can really quantify. Not only do you need to worry about the low level details of your methods and classes, but you also need to worry about your overall design. Imagine explaining the design to someone new on your team. If the explanation involves layers upon layers of  new classes, while not obviously wrong, you should probably get up and walk around a little bit asking yourself if there is absolutely no other way to accomplish what you’re trying to accomplish.

Beyond a simple overall design, think about each line you write from the perspective of someone unfamiliar with the code reading it for the first time. Consider the `try` method in rails. Any time someone encounters it being used, they now have to incorporate the thought that the caller could be `nil` and have to think about what cases that could be true. Unfortunately, the `try` method doesn’t provide any of the information needed to determine when and whether the object is legitimately null. You should write code that clearly explains to the reader why it is doing what it is doing.

Debugging time increases as a square of the program’s size.
— Chris Wenham in Physics for Programmers

To optimize for debugging speed, after taking simplicity and readability into account, you need to write as little code as possible to accomplish your task. Steve Yegge argues that size is Code’s Worst Enemy, but it is also debugging’s worst enemy. There is plenty written about brevity being the first thing to optimize for, and from a perspective of making the code as easily debugged as possible you absolutely need to focus on expressing ideas in as little code as reasonably possible.

You’re always going run into cases where, like Jeff Atwood mentions above, you’ve completely exhausted all your other options and have to write some code under duress. In these cases, just writing a small amount of code isn’t enough, there is something much more dangerous to watch our for. You need to think about the other code that people are going to need to write to interact with yours, and minimize future-code as much as possible too. Not only do you need to fight the urge to not bring new lines of code into the world, but you need to make it easy for the users of your code to do the same. If you’ve just written a small readable class that now requires all its users to write hundreds of lines of setup just to use it, you’ve still failed. You are now responsible for bringing an immeasurable number of lines of code into the world over the lifetime of the project.


* Eagleson's Law: Any code of your own that you haven't looked at for six or more months might as well have been written by someone else.

Why All Programmers Should Play MUDs

If you’re anything like me, you’ve wished it would be possible to get better at things that make you money by playing a game. I realized recently after reading Steve Yegge’s Programming’s Dirtiest Little Secret and reflecting on where I learned to type that such a game exists. It turns out that as programmers we are lucky to have such a game. These magical games are called MUDs. I spent most of my middle school to high school years playing MUDs and still hop in to see what’s going on occasionally. I didn’t realize it until recently that I can attribute some of my success as a programmer to all the time that I put into them. They may not seem like much more than text-based World of Warcraft at this point, but that isn’t really the point of this post. The point of this post is to justify the thousands of hours I spent on them talk about how spending some time playing one can actually improve your performance on the job as a programmer.

Typing

This one is probably obvious, but MUDs force you to become a fast and accurate typist. The only way to interact with the game is through text, so you won’t survive very long if you have to look down to type out the commands to heal yourself in the middle of a fight or are constantly hitting backspace to fix typos. Your progress and success is directly tied to your typing skills and as you spend more time in-game you naturally progress from a hunt-and-peck style where you think about individual letters to thinking in blocks of words and sentences that just seem to appear on screen with little effort.

Sure you can use some of the cool new web apps like typing.io which uses actual code and gives you detailed reports on how you did and where your problem areas lie, but MUDs can be incredibly addicting which is the main thing you need to get better at typing. You need to practice, a lot, and playing a game that makes you want to come back day after day is a great way to force yourself into practicing. I don’t really think it is worth spending time trying to explain why typing is important for a career as a programmer, but if you’re curious, Steve’s post above should be enough.

Scanning

The big realization I had that spurred this post was that MUDs also train you in scanning. Sure, it might be obvious that reading quickly is something you will pick up as you spend time mudding, but you also get better at not reading. With all the information that is scrolling on the screen constantly, it is just as important to know what you can ignore as it is to know what is important. Why this is particularly useful for programming is that we mostly have the same interface to our programs we’re writing as the MUD: a black screen with colored text. Unsurprisingly, being able to recognize and ignore unimportant text or quickly focus in on important text transfers incredibly well between the two domains.

Usually, this is something simple like quickly finding a method on screen, but occasionally you spot exactly what we are looking for in fast scrolling logs. You also find yourself being able to read in chunks rather than having to read every piece of the line. All of this really comes in handy when you’re debugging and/or tracing through multiple files – the less time you spend to find the what you’re looking for the better. Speed Matters.

Of course, MUDs are also fun and not just for becoming a better typist and scanner, but if you’ve ever seen or paired with someone who navigated a codebase with what seemed like impossible speed, there are few things better to train that skill than picking one and playing it for a few months. I’m sure you will see progress almost immediately and you’ll have a lot of fun while you’re doing it.

 

 

This Code is Crap, it Needs Rewritten

A few weeks ago I stumbled on Peter Naur’s essay Programming As Theory Building and it helped clarify a lot of things that have been on my mind for a while. It will likely be the source of many posts on this blog as time goes on, but there was one big idea that I’ve been obsessed with after reading it:

Not only are you building a theory of a program as you are programming, but you’re also building a theory of building programs.

It turns out that Naur even touches on this idea in Pluralism In Software Engineering (p79):

14CECF84-4C87-478D-876E-62FBDB3D30D5

“For me, program development is all about grasping the relationship between:

* the modellee (the real-world problem),
* the model (the program),
* the model building elements (the programming language constructs).”

I’m still figuring out all the implications of this and working out how usable of a thought it is for my career, but it does explain some of the things I’ve seen so far. One in particular is the topic of this post.

I’m sure we’ve all seen people join a team we’ve been on for a long time and immediately decide the program is crap and that it needs to be rewritten. I’m also sure that we’ve been that person many times as well. While there is a chance that it actually is crap, there’s even more of a chance that there is a mismatch in theories. There’s almost definitely a mismatch in the new programmer’s theory of how to solve the particular problem this program solves and the way the original programmers who have been working on the program since the beginning solved the problem. But, and this is the big thing, it is also just as likely that there is a mismatch in their different theories of building programs.

You can see this manifest itself in a bunch of different ways such as basic style differences, usage of different language idioms, usage of different programming patterns, and where the new person is on the software political axis versus the team who built the program. Unless you’re really lucky, you’re likely to have some significant differences in ways you program to solve problems – enough that you’ll run into something frustrating that someone else did before you.

The best use of this insight that I can think of is to come at all projects with an open mind – though in a different way than you probably think. You want to try as hard as possible to lose your attachment to the ways you have solved problems in the past no matter how much better you think they are. On one hand, you could get to more viscerally feel the cons of using a certain pattern or idiom – but on the other hand, you could also be learning something useful you’ll take with you the rest of your career. Either way, though, your goal on a new team isn’t to prove how smart you are. Your goal should be to minimize the amount of time it takes you to internalize the theory of the program you’re working on and the team’s theory of building programs. Being frustrated with the way the team works together and complaining about how bad the program is will just make that it take longer. I’m not saying that if things really are bad that you just have to live with them, but you should be as comfortable in the program and with the theory of the program as the best on the team before you start suggesting changes that deviate far from current styles. The upside is that once you’ve internalized the new style, you should be able to incorporate and easily switch between any of the styles you’ve learned so far. So while you may need to keep your ego in check at first and treat a new style with a beginner’s mind, in the long run it will make you a much better programmer.