Copy
View this email in your browser
Arch-Engineer
Code better

Integration Continuity: Bringing Down Giants


In his famous "How to Become a Hacker," Eric S. Raymond declared "No problem should ever have to be solved twice." We usually do this by creating libraries. Unfortunately, libraries are built to be destroyed.

That is, given two components and enough money, one can nearly always make the system better by smashing them into one component. But I yearned for a world where Raymond's decree could become literally true, and saw automated program generation as the only path forward. Writing on this topic three years ago in this newsletter, I explained
 
I used to believe fervently in aggressive reuse of small functionality. I spoke with disgust at how the myriad apps on my machine each had their own spellchecker and login-system, and the lost brainpower that had gone into creating them. I watched teams adopt prebuilt components enthusiastically, and then target them for replacement as soon as they realized they could get a 10%-smaller download size by building their own version with fewer features. I believed in the vision set forth in Doug McIllroy's famous essay "Mass-Produced Software Components," which replaced one-size-fits-all software libraries with "library families," offering variants of a library custom-suited to an application. Ever used JSON to save something to disk, but thought that, if only you had more resources, you'd make a more-efficient binary format instead? Down with JSON; there should be a button that gives you that format.

Back then, I gave up on this dream, accepting instead a world of "giants," which solve the problem of wanting to tightly integrate many different components by instead providing one big one. But, now that I've watched what has been called "the most helpful lecture on (game) API design," Casey Muratori's "Designing and Evaluating Reusable Components" (talk, transcript), I dream again. This talk provides a set of guidelines for API design so that you as a user can go from "simple and easy to use" to "does exactly what you want and no more in the format you want."

Actually, that's a lie. I first watched this talk several years ago, journaled it as "rambly, but good," and continued on not-dreaming. When it was strongly recommended more recently, I got midway through before realizing I had already watched the talk. As I began writing my critique, I found myself making multiple false starts, where I'd start ripping into the idea behind a code example, only to realize I had incorrectly generalized the example. I believe the reason is that this talk, quite ironically, is missing "explanation continuity," and jumps straight from high-level and vague to nitty-gritty specific code examples. I started off ready to say this is generally good advice, then came to believe that there's some mysterious difference between the talk's game APIs and the business-y ones I was using as examples, and then ultimately found that, with some repackaging and polishing, there really are simple lessons here that anyone can use.

Anyway, many hours, multiple rewatches/rereads, and two major rewrites of this newsletter later, here is my explanation of its core lessons, adapted away from its original focus on game development.

The Five Four Directives

The overarching concept of this talk is integration discontinuity. Typically, when you begin using an API, you do so in a very simple way, and libraries provide high-level APIs for immediately getting the benefit, often from just a single call. Over time, you begin to use more of its features, and customizing what it does, and need to use lower level operations. If adding a little bit of customization suddenly requires replacing large swaths of functionality to manually do things the library used to be doing for you (think: I want a custom animation on my button, and now I'm having to draw it manually even when the animation is not running), then you're facing an integration discontinuity.

 

The talk presents five directives to follow to help an API design attain minimal integration discontinuity. (Note that it does not say that minimizing this should be a goal of all APIs; most of these add surface area to an API.)

 

The first is granularity. High-level functions that do a lot are nice, but they should be decomposable into 2-3 smaller operations, which should in turn can be broken down into yet smaller operations. For example, a good file API should have functions each for reading a whole file, an individual line, a fixed number of characters, and an individual character. If a file API only has operations for reading a whole file or a single character, then using it instantly creates an integration discontinuity.

 

The second is redundancy. Where there are multiple types that describe the same thing, a good API should be able to accept any of them. A file API should be able to take in files as both paths and as dedicated file objects, and output what it reads as both bytes and strings. A frontend web API which adds functionality to some HTML element should be able to take that element as either a selector or a DOM object. I'll call this invocation redundancy,  as I don't think he'd be a fan of the Android Drawable API, which once required me to write code checking multiple places to determine if an image has been scaled or rotated.

 

The third is coupling, which is very misnamed. This directive actually describes two things, both of which it advocates avoiding in all circumstances. The first is providing functions which do multiple things combined, such as reading and parsing a file, without more granular variants or alternate means of giving inputs. The second is having sequences of functions that share some hidden state and therefore must be called in a certain order and interact with all other calls. The strtok function in the C standard library is an infamous example of this.  The former I would not call coupling, and it seems redundant with the previous two directives; at best, it’s a call to focus on places where offering more granularity is more important. The latter is called temporal coupling, which I wrote about in one of the first newsletters, offering weak purity (functions only mutate their arguments) as an antidote. I'll be replacing this directive with temporal coupling in the rest of the discussion.

 

The fourth is retention. This term confused me greatly, as I've most commonly seen that used as a memory-management term (iOS programmers have nightmares about “retain cycles”), of little concern in garbage-collected languages. It has some relation to  memory management, but it's actually about something more universal: whether the library maintains internal state that mirrors your own app's state. I later learned that this does connect to terminology used in graphics, retained mode vs. immediate mode APIs — which allegedly stems in part from Muratori’s own coinage. Still, to save me confusion, I'll call this mirroring

 

The talk presents three forms of mirroring. The first is in API's where some needed state for an operation is actually provided by a previous invocation. This overlaps with temporal coupling, but the complaint here is about providing (app-controlled) state in two batches, not the method calls must be sequenced. A great offender of this is Java's standard Graphics class, where drawing a colored line is two calls: setColor() and drawLine(). In other words, the Graphics object has internal state mirroring the app's provided color. The remedy is to provide both this variant and another where drawing a colored line is a single call.

 

The second form of mirroring is a variation on this, except the state is global. Initializing a library with a language or locale is an example. If you're building an ebooks app and want the entire app to be in English except for foreign books, then it's nice to also have an option to override the locale on specific method calls.

 

But the third form is more exotic. Suppose you have a feature where, if the user holds down a key, something happens, and then stops happening when they lift the key. The talk's example is extending a grappling-hook in Rocket Jockey, a cute game I'd never heard of before. You do not want to write code that says "When the press the key, do the thing; if the key is not pressed and the thing is being done, stop." This code means that the library has a copy of some state (whether there's a grappling hook being shown on the screen), you have a copy of some similar state (whether the grappling hook should be extended), and your code is manually diff'ing these two states and synchronizing them. You'd much rather give the relation between the two and let them get synchronized automatically. I suppose a non-game example would be designing your own tooltips. If you want a tooltip which appears when the user hovers over an element and then disappears when it doesn't, it's so much nicer to just say "here's a tooltip, and display it if and only if the user is hovering over this other element" (something the CSS :hover selector does nicely), than to use showElement() and hideElement() functions. (But if you use :hover and then want some more manual control over tooltips, then you’ll face a great integration discontinuity.)

 

I'm not sure the various symptoms belong under the same heading, but here are the two cures: (1) allow all distinct operations to be performed in a single invocation, and (2) allow for automated synchronization of app and library state.

 

The fifth is flow control. It says that good APIs should not have callbacks because

 
I call the library, maybe it calls me back one more time. This is obviously a negative thing because the more this happens the more complex it is to visualize in your head what's going on in your relationship to this library  

This part I don't understand at all — I'd only be concerned about where a function is getting called if there were timing concerns or lots of shared state involved (which maybe is the case in games). He also says some stuff about difficulty passing data through a library back to an app, which sounds like a symptom of not having closures or parametric polymorphism (generics). The best I can gather from discussion with others is that the actual complaint is about concurrency (meaning interleaving of code, not parallelism), which is a larger topic. I'll just mention that every single API mentioned in the next section has callbacks of some form (asynchronous fetches? webhooks? event handlers?), and continue on as if this fifth directive didn’t exist.

Serious Business

Muratori gave lots of concrete examples with game APIs. Let's try the directives on some non-game ones. I chose three APIs I'm familiar with for analysis:

  • Generic database connections (the API part, not the "SQL the language" part)

  • The PayPal API

  • The Apptimize Android API (which I worked on many years ago)

After looking for an example that I thought would make a better analogue for Muratori's physics-engine example, a small piece that controls an internal portion of some larger app, I added a fourth, to which I have very little prior exposure:

  • Intro.js (a library for building product walkthroughs)

 

We begin:

 

Granularity is a helpful lens to analyze all of these, except maybe Apptimize, which is by default quite granular. Database connection libraries can execute a query in one go or create a prepared statement that can be run multiple times. They can fetch all matching rows at once, or iterate through them one by one. But Paypal...I now see that granularity is the reason why getting basic checkout working is so easy (a special API for it), while I had a nasty time trying to implement a refund button.

 

Redundancy too, to a lesser extent. I see Intro.js succeeding in providing both a goToStepNumber() function and nextStep()/previousStep() functions, but failing in only accepting a selector for the HTML element it runs on. You can ask Apptimize which A/B testing variant a user is in, or you can just have it run the correct one. But most of the operations in these libraries act either on basic data types with only one reasonable input format (who complains when a Java API can take a String but not a char[]), or on types specific to the API (you want a Paypal transaction, you give an ID). There are more superficial changes one can make to an API (like supplying DB query parameters in sequence or in bulk), but overall there's less to be altered.

 

Temporal coupling is less common outside of C. But the Paypal API has a system where you give it your API secret, and it gives you a temporary access token to authenticate all your future queries within a given time limit. I'll let you imagine the pain that comes with tracking this, and the extra network requests that come with not. If you need more: the (deprecated) Paypal PHP SDK uses global state to track this access token, and does not let you not create one.

 

And then there's mirroring. Can we find internal library state that users would duplicate? For Intro.js, I did not find a way to get the current step of the tour that a user is on without supplying an onChange callback and tracking it manually.

 

And for a much bigger one. Your app probably has a billing system that tracks transactions. Paypal has a database of transactions. Even if 100% of your payments go through Paypal, you probably don't want these to be the same. I guess maybe you'd want to declaratively specify the relation between your database and Paypal's and let the system maintain the correspondence? That sounds very hard to do over REST. But having this directive makes me think: why not? I can see myself paying for a 3rd party tool that automated this.

 

These are all pretty different from the examples in the talk. When I first started writing this section, it was actually about how I thought the talk's directives suggested terrible things like having Paypal use your database instead of its own (thereby letting you issue yourself infinite refunds). There are good lessons here, and I did not understand them until I wrote this newsletter. Properly digested, they are indeed quite general.

Conclusion

There's a lot more I could say about Muratori's talk. We can talk about differences between game and non-game programming (more relaxed but frequent deadlines in most software these days, and real-time constraints are rare outside embedded systems), or whether "data which doesn't have a reason for being opaque should be transparent" (I think he's either talking about dumb data structures used as input, which usually are pretty transparent, or saying that it should be theoretically possible but not easy to access arbitrary bytes on a complicated C++ object, which…..it already is). And he gives lots of examples of forms of redundancy and granularity for which I’ve shown no analogue. But I'm at about 2000 words now, and think I'm doing a pretty good job of giving the same ideas as Muratori's 12,000.
 

Is this the best talk on API design? It's its own beast. Most API design things I've read (and written) are about ease of use, difficulty of misuse, and ease of API evolution. This one identifies a different kind of problem with API design and attempts guidelines to address it. Aside from the overall problem of integration discontinuity, the concept of granularity is my biggest takeaway. It's easy to understand, but now I'll be explicitly using it whenever I design or critique an API (with the other three being smaller deals). 

 

There's a lot more in the talk, and I hope that, if you choose to read it, it will be much easier to deeply understand after reading this newsletter. I'll leave you with my favorite quote from it:

 
Furthermore, if you're evaluating an API, you may think that the best thing to do is go read the documentation of the tools that you're potentially evaluating. Don't do that yet! Pretend you have the perfect one you want. Pretend to integrate it into the game you've got for a day or two, look at what you came up with, and now when you evaluate those components from the different vendors, go: "how close does this match what I'm going to do?" Don't think in their terms first! Think in your terms first, and then as you evaluate your APIs, go like: "which one of these things links up with me?" 
 

Thank you to Julian Ceipek for getting me to think harder about this talk, and for providing his transcript.

Next Cohort Starting June 23rd


The next cohort of the Advanced Software Design Web Course starts June 23rd.
 
Apply to join

Arch-Engineer Archive

A lot of good content has been written over the last 5 years for this newsletter.  I’ve frequently referred people to some of the oldest newsletters, but Mailchimp's default archive page only shows the last 20 issues, meaning you couldn't find the old newsletter on temporal coupling linked in the article above. Until now.

A New Name

 

When I registered "James Koppel Coaching, LLC" in 2017, I didn’t put much thought into the name. Now nearly every word is inaccurate: I'm more consistently called “Jimmy,” the company is growing beyond me,  and the web course has been our main offering since 2018; the “LLC” part is still accurate though. The next few newsletters may still have the old name, but, over the summer, we’ll be going through a rebranding. Look out for E-mails from us under our future name: Mirdin. The name has a double meaning — to be revealed later.


 
Copyright © 2022 Mirdin, All rights reserved.