What makes a good Internet API?

Before I get into a what makes a good Internet API, I want to be clear on what good means to me in this context. An API is, by definition, an integration project, and integration is hard and can be time consuming. A good API would reduce the time and complication of that integration. That results in engineers spending less time integrating, leaving more time for the rest of their development tasks.

If you’re writing an API, considering the distinctions and ideas in the post will make it easier for others to consume.  If you’re accessing someone else’s API, point the developers who are writing it here and your job may get easier.  If you’re wearing both of those hats – this should help you help yourself.

With that out of the way, here goes…

What is an Internet API?

  • An API is an Application Programming Interface, which is a way for one software system to provide a service or expose structured data to others. One system provides the API – we’ll refer to this system as the server. Other systems consume it – we’ll call a consumer a client.
  • An Internet API is an API that is consumed across the internet

Protocol

  • There are many potential protocols, but the most ubiquitous is the HTTP protocol since it has become the foundation of the internet.
  • The HTTP protocol is flexible in the data format being transmitted, so we must further determine the format of the data. JSON has become very prevalent due to its flexible structure. However, with that flexibility, we must take care to document the structure.

Perspectives

Server: I need to store data that others are interested in.

There are two perspectives to consider when designing and API. From the server’s perspective, there is data to be stored and controlled while the client’s perspective is about the data it receives and what to do with it. Because of these two perspectives, there can be different ways to model the data or actions of the API.

Client: In my view, that data should be combined.

As an API implementer, it’s easiest to keep the data models on the server the same as those served by the API. For example, if the server has a list of uses, it can expose them through the API using a /users/* route.

However, it sometimes makes more sense to expose a different logical model structure to the client. For example, the server may have multiple models that are more convenient for the client to consume as a single model. For example, a user may have multiple phone numbers, belong to a department which, itself, belongs to a company. The server can pull data from all of those models when the client requests the user data and provide just one user record that includes the company name, the department name, and an array of phone numbers. Alternatively, the server could expose separate APIs for the user model, the department model, the phone number model, and the company model and return references to the related records when the client requests the user data. The former is much more convenient if the client only needs user data, but the latter is much more flexible in that it allows the client to list companies and divisions, for example.

These considerations become important as we consider the needs of the clients that will access the API.

Resources

A resource is a record of data exposed by the server and that the server may allow the client to create, read, update, or delete.

Requests

The HTTP protocol has multiple “verbs” or ways to initiate a request. They are GET, POST, PUT, PATCH, and DELETE. A well structured API can leverage these verbs to simplify the routes. It may use GET requests for getting a resource or list of resources while POST is usually used to create a resource.  PUT is used to replace a resource, and PATCH is used to update one.  DELETE is for, um, deleting them.

Success

When the server is able to complete the request as expected, it is considered a success.  The response from the request is should be a record that represents the resource or set of resources being acted upon.  Even a DELETE request should return the now deleted resource.  It is ideal if all responses return the same structure for a given resource, simplifying the client’s job in handling the response.

Errors

The two hardest things in software development are naming, error handling, and off by one errors

Note how much longer this section is than the Success section – where do you suppose the most development time is spent?

Software development is an iterative process. That means that the developer implements a potential solution, tests it – either manually or with automated testing (a great topic for another post) – and then “fixes” it. This process is repeated until it works as expected under all of the required conditions.

Because of this structure, it is critical that the server return errors to the client so the developer can understand what is happening when there is an unexpected result (error).

There are two broad types of errors – those that should never happen in production, and those that may occur in production. For example, asking the server for a list of users should always return a list, even if it’s empty. However, the server may be misconfigured, and when it attempts to access the database, the database could return an error that is triggers an exception on the server which makes its way back to the client. In production, the server should be configured correctly, but during development, the situation is frequently changing and things can go wrong.

Errors that may occur in production are really just ways for the server to communicate to the client that something needs to be handled. For example, there may be a policy where the user must reauthenticate after being idle for a certain amount of time. The server could return an error indicating that the request is not authorized. That would trigger the client to prompt the user for a password and then retry the request. Usually, these errors are not exposed to the user, but they cause the client to take some action.

Many hours can be saved by anticipating all of the errors an API may encounter, documenting those errors, and returning them to the client in a defined structure. That structure can be as simple as an error code and human readable string. The client developer use the error code in the client software while they can see the human readable string while debugging without having to look up the code in the documentation. Note that it’s not generally a good idea to expose the human readable errors to the end user nor to rely on them never changing. It’s better to reference the error codes and have a localized list of error strings to present to the user when needed.

It may be useful to differentiate between those errors that should not happen in production, but the client has to allow for any of them. The non-production errors would be considered “internal” errors, and could be exposed to the user as such with a code that will help the developers track down the bug.

HTTP response codes. It is tempting to use the HTTP response codes as error codes, but this is quite limiting and can be misleading. The HTTP response codes are referring to the HTTP protocol itself with some overlap into the resource space. By using HTTP and JSON as the underlying protocol and data format for an API is actually an extension of the HTTP protocol and it turns out there is a more robust way to report errors. In the HTTP protocol, any error code in the 2xx (200-299) is considered a successful request. When a client makes a request and the server encounters a situation where it needs to return an error, the communication between the client and the server was a success, so it’s appropriate for the HTTP protocol to return a 2xx error code. To communicate the error, the server should package it up as a JSON structure that the client will examine to determine that it’s an error.

For example, the API could always return a JSON dictionary with two keys for errors, such as “error_code” and “error_message”.  When the client gets the response, it checks for a 2xx response and then looks at the JSON for a dictionary containing the error_code key.  If it’s there, it can handle the error.  If not, it can handle the data as expected.

For example:

{ “error_code”:1000, “error_message”:”User is not authorized for this action” }

Security

Security is a big topic, but there are some fundamentals that show up in every API that need to be considered.

Usually the API is protected by some kind of API key.  This just limits access to the API to authorized applications and informs the API provider who is using it.  Consider having a specific API that expects the API key and returns an authentication token that is supplied to every API thereafter.

If you’re interfacing with mobile devices, it is often helpful to have an API that accepts the API Key, a device identifier, and user credentials such as their login and password, and returns an authentication_token that can be used to identify them in the future.  If the server ever needs to revoke or expire access to the resources to that user, the authentication_token can be removed from the server’s storage, forcing the user to authentication again.

The HTTP protocol is a text based protocol and is not encrypted or secure.  By using TLS on top of HTTP, all of the data is encrypted.  This is a deep topic and out of the scope of this post, but I recommend that you use it for all of your production APIs and insist on it from anyone providing them to you.

Versioning

Usually bolted on after the fact is versioning.  When writing an API, one thing is sure – it will change.  When it does, it can cause problems for the clients.  A strategy for handling that is to version them, but make this the last resort, as versioning comes with some costs that I’ll discuss shortly.

Before versioning, consider that, if you’re using JSON to represent your data, you can generally add fields without breaking existing clients using the API.  In those cases, adding additional fields to the response doesn’t require a new version of the API.

If your needs are more complex than that, it’s time to create a new version of the API.

A simple example is by adding the version number in the url – for example https://my.api.com/v1/users.  If we need to change the nature of the result from the users API, but there are production clients using the current one, what do we do?  We could add a new route, but that would pollute the namespace.  With with versioning, we can just add a new version – e.g. https://my.api.com/v2.users.

Ok, so what’s are the costs?  When adding a new version,  consider if it’s more effective and maintainable to add the new version of the API to the existing code base or to have a separate code base that will grow while you maintain the original version for legacy purposes.  Either decision involves additional concern for the code base – either more complexity in one code base with multiple API versions, or an additional code base per version, which requires patching both for security updates, and other maintenance issues.

Automated Testing

One last thing.  When clients are using you API, their productivity will be ruined if the API doesn’t work.  To be sure an API works, it must be tested, but testing can be hard since there is no user interface for the API.  There are great tools like Swagger, which I discuss in the One Grape API post to expose the API and make it testable by humans, but there is no substitute to automated testing, where you have a suite of tests that run against the API to prove that it works.  If you miss a test case and get a report of a but, add it later and it’ll never be back!

Fin

 

Leave a comment

Your email address will not be published. Required fields are marked *