The Web API Trap

close up photography of spider

[X]aaS is great for proof-of-concept, rapid prototyping, and other quick turnaround development efforts; especially when they provide easy to use API endpoints. However, becoming over-dependent on these third-party vendors for you production systems can be a hidden liability to your product and business.

I recently had the honor of being a judge at HackGT, an annual hackathon put on by Georgia Tech students for college students.  It is a 36-hour coding sprint, and in the end it was incredible to see what everyone had built and how well many of the systems worked.  In order to get systems up and running quickly, many successfully completed projects leveraged existing web APIs to leverage features like mapping, computer vision, natural language processing (NLP), remote storage, etc ad nauseum.  Many applications had a relatively high degree of polish and functionality after only 36 hours of work, which is a testament to the hackers as well as the power of web APIs.

That’s what you want, right?

On the surface, you’d think the fast turnaround of a solution to a challenge would validate web APIs as the go-to solution for just about anything.  In some cases, I would agree.  In the case of hackathons, engineering PoCs, and other short-lived projects the ROI on web APIs is really hard to beat unless you happen to be a subject matter expert; even then it’s a few lines of code vs potentially thousands.  But what sparked this post is more about two comments I encountered at the HackGT event (paraphrased):

“Projects should be partially judged on code originality versus simply gluing together a bunch of web API calls.”

“We love our project, and we’re thinking about launching it on the app store after we get some sleep.”

The mixed blessing of web APIs

While web APIs make it easy to build something fast, they can wind up being a mixed blessing for an entrepreneurial venture.  This follows from:

  • Lack of intellectual property (IP)
  • Scalability
  • Vendor lock-in
  • Accountability

IP’s where it’s at!

From the judging criteria statement above, if you simply glue a bunch of web API calls together, you really haven’t created any defensible IP.  At best you have a copyright, but if you’re seriously interested in taking your project to the next level as a venture, you need to make sure you have some kind of original and defensible IP.  One way to tackle this might be to build custom implementations where it makes sense to provide domain-specific variants of the APIs used in your code, thus creating value for your venture.  Another might be to locate potential partners or professional organizations in your domain to see if there is a need to develop a domain-specific service vs using a general-purpose API, which might wind up being a second business proposition for your lean startup!

Will it scale?

Many web APIs out there that are free to use might not scale well.  Say you hit the jackpot, and your app lands on the top of Slashdot, Reddit, HN, and/or some other major online outlet.  Are your users going to get spinning icons and timeouts, or is your app going to just work?  Often web APIs have a free tier, but then stick it to you when you need to scale.  If your app is free, this can wind up costing you dearly.  Sometimes replicating an API service is a way to reduce costs, as well as allowing you to customize some of the behaviors to fit your needs – and allow you to continue customizing it going forward.  It will also allow you to manage the availability of your feature with more granularity based on your actual business needs.

Vendor lock-in – bad!

I’ve personally seen companies I’ve consulted with get stuck in vendor lock-in due to over-reliance on external web APIs.  Often the features provided by the API were not properly abstracted, and as such the entire codebase now depends on a specific external service.  Think of the great Leftpad Debacle or when Superstorm Sandy took down many datacenters on the east coast of the US, and how overuse of external services and data providers can be a source of risk.  (Granted the Leftpad case demonstrates the assumption by the NodeJS community that a specific module would always be available to download, but nobody had a contingency plan in place when it disappeared.)

If your entire product depends on multiple external providers that you have no control over or contractual guarantees with, your codebase is at risk.  Make sure that if you do rely on an external web API or service that your code can accommodate a switch in the event costs go out of sight, functionality changes drastically, or in  the worst case the API just disappears into the aether.  Always have an escape hatch from a vendor, and at a minimum some solid risk-management contingency plans.

But it wasn’t us!

Especially in the current AI accountability era, the use of external web APIs to access pre-trained machine learning (ML) models has exploded – but can also be fraught with risk.  Explainable AI is still an open research problem, and nobody has really come up with a good way to audit so-called “black box” ML systems.  Since many of these models are proprietary and only allow access through limited APIs, you have no idea what biases (if any) or weaknesses (susceptibility to adversarial attacks through feature engineering) the models you are using might have.  If you have developed your own models, you can at the very least provide information regarding the data set(s) used, features used, data normalization/regularization techniques, ML method used, etc if audited or claims of bias surface.  Otherwise you are at the mercy of the API vendor to bail you out.

Above and beyond AI and ML, all data services are not created equally – or are unbiased.  Google’s Knowledge Graph provides results based on what it considers most relevant based on searches, which might not dovetail exactly with what your software needs.  Data can be missing as in the case of many vehicle VIN lookup systems that have missing data – or in the worst case incorrect data.  Navigation systems might use different algorithms; Garmin, TomTom and Google Maps have been known to all give me different “optimal” routes to the same destination.

Bottom line: If you aren’t in control of mission-critical data for your product, you have a lack of accountability, transparency, and auditability.  It’s a risk that might not have a lot of impact in a free game, but might have significant impact when applied to real world problems.  Google’s infamous image classification snafu (and its “fix”) are both examples of how biases can get introduced into services and datasets.

Now what?

Web APIs and external services are a great boon to productivity, allowing developers (and in some cases non-developers) to build solutions quickly and efficiently.  That  being said, depending on the ultimate use case (internal vs external, free vs commercialized, prototyping vs production) there are varying risks associated with the use of them.  Just be aware, plan accordingly, and at the end of the day build the best application you can with the best tools for the job – and hopefully have fun doing it!