RSS

API Deployment News

These are the news items I've curated in my monitoring of the API space that are related to deploying APIs and thought worth enough to include in my research. I'm using all of these links to better understand how APIs are being deployed across a diverse range of implementations.

Importing OpenAPI Definition To Create An API With AWS API Gateway

I’ve been learning more about AWS API Gateway, and wanted to share some of what I’m learning with my readers. The AWS API Gateway is a robust way to deploy and manage an API on the AWS platform. The concept of an API gateway has been around for years, but the AWS approach reflects the commoditization of API deployment and management, making it a worthwhile cloud API service to understand in more depth. With the acquisition or all the original API management providers in recent years, as well as Apigee’s IPO, API management is now a default element of major cloud providers. Since AWS is the leading cloud provider, AWS API Gateway will play a significant role into the deployment and management of a growing number of APIs we see.

Using AWS API Gateway you can deploy a new API, or you can use it to manage an existing API–demonstrating the power of a gateway. What really makes AWS API Gateway reflect where things are going in the space, is the ability to import and define your API using OpenAPI. When you first begin with the new API wizard, you can upload or copy / paste your OpenAPI, defining the surface area of the API, no matter how you are wiring up the backend. OpenAPI is primarily associated with publishing API documentation because of the success of Swagger UI, and secondarily associated with generating SDKs and code samples. However, increasingly the OpenAPI specification is also being used to deploy and define aspects of API management, which is in alignment with the AWS API Gateway approach.

I have server side code that will take an OpenAPI and generate the server side code needed to work with the database, and handle requests and responses using the Slim API framework. I’ll keep doing this for many of my APIs, but for some of them I’m going to be adopting an AWS API Gateway approach to help standardize API deployment and management across the APIs I deliver. I have multiple clients right now who I am deploying, and helping them manage their API operations using AWS, so adopting AWS API Gateway makes sense. One client is already operating using AWS which dictated that I keep things on AWS, but the other client is brand new, looking for a host, which also makes AWS a good option for setting up, and then passing over control of their account and operations to their on-site manager.

Both of my current projects are using OpenAPI as the central definition for all stops along the API lifecycle. One API was new, and the other is about proxying an existing API. Importing the central OpenAPI definition for each project to AWS API Gateway worked well to get both projects going. Next, I am exploring the staging features of AWS gateway, and the ability to overwrite or merge the next iteration of the OpenAPI with an existing API, allowing me to evolve each API forward in coming months, as the central definition gets added to. Historically, I haven’t been a fan of API gateways, but I have to admit that the AWS API Gateway is somewhat changing my tune. Keep the service doing one thing, and doing it well, with OpenAPI as the definition, really fits with my definition of where API management is headed as the API sector matures.


Obfuscating The Evolving Code Behind My API

I’m dialing in a set of machine learning APIs that I use to obfuscate and distort the images I use across my storytelling. The code is getting a little more hardened, but there is still so much work ahead when it comes to making sure it does exactly what I needed it to do, with only dials and controls I need–nothing more. I’m the only consumer of my APIs, which I use them daily, with updates made to the code along the way, evolving the request and response structures until they meet my needs. Eventually the APIs will be done (enough), and I’ll stop messing with them, but that will take a couple months more of pushing forward the code.

While the code for these APIs are far from finished, I find the API helps obfuscate and diffuse the unfinished nature of things. The API maintains a single set of paths, and I might still evolve the number of parameters it accepts, and the fields it outputs, the overall will keep a pretty tight surface area despite the perpetually unfinished backend. I like this aspect of operating APIs, and how they can be used as a facade, allowing you to maintain one narrative on the front-end, even with another occurring behind behind the scenes. I feel like API facades really fit with my style of coding. I’m not really an A grade programmer, more a B- level one, but I know how to get things to work–if I have a polished facade, things look good.

Honestly, I’m pretty embarrassed by my wrappers for TensorFlow. I’m still figuring everything out, and finding new ways of manipulating and applying Tensor Flow models, so my code is rapidly changing, maturing, and not always complete. When it comes to the API interface I am focused on not introducing breaking changes, and maintaining a coherent request and response structure for the image manipulation APIs. Even though the code will at some point be stabilized, with updates becoming less frequent, the code will probably not see the light of day on Github, but the API might eventually be shared beyond just me, providing a simple, usable interface someone else can use.

I wouldn’t develop APIs like this outside a controlled group of consumers, but I find being both API provider and consumer pushes me to walk a line of unfinished and evolving backend, with a stable, simple API, without any major breaking changes. The fewer fixes I have to do to my API clients, the more successful I’ve been pushing forward the backend, while keeping the API in forward motion without the friction. I guess I just feel like the API makes for a nice wrapper for code, acting as shock absorbers for ups and downs of evolving code, and getting it to do what you need. Providing a nice facade that obfuscates the evolving code behind my API.


Everything Is Headless In An Api World

404: Not Found


Provide An Open Source Threat Information Database And API Then Sell Premium Data Subscriptions

I was doing some API security research and stumbled across vFeed, a “Correlated Vulnerability and Threat Intelligence Database Wrapper”, providing a JSON API of vulnerabilities from the vFeed database. The approach is a Python API, and not a web API, but I think provides an interesting blueprint for open source APIs. What I found interesting (somewhat) from the vFeed approach was the fact they provide an open source API, and database, but if you want a production version of the database with all the threat intelligence you have to pay for it.

I would say their technical and business approach needs a significant amount of work, but I think there is a workable version of it in there. First, I would create a Python, PHP, Node.js, Java, Go, Ruby version of the API, making sure it is a web API. Next, remove the production restriction on the database, allowing anyone to deploy a working edition, just minus all the threat data. There is a lot of value in there being an open source set of threat intelligence sharing databases and API. Then after that, get smarter about having a variety different free and paid data subscriptions, not just a single database–leverage the API presence.

You could also get smarter about how the database and API enables companies to share their threat data, plugging it into a larger network, making some of it free, and some of it paid–with revenue share all around. There should be a suite of open source threat information sharing databases and APIs, and a federated network of API implementations. Complete with a wealth of open data for folks to tap into and learn from, but also with some revenue generating opportunities throughout the long tail, helping companies fund aspects of their API security operations. Budget shortfalls are a big contributor to security incidents, and some revenue generating activity would be positive.

So, not a perfect model, but enough food for thought to warrant a half-assed blog post like this. Smells like an opportunity for someone out there. Threat information sharing is just one dimension of my API security research where I’m looking to evolve the narrative around how APIs can contribute to security in general. However, there is also an opportunity for enabling the sharing of API related security information, using APIs. Maybe also generating of revenue along the way, helping feed the development of tooling like this, maybe funding individual implementations and threat information nodes, or possibly even fund more storytelling around the concept of API security as well. ;-)


Looking At The 37 Apache Data Projects

I’m spending time investing in my data, as well as my database API research. I’ll have guides, with accompanying stories coming out over the next couple weeks, but I want to take a moment to publish some of the raw research that I think paints an interesting picture about where things are headed.

When studying what is going on with data and APIs you can’t do any search without stumbling across an Apache project doing something or other with data. I found 37 separate projects at Apache that were data related, and wanted to publish as a single list I could learn from.

  • Airvata** - Apache Airavata is a micro-service architecture based software framework for executing and managing computational jobs and workflows on distributed computing resources including local clusters, supercomputers, national grids, academic and commercial clouds. Airavata is dominantly used to build Web-based science gateways and assist to compose, manage, execute, and monitor large scale applications (wrapped as Web services) and workflows composed of these services.
  • Ambari - Apache Ambari makes Hadoop cluster provisioning, managing, and monitoring dead simple.
  • Apex - Apache Apex is a unified platform for big data stream and batch processing. Use cases include ingestion, ETL, real-time analytics, alerts and real-time actions. Apex is a Hadoop-native YARN implementation and uses HDFS by default. It simplifies development and productization of Hadoop applications by reducing time to market. Key features include Enterprise Grade Operability with Fault Tolerance, State Management, Event Processing Guarantees, No Data Loss, In-memory Performance & Scalability and Native Window Support.
  • Avro - Apache Avro is a data serialization system.
  • Beam - Apache Beam is a unified programming model for both batch and streaming data processing, enabling efficient execution across diverse distributed execution engines and providing extensibility points for connecting to different technologies and user communities.
  • Bigtop - Bigtop is a project for the development of packaging and tests of the Apache Hadoop ecosystem. The primary goal of Bigtop is to build a community around the packaging and interoperability testing of Hadoop-related projects. This includes testing at various levels (packaging, platform, runtime, upgrade, etc…) developed by a community with a focus on the system as a whole, rather than individual projects. In short we strive to be for Hadoop what Debian is to Linux.
  • BookKeeper - BookKeeper is a reliable replicated log service. It can be used to turn any standalone service into a highly available replicated service. BookKeeper is highly available (no single point of failure), and scales horizontally as more storage nodes are added.
  • Calcite - Calcite is a framework for writing data management systems. It converts queries, represented in relational algebra, into an efficient executable form using pluggable query transformation rules. There is an optional SQL parser and JDBC driver. Calcite does not store data or have a preferred execution engine. Data formats, execution algorithms, planning rules, operator types, metadata, and cost model are added at runtime as plugins.
  • CouchDB - Apache CouchDB is a database that completely embraces the web. Store your data with JSON documents. Access your documents with your web browser, via HTTP. Query, combine, and transform your documents with JavaScript. Apache CouchDB works well with modern web and mobile apps. You can even serve web apps directly out of Apache CouchDB. And you can distribute your data, or your apps, efficiently using Apache CouchDB’s incremental replication. Apache CouchDB supports master-master setups with automatic conflict detection.
  • Crunch - The Apache Crunch Java library provides a framework for writing, testing, and running MapReduce pipelines. Its goal is to make pipelines that are composed of many user-defined functions simple to write, easy to test, and efficient to run.
  • DataFu - Apache DataFu consists of two libraries: Apache DataFu Pig is a collection of useful user-defined functions for data analysis in Apache Pig. Apache DataFu Hourglass is a library for incrementally processing data using Apache Hadoop MapReduce. This library was inspired by the prevalence of sliding window computations over daily tracking data. Computations such as these typically happen at regular intervals (e.g. daily, weekly), and therefore the sliding nature of the computations means that much of the work is unnecessarily repeated. DataFu’s Hourglass was created to make these computations more efficient, yielding sometimes 50-95% reductions in computational resources.
  • Drill - Apache Drill is a distributed MPP query layer that supports SQL and alternative query languages against NoSQL and Hadoop data storage systems. It was inspired in part by Google’s Dremel.
  • Edgent - Apache Edgent is a programming model and micro-kernel style runtime that can be embedded in gateways and small footprint edge devices enabling local, real-time, analytics on the continuous streams of data coming from equipment, vehicles, systems, appliances, devices and sensors of all kinds (for example, Raspberry Pis or smart phones). Working in conjunction with centralized analytic systems, Apache Edgent provides efficient and timely analytics across the whole IoT ecosystem: from the center to the edge.
  • Falcon - Apache Falcon is a data processing and management solution for Hadoop designed for data motion, coordination of data pipelines, lifecycle management, and data discovery. Falcon enables end consumers to quickly onboard their data and its associated processing and management tasks on Hadoop clusters.
  • Flink - Flink is an open source system for expressive, declarative, fast, and efficient data analysis. It combines the scalability and programming flexibility of distributed MapReduce-like platforms with the efficiency, out-of-core execution, and query optimization capabilities found in parallel databases.
  • Flume - Apache Flume is a distributed, reliable, and available system for efficiently collecting, aggregating and moving large amounts of log data from many different sources to a centralized data store
  • Giraph - Apache Giraph is an iterative graph processing system built for high scalability. For example, it is currently used at Facebook to analyze the social graph formed by users and their connections.
  • Hama - The Apache Hama is an efficient and scalable general-purpose BSP computing engine which can be used to speed up a large variety of compute-intensive analytics applications.
  • Helix - Apache Helix is a generic cluster management framework used for the automatic management of partitioned, replicated and distributed resources hosted on a cluster of nodes. Helix automates reassignment of resources in the face of node failure and recovery, cluster expansion, and reconfiguration.
  • Ignite - Apache Ignite In-Memory Data Fabric is designed to deliver uncompromised performance for a wide set of in-memory computing use cases from high performance computing, to the industry most advanced data grid, in-memory SQL, in-memory file system, streaming, and more.
  • Kafka - A single Kafka broker can handle hundreds of megabytes of reads and writes per second from thousands of clients. Kafka is designed to allow a single cluster to serve as the central data backbone for a large organization. It can be elastically and transparently expanded without downtime. Data streams are partitioned and spread over a cluster of machines to allow data streams larger than the capability of any single machine and to allow clusters of co-ordinated consumers. Kafka has a modern cluster-centric design that offers strong durability and fault-tolerance guarantees. Messages are persisted on disk and replicated within the cluster to prevent data loss. Each broker can handle terabytes of messages without performance impact.
  • Knox - The Apache Knox Gateway is a REST API Gateway for interacting with Hadoop clusters. The Knox Gateway provides a single access point for all REST interactions with Hadoop clusters. In this capacity, the Knox Gateway is able to provide valuable functionality to aid in the control, integration, monitoring and automation of critical administrative and analytical needs of the enterprise.
  • Lens - Lens provides an Unified Analytics interface. Lens aims to cut the Data Analytics silos by providing a single view of data across multiple tiered data stores and optimal execution environment for the analytical query. It seamlessly integrates Hadoop with traditional data warehouses to appear like one.
  • MetaModel - With MetaModel you get a uniform connector and query API to many very different datastore types, including: Relational (JDBC) databases, CSV files, Excel spreadsheets, XML files, JSON files, Fixed width files, MongoDB, Apache CouchDB, Apache HBase, Apache Cassandra, ElasticSearch, OpenOffice.org databases, Salesforce.com, SugarCRM and even collections of plain old Java objects (POJOs). MetaModel isn’t a data mapping framework. Instead we emphasize abstraction of metadata and ability to add data sources at runtime, making MetaModel great for generic data processing applications, less so for applications modeled around a particular domain.
  • Oozie - Oozie is a workflow scheduler system to manage Apache Hadoop jobs. Oozie is integrated with the rest of the Hadoop stack supporting several types of Hadoop jobs out of the box (such as Java map-reduce, Streaming map-reduce, Pig, Hive, Sqoop and Distcp) as well as system specific jobs (such as Java programs and shell scripts).
  • ORC - ORC is a self-describing type-aware columnar file format designed for Hadoop workloads. It is optimized for large streaming reads, but with integrated support for finding required rows quickly. Storing data in a columnar format lets the reader read, decompress, and process only the values that are required for the current query.
  • Parquet - Apache Parquet is a general-purpose columnar storage format, built for Hadoop, usable with any choice of data processing framework, data model, or programming language.
  • Phoenix - Apache Phoenix enables OLTP and operational analytics for Apache Hadoop by providing a relational database layer leveraging Apache HBase as its backing store. It includes integration with Apache Spark, Pig, Flume, Map Reduce, and other products in the Hadoop ecosystem. It is accessed as a JDBC driver and enables querying, updating, and managing HBase tables through standard SQL.
  • REEF - Apache REEF (Retainable Evaluator Execution Framework) is a development framework that provides a control-plane for scheduling and coordinating task-level (data-plane) work on cluster resources obtained from a Resource Manager. REEF provides mechanisms that facilitate resource reuse for data caching, and state management abstractions that greatly ease the development of elastic data processing workflows on cloud platforms that support a Resource Manager service.
  • Samza - Apache Samza provides a system for processing stream data from publish-subscribe systems such as Apache Kafka. The developer writes a stream processing task, and executes it as a Samza job. Samza then routes messages between stream processing tasks and the publish-subscribe systems that the messages are addressed to.
  • Spark - Apache Spark is a fast and general engine for large-scale data processing. It offers high-level APIs in Java, Scala and Python as well as a rich set of libraries including stream processing, machine learning, and graph analytics.
  • Sqoop - Apache Sqoop(TM) is a tool designed for efficiently transferring bulk data between Apache Hadoop and structured datastores such as relational databases.
  • Storm - Apache Storm is a distributed real-time computation system. Similar to how Hadoop provides a set of general primitives for doing batch processing, Storm provides a set of general primitives for doing real-time computation.
  • Tajo - The main goal of Apache Tajo project is to build an advanced open source data warehouse system in Hadoop for processing web-scale data sets. Basically, Tajo provides SQL standard as a query language. Tajo is designed for both interactive and batch queries on data sets stored on HDFS and other data sources. Without hurting query response times, Tajo provides fault-tolerance and dynamic load balancing which are necessary for long-running queries. Tajo employs a cost-based and progressive query optimization techniques for optimizing running queries in order to avoid the worst query plans.
  • Tez - Apache Tez is an effort to develop a generic application framework which can be used to process arbitrarily complex directed-acyclic graphs (DAGs) of data-processing tasks and also a reusable set of data-processing primitives which can be used by other projects.
  • VXQuery - Apache VXQuery will be a standards compliant XML Query processor implemented in Java. The focus is on the evaluation of queries on large amounts of XML data. Specifically the goal is to evaluate queries on large collections of relatively small XML documents. To achieve this queries will be evaluated on a cluster of shared nothing machines.
  • Zeppelin - Zeppelin is a modern web-based tool for the data scientists to collaborate over large-scale data exploration and visualization projects.

There is a serious amount of overlap between these projects. Not all of these projects have web APIs, while some of them are all about delivering a gateway or aggregate API across projects. There is a lot to process here, but I think listing them out provides an easier way to understand the big data explosion of projects over at Apache.

It is tough to understand what each of these do without actually playing with them, but that is something I just don’t have the time to do, so next up I’ll be doing independent searches for these project names, and finding stories from across the space regarding what folks are doing with these data solutions. That should give me enough to go on when putting them into specific buckets, and finding their place in my data, and database API research.


Database To Database Then API, Instead Of Directly To API

I am working with a team to expose a database as an API. With projects like this there can be a lot of anxiety in exposing a database directly as an API. Security is the first one, but in my experience, most of the time security is just cover for anxiety about a messy backend. The group I’m working with has been managing the same database for over a decade, adding on clients, and making the magic happen via a whole bunch of databases and table kung fu. Keeping this monster up and running has been priority number one, and evolving, decentralizing, or decoupling has never quite been a priority.

The database team has learned the hard way, and they have the resources to keep things up and running, but never seem to have them when it comes to refactoring it and thinking differently, let alone tackling the delivery of a web API on top of things. There will need to be a significant amount of education and training around REST, and doing APIs properly before we can move forward, something there really isn’t a lot of time or interest in doing. To help bridge the gap I am suggesting that we do an entirely new API, with it’s own database, and we focus on database to database communication, since that is what the team knows. We can launch an Amazon RDS instance, with an EC2 instance running the API, and the database team can work directly with RDS (MySQL) which they are already familiar with.

We can have a dedicated API team handle the new API and database, and the existing team can handle the syncing from database to database. This also keeps the messy, aggregate, overworked database out of reach of the new API. We get an API. The database team anxiety levels are lowered. It balances things out a little. Sure there will still be some work between databases, but the API can be a fresh start, and it won’t be burdened by the legacy. The database to database connection can carry this load. Maybe once this pilot is done, the database team will feel a little better about doing APIs, and be a little more involved with the next one.

I am going to pitch this approach in coming weeks. I’m not sure if it will be well received, but I’m hoping it will help bridge the new to the old a little bit. I know the database team likes to keep things centralized, which is one reason they have this legacy beast, so there might be some more selling to occur on that front. Doing APIs isn’t always about the technical. It is often about the politics of how things get done on the ground. Many organizations have messy databases, which they worry will make them look bad when any of it is exposed as an API. I get it, we are all self-conscious about the way our backends look. However, sometimes we still need to find ways to move things forward, and find compromise. I hope this database to database, then to API does the trick.


Just Waiting The GraphQL Assault Out

I was reading a story on GraphQL this weekend which I won’t be linking to or citing because that is what they want, and they do not deserve the attention, that was just (yet) another hating on REST post. As I’ve mentioned before, the GraphQL’s primary strength seems to be they have endless waves of bros who love to write blog posts hating on REST, and web APIs. This particular post shows it’s absurdity by stating that HTTP is just a bad idea, wait…uh what? Yeah, you know that thing we use for the entire web, apparently it’s just not a good idea when it comes to exchanging data. Ok, buddy.

When it comes to GraphQL, I’m still watching, learning, and will continue evaluating it as a tool in my API toolbox, but when it comes to the argument of GraphQL vs. Web APIs I will just be waiting out the current assault as I did with all the other haters. The link data haters ran out of steam. The hypermedia haters ran out of steam. The GraphQL haters will also run out steam. All of these technologies are viable tools in our API toolbox, but NONE of them are THE solution. These assaults on “what came before” is just a very tired tactic in the toolbox of startups–you hire young men, give them some cash (which doesn’t last for long), get them all wound up, and let them loose talking trash on the space, selling your warez.

GraphQL has many uses. It is not a replacement for web APIs. It is just one tool in our toolbox. If you are following the advice of any of these web API haters you will wake up in a couple of years with a significant amount of technical debt, and probably also be very busy chasing the next wave of technology be pushed by vendors. My advice is that all API providers learn about the web, gain several years of experience developing web APIs, learn about linked data, hypermedia, GraphQL, and even gRPC if you have some high performance, high volume needs. Don’t spend much time listening to the haters, as they really don’t deserve your attention. Eventually they will go away, find another job, and technological kool-aid to drink.

In my opinion, there is (almost) always a grain of usefulness with each wave of technology that comes along. The trick is cutting through the bullshit, tuning out the haters, and understanding what is real and what is not real when it comes to the vendor noise. You should not be adopting every trend that comes along, but you should be tuning into the conversation and learning. After you do this long enough you will begin to see the patterns and tricks used by folks trying to push their warez. Hating on whatever came before is just one of these tricks. This is why startups hire young, energetic, an usually male voices to lead this charge, as they have no sense of history, and truly believe what they are pushing. Your job as a technologist is to develop the experience necessary to know what is real, and what is not, and keep a cool head as the volume gets turned up on each technological assault.


API Foraging And Wildcraft

I was in Colorado this last week at a CA internal gathering listening to my friend Erik Wilde talking about APIs. One concept he touched on was what he called API gardening, where different types of API providers approached the planting, cultivating, and maintenance of their gardens in a variety of different ways. I really like this concept, and will be working my way through the slide deck from his talk, and see what else I can learn from his work.

As he was talking I was thinking about a project I had just published to Github, which leverages the Google Sheets API, and the Github API to publish data as YAML, so I could publish as a simple set of static APIs. I’d consider this approach to be more about foraging and wildcrafting, then it is about tending to my API garden. My API garden just happens spread across a variety of services, often free and public space, in many different regions. I will pay for a service when it is needed, but I’d rather tend smaller, wilder, APIs, that grow in existing spaces–in the cracks.

I use free services, and build upon the APIs for the services I am already using. I publish calendars to Google Calendar, then pull data and public APIs using the Google APIs. I use the Flickr and Instagram APIs for storing, publishing, sharing, and integrating with my applications using their APIs. I pull data from the Twitter API, Reddit API, and store in Google Sheets. All of these calendar, image, messaging, and link data will ultimately get published to Github as YAML, which then is shared as XML, JSON, Atom, CSV via static APIs. I am always foraging for data using public services, then planting the seeds, and wildcrafting other APIs–in hopes something will eventually grow.

Right now Github is my primary jam, but since I’m just publishing static JSON, XML, YAML, and other media types, it can be hosted anywhere. Since it’s Github it can be integrated anywhere, directly via the static API paths I generate, or by actually cloning the underlying Github repository. I have a Jekyll instance that I can run on AWS EC2 instance or Docker container, and will be experimenting more with Dropbox, Google Drive, and other free, or low cost storage solutions that have a public component. I’m looking to have a whole toolbox of deployment options when it comes to publishing my data and content APIs across the landscape. Most of these are low use, long tail usage data and content I’m using across my storytelling, so I’m not worried about things being heavily manicured, just providing fruit in a sustained way for as low of a price as I possibly can.


API Deployment Comes In Many Shapes And Sizes

Deploying an API is an interesting concept that I’ve noticed folks struggle with a little bit when I bring it up. My research into API deployment was born back in 2011 and 2012 when my readers would ask me which API management provider would help them deploy an API. How you actually deploy an API varies pretty widely from company to company. Some rely on gateways to deploy and API from an existing backend system. Some hand-craft their own API using open source API frameworks like Tyk and deploy alongside your existing web real estate. Others rely on software as a services solutions like Restlet and Dreamfactory to connect to a data or content source and deploy an API in the clouds.

Many folks I talk with simply see this as developing their APIs. I prefer to break out development into define, design, deploy, and then ultimately manage. In my experience, a properly defined and designed API can be deployed into a variety of environments. The resulting OpenAPI or other definition can be used to generate server side code necessary to deploy an API, or maybe used in a gateway solution like AWS API Gateway. For me, API deployment isn’t just about the deployment of code behind a single API, it includes the question about where we are deploying the code, acknowledging that there are many places available when it comes to deploying our API resources.

API deployment can be permanent, ephemeral, or maybe just a sandbox. API deployment can be in a Docker container, which by default deploys APIs for controlling the underlying compute for the API you deploying. Most importantly, API deployment should be a planned, well-honed event. APIs should be able to be redeployed, load-balanced, and taken down without friction. It can be easy to find one way of deploying APIs, maybe using similar practices surrounding web deployments, or dependent on one gateway or cloud service. Ideally, API deployment comes in many shapes and sizes, and is something that can occur anywhere in the clouds, on-premise, or on-device. Pushing our boundaries when it comes to which platforms we use, which programming languages we speak, and what API deployment means.


Open Sourcing Your API Like VersionEye

I’m always on the hunt for healthy patterns that I would like to see API providers, and API service providers consider when crafting their own strategies. It’s what I do as the API Evangelist. Find common patterns. Understand the good ones, and the bad ones. Tell stories about both, helping folks understand the possibilities, and what they should be thinking about as they plan their operations.

One very useful API that notifies you about security vulnerabilities, license violations and out-dated dependencies in your Git repositories, has a nice approach to delivering their API, as well as the other components of their stack. You can either use VersionEye in the cloud, or you can deploy on-premise:

VersionEye also has their entire stack available as Docker images, ready for deployment anywhere you need them. I wanted have a single post that I can reference when talking about possible open source, on-premise, continuous integration approaches to delivering API solutions, that actually have a sensible business model. VersionEye spans the areas that I think API providers should consider investing in, delivering SaaS or on-premise, while also delivering open source solutions, and generating sensible amounts of revenue.

Many APIs I come across do not have an open source version of their API. They may have open source SDKs, and other tooling on Github, but rarely does an API provider offer up an open source copy of their API, as well as Docker images. VersionEye’s approach to operating in the cloud, and on-premise, while leveraging open source and APIs, as well as dovetailing with existing continuous integration flows is worth bookmarking. I am feeling like this is the future of API deployment and consumption, but don’t get nervous, there is still plenty of money to be be made via the cloud services.


Open Sourcing Your API Like VersionEye

I’m always on the hunt for healthy patterns that I would like to see API providers, and API service providers consider when crafting their own strategies. It’s what I do as the API Evangelist. Find common patterns. Understand the good ones, and the bad ones. Tell stories about both, helping folks understand the possibilities, and what they should be thinking about as they plan their operations.

One very useful API that notifies you about security vulnerabilities, license violations and out-dated dependencies in your Git repositories, has a nice approach to delivering their API, as well as the other components of their stack. You can either use VersionEye in the cloud, or you can deploy on-premise:

VersionEye also has their entire stack available as Docker images, ready for deployment anywhere you need them. I wanted have a single post that I can reference when talking about possible open source, on-premise, continous integration approaches to delivering API solutions, that actually have a sensible business model. VersionEye spans the areas that I think API providers should consider investing in, delivering SaaS or on-premise, while also delivering open source solutions, and generating sensible amounts of revenue.

Many APIs I come across do not have an open source version of their API. They may have open source SDKs, and other tooling on Github, but rarely does an API provider offer up an open source copy of their API, as well as Docker images. VersionEye’s approach to operating in the cloud, and on-premise, while leveraging open source and APIs, as well as dovetailing with existing continous integration flows is worth bookmarking. I am feeling like this is the future of API deployment and consumption, but don’t get nervous, there is still plenty of money to be be made via the cloud services.


Professional API Deployment Templates

I wrote about the GSA API prototype the other day. It is an API prototype developed by the GSA, providing an API that is designed in alignment with GSA API design guidelines, complete with an API portal for delivering documentation, and other essential resources any API deployment will need. The GSA provides us with an important approach to delivering open source blueprints that other federal agencies can fork, reverse engineer and deploy as their own custom API implementation.

We need more of this in the private sector. We need a whole buffet of APIs that do a variety of different things, in every language, and platform or stack that we can imagine. Need a contact directory API, or maybe a document storage API, URL shortener API–here is a forkable, downloadable, open source solution you can put to work immediately. We need the WordPress for APIs. Not the CMS interface WordPress is known for, just a simple API that is open source, and can be easily deployed by anyone in any common hosting, and serverless environments. Making the deployment of common API patterns a no-brainer, and something anyone can do, anywhere in the cloud, on-premise, or on-device.

Even though these little bundles of API deployment joy would be open source, there would be a significant amount of opportunity routing folks to other add-on, and premium services on top of any open source API deployment, as well as providing cloud deployment opportunities for developers–similar to the separation between WordPress.com and WordPress.org. I could see a variety of service providers emerge that could cater to the API deployment needs of various industries. Some could focus on more data specific solutions, while others could focus on content, or even more algorithmic, machine learning-focused API deployment solutions.

If linked data, hypermedia, gRPC, and GraphQL practitioners want to see more adoption in their chosen protocol, they should be publishing, evolving, and maintaining robust, reverse engineer-able, forkable, open source examples of the APIs they think companies, organizations, institutions, and government agencies should be deploying. People emulate what they know, and see out there. From what I can tell, many API developers are pretty lazy, and aren’t always looking to craft the perfect API, but if they had a working example, tailored for exactly their use case (or close enough), I bet they would put it to work. I track on a lot of open source API frameworks, I am going to see if I can find more examples like the GSA prototype out there, where providers aren’t just delivering an API, they are delivering a complete package including API design, deployment, management, and a functional portal to act as the front door for API operations.

Delivering professional API deployment templates could prove to be a pretty interesting way to develop new customers who would also be looking for other API services that target other stops along the API lifecycle like API monitoring, testing, and continuous integration and deployment. I’ll keep my eye out for open source API deployment templates that fit the professional definition I’m talking about. I will also give a little nudge to some API service providers, consulting groups, and agencies I know who are targeting the API space, and see if I can’t stimulate the creation of some professional API deployment templates for some interesting use cases.


Containerized Microservices Monitoring Driving API Infrastructure Visualizations

While I track on what is going on with visualizations generated from data, I haven’t seen much when it comes to API driven visualizations, or specifically visualization about API infrastructure, that is new and interesting. This week I came across an interesting example in a post from Netsil about mapping microservices so that you can monitor them. They are a pretty basic visualization of each database, API, and DNS element for your stack, but it does provide solid example of visualizing not just the deployment of database and API resources, but also DNS, and other protocols in your stack.

Netsil microservices visualization is focused on monitoring, but I can see this type of visualization also being applied to design, deployment, management, logging, testing, and any other stop along the API lifecycle. I can see API lifecycle visualization tooling like this becoming more common place, and play more of a role in making API infrastructure more observable. Visualizations are an important of the storytelling around API operations that moves things from just IT and dev team monitoring, making it more observable by all stakeholders.

I’m glad to see service providers moving the needle with helping visualize API infrastructure. I’d like to see more embeddable solutions deployed to Github emerge as part of API life cycle monitoring. I’d like to see what full life cycle solutions are possible when it comes to my partners like deployment visualizations from Tyk and Dreamfactory APIs, and management visualizations with 3Scale APIs, and monitoring and testing visualizations using Runscope. I’ll play around with pulling data from these provides, and publishing to Github as YAML, which I can then easily make available as JSON or CSV for use in some basic visualizations.

If you think about it, thee really should be a wealth of open source dashboard visualizations that could be embedded on any public or private Github repository, for every API service provider out there. API providers should be able to easily map out their API infrastructure, using any of the API service providers they are using already using to operate their APIs. Think of some of the embeddable API status pages we see out there already, and what Netsil is offering for mapping out infrastructure, but something for ever stop along the API life cycle, helping deliver visualizations of API infrastructure no matter which stop you find yourself at.


Internet Connectivity As A Poster Child For How Markets Work Things Out

I have a number of friends who worship markets, and love to tell me that we should be allowing them to just work things out. They truly believe in the magical powers of markets, that they are great equalizers, and work out all the worlds problems each day. ALL the folks who tell me this are dudes, with 90% being white dudes. From their privileged vantage point, markets are what brings balance and truth to everything–may the best man win. Survival of the fittest. May the best product win, and all that that delusion.

From my vantage point markets work things out for business leaders. Markets do not work things out for people. Markets don’t care about people with disabilities. Markets don’t see education and healthcare any differently than it sees financial products and commodities–it just works to find the most profit it possibly can. Markets work so diligent and blindly towards this goal, it will even do this to its own detriment, while believers think this is just how things should be–the markets decided.

I see Internet connectivity as a great example of markets working things out. We’ve seen consolidation of network connections into the hands of a few cable and telco giants. These market forces are looking to work things out and squeeze every bit of profit out of it’s networks that it can, completely ignoring the opportunities that are available when the networks operate at scale, and freely operate to protect everyone’s benefits. Instead of paying attention to the bigger picture, these Internet gatekeepers are all about squeezing every nickel they can for every bit of bandwidth that is currently being transmitted over the network.

The markets that are working the Internet out do not care if the bits on the network are from a school, a hospital, or you playing an online game and watching videos–it just wants to meter and throttle them. It may care just enough to understand where it can possible charge more because it is a matter of life or death, or it is your child’s education, so you are willing to pay more, but as far as actually equipping our world with quality Internet–it could care less. Cable providers and telco operators are in the profit making business, using the network that drives the Internet, even at the cost of the future–this is how short sighted markets are.

AT&T, Verizon, and Comcast do not care about the United States remaining competitive in a global environment. They care about profits. AT&T, Verizon, and Comcast do not care about folks in rural areas possessing quality broadband to remain competitive with metropolitan areas. They care about profits. In these games, markets may work things out between big companies, deciding who wins and loses, but markets do not work things out for people who live in rural areas, or depend on Internet for education and healthcare. Markets do not work things out for people, they work things out for businesses, and the handful of people who operate these businesses.

So, when you tell me that I should trust that markets will work things out, you are showing me that you do not care about people. Except for those handful of business owners who are hoping you will some day be in the club with. Markets rarely ever work things out for average people, let alone people of color, with disabilities, and beyond. When you tell me about the magic of markets, you are demonstrating to me that you don’t see these layers of society. Which demonstrates your privilege, your lack of empathy for the humans around you, while also demonstrating how truly sad your life must be, because it is lacking in meaningful interactions with a diverse slice of the life we are living on this amazing planet.


Challenges When Aggregating Data Published Across Many Years

My partner in crime is working on a large data aggregation project regarding ed-tech funding. She is publishing data to Google Sheets, and I’m helping her develop Jekyll templates she can fork and expand using Github when it comes to publishing and telling stories around this data across her network of sites. Like API Evangelist, Hack Education runs as a network of Github repositories, with a common template across them–we call the overlap between API Evangelist, Contrafabulists.

One of the smaller projects she is working on as part of her ed-tech funding research involves pulling the grants made by the Gates Foundation since the 1990s. Similar to my story a couple weeks ago about my friend David Kernohan, where he was wanting to pull data from multiple sources, and aggregate into a single, workable project. Audrey is looking to pull data from a single source, but because the data spans almost 20 years–it ends up being a lot like aggregating data from across multiple sources.

A couple of the challenges she is facing trying to gather the data, and aggregate as a common dataset are:

  • PDF - The enemy of any open data advocate is the PDF, and a portion of her research data data is only available in PDF format which translates into a good deal of manual work.
  • Search - Other portions of the data is available via the web, but obfuscated behind search forms requiring many different searches to occur, with paginated results to navigate.
  • Scraping - The lack of APIs, CSV, XML, and other machine readable results raises the bar when it comes to aggregating and normalizing data across many years, making scraping a consideration, but because of PDFs, and obfuscated HTML pages behind a search, even scraping will have a significant costs.
  • Format - Even once you’ve aggregated data from across the many sources, there is a challenge with it being in different formats. Some years are broken down by topic, while others are geographically based. All of this requires a significant amount of overhead to normalize and bring into focus.
  • Manual - Ultimately Audrey has a lot of work ahead of her, manually pulling PDFs and performing searches, then copying and pasting data locally. Then she’ll have to roll up her sleeves to normalize all the data she has aggregated into a single, coherent vision of where the foundation has put its money.

Data research takes time, and is tedious, mind numbing work. I encounter many projects like hers where I have to make a decision between scraping or manually aggregating and normalizing data–each project will have it’s own pros and cons. I wish I could help, but it sounds like it will end up being a significant amount of manual labor to establish a coherent set of data in Google Sheets. Once, she is done though, she has all the tools in place to publish as YAML to Github, and get to work telling stories around the data across her work using Jekyll and Liquid. I’m also helping her make sure she has a JSON representation of each of her data projects, allowing others to build on top of her hard work.

I wish all companies, organizations, institutions, and agencies would think about how they publish their data publicly. It’s easy to think that data stewards will have ill intentions when it comes to publishing data in a variety of formats like they do, but more likely it is just a change of stewardship when it comes to managing and publishing the data. Different folks will have different visions of what sharing data on the web needs to look like, and have different tools available to them, and without a clear strategy you’ll end up with a mosaic of published data over the years. Which is why I’m telling her story. I am hoping to possibly influence one or two data stewards, or would-be data stewards when it comes to the importance of pausing for a moment and thinking through your strategy for standardizing how you store and publish your data online.


Each Airpair Datastore Comes With Complete API and Developer Portal

I see a lot of tools come across my desk each week, and I have to be honest I don’t alway fully get what they are and what they do. There are many reasons why I overlook interesting applications, but the most common reason is because I’m too busy and do not have the time to fully play with a solution. One application I’ve been keeping an eye on as part of my work is Airtable, which I have to be honest, I didn’t get what they were doing, or really I just didn’t notice because I was too busy.

Airtable is part spreadsheet, part database, that operates as a simple, easy to use web application, which with a push of a button, you can publish an API from. You don’t just get an API by default with each Airtable, you get a pretty robust developer portal for your API complete with good looking API documentation. Allowing you to go from an Airtable (spreadsheet / database) to API and documentation–no coding necessary. Trust me. Try it out, anyone can create an Airtable and publish an API that any developer can visit and quickly understand what is going on.

As a developer, API deployment still feels like it can be a lot of work. Then, once I take off my programmers hat, and put on my business user hat, I see that there are some very easy to use solutions like Airtable available to me. Knowing how to code is almost slowing me down when it comes API deployment. Sure, the APIs that Airtable publishes aren’t the perfectly designed, artisanally crafted API I make with my bare hands, but they work just as well as mine. Most importantly, they get business done. No coding necessary. Something that anyone can do without the burden of programming.

Airtable provides me another solution that I can recommend that my readers and clients should consider using when managing their data, which will also allow them to easily deploy an API for developers to build applications against.I also notice that Airtable has a whole API integration part of their platform, which allows you to integrate your Airtables into other APIs–something I will have to write about separately in a future post. I just wanted to make sure and take the time to properly add Airtable to my research, and write a story about them so that they are in my brain, available for recall when people are asking me for easy to use solutions that will help them deploy an API.


When You Publish A Google Sheet To The Web It Also Becomes An API

When you take any Google Sheet and choose to publish it to the web, you immediately get an API. Well, you get the HTML representation of the spreadsheet (shared with the web), and if you know the right way to ask, you also can get the JSON representation of the spreadsheet–which gives you an interface you can program against in any application.

Articles I curate, the companies, institutions, organizations, government agencies, and everything else I track on lives in Google Sheets that are published to the web in this way. When you are viewing any Google Sheet in your browser you are viewing it using a URL like:

https://docs.google.com/spreadsheets/d/[sheet_id]/edit

Of course, [sheet_id] is replaced with the actual id for your sheet, but the URL demonstrates what you will see. Once you publish your Google sheet to the web you are given a slight variation on that url:

https://docs.google.com/spreadsheets/d/[sheet_id]/pubhtml

This is the URL you will share with the public, allowing them to view the data you have in your spreadsheet in their browsers. In order to get at a JSON representation of the data you just need to learn the right way to craft the URL using the same sheet id:

https://spreadsheets.google.com/feeds/list/[sheet_id]/default/public/values?alt=json

Ok, one thing I have to come clean on is that the JSON available for each Google sheet is not the most intuitive JSON you will come across, but once you learn what is going on you can easily consume the data within a spreadsheet using any programming languages. Personally, I use a JavaScript library called tabletop.js that quickly helps you make sense of a spreadsheet and get to work using the data in any (JavaScript) application.

The fastest, lowest cost way to deploy an API is to put some data in a Google Sheet, and hit publish to the web. Ok, its not a full blown API, it’s just JSON available at a public URL, but it does provide an interface you can program against when developing an application. I take all the data I have in spreadsheets and publish to Github as YAML, and then make static APIs available using that YAML in XML, CSV, JSON, Atom, or any other format that I need. Taking the load of Google, creating a cached version at any point in time that runs on Github, in a versioned repository that anyone can fork, or integrate into any workflow.


Being First With Any Technology Trend Is Hard

I first wrote about Iron.io back in 2012. The are an API-first company, and they were the first serverless platform. I’ve known the team since they first reached out back in 2011, and I consider them one of my poster children for why there is more to all of this than just the technology. Iron.io gets the technology side of API deployment, and they saw the need for enabling developers to go serverless, running small scalable scripts in the cloud, and offloading the backend worries to someone who knows what they are doing.

Iron.io is what I’d consider to be a pretty balanced startup, slowly growing, and taking sensible amounts of funding they needed to grow their business. The primary area I would say that Iron.io has fallen short is when it comes to storytelling about what they are up to, and generally playing the role of a shiny startup everyone should pay attention to. They are great storytellers, but unfortunately the frequency and amplification of their stories has fallen short, allowing other strong players to fill the void–opening the door for Amazon to take the lion share of the conversation when it comes to serverless. Demonstrating that you can rock the technology side of things, but if you don’t also rock the storytelling and more theatrical side of things, there is a good chance you can come in second.

Storytelling is key to all of this. I always love the folks who push back on me saying that nobody cares about these stories, the markets only care about successful strong companies–when it reality, IT IS ALL ABOUT STORYTELLING! Amazon’s platform machine is good at storytelling. Not just their serverless group, but the entire platform. They blog, tweet, publish press releases, whisper in reporter ears, buy entire newspapers, publish science fiction patents, conduct road shows, and flagship conferences. Each AWS platform team can tap into this, participate, and benefit from the momentum, helping them dominate the conversation around their particular technical niche.

Being first with any technology trend will always be hard, but it will be even harder if you do not consistently tell stories about what you are doing, and what those who are using your platform are doing with it. Iron.io has been rocking it for five years now, and are continuing to define what serverless is all about, they just need to turn up the volume a little bit, and keep doing what they are doing. I’ll own a portion of this story, as I probably didn’t do my share to tell more stories about what they are up to, which would have helped amplify their work over the years–something I’m working to correct with a little storytelling here on API Evangelist.


Bringing The API Deployment Landscape Into Focus

I am finally getting the time to invest more into the rest of my API industry guides, which involves deep dives into core areas of my research like API definitions, design, and now deployment. The outline for my API deployment research has begun to come into focus and looks like it will rival my API management research in size.

With this release, I am looking to help onboard some of my less technical readers with API deployment. Not the technical details, but the big picture, so I wanted to start with some simple questions, to help prime the discussion around API development.

  • Where? - Where are APIs being deployed. On-premise, and in the clouds. Traditional website hosting, and even containerized and serverless API deployment.
  • How? - What technologies are being used to deploy APIs? From using spreadsheets, document and file stores, or the central database. Also thinking smaller with microservices, containes, and serverless.
  • Who? - Who will be doing the deployment? Of course, IT and developers groups will be leading the charge, but increasingly business users are leveraging new solutions to play a significant role in how APIs are deployed.

The Role Of API Definitions While not every deployment will be auto-generated using an API definition like OpenAPI, API definitions are increasingly playing a lead role as the contract that doesn’t just deploy an API, but sets the stage for API documentation, testing, monitoring, and a number of other stops along the API lifecycle. I want to make sure to point out in my API deployment research that API definitions aren’t just overlapping with deploying APIs, they are essential to connect API deployments with the rest of the API lifecycle.

Using Open Source Frameworks Early on in this research guide I am focusing on the most common way for developers to deploy an API, using an open source API framework. This is how I deploy my APIs, and there are an increasing number of open source API frameworks available out there, in a variety of programming languages. In this round I am taking the time to highlight at least six separate frameworks in the top programming languages where I am seeing sustained deployment of APIs using a framework. I don’t take a stance on any single API framework, but I do keep an eye on which ones are still active, and enjoying usag bey developers.

Deployment In The Cloud After frameworks, I am making sure to highlight some of the leading approaches to deploying APIs in the cloud, going beyond just a server and framework, and leveraging the next generation of API deployment service providers. I want to make sure that both developers and business users know that there are a growing number of service providers who are willing to assist with deployment, and with some of them, no coding is even necessary. While I still like hand-rolling my APIs using my peferred framework, when it comes to some simpler, more utility APIs, I prefer offloading the heavy lifting to a cloud service, and save me the time getting my hands dirty.

Essential Ingredients for Deployment Whether in the cloud, on-premise, or even on device and even the network, there are some essential ingredients to deploying APIs. In my API deployment guide I wanted to make sure and spend some time focusing on the essential ingredients every API provider will have to think about.

-Compute - The base ingredient for any API, providing the compute under the hood. Whether its baremetal, cloud instances, or serverless, you will need a consistent compute strategy to deploy APIs at any scale. -Storage - Next, I want to make sure my readers are thinking about a comprehensive storage strategy that spans all API operations, and hopefully multiple locations and providers. -DNS - Then I spend some time focusing on the frontline of API deployment–DNS. In todays online environment DNS is more than just addressing for APIs, it is also security. -Encryption - I also make sure encryption is baked in to all API deployment by default in both transit, and storage.

Some Of The Motivations Behind Deploying APIs In previous API deployment guides I usually just listed the services, tools, and other resources I had been aggregating as part of my monitoring of the API space. Slowly I have begun to organize these into a variety of buckets that help speak to many of the motivations I encounter when it comes to deploying APIs. While not a perfect way to look at API deployment, it helps me thinking about the many reasons people are deploying APIs, and craft a narrative, and provide a guide for others to follow, that is potentially aligned with their own motivations.

  • Geographic - Thinking about the increasing pressure to deploy APIs in specific geographic regions, leveraging the expansion of the leading cloud providers.
  • Virtualization - Considering the fact that not all APIs are meant for production and there is a lot to be learned when it comes to mocking and virtualizing APIs.
  • Data - Looking at the simplest of Create, Read, Update, and Delete (CRUD) APIs, and how data is being made more accessible by deploying APIs.
  • Database - Also looking at how APIs are beign deployed from relational, noSQL, and other data sources–providing the most common way for APIs to be deployed.
  • Spreadsheet - I wanted to make sure and not overlook the ability to deploy APIs directly from a spreadsheet making APIs are within reach of business users.
  • Search - Looking at how document and content stores are being indexed and made searchable, browsable, and accessible using APIs.
  • Scraping - Another often overlooked way of deploying an API, from the scraped content of other sites–an approach that is alive and well.
  • Proxy - Evolving beyond early gateways, using a proxy is still a valid way to deploy an API from existing services.
  • Rogue - I also wanted to think more about some of the rogue API deployments I’ve seen out there, where passionate developers reverse engineer mobile apps to deploy a rogue API.
  • Microservices - Microservices has provided an interesting motivation for deploying APIs–one that potentially can provide small, very useful and focused API deployments.
  • Containers - One of the evolutions in compute that has helped drive the microservices conversation is the containerization of everything, something that compliments the world of APis very well.
  • Serverless - Augmenting the microservices and container conversation, serverless is motivating many to think differently about how APIs are being deployed.
  • Real Time - Thinking briefly about real time approaches to APIs, something I will be expanding on in future releases, and thinking more about HTTP/2 and evented approaches to API deployment.
  • Devices - Considering how APis are beign deployed on device, when it comes to Internet of Things, industrial deployments, as well as even at the network level.
  • Marketplaces - Thinking about the role API marketplaces like Mashape (now RapidAPI) play in the decision to deploy APIs, and how other cloud providers like AWS, Google, and Azure will play in this discussion.
  • Webhooks - Thinking of API deployment as a two way street. Adding webhooks into the discussion and making sure we are thinking about how webhooks can alleviate the load on APIs, and push data and content to external locations.
  • Orchestration - Considering the impact of continous integration and deployment on API deploy specifically, and looking at it through the lens of the API lifecycle.

I feel like API deployment is still all over the place. The mandate for API management was much better articulated by API service providers like Mashery, 3Scale, and Apigee. Nobody has taken the lead when it came to API deployment. Service providers like DreamFactory and Restlet have kicked ass when it comes to not just API management, but making sure API deployment was also part of the puzzle. Newer API service providers like Tyk are also pusing the envelope, but I still don’t have the number of API deployment providers I’d like, when it comes to referring my readers. It isn’t a coincidence that DreamFactory, Restlet, and Tyk are API Evangelist partners, it is because they have the services I want to be able to recommend to my readers.

This is the first time I have felt like my API deployment research has been in any sort of focus. I carved this layer of my research of my API management research some years ago, but I really couldn’t articulate it very well beyond just open source frameworks, and the emerging cloud service providers. After I publish this edition of my API deployment guide I’m going to spend some time in the 17 areas of my research listed above. All these areas are heavily focused on API deployment, but I also think they are all worth looking at individually, so that I can better understand where they also intersect with other areas like management, testing, monitoring, security, and other stops along the API lifecycle.


The Growing Importance of Geographic Regions In API Operations

I have been revisiting my earlier work on an API rating system. One area that keeps coming up as I’m working is around the availability of APIs in a variety of regions, and the cloud platforms that are driving them. I have talked about regional availability of APIs for some time now, keeping an eye on how API providers are supporting multiple regions, as well as the expanding world of cloud computing that is powering these regional examples of providing and consuming APIs.

I have been watching Amazon rapidly expand their available regions, as well as Google and Microsoft racing to catch up. But I am starting to see API providers like Digital Ocean providing APIs for getting at geographic region information, and Amazon provides API methods for getting the available regions for Amazon EC2 compute–I will have to check if this is standard across all services. Twilio has regions for their API client, and Runscope has a region API for managing how you run API tests from a variety of regions. The role of geographic regions when it comes to providing APIs, as well as consuming APIs is increasingly part of the conversation when you visit the most mature API platforms, and something that keeps coming up on my radar.

We are still far from the average company being able to easily deploy, deprecate, and migrate APIs seamlessly across cloud providers and geographic regions, but as APIs become smaller and more modular, and cloud providers add more regions, and APIs to support automation around these regions, we will begin to see more decisions being made at deploy and run time regarding where you want to deploy or consume your API resources. To be able to do this we are going to need a lot more data and common schema regarding the what geographic regions are available for deployment, what services operate in which regions, and other key considerations about exactly where our resources should operate. This is why I’m revisiting this work, to see what I can do to get API service providers to share more data from either the API provider or consumer side of the equation.

I am considering adding an area of my research dedicated to API regions, aggregating examples of how geographic regions are playing a role in API operations. I’m thinking region availability will be playing just as significant role as performance, plans, security, reliability, and other areas of the API lifecycle when it comes to deciding where you deploy or consume your APIs. It feels like another one of the aspects of API operations that will overlap with many stops along the API lifecycle–not just deployment. One of the areas of the API lifecycle I’m increasingly thinking about that will affect geographic API decisions is regulations, and how governments are dictating what is acceptable when it comes to the storage, transmission, and access of digital resources. It feels like early notions of what the World Wide Web has been for the last 25 years is about to be blown out of the water, with the influences of digital nationalism, regulation, or even the Internet moving off planet, and increasingly driven by satellite infrastructure.


The Growing Importance of Geographic Regions In API Operations

I have been revisiting my earlier work on an API rating system. One area that keeps coming up as I’m working is around the availability of APIs in a variety of regions, and the cloud platforms that are driving them. I have talked about regional availability of APIs for some time now, keeping an eye on how API providers are supporting multiple regions, as well as the expanding world of cloud computing that is powering these regional examples of providing and consuming APIs.

I have been watching Amazon rapidly expand their available regions, as well as Google and Microsoft racing to catch up. But I am starting to see API providers like Digital Ocean providing APIs for getting at geographic region information, and Amazon provides API methods for getting the available regions for Amazon EC2 compute–I will have to check if this is standard across all services. Twilio has regions for their API client, and Runscope has a region API for managing how you run API tests from a variety of regions. The role of geographic regions when it comes to providing APIs, as well as consuming APIs is increasingly part of the conversation when you visit the most mature API platforms, and something that keeps coming up on my radar.

We are still far from the average company being able to easily deploy, deprecate, and migrate APIs seamlessly across cloud providers and geographic regions, but as APIs become smaller and more modular, and cloud providers add more regions, and APIs to support automation around these regions, we will begin to see more decisions being made at deploy and run time regarding where you want to deploy or consume your API resources. To be able to do this we are going to need a lot more data and common schema regarding the what geographic regions are available for deployment, what services operate in which regions, and other key considerations about exactly where our resources should operate. This is why I’m revisiting this work, to see what I can do to get API service providers to share more data from either the API provider or consumer side of the equation.

I am considering adding an area of my research dedicated to API regions, aggregating examples of how geographic regions are playing a role in API operations. I’m thinking region availability will be playing just as significant role as performance, plans, security, reliability, and other areas of the API lifecycle when it comes to deciding where you deploy or consume your APIs. It feels like another one of the aspects of API operations that will overlap with many stops along the API lifecycle–not just deployment. One of the areas of the API lifecycle I’m increasingly thinking about that will affect geographic API decisions is regulations, and how governments are dictating what is acceptable when it comes to the storage, transmission, and access of digital resources. It feels like early notions of what the World Wide Web has been for the last 25 years is about to be blown out of the water, with the influences of digital nationalism, regulation, or even the Internet moving off planet, and increasingly driven by satellite infrastructure.


Publishing Your API In The AWS Marketplace

I’ve been watching the conversation around how APIs are discovered since 2010 and I ave been working to understand where things might be going beyond ProgrammableWeb, to the Mashape Marketplace, and even investing in my own API discovery format APIs.json. It is a layer of the API space that feels very bipolar to me, with highs and lows, and a lot of meh in the middle. I do not claim to have “the solution” when it comes to API discovery and prefer just watching what is happening, and contributing where I can.

A number interesting signals for API deployment, as well as API discovery, are coming out of Amazon Marketplace lately. I find myself keeping a closer eye on the almost 350 API related solutions in the marketplace, and today I’m specifically taking notice of the Box API availability in the AWS Marketplace. I find this marketplace approach to not just API discovery via an API marketplace, but also API deployment very interesting. AWS isn’t just a marketplace of APIs, where you find what you need and integrate directly with that provider. It is where you find your API(s) and then spin up an instance within your AWS infrastructure that facilitates that API integration–a significant shift.

I’m interested in the coupling between API providers and AWS. AWS and Box have entered into a partnership, but their approach provides a possible blueprint for how this approach to API integration and deployment can scale. How tightly coupled each API provider chooses to be, looser (proxy calling the API), or tighter (deploying API as AMI), will vary from implementation to implementation, but the model is there. The Box AWS Marketplace instance dependencies on the Box platform aren’t evident to me, but I’m sure they can easily be quantified, and something I can get other API providers to make sure and articulate when publishing their API solutions to AWS Marketplace.

AWS is moving towards earlier visions I’ve had of selling wholesale editions of an API, helping you manage the on-premise and private label API contracts for your platform, and helping you explore the economics of providing wholesale editions of your platforms, either tightly or loosely coupled with AWS infrastructure. Decompiling your API platform into small deployable units of value that can be deployed within a customer’s existing AWS infrastructure, seamlessly integrating with existing AWS services.

I like where Box is going with their AWS partnership. I like how it is pushing forward the API conversation when it comes to using AWS infrastructure, and specifically the marketplace. I’ll keep an eye on where things are going. Box seems to be making all the right moves lately by going all in on the OpenAPI Spec, and decompiling their API platform making it deployable and manageable from the cloud, but also much more modular and usable in a serverless way. Providing us all with one possible blueprint for how we handle the technology and business of our API operations in the clouds.


API Providers Localizing Compute For Developers Using Serverless

Twilio launched their Twilio Function this last week, localizing serverless infrastructure for Twilio API consumers, when it comes to powering key functionality that Twilio brings to the table. This seems like a logical move for mature API providers, keeping in tune with shifts in how developers are integrating with APIs, and deploying their applications in a DevOps, continuous integration world.

I could see other API providers following Twilio’s lead, jumping on the serverless bandwagon, and localizing compute within their API ecosystems. I can see this approach converging with other movements in the SDK space where service providers like APIMATIC are enabling the continuous deployment of SDKs, samples, and other scripts for API integration. Allowing developers to quickly deploy integration scripts, in the programming language of choice–all baked into their existing API platform developer arrangement.

It makes sense that some of these common approaches that are emerging across the API space like containerization, webhooks, serverless, evented and other real-time technologies make their way to being baked in, or at least augmenting existing API operations. I don’t think that every API provider should be following Twilio’s lead in every area, but they do provide a pretty interesting example consider when we think about where the API space might be headed–I find the most mature API providers are just as important to keep an eye on as much as each wave of startups.

I’ll keep an eye on serverless being localized like this with other API providers. It seems like an opportunity for some provider, to develop a white label solution to help API providers deliver scripting, events, webhooks, and other emerging ways to orchestrate and integrate with APIs like Twilio is doing.


Craft Your API Design Guide So You Can Move To Other Areas of The Lifecycle

I am working on an API definition and design guide for my human services API work, helping establish a framework for approaching API design as part of the human services data and API specification, but also for implementers to follow in their own individual deployments. Every time I work on the subject of API design, I’m reminded of how far behind the API sector is when it comes to standardizing what it is we do.

Every month or so I see a new company publicly share their API design guide. When they do my friend Arnaud always adds to his API Stylebook, adding it to the wealth of information available in his work. I’m happy to see each API design guide release, but in reality, ALL API providers should have an API design guide, and they should also be open to publishing it publicly, showing their consumers they have their act together, and sharing with the wider API community the best practices in play.

The lack of companies sharing their API design practices and their API definitions is why we have such a deficiency when it comes to common API patterns in use. It is why we have so many variations of web APIs, as well as the underlying schema. We have an API industry because early practitioners like SalesForce, Amazon, eBay, Flickr, Delicious, Twitter, Youtube, and others were open with their API operations. People emulate what they see and know. Each wave of the API sector depends on the previous wave sharing what they do publicly–it is how this all works.

To demonstrate even further about how deficient we are, I do not find companies sharing their guides for API deployment, management, testing, monitoring, clients, and other stops along the API lifecycle. I’m glad we are seeing an uptick in the number of API design guides, but we need this practice to spread to every other stop. We need successful providers to share how they deploy their APIs, and when any company hires a new developer, you should ALWAYS be given a standard guide for deploying, managing, testing, as well as designing APIs.

It’s not rocket science, and honestly, it’s not even technical. It just means pausing for a moment, thinking about how we approach each stop in the API lifecycle, writing up an overview, publishing, and sharing it with API stakeholders, and even the wider API community. Every company doing APIs in 2017 should be crafting an API design guide so you can get to work on guides for the other areas of your lifecycle, thinking through and standardizing your approach, and making it known to every person involved–ideally, you are also being very public about all of this, and sharing your work with me and Arnaud, so we can get the word out about the good stuff you are up to! ;-)


My Google Sheet Driven Product API And Web Page

I am in the process of eliminating the MySQL backend behind much of my research, eliminating a business expense, as well as an unnecessary complexity in my architecture. There really is no reason for the data I use in my business to be in a database. Nothing I track on tends to go beyond 10K rows, with most of the tables actually being less than 100 rows–perfect for spreadsheets, and my new static approach to delivering APIs, and websites for my research.

The time had come to update some of the products on my website, and I thought my product page was a perfect candidate for this approach, providing me with the following elements:

Products Google Sheet - I have a simple spreadsheet with all of my products in it. Jekyll YAML Data Store - I have a YAML data store in the _data folder for API Evangelist. Google Sheet to YAML Sync - I have a JavaScript function that pulls the data from the Google Sheet, converts it to YAML, and writes to the _data folder in the Jekyll repository. Products Web Page - I have a page that lists all the products in the YAML file as HTML using Liquid. Products API - I have a JSON page that lists all the products in the YAML file as JSON using Liquid.

This simple approach to publishing static APIs using Google Sheets and Github is working well for little data like this–I am all about the little data, while everyone else is excited about big data. ;-) I even have the beginning of some documentation and an updated APIs.json for my website.

Next, I’ll work through the rest of my projects, organizations, tools, and other data I track on as part of my API research. I’ll be publishing a complete snapshot of this data at API Evangelist, as well as subsets of it at each of the individual research projects. When I’m done I’ll have a nice static stack of APIs for all of my research, easily managed via Google Sheets, and YAML on Github.


If you think there is a link I should have listed here feel free to tweet it at me, or submit as a Github issue. Even though I do this full time, I'm still a one person show, and I miss quite a bit, and depend on my network to help me know what is going on.