API Best Practices Blog
API Product Management: Driving Success through the Value Chain (slides & video) »
Thanks to all who attended the API Product Management: Driving Success through the Value Chain Webinar.
Video and slides are here. Thanks to @jenmazzon, @sramji and a big thanks to our special guest host, Michael Hart @michaelhart. As always, we'd love more of your thoughts, insights, or questions on the api-craft forum.
API Adoption Tip: One simple, open method »
Whether you are creating an API for internal use, partners or the wide open world of application developers - here is a tip that could help with initial adoption of your API.
Create at least one simple, open API method.
What I mean by simple and open is an API satisfying these criteria:
- no signup required
- no authentication required
- uses HTTP GET
- returns JSON
A good example - is the the Twitter public timeline API method - no signup or authentication required.
Many API teams we work with are creating sophisticated, highly secure APIs with complex authentication and authorization requirements. The API's primary purpose is to access private, secure data.
Why would an API team working on such private stuff take the time to create at least one simple, open API method? Adoption.
If a developer can copy and paste a simple URL into his browser and see JSON coming back then he's already used your API. If he's already a user then getting him to sign up and figure out authentication will be easier for him to stomach.
How do you find an appropriate API method to create?
One place to start is your company's website. Maybe it has a listing of the products and services you sell? Or the people on the management team?
Find some information that is already publicly available and wrap an API around it. Your developers will thank you by using it, signing up and creating compelling apps.
For more on developer adoption - check out our new webinar video on Developer marketing and adoption How-Tos and best practices
Video and Slides: Is your API naked? API Platform and Ops Considerations »
Thanks to all that attended last week's API Best Practices Webinar #5 "Is your API Naked? API Platform and Operations Considerations" (and thanks to our presenters @gbrail and @landlessness). Video and slides are below.
Our next API webinar, "Your API Sucks! Why developers hang up and how to stop that" with @landlessness and @earth2marsh, is June 14th at 11am PST (sign up here!)
(And you can see all our API best practices webinars to date here)
Video and Slides: API Metrics - What to Measure »
Thanks to all that attended last week's API Best Practices Webinar #4, API Metrics - What to Measure (and thanks to our presenters @brianpagano and @landlessness). Video and slides are below.
Our next API webinar, "Is your API Naked? API Technology and Ops Considerations" with @landlessness and @gbrail, is June 14th at 11am PST (sign up here!)
Video: RESTful API Design (Pragmatic, not Dogmatic) »
Recently Brian Mulloy (@landlessness) and Marsh Gardiner (@earth2marsh) hosted a webinar on API design and Pragmatic REST. The video of the recording and the slides are below.
Our next API webinar - 10 Patterns in Successful API Programs - is this Thursday, May 19th at 11am PST, with Brian and Greg Brail @gbrail. Interested in the topic? Then you should sign up now!
Pragmatic PCI Compliance and APIs: Just Enough Process »
“If the minimum wasn’t good enough, it wouldn’t be the minimum.” - Keith W.
Wise words from one of my developers many years ago. When it comes to tackling PCI Compliance, it is advice well worth taking.
With leaks of sensitive customer information in the news, there’s an increased focus on compliance as more services shift to cloud computing and APIs.
If you are a merchant of any kind or deal with customer credit card information then you must be aware of PCI compliance regulations that are designed to protect consumer credit card information from exposure.
PCI compliance gets tricky as apps and services move to cloud services and APIs. If you’re heading down the path of PCI compliance or just trying to position yourself, your APIs and your internal systems better for the future, keeping it simple will help you be successful.
First, Document your Process
The PCI Data Security Standards (PCI DSS) establish the “what” but not the “how” of achieving compliance.
The how is up to you. But like most audit and process centric assessments, what is most important is being able to articulate what you do to support a particular DSS item - and then being able to show evidence to support that statement.
Identify all of the process standards that apply to you from the DSS and identify the proper owner of those processes as well. Put together a simple Process Description Template that everyone uses to document their individual processes and adopt a naming convention that calls out the DSS section. Centralize the storage of those documents and make sure everyone knows where they are.
Just focus on capturing the “how” of your processes in as lean a manner as possible. Your assessment team is not going to evaluate quality of the process or the documentation, only that it meets the requirements of the DSS.
With your processes documented "well enough" and easily mapped to the PCI DSS, you’ll discover gaps, strengths and you’ll make your assessors life easier and that makes your life easier.
Next, we'll talk about the special challenges with PCI in cloud computing and APIs, and practices that you can apply to reduce your risk.
Morgan Hall is a project manager with Apigee. Previously he was Director of Business Architecture at TransUnion.
Universal Design Principles Applied to APIs (Part 2 of 4) »
Following Part 1, here are the slides and a video for Part 2 of our series on applying universal principles of design to APIs. In this part I cover 3 more universal design principles, including:
- Flexibility-Usability Tradeoff
- Hick’s Law
- 80/20 Rule
- Inverted Pyramid
And for more on API Design check out Teach a Dog to REST and REST for SQL Developers
Build & Run a Web API for FREE »
With the proliferation of free, cloud services it's possible to build and run interesting mobile and web projects from end-to-end for free--including an awesome web API.
Here are 10 steps to building and running a web API for free.
1. Prepare
Get up to speed on APIs - from their economic and business model rationale to design best practices and principles to developer community practices.
2. Design
Prioritize your resources (objects) in a Google spreadsheet and start to craft your URLs and HTTP methods.
3. Build
Build your API quickly using Ruby on Rails scaffolding. And here are some additional Rails API tips from @StefanSiebel.
Store your source code in GitHub.
4. Deploy
After you create your Rails-based API, deploy it to Heroku.
5. Proxy
Use Apigee's Free service for API analytics, debugging and rate limiting.
6. Document
Document your API using GitHub pages.
7. Engage
Engage developers by embedding an API Console into your docs using Apigee To-Go.
8. Launch
Grab a domain name for your project from a DNS provider like GoDaddy.
Setup CNAME records for:
- GitHub docs: dev.mydomain.com
- Apigee proxy: api.mydomain.com
9. Promote
Use Twitter, Blogger and Facebook to promote your API.
10. Support
Support your community with GetSatisfaction and manage tickets with GitHub's Issues feature.
A couple of us (@earth2marsh & @landlessness) recently built an API with this set of products and had a lot of fun doing it. We'll talk more about that project in a future post.
How APIs encourage the separation of content and presentation »
From the dawn of the web, we've been entangling presentation with content. Which is, overall, a big suck. Why? Well, on one hand, it's fine for people to consume. But it's really rude to computers. And computers, after all, are doing all the work to keep the Internet running.
If computers can't easily parse data, they can't do anything meaningful with it. However, once computers can easily disentangle the layers of data, watch out: innovation happens like crazy.
In a series of blog posts I'm going to talk about all the beauty of separating content from presentation. And how that simple idea when amplified with proven, existing design patterns makes things really interesting. Let's start with Twitter as an example...
When you view tweets on twitter.com, you see shared links, hashtags, and @mentions, all of which are clickable. But if you query the Twitter API you get back plain text. Why?
Because Twitter understands the importance of separating the content layer from the presentation layer. Imagine seeing this on a phone:
Ran into <a href="http://twitter.com/shanley">@shanley</a> at <a href="http://twitter.com/#!/search?q=%23sxsw">#sxsw</a>!
… because the SMS app on that old phone didn't understand HTML! Furthermore, when every character counts, like in a 160 character SMS message, markup would end up dominating the content. You shouldn't even assume that the consumer device will even know what to do with it.
So instead, Twitter keeps statuses as simple as possible—text only. This is plain, vanilla content. No markup, just data (well, unless you count the "organic" markup conventions like hashtags and @mentions).
Everything else, all the rest of the CSS and HTML that make up the twitter.com experience—that's the presentation layer. It takes all that content and makes it look good.
APIs, at their most basic, are carriers of content. They are perfectly evolved for getting data from one place to another.
And this is super-important in today's world, where we consume content on multiple devices. One size no longer fits all. There is no single form-factor. Each device offers a different experience of the same content.
This is just one example why APIs are changing the way we develop, by demanding the best-practice approach of separating the content and presentation layers. Modern web applications use the same APIs that their official mobile apps do. And those are the same APIs that get exposed to 3rd party developers who innovate around the edges.
REST API design for SQL programmers »
@gbrail put together this short deck mapping SQL concepts to (pragmatic) RESTful API design. And for more on API design - see Teach a Dog to REST and Universal Design Principles Applied to APIs
REST API Design for SQL developers
Why XML won’t die: XML vs. JSON for your API »
But is JSON (and JSONP) perfect for everything you need to support with your API? Is XML dead? JSON is especially good at representing programming-language objects. If you have a JavaScript or Java object, or even a C struct, the structure of the object and all its fields can be easily and quickly converted to JSON, sent over a network, and retrieved on the other end without too much difficulty and (usually) comes out the same on both ends. But not everything in the world is a programming-language object. Sometimes to describe a complex real-world object we have to combine different descriptions and languages from different places, mash them up, and use them to describe even more complex things. The descriptions of these complex things need to be validated, they need to be commented on, they need to be shared and sometimes annotated with additional data that doesn't affect the original structure. When the world gets complicated and open-ended like that, what's needed is not a programming-language-format object, but a open-ended, extensible -- umm -- markup language. That's what we have today with XML. For instance, the travel industry (through the Open Axis Group), the insurance industry (through ACORD) and the financial services industry (through FpML) have all spent many person-years developing standards that describe what they do in XML format. Each standard comes complete with a schema, which means that any client or server can validate a document to ensure it is correct enough before starting to parse it, and which makes it easier to edit the document using one of the many of the mature tools that are available. Sure, parsing and understanding these documents is not simple, but they do not represent simple things. The ability to represent a complex travel itinerary, a life insurance policy, or an interest-rate swap in a standards-based format is a big deal and a triumph of XML technology. Similarly, look at HTML. (Most HTML is not XML but both come from SGML and are very similar.) HTML works because it can combine both structured and unstructured content in various ways and accept the ability to mash up different standards into one document. In my opinion, XML will only be dead when the web has replaced HTML with JSON. So for our APIs, let's embrace JSON -- it's small, simple, and easy to use. But when we have to collaborate on complex documents, pull information from different places, and define complex schemas to represent complex real-world concepts, let's also not forget about good old XML.
Last week I wrote that if you're API doesn't support JSON and JSONP - you're doing it wrong. I don't think that's terribly controversial.
Not serving JSON AND JSONP? Then you’re doing it wrong! »
If you've used an API recently, you've probably seen that the popular APIs out there support JSON. JavaScript Object Notation is a standard defined a while back by Douglas Crockford from Yahoo. It uses a subset of the JavaScript syntax to simply and effectively describe an object.
In the last few years, JSON has taken its place alongside XML as the de facto way to describe API data. Today's leading APIs support JSON in addition to XML, and an increasing number support only JSON.
JSON is popular because it's simple. Programming-language objects map to and from JSON in a straightforward way that everyone can understand. You need a "JSON parser" to convert JSON into an object (unless you're working in JavaScript) but you don't need to know much about it other than how to make it go.
If you are thinking of building an API, JSON support is critical. Here's why:
JavaScript. JSON is JavaScript. That is, a "JSON object" is literally a small fragment of JavaScript that represents an object and its sub-objects. That means that creating a real JavaScript object based on some JSON text is simple and fast. Web programmers love JSON.
JSONP. This lets JavaScript running inside the browser invoke APIs that reside on a different host on the Internet. This doesn't sound like a big deal but it is actually huge because all browsers implement a "same-origin policy" that otherwise makes this impossible. JSONP is hard to implement, but libraries like jQuery make it easy for the client if the server already supports it. If you're not serving JSON AND JSONP you're doing it wrong!
Smaller. Smaller is better, especially in the mobile environment, and since JSON doesn't "say" every field name twice like XML does, JSON output is a lot smaller than XML.
Less complicated. JSON is free of namespaces, attributes, multiple "text" nodes, and other complexities of XML. The result is that JSON parsers exist for every language, they're small, and they're fast. Furthermore, if you need to write your own, it's not complicated. The same goes for security -- all that's necessary to prove that a JSON document is valid JSON is a simple regular expression check, which is easily available in nearly every programming environment.
Tools. An increasing number of tools support JSON. JSON support is not ubiquitous yet, but at the rate JSON is gaining it will be soon.
Many APIs now support XML and JSON - like the Twitter API, where JSON is the default. Some APIs support only JSON (like Foursquare's V2 API).
But JSON isn't for everything. Next up: Why XML isn't dead yet!
Why basic auth and social security numbers both suck (and how OAuth helps for APIs) »
Eggs…
Around the time of the industrial revolution, there was a problem—too many ships were sinking. So in 1891, the Bulkhead Committee proposed a "Grade I subdivision" that, among other things, required passenger liners longer than 425 feet to be able to float even if two compartments were compromised. Accidents will always happen, but the committee wanted to minimize any loss of life when they did.
It was a simple idea: that a single hull breach should not sink a vessel.
Jump ahead 118 years to the month in which my son started kindergarten. It's going pretty well for him. But yesterday, a notice came home asking us to supply his social security number to identify him in a longitudinal data system in order to comply with federal funding requirements. While I appreciate the future value such data will have, I am not at all comfortable giving away that secret number.
If you are anything like me, you think twice before giving out your social security number. I've had mine compromised before (I still have the Secret Service letter describing how it was discovered by customs in a briefcase along with other people who had worked for the same employer, but that's another story). This totally sucked, and I now hesitate even more than I did before when presented with a rental application / insurance form / school form / etc.
- Every time I give out my social security number, I increase the chances of it being compromised.
- Every time I build ships without compartmentalization, I compromise the safety of its passengers.
- Every time I give out my username and password for a web service, I increase the chances of it being compromised.
… And Baskets
By now you can probably see where I'm going… because at Apigee we're obsessed with APIs (think Glenn Close in Fatal Attraction but without all the psychosis). Smart people are going to going to build web services. To grow those services, they're going to build APIs. And those APIs need to have some way to authenticate users.
Since the username/password is still the primary way in which people identify themselves to a web service, you might think that securing your API with usernames/passwords makes a lot of sense. But you are only as strong as your weakest link… and once that password is compromised, the whole ship sinks.
There are perfectly good ways to deal with this, of course, API tokens for example, usually a custom token passed as a query or header parameter. These make most sense when sent over SSL to keep them safe. Some providers, like 37signals's API for Highrise, expect them to be using HTTP basic authentication in lieu of the username or password.
The OAuthpocalypse & The Unsinkable
Of course there are other ways. OAuth 1, popularized by Twitter as much as anyone, is a great example. At it's best, it works like valet keys that users can give to applications to access their accounts on their behalf while retaining the option to revoke those permissions app by app. An additional benefit is that API analytics can be organized by application.
But OAuth has it's own problems. It's harder to understand conceptually. It's harder to implement. Perhaps worst of all, it can disenfranchise users who must use proxies to get around the censorship walls put in place by governments like China and Iran.
OAuth 2 is different in some important ways. Facebook is already using a version of it with their Graph API, despite the spec still being in draft. Twitter appears to be using OAuth 2 draft for @Anywhere. The best benefit is that it's simpler and easier for developers to start using, since, among other things, it does away with the signing of base strings.
So OAuth does away with the password anti-pattern. But transitioning away from an authentication system like basic authentication to one like OAuth isn't ever easy. Change is hard—just look how much complaining follows every Facebook redesign. And things will break… and with the OAuthpocalypse, they did.
The Titanic was famously built to exceed the Bulkhead Committee's Grade I requirements. But that alone wasn't enough to save it. The hubris of excessive speed for the conditions meant the ship couldn't turn in time, and following the subsequent impact, five compartments flooded. It was an unprecedented disaster.
Twitter, on the other hand, knew up front that things were going to break. And when it became obvious that the deadline was coming too fast, they slowed down, delaying the event for two months. It was certainly painful for developers—though we still don't know the full extent to which this broke applications (though apparently not their own, thanks to a basic auth back door).
Overall, Twitter did an excellent job of handling the migration. They gave developers lots of time, and then they gave them more. As the date approached, they shut off basic auth for short periods—an iceberg ahead alarm bell, if you will. They slowly lowered the rate limits over two weeks. And most importantly, they communicated and coordinated well—Developer Advocates Taylor Singletary and Matt Harris, and other members of the Twitter API team worked tirelessly to provide tools and support devs as they struggled with the migration.
So far, things are looking pretty good for their great experiment. But only history will tell the full extent of the damage.
Darwin’s Finches, 20th Century Business, and APIs: Evolve Your Business Model »
What do APIs have in comon with Darwin's observations on evolution, the 20th century garment district, and the Kobayashi Maru?
Sam Ramji makes the case for APIs in his much written about web 2.0 talk - watch and listen to the full talk or just flip through the slides - both below.
Don’t try to market to developers. Instead, solve their problems. »
Not long ago you could count the number of 'developer marketing' programs on one hand. Now there are hundreds of programs as Web companies and enterprises open APIs. These companies know that developer adoption will make their API strategy succeed or fail.
But Developer Marketing is an oxymoron. Developers hate marketing.
You cannot drive adoption by 'marketing to developers.' Sure, you can send offers to your developers but your mileage may vary.
A better formula - understand what's important to developers and give them what they need to reach these goals. Developers want to:
- build new skills that lead to the best projects and jobs. This is why new or proprietary tools and programming models are tough to get off the ground - it's a small market of new projects for the developer.
- increase their productivity. With good tools and by connecting developers with decent resources and each other for help. This is why sites like StackOverflow take off.
- be recognized for good work and see their products used. Focus on showcasing their work, not your product. It's not about you.
- get paid. Think App Store model, or affilate marketing networks.
Talk to the folks that made the big developer networks sucessful and you'll hear these points over and over. Some others:
- Developers are not buyers, but are very strong influencers. There are superstars in the developer world - make them fans and that is the best marketing you'll ever get.
- You can't 'own' or 'use' developers because they have an account on your service. Developers have lots of options and switching costs might be low from your API.
- Act on their feedback. Developers are smart and listening and acting on their complaints and ideas is critical to your credibility.
- Developer communities are fragmented. For example, there is no such thing as an "API developer', but instead there are Twitter or Facebook or Salesforce developers.
Once you have attracted a developer to use your service - they are like gold. So treat them with respect - don't try to 'use' developers or you might lose them!
Building Apigee for Multiple Clouds: 10 Cloud Portability Lessons Learned »
This is a repost of a piece I prepared this for Shlomo Swidler's panel "Writing Code for Many Clouds" at CloudConnect 2010 and also posted on my own blog earlier this year. It's a long post, so later I created this short screenr of a cliffs notes version below.
At Sonoa, we have an enterprise product which we turned into a service called Apigee. From the first, we needed to move beyond just being packaged as a VM and “deployable anywhere” to really living in the cloud.
This is what we’ve learned so far – some of which we anticipated and some of which we reacted to. Build, deploy, and manage – of the three basic parts of running a service only deploy and manage really change. The big difference is in operationalization of the system.
Most recently we realized that we needed to be HA across providers and get total control of our latency, so we are building a new datacenter on Rackspace as well. This is a work in progress so I’ll be reporting from the front lines.
Finally, we’ve helped implement a multi-cloud architecture for ING which has taught us something about where multi-cloud services may be headed.
The first cloud: EC2
1. Build:
The first and biggest step for any system will be building it as a VM if you haven’t done so before. Once you have done this, you can practically drop it onto any box. You’ve become independent of the hardware and other aspects of the operating environment.
Beyond build, you have to focus on: setting up the network topology, configuring the virtual boxes once they’re up, and managing the result.
2. Deploy:
The next phase is figuring out how you bring up instances in your cloud platform. EC2 has its own interfaces for this, and Rackspace has different ones. Rightscale normalizes these interfaces and provides a UI. There’s an open source package with no UI that we evaluated but aren’t using called libcloud.
Now that you’re hardware independent, you can run as many instances of your service’s components as you can afford. The main solutions here are Chef and Puppet, both open source. We use Capistrano for scripting automation.
Then you need to configure the topology of the different subsystems you’ve built. Here things get interesting. EC2 does not support multicasting across your default virtual network; this was tough for us and would be for anyone relying on clustering. VPN-Cubed from CohesiveFT let us build a private network within our EC2 environment and let us do the multicasting we needed.
Once your network is up and you can push software, it’s just the same as having your own private datacenter. You can connect from anywhere, manage instances, and get alerts and reports.
3. Manage:
That brings us to management. We use Nagios for monitoring our virtual boxes. We learned that we needed to have a separate machine outside of EC2 as a “monitor monitor” – a Nagios instance that monitored the health and responsiveness of the Nagios box in the cloud environment We use RightScale for managing all of our accounts and instance creation. With this setup we’ve had zero downtime since our launch in late August of last year.
We realized at the outset that we wanted to build a service that would be portable, so we chose not to use the least portable features of AWS, such as S3. While it would have made our life simpler for some of the assets we were managing, there was no corollary (and Walrus, the Eucalyptus storage subsystem that mimics S3, does not count as a corollary, even though it really works). We did use EBS (Elastic Block Storage) which is so close to a SAN that we felt it was reasonably standard; and forcing our hand was the fact that we needed to solve for persistence and performance.
But the evils of cloud computing were present as well as the the good. EC2 does not guarantee the availability of an instance, but the availability of a zone. As a result we found that the latency of our service had a high degree of jitter (between 5 and 15ms), which was acceptable but not ideal. The lack of control in this environment means that we’ve been buying instances ahead of our need in order to guarantee not just availability but performance. This is one of the headaches that cloud computing is supposed to transcend.
In a nutshell – “it’s elastic but you have to manage it.”
So in order to manage the network performance issues (achieve constant performance AND availability) we realized that we needed to go multi-cloud. We also realized that our core service principle – we’re a cloud service gateway and active proxy for people's API traffic – meant that we had to have a “strongest link” architecture so that no set of failures at a single cloud provider could take down our service.
We’re now building on what we’d anticipated and developing a new instance of our service at Rackspace. The big changes here are the level of control we have out of the gate for network topology, process isolation, CPU performance… and price, which is higher.
The second cloud: Rackspace
1. Build
Architecturally the big differences are database replication and cross-provider load balancing. This places really specific requirements on your networking design and technology as well as your database design.
One of the things our service does is store all of our customers’ cloud API traffic for their later use in analytics. Thinking about data modularly helps with replication. In a replicated world we need to break out types of datasets – such as customer information and service configuration – into smaller chunks that can meet higher-speed replication requirements cost-effectively, and break them away from all the historical traffic data. Even the traffic data needs to be handled differently in this world.
We are now sharding the database into circular tables, where the incoming data is always written to a write-only area, and revolves to the next area every five minutes. In our user base a 5-minute delay on analytics is more than acceptable (compare this with the SLA for Google Analytics), and the working set of data used for traffic management is handled separately in realtime. All of this means that we can have either a hot standby or live-live dual-cloud configuration without breaking our customer promise that they can tweak their service at any time at all, and that their analytics are consistently available. This will also let us evolve both sides of the service as it grows.
2. Deploy
Deployment tooling stays the same – our old friends RightScale and Capistrano are used to spin and configure instances.
On the networking side, obviously you need to connect your clouds securely in order to replicate between them as well as to exchange performance data which can be used for load balancing. We found again that VPN-Cubed helps us establish a trusted connection between our heterogeneous cloud environments.
3. Manage
Since we are using standard monitoring and management tools – Nagios, RightScale, and Capistrano – these all work in both environments, and our approach of using a “monitor monitor” doesn’t change.. although now we need to monitor monitors in each cloud.
Is there an easier way?
For an infrastructure play like Apigee, we don’t think so. Given our customer promise of near-zero and predictable latency we need as much control as possible. For an application-level service play though, we think some parts can be easier. We’re built on Sonoa technology that manages all of our cloud API traffic processing, as is ING, a financial service company that’s moving to cloud. Their challenge is elasticity in financial modeling – specifically the Monte Carlo simulation workload which is compute-intensive and highly intermittent in use of resource. When you’re running the simulation, you need all the compute resource you can get. When you’re not running a simulation, you need almost none.
Cloud infrastructure like EC2 and Rackspace take care of the racking & stacking problem associated with scaling up for Monte Carlo. You still need to manage that with a tool like RightScale or libcloud plus your configuration and deployment tool of choice. But at the higher level where you’re load-balancing between clouds you don’t necessarily need a VPN, as there’s no data replication requirement. At this layer they’ve implemented a secure API which is called by internal clients, and then this API request is load-balanced by Sonoa’s API gateway. The gateway then calls the right cloud based on policies set by the monitoring and scheduling software. So in this situation you are monitoring your cloud instances and letting the API gateway handle the dirty work of dispatching and securing the calls.
10 Lessons Learned From Building to Multiple Clouds:
1. Get everyone comfortable with virtualization fundamentals, from developers to admins.
2. Limit your dependency on provider-specific APIs by using 3rd party tools that manage this for you.
3. There may be SLAs on your cloud instances but there are no SLAs on the APIs your cloud providers give you.
4. Refuse to use services that have no corollary in other clouds. It will cost you more in rearchitecture than you gain by using it.
5. Understand the cost trade-offs for your business of the different clouds’ strengths – especially in the dimension of availability, price, and performance.
6. Anticipate your needs for data replication and design your databases accordingly.
7. Pay attention to your networking requirements and network topology.
8. Consider the granularity of the requests that you need to load balance – is it at the service or API layer or is it finer-grained than that?
9. You’ll still buy more than you need but the waste ratio is much less in the cloud.
10. Monitor the monitors!
Right product at the right time: API Product Management »
Recently we were asked by a SaaS company exec "can't we just hire someone to come in here and build our API for us?"
Danger, Will Robinson. Just like any other product in your stable, your API needs to go through your product management practice. Successful APIs usually have a dedicated API product manager that creates the 'right product at the right time" by continually focusing helping the team stay on target by driving:
1. What is the vision for the API? How do you go from an idea to great product? Start by asking what is your vision? if you were sitting around with your top 5 execs..would they agree? One good PM framework we've seen really focus an effort is explicitly defining the "VMSO" or "Vision, Mission, Strategy, Objectives" before every major release. For example:
- Vision - what is "the dream" (example: be the most widely used widget catalog on the planet)
- MIssion - what do we do every day to achieve the dream? (example: have the easiest catalog API to learn and use)
- Strategy - what is our unique approach for achieving our mission? (example: have the smoothest sign up, clearest REST API and best community support)
- Objectives - what are our 1-3 key API metrics to determine if the strategy is working? (example: developer apps, API transactions, API revenue)
2. What is the target customer segment for the API? Mobile developers? Your top 10 partners? Affliate marketers? Each segments may need different features, policies, or marketing approaches. Do a customer or developer segmentation analysis and force rank priority segments.
And if you ask your API team who their target segment is and the answer is 'everybody' - get worried.
3. Develop use cases. Ask how little, not how much, you can launch with your API. Taking back functionality is difficult once it's out there. Identify and prioritize the minimum set of use cases (or user scenarios - such as 'browse catalog information') and consider throwing out anything outside what's needed for each use case.
4. Iterate quickly. It's rare to find a successful API program where the PM doesn't say something like "and after we launched our customers took us in a completely different direction." Consider agile development techniques to help your team iterate quickly.
5. Differentiate your API. How is your API or content different than competing APIs in your vertical? Why should I drop what I'm doing now and use your API? Using a well-worn PM 'positioning framework' can help the team agree on this beforehand. For example:
| For the (target customer) |
(example: Mobile developers) |
| Who needs (primary pain point or need) |
(ex: a free and complete widget catalog for commerce apps) |
| Our solution (our API is.) |
(ex: the most comprehensive, open widget catalog that is incredibly easy to use) |
| That (key benefit) |
(ex: provides comprehensive and accurate widget product data for 3rd party apps) |
| Unlike (the compeition) |
(ex: 'for pay' catalog APIs or catalogs with low rate limits inaccurate, incomplete catalogs or APIs that are hard to use) |
|
Solution is (greatest differentation) |
(ex: free and easy to get started with - with amazing community support ) |
What other PM processes do you recommend?
(and thanks to respres for the great photo.)
Moving the needle: Example API metrics »
It's an old cliche, but it's been said that you can't move the needle if you can't *see* the needle. So frequently we're asked "what are good metrics to measure an API program?"
While individual metrics are important - it might be as much about the 'process around metrics.' Or..how metrics are evangelized and used to drive specific parts of the API product development pipeline. Specifically:
1. Get early buy-in on the 'top 3' - strong API product managers often focus in on 1-3 top level 'strategic' metrics and get early, wide agreement from all parts of the extended team - the sponsoring exec, PM, engineering, BD, and operations. If different stakeholders are measuring success with different metrics (say number of developer sign-ups vs. API traffic vs. revenue) this can pull resources in different directions.
2. Track against realistic projections. Set expectations early by modeling anticipated results and then track actuals against this estimate. For example, pick a 'comparable' or competitor's API to guess developer portal traffic, then model the expected developer sign-ups and conversions (for example, 10% of visitors might ask for a key, 20% of them might built an app, 10% of those apps might drive ongoing traffic, each of those apps might drive a certain volume of traffic, and so on... )
3. Publish a weekly dashboard, religiously. Proactively call out how product updates and community activities do or don't move the needle so you can quickly adjust tactics and think of new ideas that might move the needle.
4. Create a metrics 'pipeline' - How do different metrics diagnose how each stage of the customer conversion process is working? For example, developer portal traffic might be a good metric to measure the marketing guys. (that is, they might be responsible for getting developers *to* the portal.) But whether or not a developer converts to ask for a key and then converts into an active API user might be a measure of how effective the PM process is working to create a product that developers want to use. User experienced bugs can measure development and product QA effectiveness, and so on..
Here is an example of a metrics pipeline that we recently discussed with a customer.
| Category | Example Metric |
|---|---|
| Awareness (measure of marketing effectiveness) |
-Developer portal traffic: Unique users, page views, and engagement (PVs/UU) |
| Signups (measure of portal messaging effectiveness) |
-Registrations (developer keys issued) |
| Adoption (measure of product fit) |
-Active developers, partners -Applications (number, by app type, geo, partner 'tier') -App end users (such as mobile app users) -Traffic: volume and % API vs. non-API -Developer retention (active developers lost) |
| Quality (measure of dev process) |
-User experienced problems (errors returned) -Bugs reported -Critical situations (P1 bugs or blocking bugs) |
| Community (measure of customer sat) |
-Community members -Community forum activity and engagement -Number of very active members -Net promoter score |
| Financial (measure of business model fit) |
-Revenue -Cost of data served (if licensed) -Profit and margin -Market share |
(Thanks to seenoevil for the photo)
Cloud Security series - issues around PII, privacy, and audit compliance »
Greg recently sat down with Ryan Bagnulo, Security Architect for ASPECT-i, to discuss a number of cloud security concerns and issues.
We captured these discussions in six short videos, each focusing on a topic. Here are the first two on PII, data filtering, and audit and regulatory concerns, (see the full series here.)
In this first video, Greg and Ryan set things up with discussions on:
- Challenges in deploying cloud, starting with: should you trust your cloud administrator?
- Good data for early cloud adoption (such as public data like news, stocks)
This 2nd short focuses on:
- issues around PII (personally identifiable information)
- counter-measures, such as de-identifying data with filtering, screening or access control
- privacy and regulatory risks around stored in the cloud.
- best practices for protecting data
- implications for violating security breaches privacy regulations
We'd love your thoughts and comments.
OAuth — Take care with those keys! »
A lot has been happening with OAuth recently. Earlier this year a security hole was discovered in the protocol which exposed it to a potential “social engineering” attacks. However, the OAuth community is working on a revision to the spec that will eliminate this particular hole.
Last week we wrote a bit on OAuth as an option for API security. But today I wanted to bring up a related OAuth issue - how do you securely manage all those keys?
With traditional username / password authentication, good security practices require you don't just have a big database on the back end with a list of unencrypted passwords. Instead, a hash of the password is stored, preferably using a salt. So someone who can read the password file can verify they have the right password, but cannot see the actual password.
It is still critical to protect access to these encrypted passwords. Otherwise, an attacker can mount a dictionary attack to try and crack them. However, even if someone gains access to your entire database of encrypted passwords, they can still only easily gain access to lousy passwords. At least users who choose secure passwords are relatively safe. (It is also critical to protect access to the cleartext password, but at least this mechanism doesn’t require that it be stored in a database for all to see.)
As networking and middleware people, we spend a lot of time thinking about the security of our network protocols, and especially ensuring that someone eavesdropping on a network cannot grab our passwords and other sensitive data as they fly by. But how many times have we heard of a security breach caused by a stolen laptop? I would argue that protecting so-called “data at rest” is just as important, or maybe even more important, as protecting the data flying around your laptop.
Now, back to OAuth. Each “user” in OAuth holds something called an “access token,” which is like a username, and a “token secret,” which is like a password. When a request is sent over the network containing an OAuth authentication token, a bunch of data in the token is encrypted using the token secret, but the secret itself is never sent over the network. That way, regardless of whether SSL is in use, there is no way to gain access to the token secret by sniffing the network.
However, on the server side, in order to validate the OAuth token, the server must make the same calculation that the client made when it encrypted the data to put in the token. That means that both the client side and the server side in OAuth must be able to read the unencrypted token secret from some sort of database. Without it, OAuth doesn’t work. There’s no set of standard ways for storing those keys like there are for passwords, so presumably different implementations are storing them in different ways.
As a result, any client and any server that uses OAuth has to take extra-special care with all those token secrets. Otherwise, anyone who gets access to the database of tokens and secrets used by the back end servers immediately has access to all the OAuth-enabled accounts.
I am not suggesting a change to the OAuth protocol here — it solves an important problem. However, I am suggesting that anyone who implements either the “service provider” or “consumer” side of OAuth take very special care of those tokens!
For instance:
- If they’re on a regular disk file, protect them using filesystem permissions, make sure that they’re encrypted, and hide the password well.
- If they’re in a database, encrypt the fields, store the key well, and protect access to the database itself carefully.
- If they’re in LDAP, do the same.
Come to think of it, perhaps the world needs a standard LDAP schema for storing OAuth secrets in a secure way. Anyone care to make a proposal?
Tech Talk: API Visibility and Metrics »
Earlier this week, Greg speculated that Twitter might have benefited from digging deeper into API metrics and usage patterns, so we thought it would be a good time to put him on the spot with a tech talk he recorded on API visibility a couple weeks ago.
For more, here are some sample API metrics considerations and a demo of our own API Analytics solution.
So you want to open an API? »
Greg Brail, our CTO, took advantage of a break from O'Reilly Velocity and a new Flip HD ultra to record a series of four short whiteboard talks on issues many face when opening an API - visibility, security, traffic management, and more. Here is the first clip and you can preview all of these and more API case studies on our Apigee youtube channel.).
We love feedback and these are quick to do - so any more topics you'd like to see, please let us know..
Challenges when building APIs »
If you’re planning to build an API and expose it to the Internet then you’re going to have to face some challenges that you won’t necessarily find when building an internal web service. For instance:
Design. The best APIs are the simplest, but designing a simple API isn’t easy. Plus, what’s simple to one user population isn’t simple to another. A “REST-style” API like Twitter’s is great for AIR programmers or Perl hackers but someone accessing it from inside a big web app server stack might actually find it easier to use a SOAP web service with a WSDL. On the other hand, a SOAP-only API would have been death for Twitter because it would have meant that those tens of thousands of Perl hackers would have had a heck of a time using it in the first place.
Compatibility. Let’s say you don’t get the design right the first time — and how often does that happen? How many “old” versions of your API can you afford to keep running to keep clients functioning? Are are you willing to tell your users, “sorry, we changed the API and now you have to re-write your apps.”
Authentication and Authorization. What does your API do? If it just lets you look up public information, maybe you don’t need authentication. But are you planning on using it with more sensitive data? Will people be using your API to spend money? They’re going to expect that they have to authenticate using a username and password at the very least. There are quite a few ways to do that — which one(s) will you choose? How will you manage all those accounts?
Threat protection. Is there a possibility that a malformed API request can cause your servers to go off in la-la-land, trying to execute an impossible query? Did you code everything write to prevent a SQL injection attach? What if a client sends your servers some bizarre XML — will they run out of memory or crash?
Latency. Since the goal of your API is to provide a service over the Internet, then you will have to live with anywhere up to several hundred milliseconds of latency just to get to and from your API. If each API request takes hundreds more milliseconds, or even several seconds, to run, then how will that affect the perception of your service?
Visibility. Who is using your API? How often? How do the patterns change over time? What kind of latency are they seeing? How many errors do they get? Do different users see a higher error rate? Is the user you signed up last week actually using the API? These are all questions you will want to answer in order to serve your customers better.
Rate limiting. How do you plan to limit user access to your API? Sometimes the right answer is to do nothing — and this is often the right answer for an internal system, where saying “no” is not an option. But for a public, Internet-based API, you owe it to yourself to at least protect your API against disaster — a user who decides today’s a great day to see if they can call your API 100 or 1000 times per second, or one that makes a programming mistake and codes up an infinite loop, or worse. And if you’re planning on a larger user population, then a formal set of quotas makes a lot of sense, which is why Twitter, Yahoo, Google, Amazon, and others all put limits on how much you’re allowed to use their APIs before you give them a call and let them know what you’re up to.
Next, I hope to dive into what we're seeing for each of these and more in detail -Greg



