There’s been an interesting discussion over at JSMentors.com about JSONP and how to make it safer. This is a good thing, not least because it forced me to take a deeper look and come up with a (sort of) counter-proposal of my own.
We’ll start with an overview of JSON basics, including the EcmaScript 5 JSON API, and then discuss cross-domain JSON retrieval via JSONP. Finally I’ll introduce a simple and relatively safe JSONP framework and show how to use it to fetch tweets from the Twitter database.
What is JSON?
JSON (JavaScript Object Notation) is a lightweight data interchange format based on the JavaScript literal representation of Objects, Arrays, Strings, Numbers and Booleans. A variation of JSON is supported by most modern languages and it now competes with XML as a data protocol for web services, http and system configuration.
JSON was formalized and popularized by Douglas Crockford starting around 2001. The specification is described in rfc4627
OK, OK, I can get that from Wikipedia. We want examples
OK – so here’s some cookies (the good kind) expressed in JSON…
{ "cookies": { "oatmeal": { "ingredients": [ "flour", "sugar", "oats", "butter" ], "calories": 430, "eatBy": "2010-12-05", "kosher": true }, "chocolate": { "ingredients": [ "flour", "sugar", "butter", "chocolate" ], "calories": 510, "eatBy": "2010-12-03", "kosher": true } } }
…this is equivalent to the following xml expression…
<cookies> <oatmeal> <ingredients>flour</ingredients> <ingredients>sugar</ingredients> <ingredients>oats</ingredients> <ingredients>butter</ingredients> <calories>430</calories> <eatBy>2010-12-05</eatBy> <kosher>true</kosher> </oatmeal> <chocolate> <ingredients>flour</ingredients> <ingredients>sugar</ingredients> <ingredients>butter</ingredients> <ingredients>chocolate</ingredients> <calories>510</calories> <eatBy>2010-12-03</eatBy> <kosher>true</kosher> </chocolate> </cookies>
So JSON is just like JavaScript?
Not exactly. Although JSON looks a lot like JavaScript, it is further constrained by the following rules:
- JSON represents six value types: objects, arrays, numbers, strings, booleans and the literal null
- Dates are not recognized as a unique value type
- The concept of a JavaScript identifier is not understood by JSON. All key names must be JSON strings
- JSON strings must be wrapped by double quotes.
- JSON numbers cannot have leading zeros (unless adjacent to a decimal point)
Moreover, since JSON is intended to be language independent, JSON objects should be considered as generic strings, not JavaScript objects.
Using JSON in JavaScript
JSON is a useful format in which to receive server responses from XHR requests. Presumably this response will be in the form of a string. One way to convert a JSON string to a JavaScript object is by supplying it as an argument to the eval
function:
var myCookies = eval('(' + cookieJSON + ')'); myCookies.cookies.chocolate.ingredients[1]; //"sugar"
(The extra parentheses are necessary because of ambiguity in the way JavaScript interprets a leading curly bracket)
Regular XHR transactions are subject to the same domain constraint so you can be pretty sure that the response in coming from your own server. Nevertheless the paranoid amongst us will fret about the consequences of a server error or malicious redirect, and indeed a blind eval of whatever gremlins your server coughs up might just get you into trouble one day.
Luckily ES5 is looking out for you…
JSON.parse and JSON.stringify
ES5 specifies a new built-on object called JSON
with two useful functions based on an API originally developed by Douglas Crockford.
JSON.parse performs a “safe eval” of supposed JSON strings (presumably by means of a regular expression). If the string is not valid JSON, a SyntaxError exception is thrown and the eval does not get called. There is a second optional argument, reviver
, a function that takes two parameters (key
and value
). If supplied, the reviver
function is applied to every key/value pair produced by the parse, which may cause certain values to be modified according to the function’s logic. A typical use of the reviver
is to reconstitute date values from strings (though its worth noting that ES5 also specifies a Date.prototype.toJSON
function)
function dateReviver(key, value) { if (typeof value === 'string') { var a = /^(\d{4})-(\d{2})-(\d{2})$/.exec(value); if (a) { return new Date(Date.UTC(+a[1], +a[2] - 1, +a[3])); } } return value; }; var myCookies = JSON.parse(cookieJSON, dateReviver); myCookies.cookies.oatmeal.eatBy; //Sat Dec 04 2010 16:00:00 GMT-0800 (Pacific Standard Time)
JSON.stringify does the opposite. The value
argument is required and can be any JavaScript object (though its typically an object or an array). The result of invoking stringify
is a JSON string. There are also two optional arguments, replacer
and space
. If replacer
is a function then it basically acts as a reviver
in reverse; however it can also be an array in which case it acts as a white list of object properties to be serialized. The space
argument is a formatting device, its value can be either a number or a string. If a number is supplied, it represents the number of white spaces with which to indent each level. If the argument is a string (typically ‘\t’), then the return-value text is indented with the characters in the string at each level.
JSON.stringify(cookies, ['cookies','oatmeal','chocolate','calories'], '\t') /* '{ "cookies":{ "oatmeal":{ "calories":430 }, "chocolate":{ "calories":510 } } }' */
Both functions are implemented by all modern browsers (but not IE7). Asen Bozhilov is compiling a compatibility table which exposes differences in how vendors interpret JSON.parse
JSONP
We’ve seen that we can use JSON to transport data between server and client, and that we can do so relatively safely. But what about fetching data from other domains. I happen to know Twitter has a rich API for grabbing historical tweet data, but I’m constrained by the same origin policy. That is, unless my client is in the twitter.com domain, using a regular XHR get will get me nothing more than a HTTP error.
A standard workaround is to make use of Cross Origin Resource Sharing (CORS) which is now implemented by most modern browsers. Yet many developers find this a heavyweight and somewhat pedantic approach.
JSONP (first documented by Bob Ippolito in 2005) is a simple and effective alternative that makes use of the ability of script
tags to fetch content from any server.
This is how it works: A script
tag has a src
attribute which can be set to any resource path, such as a URL, and need not return a JavaScript file. Thus I can easily stream a JSON of my twitter feeds to my client.
var scriptTag = document.createElement('SCRIPT'); scriptTag.src = "http://www.twitter.com/status/user_timeline/angustweets.json?count=5"; document.getElementsByTagName('HEAD')[0].appendChild(scriptTag);
This is great news except it has absolutely no effect on my web page, other than to bulk it out with a bunch of unreachable JSON. To make use of Script tag data we need it to interact with our existing JavaScript. This is where the P (or “padding”) part of JSONP comes in. If we can get the server to wrap its response in one of our own functions we can make it useful.
Ok here goes:
var logIt = function(data) { //print last tweet text window.console && console.log(data[0].text); } var scriptTag = document.createElement('SCRIPT'); scriptTag.src = "http://www.twitter.com/status/user_timeline/angustweets.json?count=5&callback=logIt"; document.getElementsByTagName('HEAD')[0].appendChild(scriptTag); /* console will log: @marijnjh actually I like his paren-free proposal (but replacing global w/ modules seems iffy) JS needs to re-assert simplicity as an asset */
Whoa – how on earth did I do that? Well, not without a lot of help from twitter, who along with many other APIs now support JSONP style requests. Notice the extra request parameter: callback=logIt
. This tells the server (twitter) to wrap their response in my function (logIt
).
JSONP looks pretty nifty. Why all the fuss?
OK so, finally, we’re caught up and ready to check out the JSMentors.com discussion I referenced at the top of the article. Peter Van der Zee, Kyle Simpson (a.k.a Getify) and others are concerned about the safety of JSONP and understandably so. Why? Because whenever we make a JSONP call we are going to invoke whatever code the server puts in our hands, no questions asked, no going back. Its a bit like going to a restaurant with a blindfold on and asking them to shovel food into your mouth. Some places you trust, some you don’t.
Peter recommends stripping the function padding from the response and implementing it manually only after the response has been verified as pure JSON. The idea is basically sound but he goes into few implementation details. He also regrets the current requirement that a global variable be supplied. Kyle’s proposal is similar: he too advocates a post response verification based on the mime type of the Script tag – he suggests introducing a new JSONP specific mime type (e.g. “application/json-p”) which would trigger such a validation.
My JSONP Solution
I agree with the spirit of both Kyle and Peter’s arguments. Here is a lightweight JSONP framework that might address some of their concerns. The fucntion evalJSONP
is a callback wrapper which uses a closure to bind the custom callback to the response data. The custom callback can be from any scope and, as in the following example, can even be an anonymous function created on the fly. The evalJSONP
wrapper ensures that the callback will only get invoked if the JSON response is valid.
var jsonp = { callbackCounter: 0, fetch: function(url, callback) { var fn = 'JSONPCallback_' + this.callbackCounter++; window[fn] = this.evalJSONP(callback); url = url.replace('=JSONPCallback', '=' + fn); var scriptTag = document.createElement('SCRIPT'); scriptTag.src = url; document.getElementsByTagName('HEAD')[0].appendChild(scriptTag); }, evalJSONP: function(callback) { return function(data) { var validJSON = false; if (typeof data == "string") { try {validJSON = JSON.parse(data);} catch (e) { /*invalid JSON*/} } else { validJSON = JSON.parse(JSON.stringify(data)); window.console && console.warn( 'response data was not a JSON string'); } if (validJSON) { callback(validJSON); } else { throw("JSONP call returned invalid or empty JSON"); } } } }
(Update: at the suggestion of Brian Grinstead and Jose Antonio Perez I tweaked the util to support concurrent script loads)
Here’s some usage examples….
//The U.S. President's latest tweet... var obamaTweets = "http://www.twitter.com/status/user_timeline/BARACKOBAMA.json?count=5&callback=JSONPCallback"; jsonp.fetch(obamaTweets, function(data) {console.log(data[0].text)}); /* console logs: From the Obama family to yours, have a very happy Thanksgiving. http://OFA.BO/W2KMjJ */ //The latest reddit... var reddits = "http://www.reddit.com/.json?limit=1&jsonp=JSONPCallback"; jsonp.fetch(reddits , function(data) {console.log(data.data.children[0].data.title)}); /* console logs: You may remember my kitten Swarley wearing a tie. Well, he's all grown up now, but he's still all business. (imgur.com) */
Note that sites such as twitter.com actually return unquoted JSON which causes the Script tag to load a JavaScript object. In such cases its the JSON.stringify
method that actually does the validation by removing any non-JSON compliant attributes, after which the JSON.parse
test is sure to pass. This is unfortunate because even though I can cleanse the object of non JSON data I will never know for sure whether the server was trying to send me malicious content (short of writing a horrendous equals method to compare the original streamed object with the stringified and parsed version) – best I can do is log a warning in the console.
To clarify this is safer, not safe. If the server provider simply chooses to ignore your request to wrap its response in your function than you’re still left wide open, but if nothing else, what I’ve presented should make using JSONP a breeze. Its also gisted here. Hope its useful 😉
Further Reading
Douglas Crockford: Introducing JSON
Peter Van der Zee: Proposal for safe jsonp part 1,
part 2
Kyle Simpson: Defining Safer JSON-P
Matt Harris: Twitter API
ECMA-262 5th Edition 15.12: The JSON Object
Thanks for the article. I left a response at http://www.briangrinstead.com/blog/safe-jsonp
Basically, I made a couple of changes to allow multiple requests at once by creating a new JSONCallback function for each request. This makes it more robust by handling the case where “Request 1” gets sent before “Request 2”, but “Request 2” responds first.
One thing that this method doesn’t provide is protection from a site that doesn’t return JSONP at all – it is still able to execute arbitrary scripts. This is a weakness inherent in adding a script tag to your page from a source you do not control, and I suppose that fixing this would require using a different method entirely.
Thanks Brian. I incorporated the multiple request code from your version. (also at //gist.github.com/722562)
(And that’s the only thing my proposal really just aims for; preventing arbitrary script execution. The supplied callback (rather than global) just seems like a logical sugar.).
As I said on Twitter, your solution doesn’t (and can’t) address that. Vendors really do need to implement a new tool for that. The rest of your script is indeed what my proposal would ask for when the data is in 🙂
And also keep in mind that the 3rd party can return a script, that will call your callback, once it has finished it’s dirty work, say, injecting another tracking beacon to your site. Maybe checking the text content of the inserted script tag periodically or on load could detect such evilness, although it could only maybe warn you or the user, that a malicious or suspicious script was injected to the site.
Right now, JSON.parse() appears to be quite safe for me, much safer than eval(). Could you show me an example where the 3rd party could do evil things with its response? (this question is directed to Peter and anybody who can answer).
Think of any XSS attack, that you could think of. The basic idea is to get some JavaScript that you control to a client, who trusts a site. That script could load more scripts, change the site, replace links, hijack the users’ session, may read other cookie values, etc. There are tons to read about XSS, and it can be used to do serious damage to the users and to the company that does not care about XSS.
Hi Thomas
I agree that JSON.parse is quite safe. I think the point is there is no guarantee that the recipient of the JSONP will wrap their response in your function as defined by the contract. If the wrapping function doesn’t get called then neither does the enclosed JSON.parse
Thanks for your answers, I see now. Basically, if the 3rd party response is `eval(/*some malicious code + a call to the right callback with real data*/)`, the attack could be completely transparent.
In fact, the 3rd party does not need to do any trickery to get their code to run. Since you added their script directly onto your page, their response can be anything.
If the response was simply
runMaliciousCode();
there would be absolutely no way to prevent or even detect that activity from the client side. This is a weakness inherit in the JSONP technique, since it requires adding a script to your page.Put another way, if mysite.com/index.html adds a reference to yoursite.com/api.js, and api.js returns
alert(window.location.href)
, it would alert “http://mysite.com”. When you add a script to your site, you are giving it the exact same context and permissions you give your own scripts.Hi;
I have written an article about this topic on my website on wich I have translated a lot of your content into spanish.
The url is:
http://www.etnassoft.com/2010/12/30/tutorial-json/
I hope that it may be useful for someone.
Best regards and great job!
Thanks Etnas – I’ve linked your article at the top of mine
It’s important to note that IE6 JS doesn’t have those JSON methods so if you want to use JSON.Parse and JSON.Stringify in IE6 then you’ll first need to load the JSON utility script from JSON.org.
Angus, thanks so much for this nice approach. I’ve been testing it and it seems to succeed nicely.
But: Why does this work when I use PHP’s $_GET[‘callback’] to output “JSONPCallback(…json…)”, but when I simply try to use “datafile.json” or “datafile.js” files with the exact same output (including the function), it does not work. Is there a specific HTTP header I need to use?
Hello, thank you for the nice and clean tutorial. I am kind of new to JSONP can you explain, do I have to modify the JSON file wrapping the json content with for example foo( //json here ); then callback=foo. That was the only way to make it work around. Either way I leave it plain json it just does not work, even if I pass it to an eval function.
Hi Angus. Any chance you could update the JSMentors.com link your referring to?
Thanks.
Reblogged this on rg443blog and commented:
JSON / JSONP
Reblogged this on CodeSlayer2010 : Slaying Code 24×7!.