Tuesday, July 29, 2014

URL Bending

URL means "Universal Resource Location", and thus named the construct finds much use as a substitute for a lot of structures, many if not most of which have nothing to do with resource locations.

When any term such as a URL gets assigned multiple meanings, whether these are different people's interpretations of the same purpose or the usages originate as means for different ends, that term becomes a homonym. 

In the context of Web applications, we're told that a URL does just one thing: locate a resource. But the "universal" constructs that URLs attempt to address via a one dimensional string, are multidimensional and contain subspaces that link pervasively between one another.  We are faced with many purposes, many decompositions of the data, many formats, many relationships, and so on... and somehow all those facets are supposed to be encoded as a single human readable identifier. 

Even on something as conceptually straightforward as a topological map, we use discrete coordinates to differentiate the dimensional components of an address. More generally, an address is an N-tuple. That N-tuple can (or must) be represented as a string, but the representation does not usually utilize nesting or containment - the primary dimensions are orthogonal and vary independently of one another.  Yet in a URL most often the string is read left-to-right, and path segments form an implicit hierarchy. Or they don't. There is no single interpretation that is actually canonical in the sense that everyone actually follows it.

Here is, syntactically, where URLs break down: we cannot both, at once, infer hierarchy where it was intended to be implied and not infer hierarchy where it was not, without overloading the URL with a one-off domain specific syntax.

So we see a proliferation of syntactical forms appearing, starting with "?query+parameters".  We argue over meaningless forms - should it be /new/user or /user/new or /users (PUT) or /user (PUT) or whatnot - and the amount of argument is inversely proportional to the triviality of the distinctions to be made. A sound, common grammar is a necessity.

A URL isn't really an address in a sense analogous to a cartesian coordinate. It is a parameter binding. In trying to represent multiple twists and turns by way of a single mangled string, in effect we are tying a knot. Or a bend or hitch, if you will, depending upon the object and subject being tied. The moves in tying this knot form a sort of grammar, which for lack of a better term and because it sounds like binding, I'll call "URL Bending".  For reference, a bend is a knot used to join two or more lengths of rope. 

A grammar for bending could be formalized, I suppose. We would need to grok the distinct dimensions along which resources are addressed in various bounded contexts represented in the solution space. (Determining open addressing models is a heavy focus of ISO/IEC 10744:1997, aka "HyTime".) We would need to grasp whether those bits should be included in the knot tying, or should more opaquely be mapped to components of the transaction concomitant with the use of the URL, like the HTTP method or POST or PUT content payloads or HTTP headers. 
Post a Comment