Wednesday, November 12, 2014

Version Control and Codependent Relationships

In the midst of others' conversations at a dinner meetup recently, I asked one quiet young designer/developer sitting across from me where she was from.   She mentioned her home state, and that she worked for a certain marketing agency in that area. Furthermore, she offered with a hint of wistfulness, an ironic observation that they were a leading agency with a huge backlog, they were not up to speed with modern software development practices such as version control.

I also heard an undertone of fatalism, and frustration about how to communicate, and I'd heard that many times before. Particularly in Web Design as an art, modern software engineering practices have only begun to really take root and infiltrate as a professional practice. This is because only recently have Web Designers begun to recognize themselves as serious software professionals. 

That is not to say, that they weren't serious before. Or that they weren't software developers. It is just that in their daily practice, their brains had yet (and to some extent have yet) to converge on a common cultural recognition that they are professionals with a professional discipline.  

There are many reasons for this:
  • personal immaturity - a person is "just not there" yet, and may not see the value in reinvesting effort in skill building
  • "fire fighter" mentality - fire fighters don't have to be concerned about building structures, they just try to keep the flames at bay
  • stress - people who feel under the gun have much less presence of mind for reflection, self-improvement, or process improvement
  • management reactivity - this contributes to stress too and IMHO is the most important root cause in a small business environment

Management Reactivity

The young developer mentioned that the idea of version control prevalent in their office was to yell over a cubical wall and say "Hey, I'm going to edit FuBar.html, is anyone else editing it?".  Yet this isn't even a rudimentary version control such as copying to snapshot folders or renaming .bak files - it is just a verbal form of a semaphore. 

The reason for this immaturity of practice? Ostensibly, it is that they do not have the time to pick up a new practice and put it in place, while also getting the backlog worked on.  

The root cause is reactivity in management. I do not mean "knee jerk" reactions, although that is a visible sign of reactivity.  It may alternatively be that management is poorly trained and possibly even incompetent. By reactivity I mean any practice that undermines a continuous improvement process by constantly misaligning the goals and the actual values expressed to the team. 

Lumped together, you might just simply say it is bad management. Other signs:
  • the company does not allocate a sufficient amount of resources for continued professional skill building
  • calculated risk taking is discouraged; the level of proof required to bring in new techniques or technologies is set higher than the level of proof required to keep the existing known poor practices and technologies with persistent defects
  • supervisors are not actively contributing to work output, but are all mere overseers
  • heavy emphasis on documentation in planning, with little reference or use of those documents by the team performing the work
  • frequent use of the word "just," "only," or other hedging language that diminishes the cost/effort/time/importance/complexity/thinking required to move forward in a sensible direction
  • "Continuous improvement" is a cliche used often, but with no practical path of allowing developers to start moving down any path that changes the toolchain or tactics.  

Co-Dependence

Now, here's the thing: that developer is young and that developer is smart,  so that developer has the power to effect change. Period. And that should be the End of Discussion.

But it isn't the end of the discussion. That developer is also inexperienced and is fearful or at least risk averse, and it is the employer who has the money. There is a real power imbalance when the developer sees herself as the one who needs the money more than anyone else needs her skills. 

By postponing skill building, the developer puts herself in a position to be used reactively. 

By foregoing process and technology improvement - and suppressing the adoption of modern software practices - the employer keeps the developer in a co-dependent posture. 

The tactics the developer learns to deal with problems reactively are employer-specific, and thus much less non-transferrable. At best, they fail to make the developer more attractive to another potential employer.   The employer can pay a co-dependent developer less, because the developer lacks confidence and lacks opportunities. Modern practices, on the other hand, make the developer more attractive to competitors and helps equalize the balance of power. 

You get the idea. The sad thing is, co-dependence hurts all parties in a relationship. The employer will fall behind competitors, and so will the employee. 

Sunday, August 3, 2014

Namespaces in Ruby

Ruby is a very plastic language. By plastic, I don't mean "fake" but easily manipulated.   I was considering namespaces, as they are in PHP and a number of languages derived syntactically from C:

namespace \MyOrg\MyDomain\MyApp\MyPackage\Foo;



I was thinking of Ruby. In Ruby, there is no single namespace declaration; instead, the language provides a Module construct to more-or-less accomplish the same goals. The difficulty being that Module is rather more syntax than less.
Poking around Google, I came across this little gist in which Justin Herrick describes how he made a short DSL to have a nice brief Clojure-like syntax:

ns 'MyOrg.MyDomain.MyApp.MyPackage.Foo' do
   def fluggelduffel
      ...
   end
end


Herrick's solution takes advantage of Ruby's seemingly limitless ability to modify the module environment. And it works, with one limitation: constants referenced in a method like fluggelduffel, or anywhere in the do block for that matter, throw a NameError unless const_set is used:

ns 'MyOrg.MyDomain.MyApp.MyPackage.Foo' do
   const_set("A","FUBAR")
   def fluggelduffel
      puts A
   end
end


I played around with the code a bit to add an options hash:

ns 'MyOrg.MyDomain.MyApp.MyPackage.Foo', {  :constants=>{ :A=>"FUBAR" } } do
   def fluggelduffel
      puts A
   end
end


The code simply calls const_set in a different place. The constant A is there in module Foo, but it isn't visible in the lexical scope in which puts is referencing A. We can address A explicitly via MyOrg::MyDomain::MyApp::MyPackage::Foo::A, but how ugly is that? We can also use const_get('A') but that is pretty ugly too.

The problem is that bare references to constants are resolved in the lexical scope in which the block was created. It has nothing to do with the scope the constant is defined in. What to do?

There isn't a lot that can be done. If you're using unqualified constants, that's pretty ugly in itself... polluting your code with global references and all. If you really need that (dis)ability, const_get('A') follows the nesting chain all the way up. I've found that self::A works fine for the globals I've defined locally using const_set, though I'm uncertain if there are any side-effects or weird interactions. In this way, constants can be defined dynamically, and attached to the initial namespace definition.

Saturday, August 2, 2014

HTML is BAD, and YOU SHOULD FEEL BAD for using it

No, I don't really think this, but that's a catchy headline, isn't it?

On the other hand, there is a part of me that thinks that HTML represents a sort of dishonesty, a kind of technological plagiarism. 

The Not Invented Here (NIH) reinventing of wheels often the standard of practice across the business world - reinforced with Intellectual Property portfolios and litigation. Technologists thrive on NIH. The behavior may be simply in part because technology oriented humans just enjoy tinkering with something we perceive as being new. 

It is further promoted by broad based illiteracy among practitioners. The Internet helps people self-educate, but as masses of people learn rudimentary basics of programming they are apt to stop when they learn just enough to be dangerous, that is, just enough to earn some money from a skill. Those with any real interest in the science will be doomed to wander through parts of the discipline that were already-well-explored decades ago. 

Yet the most corrosive aspect of NIH on platforms-substituted-as-standards such as HTML is intellectual dishonesty. The same kind of intellectual dishonesty that pervades business advertising, the posturing of vendors towards clients in fixed bid contracts, and that lawyers and politicians seem to consider acceptable in love and war.  Even if they were self-aware of their own agendas, they would not admit to it; and it corrupts both the culture and the outcomes at once. 

Don't get me wrong. Society as a whole benefitted greatly from the worse-is-better approach embodied in HTML. The world's peoples gained experience in a domain previously occupied by a few brave geeky souls. We got cool toys and new ways of doing medicine - and innumerable other unspeakable benefits from exploring the space with just enough technology, even if it was a bit broken. 

The problem space will eventually press in on the field. We see pseudo-standards such as micro-formats and Web Components competing to represent multiple parallel domains of information in HTML encoded resources. Technologically, they are neat, and I have little doubt that they serve to further the interests of Google and various social media manipulators. But they also work at cross-purposes to the original intent of markup, which is to represent information with integrity and to make it accessible and open over the long term for all stakeholders.   

As people move forward with Web Components, I'm reminded that XML offered us the ability to use our own tag names to represent information content. A Web Component can be designed in such a way as to be a Graphical User Interface widget, but the higher usage is to use it to isolate or entirely occlude for-the-Browser behaviors with elements that express only the problem domain's semantics. Otherwise, we're just back to writing 4GL applications again, and we did that back in the '80s. 





Tuesday, July 29, 2014

URL Bending

URL means "Universal Resource Location", and thus named the construct finds much use as a substitute for a lot of structures, many if not most of which have nothing to do with resource locations.

When any term such as a URL gets assigned multiple meanings, whether these are different people's interpretations of the same purpose or the usages originate as means for different ends, that term becomes a homonym. 

In the context of Web applications, we're told that a URL does just one thing: locate a resource. But the "universal" constructs that URLs attempt to address via a one dimensional string, are multidimensional and contain subspaces that link pervasively between one another.  We are faced with many purposes, many decompositions of the data, many formats, many relationships, and so on... and somehow all those facets are supposed to be encoded as a single human readable identifier. 

Even on something as conceptually straightforward as a topological map, we use discrete coordinates to differentiate the dimensional components of an address. More generally, an address is an N-tuple. That N-tuple can (or must) be represented as a string, but the representation does not usually utilize nesting or containment - the primary dimensions are orthogonal and vary independently of one another.  Yet in a URL most often the string is read left-to-right, and path segments form an implicit hierarchy. Or they don't. There is no single interpretation that is actually canonical in the sense that everyone actually follows it.

Here is, syntactically, where URLs break down: we cannot both, at once, infer hierarchy where it was intended to be implied and not infer hierarchy where it was not, without overloading the URL with a one-off domain specific syntax.

So we see a proliferation of syntactical forms appearing, starting with "?query+parameters".  We argue over meaningless forms - should it be /new/user or /user/new or /users (PUT) or /user (PUT) or whatnot - and the amount of argument is inversely proportional to the triviality of the distinctions to be made. A sound, common grammar is a necessity.

A URL isn't really an address in a sense analogous to a cartesian coordinate. It is a parameter binding. In trying to represent multiple twists and turns by way of a single mangled string, in effect we are tying a knot. Or a bend or hitch, if you will, depending upon the object and subject being tied. The moves in tying this knot form a sort of grammar, which for lack of a better term and because it sounds like binding, I'll call "URL Bending".  For reference, a bend is a knot used to join two or more lengths of rope. 

A grammar for bending could be formalized, I suppose. We would need to grok the distinct dimensions along which resources are addressed in various bounded contexts represented in the solution space. (Determining open addressing models is a heavy focus of ISO/IEC 10744:1997, aka "HyTime".) We would need to grasp whether those bits should be included in the knot tying, or should more opaquely be mapped to components of the transaction concomitant with the use of the URL, like the HTTP method or POST or PUT content payloads or HTTP headers. 

Monday, July 21, 2014

Separation of Concerns

If you are a developer and you work long enough on business applications, you start to sense the corrosive effects when divergent interests and viewpoints are forced into a single representation.

It may be a vague suspicion - a code smell. You don't necessarily know precisely what those conflicting concerns may be, or why they were conflated in the first place, or the possible consequences of trying to separate them. But you know that a valid stakeholder concern can be addressed only if is identified. Separating concerns is a necessary, but not sufficient, step in the right direction.

The volatility of your codebase - the rate at which changes tend to grow - depends on how well matched the codebase is to the concerns it seeks to address. If the code tends to cover a mere fraction of one concern with each coding unit, its volume will blow out and thus also the difficulty of managing all the moving pieces.  If the code tends to cover many concerns in a few monolithic coding units, it may be too terse for a human to sensibly and reliably decode and have so few points o articulation that the simplest of local changes gives rise to a cascade of far reaching fissures.

Cohesion is the term often used to describe the continuum between these extremes, but the success of biological systems calls this dogma into question. Cohesion is neither necessary nor sufficient for a dynamically stable, long-running, self-maintaining system; so I think it is not really necessary or sufficient even for our crude software approximations of real world processes.  A better metaphor is the optic system, principally the concept of focal length.

In optical systems, the focal length is a measure of how strongly a lens bends light, determining among other things the magnification and angular field of view. It also influences the degree to which an image blurs when cast upon a surface parallel to the principle plane of the lens. When the distance is just right, the projected image is sharp and all detail is to scale. When the distance is too close and/or too far away, the projected image is blurred and/or skewed.

Being cohesive isn't enough. A coding unit that does not represent an optimal fraction of concerns is either out of focus, skewed, or both blurred and skewed.

Sunday, July 20, 2014

Conservation of Information

Information must always be available, but it is not necessarily usable. Like energy, information is conserved - it cannot really be lost or created, only changed in form. Entropy really has to do with the energy required to put the bits together into a suitable form for a decision to be made. 

We can account for the costs of information through various means, one one which is the human work hours (or some other unit of time) expended to express and reformulate the mechanisms used to move the bits around. 

Information flowing between domains always causes a loss of usability of some finite fraction of the information, unless sufficient energy input is present to counteract the small-scale distinctions and fine-grained anisomorphisms introduced at the boundaries of the domains. Information content is conserved, but some bits may be masked, mutated, or combined in some manner that is intractable given extant means and methods.  

Relativistic effects also come into play. Two or more highly localized bounded contexts of information necessarily give rise to complex distortions of views between the respective reference frames. Sense making only occurs when one takes into account one's own reference frame and those being observed. 

[I know this is probably bit of BS. It was edited from a brainstorming journal entry originally made 1-2-1997] 

Friday, May 30, 2014

One Language Puritanism for Acceptance Testing ?

A good Web acceptance test - or to be more precise, a tool chain that well-supports end-to-end tests, is apparently quite difficult to achieve.

The problem is not in the high-level languages. Domain Specific Languages like Gherkinm, and hosted language syntaxes, have been used productively.

The main problem is not in the deployment techniques, although lack of closure over the environment is the first major stumbling block to doing any kind of meaningful testing.  Whether it is the local isolated deployment through Ruby On Rails' bundler, rake, and tools like RVM; or CI tool chains like Travis CI, Jenkins, or Bamboo; or just a skunkworks set of scripts, config files, and GIT practices; there are many paths to creating Closure Over a Deterministic Environment  (CODE).

A slightly more interesting problem is generating meaningful and valuable configurations of test data. I've worked with Fixtures and Factories - both are fragile and both take a considerable fraction of the development time to work with consistently. They don't scale all that well, but there is an effective process that can be started and applied in substantive ways across large swaths of functionality.

An even more interesting problem is getting rigorous (meaning: formally translatable) specifications of the expected data and objects. The difficulty is that this takes a lot of long, deep thinking and reflective writing, tactics that were prized activities in decades past but which have met with extreme disfavor in Agile Internet time. Present first-world Homo Sapiens seem to have trained their working memories so that they are barely able to consider meaning in a full length Tweet, let alone a missive as long as this rant.

But I digress. Formal and complete specifications are only plausible, if at all, as an end result of a long development process.  We could expect to possibly accrue such comprehensive specifications over the duration of a project, as an executable or translatable artifact: code; test specification; or literate programming via configuration parameters represented in markup documents. The point is, specifications don't just pop into existence, they are a side-effect of long term thinking and communication processes.

Whether we choose to value that content higher than working project code (which is likely a waste unless it is a research project); or devalue that content to the point of immediately discarding it like so many candy wrappers and beer bottles on the ground (which is a likely waste unless it is a trivial project); or find a way to make that content an integral aspect of the artifacts under construction (literate programming, configuration by convention, specification translation); so long as the team can maintain an adequate level of context and understanding, there is a path forward.

No, the devil in acceptance testing is in the details: our tools on the client side are immature, and one reason that people are stuck on stupid is an irrational obsession with using one, and only one, language to solve the problem.

Don't get me wrong: that language you use, be it Javacript or PHP or Gherkin or whatever, is just fine. But that language choice, for end-to-end acceptance testing, is barely even relevant.  The specific language choice is not necessary and not sufficient.  What matters is whether the tooling can actually perform a simple step like input a select option and observe a change happen as a result of a back-end transaction. I recently ran into trouble with Codeception/Behat/Mink because of this issue. Most of the acceptance testing frameworks I've seen give the developer way too much grief to work around these scenarios, if they can even work with them correctly at all.

I'd rather use Javascript (or better yet, CoffeeScript) to write acceptance tests for a PHP application, if it means that the test engine will be able to smoothly traverse the client side interactions. Codeception is a nice integration of PHP toolkits - one of the best I've seen - but the fact that a test framework uses PHP exclusively as a testing language means LESS THAN NOTHING to me. That a client-side test engine which understands Javascript can be bridged to a library of modules in a second language (PHP) simply presents on-going risk of flaws with no opportunity for accruing benefits.

It really amounts to introducing multiple layers of indirection in order to avoid using a native language of the actual test platform. As always the problem this strategy creates is too many layers of indirection.