Friday, January 27, 2012

Why we should care about server side JavaScript

A software practitioner recently asked me what are the benefits of a functional language, and why would one want to use, say, JavaScript for an MVC Web application stack. A stack to consider would be Ruby on Rails for instance. Clearly, developers in the Node.js community have their eyes set on that problem space.

I wasn't prepared answer his question, at least not adequately.

JavaScript certainly offers interesting things like closures and anonymous functions, which make event-driven programming interesting.  Closures, my friend pointed out, are like deep class hierarchies in an OO system, and can obscure the flow of control. It is true that closures come at a cost, but there are benefits as well in the reduction of code and less time spent in allocating temporary variables and in avoidance of copying. But closures are just one construct among several primitives in functional programming that work together to form an extensible system of logic. A grammar, if you will.

Why should we care?

Improved engines, especially Google's V8, have brought speed characteristics to rival that of C++. If JavaScript were still as slow and memory intensive as, say, Java, it might have survived in the browser space. Yet V8 brings JavaScript benchmarks into the same order of magnitude as statically compiled languages. It still isn't as fast as C or Perl5, but it is on-par with PHP and C++ and edges out Ruby (sorry Ruby, I do love you, but you're not quite as fast as V8). That is one characteristic that has some Web app developers all hot and bothered about Node.js.

And for whatever faults it has, JavaScript has less cruft than PHP, C++, or Java. Functional primitives combined with a kind of Lamarkian inheritance and minimal data types make it a very cohesive language despite its half-baked flaws. It doesn't have classes, but then again, it doesn't have classes. JavaScript does have prototypal inheritance. Modules and packages have much more practical benefit, and you can make those in JavaScript.  With cleaned up syntaxes such as that offered by CoffeeScript, some of the worst flaws can be entirely elided from the coding experience while making programs even shorter.

As a language, JavaScript leaves some things to be desired. The lack of tail-call optimization, while generally treated as YAGNI, prohibits some interesting implementation techniques. The widely used long-running script detection makes sense for browsers and servers, but for background processing tasks and monitoring not so much. There are some interesting ways of managing blocks of code, formulating methods, and handling control flow in Ruby and Python that I sometimes wish were in JavaScript.  The weirdness of falsiness and truthiness and == is eye-rollingly campy.

But in spite of its flaws, JavaScript is still a much saner language than Java and unlike Java/Ruby/Python/C++/Perl/PHP/whatever, it is part of almost every Web browser. Node opened up the path to a coding experience in which the seams between deployment environments are much cleaner and tighter, allowing them to be increasingly well-defined, well-factored, well-integrated, and well-tested with less code.

As a spiritual descendant of Scheme it is difficult to stay mad at JavaScript for very long even when the browser environment makes simple tasks grueling; eventually the elegance and simplicity of the language still draws you in.

Approachable beauty intrinsically engenders creativity and productivity. That in a nutshell, in my very humble and only poorly-informed opinion, is why programmers are noticing JavaScript.



Thursday, January 26, 2012

Class Variables versus Class Instance Variables in Ruby

I'm going to do a code dump and annotate.

#!/usr/bin/env ruby

class Something
  @@class_variable = 0

  def initialize( name )
    @@class_variable += 1
    @name = name
  end
  def value
    "#{@name},#{@@class_variable}"
  end
end

class SomethingElse < Something
  def initialize( name )
    super(name)
  end
end

joe = Something.new("Joe")
puts "Joe = #{joe.value}"
mary = Something.new("Mary")
puts "Mary = #{mary.value}"
sam = SomethingElse.new("Sam")
puts "Sam = #{sam.value}"
puts "Finished creating. Now Joe = #{joe.value}, Mary = #{mary.value}, and Sam = #{sam.value}"


Class variables are shared among all subclasses.

class SomethingEntirelyDifferent < SomethingElse
  @@class_variable = 0
  def initialize( name )
    super(name)
  end
end

puts
ferdinand = SomethingEntirelyDifferent.new("Ferdinand")
puts "Ferdinand = #{ferdinand.value}"
puts "Created subclass with same classvariable. Now Joe = #{joe.value}, Mary = #{mary.value}, and Sam = #{sam.value}"


Class variables are not shadowed. They are scoped wrt the inheritance chain.
Class variables are candidates for unintentional side-effects.

class SomethingAgain < SomethingElse
  @class_instance_variable = 0

  class << self; attr_accessor :class_instance_variable end

  attr_accessor :instance_variable

  def initialize( name )
    self.class.class_instance_variable += 1
    super(name)
  end
end

albert = SomethingAgain.new("Albert")
puts "Albert = #{albert.value}"
puts "Albert's class = #{albert.class.name} && class instance = #{albert.class.class_instance_variable}"
jane = SomethingAgain.new("Jane")
puts "Created subclass with class instance variable, and two instances."
puts "Jane's class = #{jane.class.name} && class instance = #{jane.class.class_instance_variable}"
puts "Albert's class = #{albert.class.name} && class instance = #{albert.class.class_instance_variable}"


Class instance variables are attached to an object's class object.

class SomethingMore < SomethingAgain
  @class_instance_variable = 1138
  def initialize( name)
    super(name)
  end
end

mike = SomethingMore.new("Mike")
puts "Mike = #{mike.value}"
puts "Mike's class = #{mike.class.name} && class instance = #{mike.class.class_instance_variable}"
puts "Created subclass of class with class instance variable, with its own class instance variable"


Class instance variables aren't visible to subclasses.
Class instance variables are required on subclasses when base-class methods that read or write them.

puts "Albert's class = #{albert.class.name} && class instance = #{albert.class.class_instance_variable}"


Class instance variables are not visible to other subclasses in the inheritance chain.


All in all, the @@class variables encourage collusive coding and appear to carry a high risk of causing race conditions and other unintentional side-effects. The @class instance variables carry a somewhat lesser risk. Barring some obscure trick, a class instance variable is always associated with precisely one class. But even with class instance variables, multiple object instances can gain access to the variable through their own "class" property, with the potential for unintended side-effects.

Wednesday, January 25, 2012

Simpler than possible

A scientific theory should be as simple as possible, but no simpler
- A. Einstein

I'm not sure of the precise context of Einstein's words, but it seemed to do with deflection of criticisms toward one of the relativity theories.

A kid with a magnifying glass intuitively understands the meaning: the focal length of the lens being a theory, too close in or too far away both give rise to fuzzy representations that aren't too bright.

By DrBob via Wikimedia Commons
The question is one of focus. Literally, not figuratively.

Considering that vision originates in the brain and its purpose is to create a predictive theory of the world around us, it is unremarkable that lenses are incorporated into our biology. The lens reduces the scale of the external problem visual field while concentrating signals in the process, and makes a projection onto a concavely curved surface covered with photoreceptors. The lens is an image transfer device. 

The brain then, is a device onto which images are transferred. Into, onto, it is hard to express: the memories modify the fine grained structure of neural dendrites, which incorporate the sensory inputs in analog gradients, and do so more or less as a whole. 

So I propose an idea of Focal Distance and the degree of Focal Alignment when considering how fit a software language, idiom, framework, system, or platform is to a particular purpose set of stakeholder needs. 

This assumes that the needs are in some manner, self-consistent -- they lie along a parallel trajectory. It may be that due to conflicting interests between stakeholders, the solutions deemed acceptable will never, ever, approach Focal Alignment. There could be orthogonal components to the needs, causing the lens -- and by extension the solution-image -- to skew. There could be absolute differences in stakeholder positions along the same trajectory or orientation in opposite directions along the same trajectory, giving a compromised Focal Distance and solutions that are blurry. 


Thursday, January 19, 2012

What Windows POSIX Compliance Teaches Us: a Wink and a Nod


Many years ago, the Federal government put out a series of requirements, the FIPS standards. IIRC, the POSIX specs are part of FIPS.  To generalize, Linux is an implementation of POSIX. 

POSIX was pushed because of a few factors:
  • vendors' operating systems are divergent, making it harder to migrate programs between systems
  • vendors go out of business, shut down product lines, and make radical changes to them, so development using their APIs are an unsound investment over time
  • government institutions have to carry the burden of systems they buy into for decades

FIPS are procurement standards. This means that in order to sell a computer solution to the Federal government, a vendor must satisfy the POSIX requirements. The intent is clear: to safeguard public investments. An improper balance of power in the hands of a supplier inevitably leads to deleterious actions against consumers. POSIX has the effect of making investors out of consumers.  

What happened afterward is a travesty: Microsoft Windows/DOS based PC clones had picked up steam in the consumer market, and NT was being aimed at enterprises including government. Microsoft gave NT a partial POSIX subsystem, which just about nobody used, to get a rubber-stamp for sale into government accounts. A panel of judges gave the nod, and forced the Coast Guard to accept Windows based proposals in a 1995 case.

Apparently, the NT POSIX subsystem has been replaced a few times, and was crippled from the start. That's why everyone and their brother uses Cygwin, UnixUtils, or MinGW for porting Unix apps to Windows. But due to Windows' non-compliance, it doesn't run Unix style applications all that well. 

credit: Zombie classified by bloodredrapture on Flickr
Microsoft only added the DOA POSIX subsystem so they could claim compliance, when their compliance was a sham on its face. The subsystem was virtually a zombie interface.  

The Coast Guard lawyers in the 1995 case would have done well to ask a multilingual colleague to assist, using a deliberately broken grammar in a non-English language to present some portion of their argument. Prior to being cited for contempt, they could then argue that their compliance with requirements for language interfaces in court was similar in kind to NT's conformance to POSIX interface requirements, and with identical outcomes. 

Government purchases of closed systems like Microsoft Windows amount to a collusion with vendors in  constructive non-compliance: apparently in conformance but with precisely the opposite effects as those intended by the standards authors. Such is the power of the judiciary to rewrite law. 


One way to re-approach the original intent of FIPS is to reformulate compliance in terms of public trusts, or something akin to a credit union in which software is the primary asset. We have very good institutional precedents in the form of non-profit organizations, like the Apache Software Foundation and the Mozilla Foundation. (Indeed, these two alone account for  a huge amount of the Web infrastructure that drives our economy.) 

For software to be purchased by a government entity its assets and its dependencies should be escrowed in the public trust, largely if not completely. In the case of open source software using distributed version control services, this could be accomplished easily: just identify yourself and the repositories. Private concerns would have to accept that the public's interest in not losing access to the intellectual property outweighs their interest in keeping it private, and trust the escrow service to not leak their IP prematurely; or chose to not play in the public space. 

Those institutions that adopted closed systems early got quick benefits, particularly in the predictability of the user interface and plug-and-play commodity hardware peripherals.  But those same institutions are  now bumping up against the inevitable consequences of the strategy. Adopting a strategy of developing for privately held operating systems is a good way to disadvantage yourself over the long term. A nod is as good as a wink to a blind horse. 

Monday, January 16, 2012

Compass imports

Yeah, some articles on this blog are a dumping ground for when a crib is needed.
The Compass docs are not particularly easy to scan through quickly.

Compass imports
ex: @import "compass/layout"

compass/

  grid-background
  sticky-footer
  stretching

  css3
    appearance – Specify the CSS3 appearance property.
    background clip – Specify the background clip for all browsers.
    background origin – Specify the background origin for all browsers.
    background size – Specify the background size for all browsers.
    border radius – Specify the border radius for all browsers.
    box – This module provides mixins that pertain to the CSS3 Flexible Box.
    box shadow – Specify the box shadow for all browsers.
    box sizing – Specify the box sizing for all browsers.
    columns – Specify a columnar layout for all browsers.
    font face – Specify a downloadable font face for all browsers.
    gradient – Specify linear gradients and radial gradients for all browsers.
    images – Specify linear gradients and radial gradients for many browsers.
    inline block – Declare an element inline block for all browsers.
    opacity – Specify the opacity for all browsers.
    text shadow – Specify the text shadow for all browsers.
    transform – Specify transformations for many browsers.
    transition – Specify a style transition for all browsers.

  typography
    links
      hover-link
      link-colors
      unstyled-link

    lists
      bullets
      horizontal-list
        bullets
        clearfix
        float
      inline-block-list
        bullets
        inline-block
        float
        horizontal-list
      inline-list

    text
      ellipsis
      force-wrap
      no-wrap
      text-replacement
    vertical-rhythm

  utilities
    links – Tools for styling anchor links. (from typography)
    lists – Tools for styling lists. (from typography)
    text – Style helpers for your text. (from typography)

    color – Utilities for working with colors.
     color-contrast

    general – Generally useful utilities that don't fit elsewhere.
      Clearfix – Mixins for clearfixing.
      Float – Mixins for cross-browser floats.
      Hacks – Mixins for hacking specific browsers.
      Minimums – Mixins for cross-browser min-height and min-width.
      Reset – Mixins for resetting elements (old import).
      Tag Cloud – Mixin for styling tag clouds.
    sprites – Sprite mixins.
      sprite-image
    tables – Style helpers for your tables.
      table-striping
      table-borders
      table-scaffolding

Closure as a property of a package management system

I filed a bug today for a simple OSX Homebrew recipe for git-hg. Basically, git-hg allows you to clone and work with mercurial repositories using a git repo.  Upon installing the recipe though, cloning resulted in an empty directory, and a fetch (performed out of curiosity) got:


Traceback (most recent call last):
  File "/usr/local/Cellar/git-hg/HEAD/bin/../fast-export/hg-fast-export.py", line 6, in <module>
    from mercurial import repo,hg,cmdutil,util,ui,revlog,node
ImportError: No module named mercurial

Aha, so the immediate problem is that the Python mercurial module is missing.

The actual problem is a little more pervasive: the git-hg recipe didn't close over all of its environmental settings and software dependencies.

A further example of faulty closures was seen in the associated Python install.  Why was the module missing?  The first place to look is the Python distribution's site-packages folder ( /usr/local/lib/python2.7/site-packages for the Homebrew example).

This turned out to be a puzzler because OSX also comes with an older version of Python installed. Thus, as the Homebrew and Python page explains, you have some fiddling to do with your PATH and PYTHONPATH environmental settings.   While one cannot predict or completely automate environmental dependencies, it would have been very nice to be reminded of the potential conflict by the Python recipe.

The omission constitutes a lack of closure of the recipe over its dependencies.

Sunday, January 15, 2012

JavaScript is Dead. Long Live JavaScript

OK, this is going to be light on fact and heavy on impressions. I admit to not having much to go on, other than my own memory of history. I'm simply spewing intuition here.

I just took a whiff of a new programming language. Something smelled wrong.  It reminded me of 
  • Fully blown Corba enterprise standards
  • C++ for business applications
  • Java for the Web
  • W3C XML Schema Language 
  • Web Services Standards (pick just about any)
It is not that there isn't some merit to each of these, but each egregiously forces a practice of making many decisions early in the design process; they do so in the name of performance; and they all suck because they embed too much cruft in the deployed systems.

The language I smelled was Google's Dart. It smells an awful lot like Java, from the tooling (Eclipse) to the static typing (uh, "optional" static typing). Then there's the 17 thousand + line of code "Hello World" example. But OK, even gcc compiles Hello World to around 8k on an OSX machine. (Then again, that's machine code gcc is compiling into on a desktop, not scripted source in a Web browser.) 

If Google wants to push an alternative to Java, more power to them. But they should not have done it under the pretense of killing JavaScript, or of making open Web scripting easier with enterprise tooling. That's just a little bit "evil". 

Thursday, January 12, 2012

SVN Cribs

(Cheetah credit: Wikipedia Commons)

svn co https://some.where.no/path/trunk main --username username # "username" is from the server


svnadmin create reponame # a local repo (the repo itself, do not modify!)


mkdir pro touch proj/somefile.txt


svn import proj file:///Users/yourname/reponame/proj -m "Initial import"


svn checkout file:///Users/yourname/reponame/proj proj-copy
svn export file:///Users/yourname/reponame/proj proj-copy # export instead of checkout


cd proj-copy


svn status


svn update # Pull changes from server to local; update working copy
svn resolve --accept working # claim that conflicts were resolved 

# change, create files

svn add changedfile.txt
svn move oldfilename new filename
svn copy oldfilename new filename
svn delete filename


svn commit -m "made this change" # commit and push changes to server
svn ci -m "changes comment" filename # commit for one file

svn log
svn log filename # look at log for file


svn diff -r priorversionnumber
svn diff -r ver filename # diff only for this file


svn update -r priorversionnumber # roll back to a prior repo state
svn revert FILE


Global options:
  --username ARG           : specify a username ARG
  --password ARG           : specify a password ARG

more at http://svnbook.red-bean.com/

SVN is dead. Long Live SVN

I prefer GIT, but SVN is widespread too. Even though SVN is not very old (it is only 12 years old) but it represents a clone of, rather than a re-imagining of, the antiquated CVS tool (released around 1989). CVS was itself a reworking of the (at the time) popular RCS tool first seen around 1982. RCS inherited its good looks from SCCS, created in 1972 for an IBM System/370.


(Dodo credit Wikimedia Commons)

So the core concepts that frame up SVN are over forty years old. The workflows have changed with each re-implementation, but the basic approach to version control is pretty similar. Vaguely, the way these tools work can be characterized by a few features:

  1. there is a single central point of update 
  2. changes are tracked one file at a time
    • it is left to the user to use labeling and other tool features to ensure that sets and subsets of files corresponding to systems and modules are self-consistent and sane configurations 
    • histories must be computed in a more or less unbroken chain
  3. filesystem level changes, and special file types, are not tracked well, or not at all
    • renames are not well tracked
    • file history can easily be lost 
  4. higher level constructs for configuration versioning are missing 
    • no symbolic tagging 
  5. branching is clumsy, heavyweight, long-lived, and error-prone
  6. data and metadata access is rigid and fragile, but fast
    • repositories are exposed to corruption from trivial user level actions
    • SVN is both faster and more robust, but still trivially easy to corrupt by network errors and faulty clients
OK, maybe this is being a bit unfair, incomplete, abstract, and irrelevant as judged by an SVN expert.  I admit to having completely lost interest in the SCCS evolutionary branch of VCS's when Tom Lord's Arch (tla) arrived on the scene in 2001.
(Original finch image credit Wikipedia Commons)

The vine that was TLA died but not before sparking a vibrant ecosystem of competing Distributed Revision Control systems.  My recollection could have gaps, but to my mind Tom Lord never really got the credit he deserved for this innovation. Mercurial and GIT are two modern representatives of this evolutionary genus. 

(Cheetah credit Wikimedia Commons)

While the SCCS->RCS->CVS->SVN branch has thrived in the open source community, it remains a rather stilted lineage. This isn't because SVN is a mature tool now (although it is).  Rather, it is because SCCS had fixed in place a large number of decision points that were not (or could not be) revisited even when its descendants broadened and matured the design. Up to a gross approximation of workflow, SCCS == RCS == CVS == SVN. Further diversification has a very high cost and low payback, and so no richness of form will arise from that fount.

For a more positive spin, read about the advantages of a distributed VCS on ThinkVitamin.

Friday, January 6, 2012

Check out the Node Beginner Book

Check out http://nodebeginner.org/  , the Node Beginner Book.

This is an unprovoked, unpaid, unrequited plug, just because I enjoyed looking through the site and working out the example. There is ample room to tweak the idiomatic style and play with the example, which is completely straightforward. The writing is a tiny amount sugary, but it presents a very good narrative. This guy needs to write more user guides.  

Sunday, January 1, 2012

Reading Blocks in POW and Node.js

A writer's block is when you've got some serious motivation to write, but can't seem to find the inspiration to get started or the cleverness to get past a simple difficulty.  Readers can face blocks too, especially when the writer has done a poor job of weaving the threads of the story and bringing them together.

The same problems exist for programmers when writing and reading code, except much more so.  As programmers, we often need to tease out the answers to questions about where some control path came from, and were it is going.

I'm thinking now of a snag hit while trying to convert my personal site to use HTML5's caching manifests:

Application Cache Error event: Invalid manifest mime type (application/octet-stream) http://mitch.amiano.agilemarkup.com.dev/manifest.cache

I was using the 37Signals' POW server to do my local testing.  It looks like POW serves up the manifest file as an octet stream, instead of the recommended type (text/cache-manifest). I searched for the mime type handling in POW. It is not there. As a reader, I face a small block.

POW is reusing modules defined in Node.js, the V8 evented I/O library. So naturally, there is nothing present in the POW source code base that will clarify how it resolves missing mime types.

More precisely, POW is calling a connect.static method. There is no "connect" module in POW or Node.js, and nothing in the source code base that would suggest what "connect" is, other than an anonymous Node.js module. So we have to search elsewhere, elusively.

Implicit in using Node.js is use of a flat global namespace for modules, implemented by npm, the Node Package Manager. It isn't clear to me how (or whether) npm manages versions, or what people will do if and when a competing package manager is released, or when packages adopt similar or identical names. It seems as if git repo urls are implicit in the packaging, but not in any way that is definitive in the client source code.

Now, a programmer who is experienced in a given library will become a priest, familiar with all the dangling threads of the righteous library. But that's no excuse for leaving threads dangling. This is the stuff of a cult-like priesthood, not of a profession.  all a bit of a guessing game.

[edit: turns out that Node uses package.json for version/dependency management; however, unless you've downloaded all the sources, UNIX tools like "find" or "grep" will obviously not find anything.]

Connect is (very probably) some a version (which one???) of a Node.js based middleware framework.  specified by package.json in the root of the source tree. The git repo says that static is a middleware module packaged with connect. The static middleware calls mime.lookup(path).  The mime package and version are not anywhere to be seen in the Connect source code. I'm seeing a pattern here, or rather an anti-pattern.

References should not be more exact than necessary, but nor should they be so ambiguous that they contain insufficient information to find the referenced entity.

So I locally clone Connect, and Node.js, and POW, and use OSX' find command to sniff out the possible locations of the mime type handling, to see if there is an idiomatic way of adding a new type. The mime module is a package included with Node.js, down in deps/npm/node_modules/request/mimetypes.js.

I'm repeating myself, but the mime module is not part of the Connect source code base, and I have to assume that it isn't referring to some other module also called "connect".  This case is simple enough - mimetypes.js is just an associative array with a lookup function - but in the general case, which version might we be linking to, and who is the owner?

Writing like this isn't just an interruption to the reader, it is a failure of the writer to pull together the threads of the story. That makes the story more difficult to follow than necessary, and it isn't sufficient to deliver a reliable piece of software except by fortunate accident.

Avoiding searches for ambiguous references is what Integrated Development Environments were designed for.  But using an IDE is missing the point: it shouldn't be necessary when reading sources to go through an unreliable search for every dependency.


I'll chalk this up to my own inexperience with Node.js.


Maybe I don't didn't grok Node.js and npm well enough yet. Maybe there are some conventions that make the resolution more definitive and repeatable. The convention requires that you use npm like one would use bundler and gems in Ruby.

But writing code that requires mental long-jumps to anonymously named, un-versioned modules seems like a very stupid way to program.  Perhaps Node.js needs its own version of bundler.


[edit: Yeah, node.js has its own means of packaging components -- I had missed that in my haste. I still think there is a disconnect between where entities are defined and where they are used by clients... the dependencies wind a path through multiple packages, give opportunity for mysteriously similar names amongst package methods and variables, and generally make piecing together the story more difficult than it needs to be.]