Wednesday, October 7, 2009

Languages and projects

Perhaps it will be helpful for me to think about projects I have done, rather than recent disappointments. I'll consider languages, and mention projects summarily.

GnitPick, a jython tool which integrated the NUX XML library with others within a story editor interface, complete with a Swing-based tabbed interface and cross-tab XML query capability. The tool was an attempt to evalue Jython as a replacement for 4GL interface builder, and I think it did quite well. Menus were build using Jython callbacks organized via XML definitions, and the models were maintained as XML trees.
I also started on a Google App Engine site called SplotchUp, using Python and the Web2Py framework. Knowing exactly how to get the job done tells me very little about what direction to go with it, so for now I'm stumped -- I need an objective problem to solve in order to keep going. No user stories means no tests, and no tests means no code.
I know I've done more Python before GnitPick, but it was mixed up with support related work. GnitPick itself was part of a larger project to bundle Java libraries together into an AJAX-like app framework, which was never published: at the time, the Java clients for HTML sucked. For that matter, they still suck.
Barcoding scripts using PDF::API2. Scripts to re-write the coding of PostScript files to work with proprietary RIPs on the Xeikon platform. Perl reporting code, literally: perl modules to create output reports for a System Verification Test test-case tracking tool. Various Perl CGI scripts to extract data using the Pear interfaces for business reports. Perl for support purposes for DDTS.
Mentioned below, I also maintained a Perl-based Wiki, Onaima, forked from Dolphin Wiki and hosted on SourceForge. Four groups within the organization used this Wiki, a test team, a development team, the quality/process group, and our engineering support team. For the quality group, I combined the Wiki with a corporate portal. The Wiki succeeded where a concurrent effort to roll out QuickPlace fell completely flat.
Block Print Paginator was my first real C project; BPP counted lines with a one page lookahead algorithm which determined how to best split the lines so as to utilize the page without leaving pages with orphaned lines or too small a margin.
The next was a Project Take-Off application, written for the event-driven GEM user interface library on the Atari ST platform. GEM turned out to be not unlike X/Motif. The application itself was a spreadsheet-like system which presented contractors' yellow-pad process for estimating by take-off.
I did lots of ETL programs for Nortel under contract. Puerto-Rico 911's conversion was mine, start to finish. I asked for an ascii version of the specification, cleaned it up with AWK, then used my scripting skills to incorporate the spec as an input to a Make-based build process, generating the validation and conversion regular expressions along with unit test stubs in an object-based framework (C-structs-with-pointers-to-methods) I devised myself. It was absolute perfection. When a question came up about the stability we could both look at the spec to answer the question (the spec was indeed at fault) and demonstrate functional performance by running the built-in tests, which had been coded first.
I also did ETL programs for Bell Atlantic and another telecom, working in a team of three -- though it would have been faster, more reliable, and better documented had I been allowed to take all the responsibility.
Another project was SMSR: Single Member Set Replacement, in another team of three. Actually, two of us did most of the work; the third was the lead and did virtually nothing except set us back, and eventually try to put me into a compromising position with my employer. I did a transactional update interface for SMSR using Lex/Yacc to define the grammar of the input files and C to define the actions.
A small tool I wrote at Kodak was an SCM user interface using the X/Motif library to interact with sccs. I called it XCCS.
After working on the IRTS project, I worked on a system I dubbed the All In One Search Engine. I began writing this in multi-threaded C++ using inverted indexes constructed from case-shifted words. After a substantial amount of work I determined that I could show results faster by redirecting to the C-basedHt/Dig engine. I hacked the search algorithm and provided extended indexers. The method was to extract XML representations from various enterprise quality databases, indexing and saving links to these XML forms, while transforming into HTML representations. Searches were keyed based a selected semantic XML structures, but results were presenting using the HTML representations. Links to the XML and to the original databases (if still available) were also integrated into the HTML results. The system allowed enterprise users in Hong Kong to access quality records for products built and formerly supported in Raleigh, and transferred to China; it also allowed the US operations to discontinue the original quality databases, saving a 50k license immediately and all future support costs, since records were fully accessible in XML format. In other words, it wasn't just a search engine, it was a quality records clearinghouse. Observe that this engine was integrated in 2000-2001 timeframe.

Look, HTML isn't a programming language. See my other comments on AJAX, XML, and JavaScript. I have done a number of Web sites now, some in Plain Old HTML, but many in Joomla.

While at Merlin, I was the principle author of the Deluxe Business Systems Imaging Composition Standard Specification. The standard was produced to define a markup language approach to interchanging order data for personalized print applications. The content included financial transaction information and document design specifications. The specification was meant to cover a variety of order capture situations, across all of the DBS businesses. This was all at a time when XML was not even a first draft, so we were well ahead of the curve. I produced DTDs and semantical descriptions, as well as a process specification to show how document instances would be used in a product production chain.
Microsoft had access to the SGML-for-edi communications, I'm sure, because they were working with the DBS CTO at the time (for whom we produced the spec.) Also, one of the partners later founded PODi which in turn published the PPML standard, surfacing many of the print-on-demand markup application ideas we discussed in the form of PPML. So I'm confident I had a real, if not very visible or widely acknowledged, impact on the industry.
I joined Alcatel rather than going to Deluxe, because I wanted a little bit lower key position. My family, especially my wife, was pining for time with me. Being an "ordinary engineer" instead of an "executive consultant" gave them what they wanted.
At Alcatel, I eventually got back into a role as a management consultant and XML evangelist, having seen how markup would impact the fundamental coding practices of engineers. This was difficult, because many in management had lived insular lives. They were embedded in Telecom, and didn't understand the broader software market. Similarly, many engineers were unfamiliar with the technology and had difficulty understanding how it could impact them. Still, I gained clients. One network management group began basing their architecture on XML with XSLT; I provided XSLT and XML consulting to them. I also convinced the ADSL test group to adopt XML for test case automation; they said later in a memo -- after I had been laid off -- that I had "anticipated the organizations needs a year in advance". This was in response to a Systems group's recognition that their specification documents were a mess: the test group already had recaptured much of the necessary content and was using it in productive work.
An interesting project at Alcatel involved a configuration tool which emulated a provisioned rack and provided soundness checks as well as quick access to technical reference materials and quality databases. Two key aspects were that the "tool" was actually a set of XML documents created by an AJAX enabled administration interface, and converted to JSON notation by a publishing transformation. The documents were actually checked in to a document repository which provided consistency of identifiers and a means of subscribing to the transformations as a service. The schemas for the XML used DTD notation.
There were others as well. Another "configurator" type schema I helped define was for a sprinkler manufacturer configuration system. Generally, my approach has been to create sample instances and use Trang to create drafts of the schemas in RelaxNG, iterating between the instance and Trang until the RelaxNG schema is structurally representative of what is desired. Then I refine the RelaxNG schema and begin forking off sample instances as test cases to show usage and conformance, dropping into WXS when I've got most of the semantics worked out and need to verify that I haven't introduced something WXS cannot represent.
I've done a number of stylesheet transformation projects, some big and some not so big. Among the first were cookbook stylesheets I wrote for training classes to introduce Alcatel engineers to XML and XSLT. They were way behind the curve on markup languages, and I collected a number of techniques from my previous experience with DSSSL and participation on the XSLT mailing list. I put it all together in a Wiki, another technology I introduced to engineering at a time before Wikipedia came onto the scene.
A typical project was for SPAWAR to demonstrate a "robohelp" like framed output, with a left-pane holding context menus, a right-pane holding content, a title pane, and a notification/branding/status pane at the bottom. Often CSS and/or JavaScript can be used to add effects. Note that JavaScript is not a hard requirement since XSLT mappings provide fixed URLs without the need to perform AJAX type interactions. One thing people ask for repeatedly is how to get XSLT to generate mappings between disparate files -- they often assume the XML "ID" attribute is meant for linking, and are somewhat disoriented when it does not suffice for cross-linking.
One interesting project was for Tekelec. BrightPath Solutions brought me in to provide XSLT, ANT, and DITA training. Following this up, I designed a DITA-OT transformation queuing system and provided an interface to integrate it with the Vasont client. The point was to delegate builds to high performance remote servers, with orderly dispatching of requests. Apparently Sarbanes-Oxley was an issue as well, since controlled access to sources and deliverables required as well.
My exposure to Java is still limited. I worked a bit with it in Eclipse when integrating the Java libraries for GnitPick. But Java can give nothing like the productivity of a properly defined 4GL environment. It retains many of the worst aspects of C++, in which Object Orientation, rather than making code more coherent and comprehensible, makes it instead more disjoint and idiomatic. I supported the XSLT transformation engines at Alcatel with some Java glue coding. I also played with a Swing based "ants" simulation, a kind of cellular automaton in which colored tiles are seeded and mutated according to a set of global rules, exhibiting surprisingly complex emergent behaviors. I used Swing buttons and a simple layout manager, rather than a canvas, for the ants simulation -- a poor choice for performance but it was easier to code.

One of the first samples I wrote for training was a use of a Javascript-driven HTML shell to format and display the XSLT, XML, and XPath specifications in various ways, using an associative array to organize the possible transformations and Javascript to manage the user interface and delegating the calls.
The configuration tool mentioned under the XML topic, was an early but very robust AJAX-style application. The system featured a completely Web-based administration interface which managed the loading of card-specific JSON profiles (created by transforming XML documents) through Javascript. Balloon help popups were provided, as were dynamically composed editing forms with semi-automatic layout. Documents were retrieved as JSON, edited, submitted through a CGI interface to a document repository, and subsequently used to regenerate the JSON representation(s). The front-end consisted of the same JSON files, loaded via a frame-based, Javascript-animation enabled interface. The UI supported right-mouse context menus to fill in plug-in card slots, and red-yellow-green status displays on each card as the method of reporting compatibility issues. The rack itself was defined using HTML block-level elements and tables, with a carefully orchestrated sequence of event triggers to contextualize how objects in a class framework were populated as cards and software options were selected. Also, we provided an alternative pop-up dialog method of accessing the card information at a lower level, including access to quality databases and reference documentation. This tool succeeded at effectively managing an information set which five experienced managers had failed to organize using spreadsheets, and went well beyond by providing a coherent delivery framework and two working, AJAX-based analytical tools, within three months of effort. Keep in mind too, that the browsers at this time were not mature and did not provide the ease of use AJAX libraries in use today.
For Deluxe, one of the development projects we performed was to demonstrate a feasible bridge from the Smalltalk business objects framework to a standard SGML parser library. We selected the YASP parser, and used IBM SOM to bridge the parser into a Smalltalk environment, building objects to represent the information content of our markup.
As mentioned, the search engine I began working on was also in C++.

Maple and C++
Another small project I did was for a college professor, implementing an algorithm for efficient search of dependent parameter subsets among datasets of oncogene features, using STL containers. Although I finished work on substantial parts converting the algorithm from Maple to C++, the professor went off for Europe at the same time I graduated and I did not hear back. It was apparently not all that important to her, but honestly, it was among the more challenging projects I've worked on.

Unix Shell
I've written little HTTP servers in Korn shell, just because I could, and a plethora of other scripts, many simple, some very, very involved.
To support a Web application, I wrote a document repository and publishing system using a suite of Korn Shell scripts, written using an object-oriented approach. Scripts represented classes and arguments provided method signatures. The system could register new document types as well as provide pre-check-in validation and post-check-in indexing, abstracting services, and transformations to create AJAX-style applications. Sadly, management was clueless about the methodology and utterly ambivalent about the whole business division. The system was layered, providing a user-level interface to the Repository and delegation of functions to the document type handlers.
During some down time, between writing ETL programs for Nortel, I did an educational project I called NopGen (nop==no operation, a machine instruction which does nothing). NopGen was a code generator designed around the metaphor of plumbing pipes and couplings. The metaphor arose from a paper I wrote about the factoring of programming dependencies, and a paper on Basset Frames in one of Tom DeMarco's books. I simply extended the UNIX pipes metaphor with a framework to organize the pipes and to parameterize blocks of code as templates. A Bourne shell prototype worked well but was slow and had no consistent way to organize the templates. Most of the functionality was implemented using a 1000 line AWK script, based on an idea of using special character strings $- and -$ to delimit chunks' start and end markers, with generic labels and named parameters. (That is, it was a toy LISP-like language, with features of XML markup and XSLT.) A college professor said it belonged in an ACM paper, but I had no clue how to get that accomplished at the time.
I've also written a lot of Make build scripts, as well as several "install kits" based on UNIX shell with or without Make.

Pro*C/Oracle/PL/SQL or ESQL/C
State tax reports for Paychex. What can you say about that? It was pretty routine stuff.
Utilities to format XML output from database queries.
A lot of SMSR was actually written in Pro*C.
After creating the NopGen language, I worked on a database course at SUNY and produced a project I called GROG: Generalized Relational-Object Generator. Grog was a sort of Corba-like IDL generator. Pro*C was used to access the Oracle System tables, which had been annotated by the use of views to specify aggregated types. The tool output was object-based C-with-structures-and-pointers, and instantiated specialized instances to establish a kind of meta-object protocol. I also added stubs for pre-and-post validation procedures, as well as CRUD style accessors. Then I wrote a user guide and technical reference manual for it, as well as release notes and a shell/make-based installation kit . The instructor informed me that in his mind it was graduate-level work, well beyond what was expected, and re-assigned the course as "Advanced Database" to give me additional recognition.
Additionally, I did a lot of work with my Informix projects with ESQL/C, particularly for the Kodak Gateway project. These included data archival programs, raw data loading programs, and scanner alarm monitors. (Although ESQL/C was an Informix product it was not that different from Pro*C.)

One of my first jobs was with a small Unix reseller, thrown to the wolves as it were on a ground-up accounting software project for a Furrier's operation. The app and database were done with Informix 4GL. I also did a Sales and Support Lead Tracking system for a window-and-door manufacturer, as well as a Manufacturing Job Tracking system, also using Informix 4GL, from the ground up to fit the client's operation. Later I worked on a POS system with another guy, using a more GUI oriented version of the same 4GL; and on a Real Estate Multiple Listing Service Database and financial reports system for another client. All required attention to detail on the database, since most had been started by people who hadn't a clue about database analysis, normalization, or query optimization.
Later I worked at Kodak on the Gateway project, writing a report queue daemon in Informix 4GL and the embedded SQL/C product.
Alcatel's IRTS ISO9001 incident/workflow tracking application was done with Oracle and the Ideo SmartGL language, as well as some C and non-Ideo scripting in Perl for report and notifications daemon. Their Change Control tool was similar. I also supported a System Verification Test tool which managed test cases with a SmartGL interface, and hooks to LabView, with Perl to do reporting. Again, I had to step in there to fix up the indexes, which had been constructed improperly, and the internal clients were very appreciative.
The SMSR project for Nortel used Oracle Forms as a user interface, which I had to seriously work around to get it to behave: the designer foolishly assumed it could handle hierarchical data sets without difficulty. Remember, this was Oracle 6.

I focus on components and templates for content management systems. My biggest project was for a Bosch Security Systems marketing program using Mambo -- an ordering site for branded distributor collaterals. It required substantial modification to the com_content component to integrate personalized previews, the com_users for personalization fields and logo uploads, shopping cart integrations, rewriting a credit card gateway for, and order "ticket" packaging for a pre-press production process. The templates also had to match the Bosch site styles precisely, which was not a trivial task. I continue to produce Joomla components. I wrote one component for bridging QuickBooks customers to Joomla users, modified contact components for various purposes, and adapted a number of other off-the-shelf components for specialized purposes. For CMS sites it is rarely economical to custom program entire components from the ground up since GPL components can usually be adapted with much less effort. See and for two of the best examples.

I have repaired and evaluated technical maintenance issues on ASP hosted sites. The VBScript on them is not any more complicated than PHP, but the closed database environment of MS Access and the "jet" database files can make these sites less productive to work with than LAMP environments. I spent a lot of time supporting the Print On Demand Initiative, an industry consortium, now at The chief need there was first to stabilize the site -- it had been left with database problems -- and then to work with the marketing team to put up landing pages for targeted email marketing campaigns. To let them track the campaigns I included a URL-encoded parameter which then populated a hidden form field when the user clicked-through. Eventually the client wanted to overhaul the entire site, but I was not interested in being a liaison to the foreign team they had in mind so we parted ways amicably. Another site I did support for was a wallpaper pattern ordering site. A client, Merlin, was considering purchasing it. The code and database were a mess however, and would not allow customers to complete orders. It was also not bringing in much business -- building that up was really an SEO/SEM problem -- so after learning about the business and digging into the code I advised against a purchase.

I sometimes do MS Access work. The front-end is a VB GUI layout designer, the back-end is SQL based, and the language is VBScript, so it is really just a 4GL environment, albeit a rather poor one. I say that because it is easy for naieve programmers to get into a lot of trouble creating MS Access based applications that look great but destroy the very data they were designed to capture and protect. One of the bigger efforts was a course evaluations reporting system for St. Augustine's College; the consultant left them high-and-dry with an Access database, from which had been stripped all the report generation code. Only the report formats existed. I wrote import procedures and queries to cross-tabulate and summarize the raw data, and to implement business rules as the client specified. I also wrote validation queries to check the integrity of the data before, and of the report counts afterward. I still help them prepare the course evaluation reports every semester, refining the checks and adding a few features here and there to improve the stability and confidence in the process. I also worked on an insurance rates calculation program while at STI, which was done in VBScript under MS Access, and a couple of other forgettable Access database projects.

I did a lot of serious reworking of the SBT Accounting Modules in my first job, for a client (Utica Steel). The boss sent me for SBT training. The coding was all in dBase with simple sequential record files. I was good at it, but the dBase/SBT environment was not conducive to maintaining the modifications: you were basically forking the codebase.

In view of this reflection, part of my difficulty is in connecting at a social level, and part in treating myself as being at a "tradesman" level. It inherently limits my options.

No comments: