August 19, 2009

Balisage 2009 - Best Practices - or Not?


{If it could talk, this blog entry would be begging for your comments via the "comments" link below the article.}

On August 13th, the Balisage Conference 2009 hosted a spirited panel discussion featuring David Chesnutt, Chet Ensign, Betty Harvey, Electronic Commerce Connection, Laura Kelly, National Library of Medicine, and Mary McRae, OASIS. Harvey shared her "Top 10 Mistakes in DTDs" from 1998, most of which are still applicable today. We learned a bit about the standardization process at OASIS from Mary McRae, who differentiated between the codified rules which are enforced, informative guidelines which are essentially only recommendations, and very informal oral exchange of practices (which she emphasized should be recorded in written form).

McRae mentioned the OASIS Technical Committee Process which, among many other things, defines the OASIS Naming Guidelines in two parts: Part 1: Filenames, URIs, Namespaces and Part 2: Metadata and Versioning. For example, OASIS requires a RDDL-like file at the URI given for a namespace. This is in accord the W3C’s Architecture of the World Wide Web “good practice” for namespace documents:
The owner of an XML namespace name SHOULD make available material intended for people to read and material optimized for software agents in order to meet the needs of those who will use the namespace vocabulary.
McRae mentioned Adoption Technical Committees, I believe in reference to DITA Help, DITA Localization, UDDI, SAML, and OpenDocument Format (ODF). She also indicated that all OASIS specifications are either published in DocBook, Word, OpenOffice, DITA or XHTML.

An issue that generated divergent opinions was the question of which version of an XML Schema is the normative one -- the version stored in a separate file that can be validated and directly used, or the version that has been pasted into a Word document or other word processing format? Of course, the two should be identical and ideally pulled from the same source (i.e., the schema file could be imported into the document), but this isn’t always the case. The separate has the advantage of lending itself to code review, whereas the version embedded in a specification that is normative gives the impression that it too is normative.

Does imposing best practices on developers restrict creativity and productive competition? Are Naming and Design Rules (NDRs) inherently evil? Again, this question solicited differing opinions, although it seems that the majority who offered comments were against such impositions.

Laura Kelly's experience at the US National Library of Medicine taught her that the scope of any project is smaller than we like to admit, so perhaps draconian rules are counter-productive. When it comes to XML encoding, focus on giving the users what they should be asking for -- how to markup data so it will work best in their systems. Developers really only want to know "have I tagged my data correctly?" and "what does the data look like?"

David R. Chesnutt offered some lessons learned from his SGML work with the Model Editions Partnership (MEP) circa 1997. This project used a "subset of the SGML markup system developed by the Text Encoding Initiative (TEI)".

{Post comments below.}

No comments: