Showing posts with label Mark Logic. Show all posts
Showing posts with label Mark Logic. Show all posts

August 12, 2009

Balisage 2009 - Streamabilty of XProc Pipelines


Norm Walsh (Mark Logic) gave a talk on streamability of XProc pipelines. XProc lets users define a sequence of atomic operations to apply to a series of documents, using control structures similar to conditionals, iteration, and exception handlers. XProc: An XML Pipeline Language is presently a W3C Candidate Recommendation that is near and dear to Norm since he’s been working on it for awhile. He hinted it should become a Recommendation this fall or certainly by Christmas. As per W3C policy, there must be 2 implementations before a specification is finalized. One of those implementations is by Walsh himself, called XML Calabash which is built on Saxon 9.

Streaming would provide a sliding window in a single pass with output beginning before all input has been seen. Little in said about streaming in the spec, but it is clear it could improve end-to-end performance in certain situations and would be essential for processing documents larger than physical memory. Although there are no explicit requirements for steps to be streaming in the spec, implementations will add value by enabling this.

Norm indicated that certain XProc instructions such a p:count are streamable, wheras others such as p:exec, p:http-request, p:validate-with-relaxng, p:validate-with-schematron, p:validate-with-xml-schema, p:xquery, and p:xslt cannot be streamable. His paper discusses data he collected collected by XML Calabash between 21 Dec 2008 and 11 Jul 2009 representing more than 294,000 pipeline runs. (His implementation has an opt-out, phone home feature so he can collect certain usage data.) In his Submitted Paper, Walsh concluded:
The preliminary analysis performed when this paper was proposed suggested that less than half “real world” pipelines would benefit from a streaming implementation.
The data above seems to indicate that the benefits may be considerably larger than that. Although it is clear that there are pipelines for which streaming wouldn't offer significant advantages, it's equally clear that for essentially any set of pipelines of a given length, there are pipelines which would be almost entirely streamable.
Perhaps the most interesting aspect of this analysis is the fact that as pipeline runs grow longer, they appear to become more and more amenable to streaming. That is to say, it appears that a pipeline that runs to 300 steps is, on average, more likely to benefit from streaming than one that's only 100 steps long. We have not yet had a chance to investigate why this is the case.

Balisage 2009 - Beer and Demo


Q: What is the preferred accompaniment for demos at a technical conference?
A: Why, beer and free food, of course! (Save your divergent opinions about that statement, guys!)

On August 11th, Mark Logic was generous enough to provide great quantities of liquid and solid nourishment at the Brewtopia pub in Montreal. The demo format was simple: 5 minutes each to plugin and go. Over a dozen eager folks braved the cramped space and a hot room, not to mention an increasingly rowdy audience (funny how beer contributes to that). The contestants and the names of their demos follow:
  • Micah Dubinko: Zero to App in 5 minutes
  • Michael Sokolov: Bibilical Studies
  • Bruce Bauman: Conceptual Models to XML Schema
  • Josh Lubell: Quality of Design
  • Mohamed Zergaoui: XML Prague and XProc Designer
  • Uche Ogbuji: Freemix
  • Markos Z(?): XQuery in the Browsercreain
  • David Lee: One-Line Web Server
  • Quinn Dombrowski: Visualizing Bulgarian Dialect Data
  • Betty Harvey: Archival Description (NARA)
  • Steve Newcomb: IEML Parser
  • John Snelson: Higher Order Functions in XQuery 1.1
[Also, Vyacheslav Zholudev volunteered to demo Presentational OMDoc but unfortunately couldn't get a working laptop in the allotted time.]

Betty Harvey (Electronic Commerce Connection, Inc.) and Quinn Dombrowski received identical cheers twice in succession (as measured by the highly scientific decibel meter) so they were declared co-winners, splitting the cash prize. By sheer coincidence, they were the only two female demonstrators. Draw your own conclusions. Everyone who participated was awarded a Mark Logic t-shirt.

Thanks to Mark Logic, especially host Norm Walsh and his colleagues, for a fun and "educational" evening. Thanks also the Brewtopia wait staff who had to wend their way through the tightly packed crowd all night.

August 11, 2009

Balisage 2009 - Opening Remarks and Sponsors


The Balisage 2009 Conference Committee -- B. Tommie Usdin (chair), Deborah A. Lapeyre, James David Mason, Steven R. Newcomb, C. M. Sperberg-McQueen -- opened the 4-day XML conference. One of the well-received announcement was their determination to make Balisage conference proceedings persistent. Unlike some other XML conferences which shall remain nameless (but not blameless), you’ll always be able to find all papers in the series. An ISBN has been assigned to each volume (2 per year), the entire series has an ISSN, and each individual paper has its own DOI (digital object identifier). How cool! Thank you, Mulberry Technologies!

The co-chairs acknowledged the two main sponsors: Mark Logic and the FLWOR Foundation. The FLWOR Foundation is dedicated to providing middleware and clients to simplify the use of XQuery. They have 3 open source projects under an Apache license:Zorba - XQuery processor, XQuery 1.1, update facility, scripting and REST extensions; XQIB (XQuery in the browser) is a browser plugin for Internet Explorer which allows execution of client-side XQuery to navigate and update the DOM; and an Eclipse plugin (XQVT?).

Of course, we all know Mark Logic because they've given us: MarkLogic Server, a native XML database that implements XQuery for the CRUD functionality with full-text and structured search; MarkLogic Application Services which includes Application Builder provides an intuitive, browser-based user interface for creating applications without writing XQuery code; and MarkMail.org, a public email search site built using Mark Logic App Builder which currently archives over 40 million searchable emails.

And Mark Logic is now also known for sponsoring a “Beer and Demo” (more about that later).

And let’s not forget those cool ergonomic pens donated by Patrick Russo (sp?). Don’t confuse them with a tuning fork or wishbone ;-)