August 10, 2009

Balisage 2009 - Review and Summary: Processing XML Efficiently

Mike Kay summarized and reviewed the symposium and added his own observations. He was a little surprised that there was nothing about Binary XML and little about speeding up XSLT or XQuery. Kay made the following main points:

Performance against other objective:

  • Saxon: first standard conformance, then usability, then performance
  • performance is not always the most important consideration
  • David Wheeler (invented the subroutine)- “Optimize the code that users actually write”.
  • Should you optimize for Joe User or the expert?

Performance metrics:

  • Response time / latency
  • throughout
  • resource cost
  • scaleability
  • Jim Robinson: “Good enough for us” -- your own requirements, not the world’s best solution

Performance methodology:

  • measure
  • understand - figure out where bottlenecks are and what can be done to improve it
  • focus on critical components
  • improve -but if not, don’t leave in the things that didn’t help
  • repeat until ok - what counts as ok

Where are the bottlenecks?

  • application-level glue
  • parsing vs. query?
  • validation?
  • conversion to/from XML?
  • serialization?
  • query/transformation?
  • too many phases in the pipeline?

What improvements can we expect?

  • faster parsing has ben promised?
  • faster and more scalable transformation
  • optimized pipelines?
  • smarter users?
  • faster hardware

No comments: