Higher-Order Functions in XQuery 3.0

Introduction

Higher-order functions are probably the most notable addition to the XQuery language in version 3.0 of the specification. While it may take some time to understand their full impact, higher-order functions certainly open a wide range of new possibilities, and are a key feature in all functional languages.

As of April 2012, eXist-db completely supports higher-order functions, including features like inline functions, closures and partial function application. This article will quickly walk through each feature before we put them all together in a practical example.

Function References

A higher-order function is a function which takes another function as parameter or returns a function. So the first thing you'll need in order to pass a function around is a way to obtain a reference to a function.

In older versions of eXist-db we had an extension function for this, called util:function, which expected a name as first argument, and the arity of the function as second. The arity corresponds to the number of parameters the target function takes. Name and arity are required to uniquely identify a function within a module.

XQuery 3.0 now provides a literal syntax for referencing a function statically. It also consists of the name and the arity of the function to look up, separated by a hash sign:

let $f := my:func#2

This assumes you already know the function when writing the query. However, there's also a dynamic variant, a function called "function-lookup", which is part of the standard library:

let $f := function-lookup("my:func", 2)

Dynamic Function Calls

Now we have a reference, we can actually call the function. In XQuery 3.0, a dynamic function call is a primary expression followed by an argument in parenthesis:

let $f := upper-case#1 return $f("Hello")

In this case the function reference is obtained from variable $f, but we could actually use any primary expression here, even another function call. Now we're able to write our first higher-order function, i.e. one which takes another function as parameter:

xquery version "3.0"; declare namespace ex="http://exist-db.org/xquery/ex"; declare function ex:map($func, $list) { for $item in $list return $func($item) }; (: Create an inline function and assign it to $f :) let $f := upper-case#1 return ex:map($f, ("Hello", "world!"))
Edit

Our function, ex:map, simply applies the passed function to every item in the list it received. This is actually a very common operation, so the XQuery 3.0 standard library does already provide a function, fn:map, and we don't need to write our own.

Inline Functions and Closures

Instead of explicitly declaring a function in the module, we may also use an inline function (or anonymous function), which we can directly pass around. To put it simple: an inline function is a function without a name:

map(function($x) { $x * $x }, 1 to 5)
Edit

Inline functions have an important feature: within the function body, you can not only access the function parameters, but also any variable which was in scope at the time the inline function was created. This is commonly called a closure in functional languages. The closure allows the function body to access variables outside its immediate lexical scope. We'll see concrete examples in the larger example following below.

Inline functions also inherit the static context from the module in which they were created.

Partial Function Applications

Partial function applications are a powerful feature in functional programming languages. They allow us to fill in parts of a function call, leaving some parameters open. This is useful if information has to be passed across a hierarchy of function calls. The calling function can fill in some parameters and leaves it to the called function to provide the rest.

A partial function application looks like a normal (dynamic or static) function call using a question mark (?) as placeholder for some parameters:

declare function local:add($lhs as xs:integer, $rhs as xs:integer) as xs:integer { $lhs + $rhs }; let $f := local:add(10, ?) return $f(32)
Edit

The partial function application actually creates a new function with the same name, but removing all fixed parameters from the parameter list.

Putting it all Together

For a practical exercise we choose a frequent task when working with eXist: scan a collection hierarchy recursively and apply some operation to every collection or resource encountered. This can be written as a one-time solution using a simple recursive function. However, we would rather like to design a set of functions which can be reused for arbitrary tasks on collections and resources.

To start with, we create a function to recursively scan the collection hierarchy, calling a provided function for each collection encountered:

xquery version "3.0"; module namespace dbutil="http://exist-db.org/xquery/dbutil"; (:~ : Scan a collection tree recursively starting at $root. Call : $func once for each collection found :) declare function dbutil:scan-collections($root as xs:anyURI, $func as function(xs:anyURI) as item()*) { $func($root), for $child in xmldb:get-child-collections($root) return dbutil:scan-collections(xs:anyURI($root || "/" || $child), $func) };

The function takes two parameters:

  • the path to the root collection
  • a function which accepts one parameter: the path to the current collection

The callback function is called once for every collection we encounter while scanning the collection tree. We then walk each child collection, calling dbutil:scan-collections recursively. The || operator is also new in XQuery 3.0 and concatenates strings. It is a shorthand for the fn:concat function.

Next, let's add another function for scanning all the resource paths stored in a given collection and pass them to another callback function. This is straightforward:

(:~ : List all resources contained in a collection and call the : supplied function once for each resource with the complete : path to the resource as parameter. :) declare function dbutil:scan-resources($collection as xs:anyURI, $func as function(xs:anyURI) as item()*) { for $child in xmldb:get-child-resources($collection) return $func(xs:anyURI($collection || "/" || $child)) };

We're now ready to combine the two operations: create a function which recursively walks the collection tree and calls a provided function for each collection and resource encountered.

(:~ : Scan a collection tree recursively starting at $root. Call : the supplied function once for each resource encountered. : The first parameter to $func is the collection URI, the : second the resource path (including the collection part). :) declare function dbutil:scan($root as xs:anyURI, $func as function(xs:anyURI, xs:anyURI?) as item()*) { dbutil:scan-collections($root, function($collection as xs:anyURI) { $func($collection, ()), (: scan-resources expects a function with one parameter, so we use a partial application to fill in the collection parameter :) dbutil:scan-resources($collection, $func($collection, ?)) }) };

The callback function for the collection/resource scan takes two parameters:

  • the path to the collection
  • the full resource path if the current item is a resource, or the empty sequence if it is a collection

We start by calling dbutil:scan-collections with an inline function. This gets the path to the current collection and first calls the user provided callback with the collection path and an empty sequence for the resource parameter. Note that the variable $func, which we received as parameter to the outer call, is visible within the inline function due to the closure!

Now to the interesting code: for each collection, we call dbutil:scan-resources. Remember that it expects a callback function with just one parameter (the resource path), but the user-supplied $func takes two parameters. We could now work around this by changing dbutil:scan-resources, but there's an easier way: we pass a partial function application, providing $collection as a fixed parameter while leaving the resource parameter open!

We now have all our library functions in place and are ready to use them. For a first test, here's a function which scans the collection hierarchy for resources having a given mime type:

declare function dbutil:find-by-mimetype($collection as xs:anyURI, $mimeType as xs:string) { dbutil:scan($collection, function($collection, $resource) { if (exists($resource) and xmldb:get-mime-type($resource) = $mimeType) then $resource else () }) };

If our inline function receives a $resource parameter, we check its mime type and return the resource path if it matches. In all other cases, the empty sequence is returned. We can now test if our library works as expected:

import module namespace dbutil="http://exist-db.org/xquery/dbutil" at "xmldb:exist:///db/codebin/dbutils.xql"; dbutil:find-by-mimetype(xs:anyURI("/db/demo"), "application/xquery")
Edit

You can view the code for the complete module.

Another operation you will probably need when migrating your database from eXist-db version 1.4.x to the forthcoming 2.0: traverse the collection tree and fix permissions on all collections and XQuery resources. Contrary to previous versions, eXist-db 2.0 closely follows the Unix security model. To access a collection, it is no longer sufficient to have read permissions on it: you also need the execute permission in order to see the collection contents. Likewise, stored XQuery modules need the execute flag to be set in order to be executed through eXist-db.

Here's a piece of code to globally reset permissions:

xquery version "3.0"; import module namespace dbutil="http://exist-db.org/xquery/dbutil" at "xmldb:exist:///db/codebin/dbutils.xql"; (: You need to run this as an admin user, so it won't work on the demo server :) let $perms := "g+x,u+x,o+x" return dbutil:scan(xs:anyURI("/db"), function($collection, $resource) { if ($resource and xmldb:get-mime-type($resource) = "application/xquery") then sm:chmod($resource, $perms) else sm:chmod($collection, $perms) })
Edit

Availability and Backwards Compatibility

Complete support for XQuery 3.0 higher-order functions is available in eXist-db trunk (2.1dev) starting with revision 16248 (April 14, 2012). It will also be part of the final 2.0 release. We're currently working hard to complete the last bits and pieces for the release.

eXist-db provided ways to dynamically call functions since several years, based on the two extension functions util:call and util:function. They are used in several modules (e.g. the KWIC module).

We made sure the new XQuery 3.0 higher-order function implementation is backwards compatible (though a lot more powerful), so old code will continue to work and migration is smooth.

1.4.2 Maintenance Release

While we're working hard on polishing version 2.0 for the final release, we should not forget that there have been quite a few important bug fixes in the 1.4.x branch as well. We are thus happy to announce another production quality release in the stable branch: 1.4.2. Version 1.4.2 is the last release in the 1.4.x branch before 2.0. After 2.0 has been released in May, it will become the new stable branch.

Stable releases only contain hand-selected bug fixes which were ported back from the development trunk. There are no new features. Notable changes include:

  • Deadlock fixes
  • Lots of updates to the rewritten WebDAV interface
  • Performance improvements when using the Lucene index
  • Backup to file system produced bad directory structure
  • Support circular module imports
  • XQuery processing fixes

eXist-db version 1.4.2 is now available on sourceforge.

One of the outstanding new features in the development version of eXist is the ability to index and query the content of binary resources, which are stored inside the database. This is made possible through a new XQuery module for content extraction and some basic, but powerful additions to the Lucene-based full text indexing. Our goal was to provide as much flexibility to the user as possible. The new functions are thus rather generic and can be used in a wide range of scenarios.

Read article ...

Demo of the new app repository

eXist's development version (to become v1.5) provides a number of new features to simplify the process of creating, deploying and distributing XQuery-based apps. An "app" in this context is a self-contained package, which can be downloaded from a public or private repository and installed into any instance of eXist-db with a few mouse clicks. The app may just package a bunch of XQuery library modules or (REST-style) interfaces, or it may contain an entire, complex web application.

There are many different paths to create an application with eXist, which is good. But this also makes it difficult for new users to find their way. The new app repository as well as eXide try to simplify the process for people to get started (just keep in mind that not every app will fit into this framework).

Upon request, I created a short screencast to demonstrate how simple it is to use the package repository to install entire applications into eXist. This is just a teaser and does not explain how to actually create app packages. I have a longer video in the pipeline which explains just that (eXide actually handles most of the setup work for you).

For the next release of eXist, we plan to ship all example code and parts of the documentation as apps, which can be installed on demand. This will lead to a cleaner installation and make it easier for people to find their way through the examples.

Release 1.4.1

Monday, 15th August 2011

Dear Ladies and Gentlemen,

The eXist-db team are very proud to announce the release of eXist-db version 1.4.1.

Version 1.4.1 is not your average point release and concludes almost two years of hard work from the developers, contributors and the community. There are more bug-fixes and stability improvements that you can shake a very pointy stick at, and we believe that it is the best version of eXist-db yet.

There is far too much to pack into a short list, the full change list can be found here but, the highlights include:

  1. Referential Integrity – no more vanishing index entries, so no more failed document updates and inconsistent documents.
  2. Database Shutdown and Crash Protection – numerous improvements by us, which means less unnecessary recovery runs by you.
  3. WebDAV – completely new implementation, based on the solid Milton WebDAV Server library. Enabled by default, but don't panic, the old version is still present should you need it!
  4. Lucene Full-Text Indexing – upgraded to version 2.9.2 for a performance boost, configurable analyzers, configurable parameters, better match highlighting and additional functions for working with Lucene document fields.
  5. XForms – The betterFORM server-side XForms 1.1 engine now ships with eXist-db and is enabled by default. The included XSLTForms XForms engine has been updated.
  6. EXPath – The EXPath projects HTTP Client XQuery module has been implemented as a first step towards EXPath portable XQuery.
  7. Improved transaction handling – scheduled and direct system tasks no longer need exclusive access to the database. e.g. the Backup system task can operate whilst your database is still online.
  8. Indexing architecture – many improvements for both performance and stability.
  9. Resolution of all reported database lock contention issues.

Heads Up! → As we mentioned, this is no ordinary point release, and as such the legacy full-text index in eXist-db 1.4 has been disabled by default in 1.4.1. The legacy full-text index can of course be re-enabled through conf.xml, however there are several integrity issues that we will not fix, instead we suggest moving to the newer Lucene based full-text index.

We have tried to ensure that upgrading from eXist-db 1.4 to 1.4.1 is as easy as possible, however should you need some assistance or support above and beyond what the community provides then please consider eXist Solutions, without whom this release would not have been possible.

Version 1.4.1 will most likely be the last major release in the 1.4.x line. Whilst we are also planning to release a preview of eXist-db 1.6 in the next couple of months, 1.4.1 should be considered the current stable, suitable for production use, version of eXist-db.

Finally, we would like to thank the eXist-db community, contributors and developers for all their hard work on eXist-db 1.4.1.

Thanks and Enjoy :-)

eXist-db version 1.4.1 is available here - http://www.exist-db.org/download.html

Thank you

The eXist-db team

Original message appeared on the exist-open mailing list.