Demo of the new app repository

eXist's development version (to become v1.5) provides a number of new features to simplify the process of creating, deploying and distributing XQuery-based apps. An "app" in this context is a self-contained package, which can be downloaded from a public or private repository and installed into any instance of eXist-db with a few mouse clicks. The app may just package a bunch of XQuery library modules or (REST-style) interfaces, or it may contain an entire, complex web application.

There are many different paths to create an application with eXist, which is good. But this also makes it difficult for new users to find their way. The new app repository as well as eXide try to simplify the process for people to get started (just keep in mind that not every app will fit into this framework).

Upon request, I created a short screencast to demonstrate how simple it is to use the package repository to install entire applications into eXist. This is just a teaser and does not explain how to actually create app packages. I have a longer video in the pipeline which explains just that (eXide actually handles most of the setup work for you).

For the next release of eXist, we plan to ship all example code and parts of the documentation as apps, which can be installed on demand. This will lead to a cleaner installation and make it easier for people to find their way through the examples.

Release 1.4.1

Monday, 15th August 2011

Dear Ladies and Gentlemen,

The eXist-db team are very proud to announce the release of eXist-db version 1.4.1.

Version 1.4.1 is not your average point release and concludes almost two years of hard work from the developers, contributors and the community. There are more bug-fixes and stability improvements that you can shake a very pointy stick at, and we believe that it is the best version of eXist-db yet.

There is far too much to pack into a short list, the full change list can be found here but, the highlights include:

  1. Referential Integrity – no more vanishing index entries, so no more failed document updates and inconsistent documents.
  2. Database Shutdown and Crash Protection – numerous improvements by us, which means less unnecessary recovery runs by you.
  3. WebDAV – completely new implementation, based on the solid Milton WebDAV Server library. Enabled by default, but don't panic, the old version is still present should you need it!
  4. Lucene Full-Text Indexing – upgraded to version 2.9.2 for a performance boost, configurable analyzers, configurable parameters, better match highlighting and additional functions for working with Lucene document fields.
  5. XForms – The betterFORM server-side XForms 1.1 engine now ships with eXist-db and is enabled by default. The included XSLTForms XForms engine has been updated.
  6. EXPath – The EXPath projects HTTP Client XQuery module has been implemented as a first step towards EXPath portable XQuery.
  7. Improved transaction handling – scheduled and direct system tasks no longer need exclusive access to the database. e.g. the Backup system task can operate whilst your database is still online.
  8. Indexing architecture – many improvements for both performance and stability.
  9. Resolution of all reported database lock contention issues.

Heads Up! → As we mentioned, this is no ordinary point release, and as such the legacy full-text index in eXist-db 1.4 has been disabled by default in 1.4.1. The legacy full-text index can of course be re-enabled through conf.xml, however there are several integrity issues that we will not fix, instead we suggest moving to the newer Lucene based full-text index.

We have tried to ensure that upgrading from eXist-db 1.4 to 1.4.1 is as easy as possible, however should you need some assistance or support above and beyond what the community provides then please consider eXist Solutions, without whom this release would not have been possible.

Version 1.4.1 will most likely be the last major release in the 1.4.x line. Whilst we are also planning to release a preview of eXist-db 1.6 in the next couple of months, 1.4.1 should be considered the current stable, suitable for production use, version of eXist-db.

Finally, we would like to thank the eXist-db community, contributors and developers for all their hard work on eXist-db 1.4.1.

Thanks and Enjoy :-)

eXist-db version 1.4.1 is available here - http://www.exist-db.org/download.html

Thank you

The eXist-db team

Original message appeared on the exist-open mailing list.

Akismet in XQuery

So after receiving lots of comment Spam on my personal blog, I switched from using reCaptcha to Asirra, both small Modules which I implemented in XQuery.

I had assumed that the Spam was the result of a Robot, that was brute force cracking the reCaptcha Captchas via image transformation and OCR. As such, I envisaged that moving from reCaptcha to Asirra would solve this issue, as Asirra is much much tougher for a Robot to solve.

Unfortunately the move from reCaptcha to Asirra did not completely stop the spam, although the quantity is now much less. From this I am concluding that the Spammers are actually Human and that because Asirra is more time consuming that reCaptcha, this has just slowed them down.

Now, I am well versed in email Spam Filtering, as in the past I have configured plenty of Postfix mail servers with SpamAssasin and various DNS Black/White Lists. The thought occurred to me that there must be a similar service for blog comments, a quick Google revealed both Akismet and TypePad AntiSpam.

Akismet appears to be the more established player, however their terms of use are quite limiting, for example whilst personal use is free, you have to pay for commercial use. On the other hand TypePad AntiSpam are the young upstart and have very liberal terms of use. The good news is that TypePad AntiSpam implements exactly the same API as Akismet, so by just changing the hostname of the server you are contacting, you can choose to use either Akismet or TypePad AntiSpam.

So I decided to implement TypePad AntiSpam filtering of comments submitted to my blog, and guess what? I implemented it as a reusable XQuery Module (downloadable from here), which makes use of the EXPath HttpClient functions, so whilst this will work on eXist-db, it should also be useable on any XQuery processor that supports EXPath.

Example (X)HTML Page (example.html)

<?xml version="1.0" encoding="UTF-8"?> <html xmlns="http://www.w3.org/1999/xhtml"> <head> <title>Asirra Example</title> </head> <body> <form action="example.xql" method="post" id="commentform"> <fieldset> <label for="comment_name">Name</label> <br/> <input id="comment_name" name="name" type="text" size="40"/> <br/> <label for="comment_email">email address</label> (will not be shown)<br/> <input id="comment_email" name="email" type="text" size="40"/> <br/> <label for="comment_website">Website</label> <br/> <input id="comment_website" name="website" type="text" size="60"/> <br/> <label for="comment_comments">Comments</label> <br/> <textarea id="comment_comments" name="comments" rows="12" cols="55"> </textarea> </fieldset> <input type="submit"/> </form> </body> </html>

Example XQuery handler (example.xql)

xquery version "1.0"; import module namespace request = "http://exist-db.org/xquery/request"; import module namespace akismet = "http://akismet.com/xquery/api" at "xmldb:exist:///db/akismet.xqm"; declare variable $local:akismet-api-key := "your-akismet-or-typepad-api-key-goes-here"; declare function local:is-comment-spam() as xs:boolean { akismet:comment-check( $local:akismet-api-key, <akismet:comment> <akismet:blog>http://www.adamretter.org.uk/blog.xql</akismet:blog> <akismet:user_ip>{request:get-header("X-Real-IP")}</akismet:user_ip> <akismet:user_agent>{request:get-header("User-Agent")}</akismet:user_agent> <akismet:referrer>{request:get-header("Referer")}</akismet:referrer> <akismet:permalink>http://www.adamretter.org.uk/{request:get-parameter("comment",())}</akismet:permalink> <akismet:comment_type>comment</akismet:comment_type> <akismet:comment_author>{request:get-parameter("name", ())}</akismet:comment_author> { if(request:get-parameter("email",()))then <akismet:comment_author_email>{request:get-parameter("email", ())}</akismet:comment_author_email> else(), if(request:get-parameter("website",()))then <akismet:comment_author_url>{ request:get-parameter("website", ()) }</akismet:comment_author_url> else() } <akismet:comment_content>{request:get-parameter("comments", ())}</akismet:comment_content> </akismet:comment> ) }; if(local:is-comment-spam())then <result> <it-was-spam/> </result> else <result> <not-spam/> </result>

Akismet XQuery Module (akismet.xqm)

xquery version "1.0"; (:~ : XQuery Module implementation for the Akismet API - http://akismet.com/development/api/ : : Can be used with either Akismet or the TypePad AntiSpam service : : @author Adam Retter <adam@exist-db.org> : @date 2011-06-24T21:26:00+02:00 :) module namespace akismet = "http://akismet.com/xquery/api"; import module namespace http = "http://expath.org/ns/http-client"; declare variable $akismet:HTTP-OK := 200; declare variable $akismet:endpoint := "api.antispam.typepad.com"; (: for TypePad :) (: declare variable $akismet:endpoint := "rest.akismet.com"; :) (: for Akismet :) declare variable $akismet:comment-check-service := "1.1/comment-check"; declare variable $akismet:submit-spam-service := "1.1/submit-spam"; declare variable $akismet:submit-ham-service := "1.1/submit-ham"; (:~ : Calls the Akismet comment check service : : @param api-key Your Akismet API key : @param comment : <comment xmlns="http://akismet.com/xquery/api"> : <blog> The front page or home URL of the instance making the request. For a blog or wiki this would be the front page. Note: Must be a full URI, including http://. </blog> (required) : <user_ip> IP address of the comment submitter. </user_ip> (required) : <user_agent> User agent string of the web browser submitting the comment - typically the HTTP_USER_AGENT cgi variable. Not to be confused with the user agent of your Akismet library. </user_agent> (required) : <referrer> The content of the HTTP_REFERER header should be sent here. </referrer> (note spelling) : <permalink> The permanent location of the entry the comment was submitted to. </permalink> : <comment_type> May be blank, comment, trackback, pingback, or a made up value like "registration". </comment_type> : <comment_author> Name submitted with the comment </comment_author> : <comment_author_email> Email address submitted with the comment </comment_author_email> : <comment_author_url> URL submitted with comment </comment_author_url> : <comment_content> The content that was submitted. </comment_content> : </comment> : : @return true() or false() indicating if the comment is spam or not :) declare function akismet:comment-check($api-key as xs:string, $comment as element(akismet:comment)) as xs:boolean? { let $http-request := <http:request href="{akismet:_get-service-uri($api-key, $akismet:comment-check-service)}" method="post" http="1.0" override-media-type="text/plain"> <http:header name="User-Agent" value="eXist-db/1.5 | Hermes/0.2"/> <http:body media-type="application/x-www-form-urlencoded">{ akismet:_params-xml-to-form-urlencoded($comment)}</http:body> </http:request> return let $http-result := http:send-request($http-request) return if(xs:integer($http-result[1]/http:response/@status) eq $akismet:HTTP-OK)then let $akismet-result := $http-result[2] return $akismet-result eq "true" else fn:error(xs:QName("akismet:error"), fn:concat("Akismet service responded with http code: ", $http-result/http:response/@status)) }; (:~ : Calls the Akismet submit spam service : : @param api-key Your Akismet API key : @param spam-comment : <comment xmlns="http://akismet.com/xquery/api"> : <blog> The front page or home URL of the instance making the request. For a blog or wiki this would be the front page. Note: Must be a full URI, including http://. </blog> (required) : <user_ip> IP address of the comment submitter. </user_ip> (required) : <user_agent> User agent string of the web browser submitting the comment - typically the HTTP_USER_AGENT cgi variable. Not to be confused with the user agent of your Akismet library. </user_agent> (required) : <referrer> The content of the HTTP_REFERER header should be sent here. </referrer> (note spelling) : <permalink> The permanent location of the entry the comment was submitted to. </permalink> : <comment_type> May be blank, comment, trackback, pingback, or a made up value like "registration". </comment_type> : <comment_author> Name submitted with the comment </comment_author> : <comment_author_email> Email address submitted with the comment </comment_author_email> : <comment_author_url> URL submitted with comment </comment_author_url> : <comment_content> The content that was submitted. </comment_content> : </comment> : : @return true() or false() indicating if the spam was submitted or not :) declare function akismet:submit-spam($api-key as xs:string, $spam-comment as element(akismet:comment)) as xs:boolean { let $http-request := <http:request href="{akismet:_get-service-uri($api-key, $akismet:submit-spam-service)}" method="post" http="1.0" override-media-type="text/plain"> <http:header name="User-Agent" value="eXist-db/1.5 | Hermes/0.2"/> <http:body media-type="application/x-www-form-urlencoded">{ akismet:_params-xml-to-form-urlencoded($spam-comment)}</http:body> </http:request> return let $http-result := http:send-request($http-request) return $http-result[1]/http:response/@status eq $akismet:HTTP-OK }; (:~ : Calls the Akismet submit ham service : : @param api-key Your Akismet API key : @param spam-comment : <comment xmlns="http://akismet.com/xquery/api"> : <blog> The front page or home URL of the instance making the request. For a blog or wiki this would be the front page. Note: Must be a full URI, including http://. </blog> (required) : <user_ip> IP address of the comment submitter. </user_ip> (required) : <user_agent> User agent string of the web browser submitting the comment - typically the HTTP_USER_AGENT cgi variable. Not to be confused with the user agent of your Akismet library. </user_agent> (required) : <referrer> The content of the HTTP_REFERER header should be sent here. </referrer> (note spelling) : <permalink> The permanent location of the entry the comment was submitted to. </permalink> : <comment_type> May be blank, comment, trackback, pingback, or a made up value like "registration". </comment_type> : <comment_author> Name submitted with the comment </comment_author> : <comment_author_email> Email address submitted with the comment </comment_author_email> : <comment_author_url> URL submitted with comment </comment_author_url> : <comment_content> The content that was submitted. </comment_content> : </comment> : : @return true() or false() indicating if the spam was submitted or not :) declare function akismet:submit-spam($api-key as xs:string, $ham-comment as element(akismet:comment)) as xs:boolean { let $http-request := <http:request href="{akismet:_get-service-uri($api-key, $akismet:submit-spam-service)}" method="post" http="1.0" override-media-type="text/plain"> <http:header name="User-Agent" value="eXist-db/1.5 | Hermes/0.2"/> <http:body media-type="application/x-www-form-urlencoded">{ akismet:_params-xml-to-form-urlencoded($ham-comment)}</http:body> </http:request> return let $http-result := http:send-request($http-request) return $http-result[1]/http:response/@status eq $akismet:HTTP-OK }; declare function akismet:_get-service-uri($api-key as xs:string, $service as xs:string) as xs:string { fn:concat("http://", $api-key, ".", $akismet:endpoint, "/", $service) }; declare function akismet:_params-xml-to-form-urlencoded($params as element()) as xs:string { fn:string-join( for $param in $params/child::element() return fn:concat(fn:local-name($param), "=", fn:encode-for-uri($param/text())) , "&amp;" ) };

And so far so good, since switching reCaptcha for Asirra and adding TypePad AntiSpam filtering, I havent received any spam comments. But, now that I have written this...

Asirra in XQuery

Previously I wrote an XQuery Module for handling reCaptcha Captchas, as I wanted to protect my personal blog from being spammed.

Unfortunately in the long term reCaptcha did not really work out, as the Spammers were still posting to the comments section of my blog. Its a shame really as I agree with reCaptcha's efforts of digitising books.

I have read several articles about reCaptcha Captchas being cracked, so I decided to try and find a more robot proof approach. After a little Googling, I found Asirra.

Asirra, is another Captcha system, but rather than asking you to compute a sum or enter the words that appear in a deformed image, they instead show you 12 pictures, some of Cats and some of Dogs. You have to correctly select all the Cats. This seems to me like a harder problem to solve with a robot, and so I decided to replace my reCaptcha with Asirra.

I wrote a small reusable XQuery module (downloadable from here), which makes use of the EXPath HttpClient functions, so whilst this will work on eXist-db, it should also be useable on any XQuery processor that supports EXPath.

Example (X)HTML Page (example.html)

<?xml version="1.0" encoding="UTF-8"?> <html xmlns="http://www.w3.org/1999/xhtml"> <head> <title>Asirra Example</title> </head> <body> <form action="example.xql" method="post" id="commentform" onsubmit="return MySubmitForm();"> <!-- start Client API Asirra code --> <div id="asirra_auth"> <a id="asirra_logo" href="http://research.microsoft.com/en-us/um/redmond/projects/asirra/"> <img src="http://research.microsoft.com/en-us/um/redmond/projects/asirra/AsirraLogoWithName-Medium.png"/> </a> <script type="text/javascript" src="http://challenge.asirra.com/js/AsirraClientSide.js"/> <script type="text/javascript"> <![CDATA[ // You can control where the big version of the photos appear by // changing this to top, bottom, left, or right asirraState.SetEnlargedPosition("top"); // You can control the aspect ratio of the box by changing this constant asirraState.SetCellsPerRow(6); ]]> <script> <script type="text/javascript"> <![CDATA[ var passThroughFormSubmit = false; function MySubmitForm() { if(passThroughFormSubmit) { return true; } // Do site-specific form validation here, then... Asirra_CheckIfHuman(HumanCheckComplete); return false; } function HumanCheckComplete(isHuman) { if(!isHuman) { alert("Please correctly identify the cats."); } else { passThroughFormSubmit = true; formElt = document.getElementById("commentform"); formElt.submit(); } } ]]> </script> </div> <!-- end Client API Asirra code --> <input type="submit"/> </form> </body> </html>

Example XQuery handler (example.xql)

xquery version "1.0"; import module namespace request = "http://exist-db.org/xquery/request"; import module namespace asirra = "http://asirra.com/xquery/api" at "xmldb:exist:///db/asirra.xqm"; asirra:validate-ticket(request:get-parameter("Asirra_Ticket",()))

Asirra XQuery Module (asirra.xqm)

xquery version "1.0"; (:~ : XQuery Module implementation for the Asirra API - http://research.microsoft.com/en-us/um/redmond/projects/asirra/ : : @author Adam Retter <adam@exist-db.org> : @date 2011-06-24T21:26:00+02:00 :) module namespace asirra = "http://asirra.com/xquery/api"; import module namespace http = "http://expath.org/ns/http-client"; declare variable $asirra:HTTP-OK := 200; declare variable $asirra:validation-endpoint := "http://challenge.asirra.com/cgi/Asirra?action=ValidateTicket&amp;ticket="; (:~ : Validate an Asirra Ticket : : @param $asirra-ticket The Asirra ticket to validate : : @return true() or false() indicating whether the ticket was valid :) declare function asirra:validate-ticket($asirra-ticket as xs:string) as xs:boolean { let $url := fn:concat($asirra:validation-endpoint, $asirra-ticket) return let $http-result := http:send-request(<http:request href="{$url}" method="get"/>) return if(xs:integer($http-result/http:response/@status) eq $asirra:HTTP-OK)then let $asirra-result := $http-result[2] return $asirra-result/AsirraValidation/Result eq "Pass" else false() };

Pre-release 1.4.1 rev14769

Today the development team released another pre-release version of eXist-db, rev14769. It contains a number of backports of "trunk". Highlights:

  • bugfix: NPE when serialization options param was zero length string. Port of rev 14690
  • performance: Faster sequence constructors in XQuery: old code parsed (1, 2, 3) into (1, (2, 3)). Processing this recursively eventually caused a stack overflow and was slow. Port of rev 13874, rev 13875
  • bugfix: Local XMLDB API set permissions on the wrong collection - looks like this is an old bug. Port of rev 14735

The revision can be downloaded as an installer jar, exe and as a war file. Please share your experiences (bug reports, general feedback) on the exist-open mailinglist so we can release a final version soon!