New Full Text Index is Based on Lucene

The current SVN trunk version of eXist features a new full text indexing module which could be the foundation for a faster, better configurable and feature rich alternative to eXist's builtin full text index. The new search facility in AtomicWiki (check the Quick Search box to the right) is based on it, so you can immediately see the index in action by executing a search here.

Read article ...

Eclipse Plugin for eXist

Eclipse is one of the most popular development platforms. Because of that we were looking for a way to access the eXist database directly from Eclipse in a convenient way. The result of our thoughts is the eXist Eclipse Plugin.

Read article ...

eXist 1.2.4 Released

Besides fixing critical bugs in the storage backend, the 1.2.4 release mainly improves the memory consumption of queries on large document sets. Major changes include:

  • new node set implementation, which is much more memory efficient compared to previous approaches. The old implementation consumed a lot of memory when used with larger sets of documents. Obviously this had a negative effect on overall performance.
  • reduce memory consumption of documents constructed during a query: if you have a query which creates thousands of small XML fragments, each of those fragments used to have its own document context with its own name pool and various fields which may have never been needed. Large parts of the document context are now shared between fragments and we make more use of lazy initialization, thus reducing the memory consumption of in-memory fragments dramatically (in my tests, I could save up to 100mb memory when creating a few thousand XML fragments in one query).
  • fixed fatal btree bugs leading to index corruptions (which usually caused an ArrayIndexOutOfBounds exception). The bugs were more likely to occur when indexing large string keys, but they may also have happened in other situations. The failure damaged the index and rendered the db unusable (though it could be repaired).
  • fixed concurrency issues leading to ArrayIndexOutOfBounds or NoSuchElement exception when querying for attributes
  • memory leak: we observed that the xerces XML parser builds some internal data structures when validating a document, which are unfortunately not properly cleared afterwards. This is a major problem since eXist pools the XML parser instances. To work around those issues, eXist will no longer pool XML parsers which were used on larger documents.
  • using full text and ngram indexes at the same time caused eXist to hang in an endless loop

The release is now available for download.

Note: all releases in the 1.2 branch are bug fix releases and can be considered stable. They only contain hand-selected changes which were ported back from the main development version.

Warning: Bad Memory Settings in 1.2.2 and 1.2.3

As reported by users, the 1.2.2 and 1.2.3 releases shipped with a bad memory configuration: in the main configuration file (conf.xml), the cacheSize parameter was set to 256M:

<db-connection cacheSize="256M" collectionCache="24M" database="native" files="webapp/WEB-INF/data" pageSize="4096">

However, Java is started with only 128M max. memory, so using 256M for caches will sooner or later result in eXist hitting the wall.

The problem here is that the effects of an OutOfMemory error are somehow unpredictable and may lead to unnoticed corruptions in the database. Java doesn't show many warnings before it runs out of memory. All you usually get is a message on stderr.

In general, the cacheSize parameter in conf.xmlshould never be set to more than 1/3 of the maximum memory available to Java. Please adjust cacheSize accordingly or increase Java's max memory (usually set through the -Xmx parameter which has to be passed on the java command line - see bin/functions.d/eXist-settings.sh or bin/startup.bat).

Installer Issues

We still had some issues with the installer in the 1.2.2 release. This was a major problem for some users who redistribute eXist with their own application. Version 1.2.3 has been uploaded to solve those issues.

It also fixes the consistency checker, which was introduced with 1.2.1 and unfortunately triggered a false alarm in some cases.

Updating is not really necessary unless you had problems with the installer or rely on the consistency check service. The 1.2 branch is maintained separately from the development branch. This allows us to release selected bug fixes and improvements much more frequently.