Outerthought is now NGData

We’re very happy to announce the release of Lily 1.2. It contains a number of improvements and exciting new features, both on the open source and the enterprise side of things.

Lily Core 1.2

First of all, Lily has been upgraded and verified to be CDH3u3 compatible. We’re also closely tracking CDH4 evolutions and are confident to be among the first to ship a CDH4-compliant version later this year.

The most important two new features of Lily Core 1.2 are along improved support for HBase-style operations, more importantly Scanners and Map/Reduce support.

Lily Scanners

Similar to HBase Scanners, Lily Scanners allow you to scan over records, making use of the natural ordering of records using their row key or record ID. As you might know, Lily supports both system-generated as well as user-specified IDs, and experience taught us the latter are commonly used. Using user-specified IDs, Lily Scanners allow you to influence the storage order of records in Lily, and iterate over them based on ID, range of IDs (start/stop), in an indexed fashion, i.e. fast.

Even better, Scanners support the use of filters to select records based on filter expressions, which means if you’re smart about key design, you might no longer need other search indexes to Lily content (such as Solr) for simple search and retrieval operations.

You can selectively return only a subset of records fields while scanning, and filter based on RecordType and record ID prefix. Also, using filters, this allows for (Solr) index rebuilds of partial data sets.

Lily 1.2 ships with a CLI tool allowing to do simple command line Scans.

Lily Map/Reduce

Even though Map/Reduce was already being used for scalable batch rebuilding of search indexes, it wasn’t generally available for people that wanted to process Lily data in map-reduce style for other purposes.

With the 1.2 release, Lily supports a LilyScanInputFormat which allows you to use Scanners (and filters) to specify the input to a Lily-based map/reduce job. Lily ships with Maven artefacts for generating a skeleton M/R job, too.

Lily testing framework

For those of you using the Lily test framework to develop Lily client applications, you will find startup and tear-down times of the ‘Lily Embed’ tool to be much shorter. Being software engineers ourselves for a long time, we care a lot about the engineer’s experience when developing against Lily. With this improvement, running a Lily development workbench on a laptop or workstation, building and testing has become much more efficient.

Lily Enterprise 1.2

The 1.2 release of Lily Enterprise contains two major improvements: a better, more stable cluster installer, and support for Pentaho Kettle for ETL operations.

Pentaho Data Integration support

Lily is often used as the central data aggregation tier in enterprise deployments, and we want to make the task of bringing data into Lily as easy as possible.

Lily Enterprise 1.2 contains a Kettle LilyOutput plugin. Kettle, or Pentaho Data Integration, is a powerful ETL tool that connects to many different data sources. Using a simple, UI-based configuration, the Lily Kettle integration allows you to effortlessly, and most importantly without any coding, set up a transformation and loading pipeline to inject data into the Lily Data Repository from a variety of sources, such as relational databases, static files, and even enterprise back-end systems.


Starting with this release, we are now committed to regular releases of Lily Core and Enterprise. We will have three major releases per year – with Lily Enterprise customers having access to schedule and planned features.

Thanks for reading this far, and we hope you will enjoy using Lily 1.2 as much as we enjoyed building it.

categories: lily news release
by Steven Noels on 4/19/12
blog comments powered by Disqus