<?xml version='1.0' encoding='UTF-8'?><?xml-stylesheet href="http://www.blogger.com/styles/atom.css" type="text/css"?><feed xmlns='http://www.w3.org/2005/Atom' xmlns:openSearch='http://a9.com/-/spec/opensearchrss/1.0/' xmlns:georss='http://www.georss.org/georss' xmlns:gd='http://schemas.google.com/g/2005' xmlns:thr='http://purl.org/syndication/thread/1.0'><id>tag:blogger.com,1999:blog-6666528949054177156</id><updated>2012-01-27T21:58:41.595-08:00</updated><category term='advanced search'/><category term='bookmarkable URL'/><category term='intelligent web'/><category term='hibernate'/><category term='content intelligence'/><category term='EntityManager'/><category term='jsf'/><category term='openei'/><category term='collaborative filtering'/><category term='seam'/><category term='facebook connect'/><category term='ajax'/><category term='jboss seam'/><category term='term highlighting'/><category term='apache mahout'/><category term='spring security'/><category term='lucene'/><category term='lucene 2.9'/><category term='type-ahead suggestion'/><category term='recommendation engine'/><category term='mapreduce'/><category term='hadoop'/><category term='FullTextEntityManager'/><category term='facelets'/><category term='spring'/><category term='social graph'/><category term='richfaces'/><category term='nsrdb'/><category term='ci'/><category term='social plug-in'/><category term='server-side paging'/><category term='hibernate search'/><category term='distributedcache'/><category term='amazon elastic mapreduce'/><category term='mahout'/><category term='spell correction'/><title type='text'>thelabdude</title><subtitle type='html'>Knowledgebase about Seam, Spring, JSF / RichFaces, Lucene / Solr, Hibernate, and Collective Intelligence.</subtitle><link rel='http://schemas.google.com/g/2005#feed' type='application/atom+xml' href='http://thelabdude.blogspot.com/feeds/posts/default'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6666528949054177156/posts/default?max-results=100'/><link rel='alternate' type='text/html' href='http://thelabdude.blogspot.com/'/><link rel='hub' href='http://pubsubhubbub.appspot.com/'/><author><name>thelabdude</name><uri>http://www.blogger.com/profile/08032248788175889303</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><generator version='7.00' uri='http://www.blogger.com'>Blogger</generator><openSearch:totalResults>8</openSearch:totalResults><openSearch:startIndex>1</openSearch:startIndex><openSearch:itemsPerPage>100</openSearch:itemsPerPage><entry><id>tag:blogger.com,1999:blog-6666528949054177156.post-3069797037270423457</id><published>2011-07-08T16:57:00.000-07:00</published><updated>2011-07-26T11:38:47.207-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='openei'/><category scheme='http://www.blogger.com/atom/ns#' term='amazon elastic mapreduce'/><category scheme='http://www.blogger.com/atom/ns#' term='nsrdb'/><category scheme='http://www.blogger.com/atom/ns#' term='mapreduce'/><category scheme='http://www.blogger.com/atom/ns#' term='distributedcache'/><category scheme='http://www.blogger.com/atom/ns#' term='hadoop'/><title type='text'>Analyzing the National Solar Radiation Database (NSRDB) with Hadoop</title><content type='html'>&lt;style&gt;
.todo {
  font:bold 14px verdana;
  color:#FF0000;
  padding: 15px;
}
.annotation {
  font:normal 12px monospace;
  color:#0033CC;
}
.seam-comp {
  font:bold 12px monospace;
  color:#660099;
}
.sectionHdr {
  font-size:15px;
  font-weight:bold;
  color:#CC6633;
  margin-top:20px;
  margin-bottom:15px;
}
.subHdr {
  font-size:13px;
  font-weight:bold;
  color:#CC6633;
  margin-top:15px;
  margin-bottom:10px;
}
.sub2Hdr {
  font-size:12px;
  font-weight:bold;
  font-style:italic;
  color:#CC6633;
  margin-top:10px;
  margin-bottom:5px;
}
.file-path {
  color:#336600;
}
.code {
  font:normal 12px monospace;
  color:#333333;
}
.tip {
  padding: 5px;
  background: #FFFFCC;
  border: 1px solid #F0F0F0;
  width: 600px;
  margin:20px;
}
.java-class {
  font:bold 12px monospace;
  color:#003399;
}
.java-intf {
  font:italic 12px monospace;
  color:#003399;
}
.java-kw {
  font:normal 12px monospace;
  color:#0000FF;
}
.java-id {
  font:normal 12px monospace;
  color:#FF0000;
}
.java-str {
  font:normal 12px monospace;
  color:#FF00CC;
}
.java-cmt {
  font:normal 12px monospace;
  color:#339900;
}
.el {
  font:normal 12px monospace;
  color:#990000;
}
.xml-elm {
  font:normal 12px monospace;
  color:#999933;
}
.hashLink {
  font-size:11px;
  font-style:italic;
}
.screenshot {
  margin: 20px;
}
td {
font-family:verdana;font-size:12px;
}
&lt;/style&gt; &lt;span style="font-family:verdana;font-size:12px;"&gt; &lt;div class="sectionHdr"&gt;Introduction&lt;/div&gt;&lt;p&gt;In this posting, I provide a basic framework and some ideas for analyzing the National Solar Radiation Data Base (NSRDB) provided by OpenEI using MapReduce, Hadoop, and the Amazon cloud. First, a little introduction of OpenEI is in order; from the &lt;a href="http://en.openei.org/wiki/Main_Page" target="_blank"&gt;openei.org&lt;/a&gt; Web site,  &lt;div class="tip"&gt;Open Energy Information (OpenEI) is a knowledge-sharing online community dedicated to connecting people with the latest information and data on energy resources from around the world. Created in partnership with the United States Department of Energy and federal laboratories across the nation, OpenEI offers access to real-time data and unique visualizations that will help you find the answers you need to make better, more informed decisions. And because the site is designed on a linked, open-data platform, your contributions are both welcomed and encouraged.&lt;/div&gt;Sounds pretty cool ... let's roll up our sleeves and see what we can do with this data using Hadoop. If you need a refresher on Hadoop, have a look at &lt;a href="http://developer.yahoo.com/hadoop/tutorial/module1.html" target="_blank"&gt;Yahoo's Hadoop Tutorial&lt;/a&gt;.&lt;/p&gt;&lt;p&gt;The current NSRDB data set contains &amp;gt;191M observations collected from Solar stations around the US from 1991-2005. Of course this data set is tiny by Hadoop standards, but the current data set is only a sub-set of available solar data. Also, there are other &lt;b&gt;larger&lt;/b&gt; solar-related data sets that can use the same approach described in this posting to distribute expensive computations across large numbers of commodity servers using Hadoop. Moreover, this data can be joined with other data sets and analyzed using various Hadoop sub-projects, such as Hive or Pig. So this posting is mostly about how to get OpenEI data into a format that can be analyzed using MapReduce jobs (and other tools, e.g. Hive or Pig) in a Hadoop compute cluster. Specifically, I provide examples with source code for the following topics: &lt;ul&gt;&lt;li&gt;File size considerations for storing data in HDFS and S3&lt;/li&gt;
&lt;li&gt;Loading static metadata into Hadoop's DistributedCache and special handling needed to make this work with S3 and Elastic MapReduce.&lt;/li&gt;
&lt;li&gt;Running a MapReduce job locally in pseudo-distributed mode and on Amazon's Elastic MapReduce&lt;/li&gt;
&lt;li&gt;Using MRUnit to unit test the MapReduce jobs before deploying to Hadoop&lt;/li&gt;
&lt;li&gt;Using Hadoop Counters to help you diagnose unexpected job results and make writing unit tests easier&lt;/li&gt;
&lt;/ul&gt;&lt;/p&gt;&lt;div class="sectionHdr"&gt;Building the Application&lt;/div&gt;&lt;p&gt;First, you need to download the source code from &lt;a href="https://github.com/thelabdude/nsrdb" target="_blank"&gt;github/thelabdude/nsrdb&lt;/a&gt; &lt;i&gt;(note: I don't reproduce all the source code in this blog post, so you'll have to download it to get the full details)&lt;/i&gt;. Next, build the project with Maven 2.2.x using: &lt;div class="code"&gt;&lt;pre&gt;mvn clean package
&lt;/pre&gt;&lt;/div&gt;I use the Maven &lt;a href="http://maven.apache.org/plugins/maven-assembly-plugin/" target="_blank"&gt;assembly plug-in&lt;/a&gt; to create an executable JAR with all dependencies (nsrdb-0.0.2-jar-with-dependencies.jar) and the Hadoop Job JAR with dependencies (nsrdb-0.0.2-hadoop-jobs.jar).&lt;br /&gt;
&lt;div class="code"&gt;&lt;pre&gt;&amp;lt;build&amp;gt;
    &amp;lt;plugins&amp;gt;
      &amp;lt;plugin&amp;gt;
        &amp;lt;artifactId&amp;gt;maven-assembly-plugin&amp;lt;/artifactId&amp;gt;
        &amp;lt;version&amp;gt;2.2&amp;lt;/version&amp;gt;
        &amp;lt;executions&amp;gt;
          &amp;lt;execution&amp;gt;
            &amp;lt;id&amp;gt;make-exe-jar&amp;lt;/id&amp;gt;
            &amp;lt;phase&amp;gt;package&amp;lt;/phase&amp;gt;
            &amp;lt;goals&amp;gt;
              &amp;lt;goal&amp;gt;single&amp;lt;/goal&amp;gt;
            &amp;lt;/goals&amp;gt;
            &amp;lt;configuration&amp;gt;
              &amp;lt;descriptorRefs&amp;gt;
                &amp;lt;descriptorRef&amp;gt;jar-with-dependencies&amp;lt;/descriptorRef&amp;gt;
              &amp;lt;/descriptorRefs&amp;gt;
              &amp;lt;archive&amp;gt;
                &amp;lt;manifest&amp;gt;
                  &amp;lt;mainClass&amp;gt;thelabdude.nsrdb.GetData&amp;lt;/mainClass&amp;gt;
                &amp;lt;/manifest&amp;gt;
              &amp;lt;/archive&amp;gt;
            &amp;lt;/configuration&amp;gt;
          &amp;lt;/execution&amp;gt;
          &amp;lt;execution&amp;gt;
            &amp;lt;id&amp;gt;make-jobs-jar&amp;lt;/id&amp;gt;
            &amp;lt;phase&amp;gt;package&amp;lt;/phase&amp;gt;
            &amp;lt;goals&amp;gt;
              &amp;lt;goal&amp;gt;single&amp;lt;/goal&amp;gt;
            &amp;lt;/goals&amp;gt;
            &amp;lt;configuration&amp;gt;
              &amp;lt;descriptors&amp;gt;
                &amp;lt;descriptor&amp;gt;src/main/assembly/hadoop-jobs-jar.xml&amp;lt;/descriptor&amp;gt;
              &amp;lt;/descriptors&amp;gt;
            &amp;lt;/configuration&amp;gt;
          &amp;lt;/execution&amp;gt;
        &amp;lt;/executions&amp;gt;
      &amp;lt;/plugin&amp;gt;
    &amp;lt;/plugins&amp;gt;
  &amp;lt;/build&amp;gt;
&lt;/pre&gt;&lt;/div&gt;&lt;/p&gt;&lt;div class="sectionHdr"&gt;Download the Data&lt;/div&gt;&lt;p&gt;Next, we need to download the data from the OpenEI.org site using some good ol&amp;apos; screen scraping. The data is contained in 1,454 GZipped TAR files (tar.gz) on &lt;a href="http://en.openei.org/datasets/files/39/pub/" target="_blank"&gt;http://en.openei.org/datasets/files/39/pub/&lt;/a&gt;. The process I used is as follows: &lt;ol&gt;&lt;li&gt;Parse the contents of the data listing HTML page to extract links matching a specified regular expression, in this case &lt;span class="el"&gt;^\d.*\.tar\.gz$&lt;/span&gt;.&lt;/li&gt;
&lt;li&gt;Download each data file and save locally so that I don't have to return to the site; build in a 2 sec delay between requests so that OpenEI doesn't get overloaded by the app.&lt;/li&gt;
&lt;li&gt;Uncompress each GZipped TAR file; each TAR contains 15 CSV files containing data for a station for each year in the data set&lt;/li&gt;
&lt;li&gt;Append the contents of each entry in the TAR file into a single Hadoop SequenceFile that is stored in HDFS.&lt;/li&gt;
&lt;/ol&gt;&lt;/p&gt;&lt;p&gt;&lt;div class="subHdr"&gt;Motivation for Using a Single Hadoop SequenceFile&lt;/div&gt;My current goal is to store the data in a format that makes writing MapReduce jobs easy. Hadoop works better with a small number of large files that can be split and distributed by HDFS vs. a large number of small files. By small, I mean smaller than the HDFS block size (64MB is the default). Since the NSRDB is not actually that big, I chose to merge all data into a single, compressed &lt;a href="http://hadoop.apache.org/common/docs/r0.20.2/api/org/apache/hadoop/io/SequenceFile.html" target="_blank"&gt;SequenceFile&lt;/a&gt;. Internally, Hadoop will split the SequenceFile into &lt;b&gt;replicated blocks&lt;/b&gt; that are distributed around the HDFS cluster. This means that each Map task will work on a sub-set of data that is most likely stored on the same server that the task is running on (i.e. &lt;a href="http://developer.yahoo.com/hadoop/tutorial/module1.html#data" target="_blank"&gt;data locality&lt;/a&gt;). For example, on Elastic MapReduce, the default block size is 128MB, so Hadoop will run 18 map tasks for our 2.1GB of input data. I should mention that if you store this SequenceFile in S3, then you do not benefit from data locality automatically, so I'll return to this discussion below.&lt;/p&gt;&lt;p&gt;I use a custom class for the key in my SequenceFile: &lt;span class="java-class"&gt;thelabdude.nsrdb.StationYearWritable&lt;/span&gt;. This class holds a &lt;span class="code"&gt;long&lt;/span&gt; station ID and an &lt;span class="code"&gt;int&lt;/span&gt; year and implements Hadoop's &lt;a href="http://hadoop.apache.org/common/docs/r0.20.2/api/org/apache/hadoop/io/Writable.html" target="_blank"&gt;Writable&lt;/a&gt; interface for fast Serialization. The value for each key in the file is simply the entire contents of each TAR entry stored in a BytesWritable object. In other words, I'm storing the CSV data in a large BLOB for each year and station (1,454 stations * 15 years = 21,810 records). We'll save the line-by-line parsing of each CSV file for the MapReduce jobs so we can distribute the processing across a cluster.&lt;/p&gt;&lt;p&gt;&lt;div class="subHdr"&gt;GetData in Action&lt;/div&gt;So now that you understand the basic process I'm using to get the data into Hadoop, let's see how to run it. First, make sure Hadoop is setup in your environment. For now, I recommend using the pseudo-distributed configuration (see &lt;a href="http://hadoop.apache.org/common/docs/r0.20.2/quickstart.html" target="_blank"&gt;Hadoop 0.20.2 QuickStart&lt;/a&gt;). Once Hadoop is running, run the GetData application, output is saved to the HDFS you configured in conf/core-site.xml: &lt;div class="code"&gt;&lt;pre&gt;(java -jar nsrdb-0.0.2-jar-with-dependencies.jar \
  -nsrdbIndexUrl http://en.openei.org/datasets/files/39/pub/ \
  -downloadDir /mnt/openei/nsrdb/input \
  -hadoopDataPath hdfs://localhost:9000/openei/nsrdb/nsrdb.seq &amp;&gt;getdata.log &amp;)
&lt;/pre&gt;&lt;/div&gt;I'll refer you to the source code for &lt;span class="java-class"&gt;thelabdude.nsrdb.GetData&lt;/span&gt; for all the details. This application took about 2 hours to download the 1454 tar.gz files from OpenEI.org and merge them into a single SequenceFile stored in HDFS (tail the getdata.log file to see the application's progress). While this is slow, you should only have to do this once. When it finishes, you can make sure it is available in HDFS using: &lt;div class="code"&gt;&lt;pre&gt;$ bin/hadoop fs -lsr /openei/nsrdb
&lt;/pre&gt;&lt;/div&gt;You should see the following output: &lt;div class="code"&gt;&lt;pre&gt;/openei/nsrdb/nsrdb.seq (2346388001 bytes)
&lt;/pre&gt;&lt;/div&gt;&lt;i&gt;Note: Don't delete the raw data files from the download directory just yet as you may want to re-run the GetData application (with -skipDownloadStep) to create more than one SequenceFile if you plan to use Amazon S3 and Elastic MapReduce (see below).&lt;/i&gt; &lt;/p&gt;&lt;div class="sectionHdr"&gt;Analyzing NSRDB using MapReduce&lt;/div&gt;&lt;p&gt;Now that the data is in HDFS, it can be used as input to MapReduce jobs. Let's take a look at a few simple jobs that illustrate how to process the data.&lt;/p&gt;&lt;p&gt;&lt;div class="subHdr"&gt;Maximum Radiation Measured by Year&lt;/div&gt;Which station is the sunniest in the winter months of each year? While this probably doesn't mean much to energy researchers, and I'm using the term "sunniest" very loosely, it does give us a simple example to exercise the NSRDB data using MapReduce. Here's how to run the job in your local Hadoop: &lt;div class="code"&gt;&lt;pre&gt;bin/hadoop jar nsrdb-0.0.2-hadoop-jobs.jar thelabdude.nsrdb.MaxValueByYearJob \
  /openei/nsrdb/nsrdb.seq /openei/nsrdb/maxValueByYear \
  /home/ubuntu/NSRDB_StationsMeta.csv 1
&lt;/pre&gt;&lt;/div&gt;Notice that you need to pass the location of the NSRDB_StationsMeta.csv file to the job. Download the file from &lt;a href="http://en.openei.org/datasets/files/39/pub/documentation/NSRDB_StationsMeta.csv" target="_blank"&gt;OpenEI&lt;/a&gt; and save it to the local filesystem where you run the job. The job uses Hadoop's DistributedCache to replicate the static station metadata CSV to all nodes in the cluster. Using Hadoop 0.20.2 in pseudo-distributed mode on a small Amazon EC2 instance (32-bit Ubuntu), the job finished in about 50 minutes (of course, we'll improve on this time in a real cluster). Once it completes, look at the results using: &lt;div class="code"&gt;&lt;pre&gt;hadoop fs -text /openei/nsrdb/maxValueByYear/part-r-00000
&lt;/pre&gt;&lt;/div&gt;Let's look at the Map and Reduce steps in detail; the MapReduce job flow looks like this:&lt;br /&gt;
&lt;b&gt;Map(k1,v1) &amp;rarr; list(k2,v2)&lt;/b&gt; where k1 is Station ID + Year, v1 is a CSV file stored as a array of bytes, k2 is a year, such as 2001, and v2 is the station ID and sum of a solar radiation field. Using the new Hadoop API, the mapper class is defined as:&lt;br /&gt;
&lt;div class="code"&gt;&lt;pre&gt;static class MaxValueByYearMapper
 extends Mapper&amp;lt;StationYearWritable, BytesWritable, IntWritable, StationDataWritable&amp;gt; {
  ...
}
&lt;/pre&gt;&lt;/div&gt;&lt;b&gt;Reduce(k2, list (v2)) &amp;rarr; list(v3)&lt;/b&gt; where k2 and v2 are the same as what was output from the mapper and v3 is metadata for the station the measured the most radiation per year. All v2 values for the same key k2 are sorted by Hadoop during the shuffle and sent to the same reducer. The reducer class is defined as:&lt;br /&gt;
&lt;div class="code"&gt;&lt;pre&gt;static class MaxValueByYearReducer
  extends Reducer&amp;lt;IntWritable, StationDataWritable, IntWritable, Text&amp;gt; {
  ...
}
&lt;/pre&gt;&lt;/div&gt;&lt;/p&gt;&lt;p&gt;&lt;div class="sub2Hdr"&gt;Map Step&lt;/div&gt;We need to sum up all the radiation measured by each station during the winter each year. In the end, we will have 15 stations identified in our reduce output (one for each year in our dataset). Recall that keys in our SequenceFile are &lt;b&gt;StationYearWritable&lt;/b&gt; objects and values are &lt;b&gt;BytesWritable&lt;/b&gt; objects. Thus, the map function signature is: &lt;div class="code"&gt;&lt;pre&gt;&lt;span class="java-kw"&gt;public void&lt;/span&gt; map(&lt;b&gt;StationYearWritable&lt;/b&gt; key, &lt;b&gt;BytesWritable&lt;/b&gt; value, Context context)
        &lt;span class="java-kw"&gt;throws&lt;/span&gt; IOException, InterruptedException
&lt;/pre&gt;&lt;/div&gt;Now is the time to process the CSV data line-by-line, since presumably our Map task is distributed in a Hadoop cluster. Before processing the CSV data for a station, the Map step consults the station metadata stored in Hadoop's DistributedCache to see if the station is of interest to this job. For this job, I'm only interested in class 1 or 2 stations in the lower 48 US states.&lt;/p&gt;&lt;p&gt;Next, the Map step parses each line from the stored CSV file.&lt;br /&gt;
&lt;b&gt;IMPORTANT&lt;/b&gt;: Be careful when reading from BytesWritable objects because the byte[] returned from getBytes() may have more bytes than expected. You need to read only up to index getLength()-1:&lt;br /&gt;
&lt;div class="code"&gt;&lt;pre&gt;ByteArrayInputStream bytesIn = new ByteArrayInputStream(value.getBytes(), 0, value.getLength());
    reader = new BufferedReader(new InputStreamReader(bytesIn, UTF8));
&lt;/pre&gt;&lt;/div&gt;For this job, I'll sum up the values in the &lt;b&gt;Modeled Global Horizontal&lt;/b&gt; field to find my maximum. I'm also using a common MapReduce technique called a Combiner where I'm doing some of the summarization work in the Map step since we know that each value coming into the map function contains all the data we need for a given station per year. Combiners help reduce the amount of disk I/O needed between the Map and Reduce steps by reducing the number of key/value pairs output by your mapper. Hadoop allows you to configure a separate Combiner class for your job, but since the logic for this job is so easy, I can just do the summarization in the mapper.&lt;/p&gt;&lt;p&gt;&lt;div class="sub2Hdr"&gt;A few words about Hadoop's DistributedCache&lt;/div&gt;Hadoop's DistributedCache replicates static data needed by your job at runtime to all DataNodes in the cluster, which addresses my job's need to get access to station metadata nicely. However, using the DC can be tricky as the current documentation is not clear on how to set it up correctly, especially when working with S3 and Elastic MapReduce. In the job's run method, I have the following code to add the station metadata CSV file to Hadoop's DistributedCache: &lt;div class="code"&gt;&lt;pre&gt;URI localUri = stationMetadataLocal.toUri();
    if ("s3".equals(localUri.getScheme()) || "s3n".equals(localUri.getScheme())) {
      copyFromS3File(conf, stationMetadataLocal, stationMetadataHdfs);
    } else {
      FileSystem fs = FileSystem.get(conf);
      fs.copyFromLocalFile(false, true, stationMetadataLocal, stationMetadataHdfs);
    }
    DistributedCache.addCacheFile(stationMetadataHdfs.toUri(), conf);
&lt;/pre&gt;&lt;/div&gt;You'll notice that I have to handle files in S3 differently than files on the local file system where the job is run. Essentially, if your static data is in S3, you need to copy it to HDFS using two separate file systems: S3N and HDFS, see my copyFromS3File method. &lt;/p&gt;&lt;p&gt;&lt;div class="sub2Hdr"&gt;Reduce Step&lt;/div&gt;The reduce step is simple: it receives a list of values per year and sums them for each station. The framework collects and sorts the values for each year before invoking the reducer during an internal phase called the "shuffle". Once all station data has been summed up, the station with the maximum sum is written to the output. As with the Map step, my Reducer consults the DistributedCache to get station metadata for its output, which is easier to interpret than just a station ID. &lt;/p&gt;&lt;p&gt;&lt;div class="subHdr"&gt;Group by Year&lt;/div&gt;I've also included a simple job to transform the data to group values by year (see &lt;span class="java-class"&gt;thelabdude.nsrdb.GroupByYearJob&lt;/span&gt;). This example shows how you do not need a Reducer for your job if you just want to transform the input data. The output of this job is useful for using Hadoop's streaming interface. &lt;/p&gt;&lt;div class="sectionHdr"&gt;Running in the Cloud&lt;/div&gt;&lt;p&gt;Now, let's run this same job an Amazon Elastic MapReduce cluster. You'll need a few command-line tools to do the work in this section. &lt;ul&gt;&lt;li&gt;s3cmd: &lt;a href="http://s3tools.org/s3cmd" target="_blank"&gt;http://s3tools.org/s3cmd&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;elastic-mapreduce-ruby: &lt;a href="http://aws.amazon.com/developertools/2264?_encoding=UTF8&amp;jiveRedirect=1" target="_blank"&gt;http://aws.amazon.com/developertools/2264?_encoding=UTF8&amp;jiveRedirect=1&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;On Debian-based systems, you can do: &lt;div class="code"&gt;&lt;pre&gt;sudo apt-get install s3cmd
sudo apt-get install openssl
sudo apt-get install ruby1.8
sudo apt-get install libopenssl-ruby1.8
sudo apt-get install libruby1.8-extras
wget http://elasticmapreduce.s3.amazonaws.com/elastic-mapreduce-ruby.zip
&lt;/pre&gt;&lt;/div&gt;Once installed, you'll need your Amazon Access Key and Secret Access Key values to configure s3cmd using: &lt;span class="code"&gt;s3cmd --configure&lt;/span&gt;. Follow the instructions for elastic-mapreduce to get that setup as well. If you need a refresher on Amazon access credentials, please read: &lt;a href="http://alestic.com/2009/11/ec2-credentials" target="_blank"&gt;Understanding Access Credentials for AWS/EC2&lt;/a&gt;. &lt;div class="tip"&gt;Be aware of the implications of the region setting in the elastic-mapreduce &lt;b&gt;credentials.json&lt;/b&gt; file as it determines which region your Elastic MapReduce instances launch in. If you store your data in a US Standard S3 bucket, then I recommend launching your EMR instances in the "us-east-1" region as there are no data transfer fees between your EMR instances and S3. On the other hand, if you launch your cluster in the "us-west-1" region, you will incur data transfer fees if your data is in a US Standard S3 bucket, so only use "us-west-1" if your S3 bucket is in Northern California.&lt;/div&gt;&lt;/p&gt;&lt;p&gt;&lt;div class="subHdr"&gt;File Size Revisited for Amazon S3&lt;/div&gt;Previously, I made the statement that Hadoop works better with a small number of large files vs. a large number of small files. While this is still true with EMR, we need bend this rule slightly when working with files stored in S3 as using only one large file will lead to a major degradation in performance. Your job may actually run slower than it does in pseudo-distributed mode on a single machine since you'll also have Hadoop job management and S3 latency involved. Intuitively, this makes sense because Hadoop sets the number of map tasks based on the number of splits. So no matter how big your cluster is, only one node can process the input from one file on S3. The solution is split the data into smaller chunks, such as double the number of nodes in your cluster. The GetData tool I developed allows you to specify the number of SequenceFiles to create using the chunkSize parameter. For a cluster with 4 nodes, I used &lt;span class="code"&gt;chunkSize=256&lt;/span&gt; which resulted in 9 SequenceFiles instead of 1. The key take-away here is that now I can have 9 map tasks running in parallel instead of only one.&lt;/p&gt;&lt;p&gt;Copy the SequenceFile(s) created by GetData from your local HDFS to an S3 bucket you have access to using Hadoop's distcp: &lt;div class="code"&gt;&lt;pre&gt;hadoop distcp /openei/nsrdb s3n://ACCESS_KEY:SECRET_KEY@BUCKET[/PREFIX]
&lt;/pre&gt;&lt;/div&gt;Notice that I pass my Amazon ACCESS_KEY and SECRET_KEY in the S3 URI and that I'm using s3n as the protocol (native S3), see: &lt;a href="http://wiki.apache.org/hadoop/AmazonS3" target="_blank"&gt;Hadoop and Amazon S3&lt;/a&gt;. The &lt;a href="http://hadoop.apache.org/common/docs/r0.20.2/distcp.html" target="_blank"&gt;distcp&lt;/a&gt; command uses MapReduce to distribute a large copy around your Hadoop cluster. The files copied from HDFS will be the input to your MapReduce jobs on EMR.&lt;/p&gt;&lt;p&gt;Next, you need to put the Jobs JAR file in S3, for this I like to use s3cmd: &lt;div class="code"&gt;&lt;pre&gt;s3cmd put s3://BUCKET[/PREFIX] 
&lt;/pre&gt;&lt;/div&gt;For example: &lt;div class="code"&gt;&lt;pre&gt;s3cmd put nsrdb-0.0.2-hadoop-jobs.jar s3://thelabdude/openei/
&lt;/pre&gt;&lt;/div&gt;Now, you're all set for running the job in EMR. The approach I like to take is to start a job flow without any jobs configured (actually, this is pretty much the standard approach when working with EMR). Once the cluster is running, I add job flow steps to it, which is more efficient because if the job fails due to a invalid or missing parameter, I can correct and re-start the step without waiting for the cluster to restart. To start a job flow, do the following: &lt;div class="code"&gt;&lt;pre&gt;elastic-mapreduce --create --alive \
  --log-uri s3n://BUCKET/openei/logs/ \
  --key-pair YOUR_KEY_PAIR_NAME \
  --slave-instance-type m1.small \
  --master-instance-type m1.small \
  --num-instances 9 \
  --name openei.nsrdb
&lt;/pre&gt;&lt;/div&gt;This will create a new Job Flow named "openei.nsrdb" with 1 master node and 8 data nodes (m1.small instances); it can take a few minutes before the job flow is started. Take note of the Job Flow ID returned as you will need this to access logs and add job steps. You can add a new "step" (ie. Hadoop job) once you have the Job Flow ID: &lt;div class="code"&gt;&lt;pre&gt;elastic-mapreduce --jar s3n://BUCKET/openei/nsrdb-0.0.2-hadoop-jobs.jar \
  --main-class thelabdude.nsrdb.MaxValueByYearJob \
  --arg s3n://BUCKET/openei/nsrdb-chunks \
  --arg s3n://BUCKET/openei/maxValueByYear/ \
  --arg s3n://BUCKET/openei/NSRDB_StationsMeta.csv \
  --arg 8 \
  -j JOB_ID
&lt;/pre&gt;&lt;/div&gt;Notice that I'm now requesting 8 reducers; Hadoop recommends a range of ((0.95 to 1.75) * cluster size). Also, I'm reading the input data directly from s3n://BUCKET/openei/nsrdb-chunks. If you're going to run multiple jobs on the data in the same EMR JobFlow, then I recommend using distcp to copy the data from S3 to the local HDFS for your Elastic MapReduce cluster using:  &lt;div class="code"&gt;&lt;pre&gt;elastic-mapreduce --jar s3n://elasticmapreduce/samples/distcp/distcp.jar \
  --arg s3n://BUCKET/openei/nsrdb-chunks \
  --arg /openei/
  -j JOB_ID
&lt;/pre&gt;&lt;/div&gt;This will help your jobs run much faster because the input data is already loaded into the local HDFS of your EMR cluster (data locality). To view the logs for your Job Flow, simply do: &lt;div class="code"&gt;&lt;pre&gt;elastic-mapreduce --logs -j JOB_ID
&lt;/pre&gt;&lt;/div&gt;With a 8 node cluster of EC2 small instances, the job completed in roughly 8 minutes (after using distcp to copy the data from S3). The output from the job is included here: &lt;div class="code"&gt;&lt;pre&gt;2011-07-08 23:15:45,796 INFO org.apache.hadoop.mapred.JobClient (main): Job complete: job_201107082258_0002
2011-07-08 23:15:45,804 INFO org.apache.hadoop.mapred.JobClient (main): Counters: 19
2011-07-08 23:15:45,804 INFO org.apache.hadoop.mapred.JobClient (main):   Job Counters 
2011-07-08 23:15:45,804 INFO org.apache.hadoop.mapred.JobClient (main):     Launched reduce tasks=8
2011-07-08 23:15:45,804 INFO org.apache.hadoop.mapred.JobClient (main):     Rack-local map tasks=5
2011-07-08 23:15:45,804 INFO org.apache.hadoop.mapred.JobClient (main):     Launched map tasks=19
2011-07-08 23:15:45,804 INFO org.apache.hadoop.mapred.JobClient (main):     Data-local map tasks=14
2011-07-08 23:15:45,804 INFO org.apache.hadoop.mapred.JobClient (main):   thelabdude.nsrdb.MaxValueByYearJob$StationDataCounters
2011-07-08 23:15:45,804 INFO org.apache.hadoop.mapred.JobClient (main):     STATION_IGNORED=10095
2011-07-08 23:15:45,804 INFO org.apache.hadoop.mapred.JobClient (main):   FileSystemCounters
2011-07-08 23:15:45,805 INFO org.apache.hadoop.mapred.JobClient (main):     FILE_BYTES_READ=131196
2011-07-08 23:15:45,805 INFO org.apache.hadoop.mapred.JobClient (main):     HDFS_BYTES_READ=2347266412
2011-07-08 23:15:45,805 INFO org.apache.hadoop.mapred.JobClient (main):     FILE_BYTES_WRITTEN=276357
2011-07-08 23:15:45,805 INFO org.apache.hadoop.mapred.JobClient (main):     HDFS_BYTES_WRITTEN=1307
2011-07-08 23:15:45,805 INFO org.apache.hadoop.mapred.JobClient (main):   Map-Reduce Framework
2011-07-08 23:15:45,805 INFO org.apache.hadoop.mapred.JobClient (main):     Reduce input groups=15
2011-07-08 23:15:45,805 INFO org.apache.hadoop.mapred.JobClient (main):     Combine output records=0
2011-07-08 23:15:45,805 INFO org.apache.hadoop.mapred.JobClient (main):     Map input records=21810
2011-07-08 23:15:45,805 INFO org.apache.hadoop.mapred.JobClient (main):     Reduce shuffle bytes=141561
2011-07-08 23:15:45,805 INFO org.apache.hadoop.mapred.JobClient (main):     Reduce output records=15
2011-07-08 23:15:45,805 INFO org.apache.hadoop.mapred.JobClient (main):     Spilled Records=23430
2011-07-08 23:15:45,805 INFO org.apache.hadoop.mapred.JobClient (main):     Map output bytes=281160
2011-07-08 23:15:45,805 INFO org.apache.hadoop.mapred.JobClient (main):     Combine input records=0
2011-07-08 23:15:45,805 INFO org.apache.hadoop.mapred.JobClient (main):     Map output records=11715
2011-07-08 23:15:45,805 INFO org.apache.hadoop.mapred.JobClient (main):     Reduce input records=11715
&lt;/pre&gt;&lt;/div&gt;Notice that the job ignored 673 stations (thelabdude.nsrdb.MaxValueByYearJob$StationDataCounters.STATION_IGNORED: 10095 / 15 years). The reducer output looks like: &lt;div class="code"&gt;&lt;pre&gt;1991    722016,2,false,MARATHON AIRPORT,FL,24.733,-81.05,2,-5,24.733,-81.05,2,350400.0
1992    722010,1,false,KEY WEST INTL ARPT,FL,24.55,-81.75,1,-5,24.55,-81.75,1,332856.0
1993    722010,1,false,KEY WEST INTL ARPT,FL,24.55,-81.75,1,-5,24.55,-81.75,1,356962.0
1994    722016,2,false,MARATHON AIRPORT,FL,24.733,-81.05,2,-5,24.733,-81.05,2,343829.0
1995    722016,2,false,MARATHON AIRPORT,FL,24.733,-81.05,2,-5,24.733,-81.05,2,354387.0
1996    722010,1,false,KEY WEST INTL ARPT,FL,24.55,-81.75,1,-5,24.55,-81.75,1,377977.0
1997    722010,1,false,KEY WEST INTL ARPT,FL,24.55,-81.75,1,-5,24.55,-81.75,1,354811.0
1998    722695,2,false,LAS CRUCES INTL,NM,32.283,-106.917,1393,-7,32.283,-106.917,1393,354944.0
1999    722735,2,false,DOUGLAS BISBEE-DOUGLAS INTL A,AZ,31.467,-109.6,1249,-7,31.467,-109.6,1249,370871.0
2000    722735,2,false,DOUGLAS BISBEE-DOUGLAS INTL A,AZ,31.467,-109.6,1249,-7,31.467,-109.6,1249,371146.0
2001    722016,2,false,MARATHON AIRPORT,FL,24.733,-81.05,2,-5,24.733,-81.05,2,374057.0
2002    722016,2,false,MARATHON AIRPORT,FL,24.733,-81.05,2,-5,24.733,-81.05,2,354146.0
2003    722016,2,false,MARATHON AIRPORT,FL,24.733,-81.05,2,-5,24.733,-81.05,2,363305.0
2004    722010,1,false,KEY WEST INTL ARPT,FL,24.55,-81.75,1,-5,24.55,-81.75,1,362566.0
2005    722016,2,false,MARATHON AIRPORT,FL,24.733,-81.05,2,-5,24.733,-81.05,2,373423.0
&lt;/pre&gt;&lt;/div&gt;Not surprisingly, all the values are in either Arizona or Florida!&lt;br /&gt;
&lt;br /&gt;
&lt;b&gt;IMPORTANT&lt;/b&gt;: Be sure to terminate your job flow when you're finished working with it so that you don't incur unnecessary expenses. &lt;div class="code"&gt;&lt;pre&gt;elastic-mapreduce --terminate -j JOB_ID
&lt;/pre&gt;&lt;/div&gt;&lt;div class="tip"&gt;For more tips about running jobs in Elastic MapReduce, check out my contributions to the &lt;a href="http://mahout.apache.org/" target="_blank"&gt;Mahout&lt;/a&gt; wiki based on some benchmarking work I did for &lt;a href="http://www.manning.com/ingersoll/" target="_blank"&gt;Taming Text&lt;/a&gt;. See &lt;b&gt;Building Vectors for Large Document Sets&lt;/b&gt; on &lt;a href="https://cwiki.apache.org/confluence/display/MAHOUT/Mahout+on+Elastic+MapReduce" target="_blank"&gt;Mahout on Elastic MapReduce&lt;/a&gt;. Alternatively, you can also spin-up your own Hadoop cluster instead of EMR using the notes I compiled for &lt;a href="https://cwiki.apache.org/MAHOUT/use-an-existing-hadoop-ami.html" target="_blank"&gt;Mahout on Amazon EC2 &amp;gt; Use an Existing Hadoop AMI&lt;/a&gt;&lt;/div&gt;&lt;/p&gt;&lt;div class="sectionHdr"&gt;Testing Hadoop Jobs with MRUnit&lt;/div&gt;&lt;p&gt;Lastly, I've included a test case (&lt;span class="java-class"&gt;thelabdude.nsrdb.MaxValueByYearMRUnitTest&lt;/span&gt;) to test my MR job using &lt;a href="https://repository.cloudera.com/content/repositories/releases/com/cloudera/hadoop/hadoop-mrunit/0.20.2-737/" target="_blank"&gt;MRUnit&lt;/a&gt;. The MRUnit library provides the mock objects and job drivers that simulate the Hadoop runtime environment for testing your job outside of Hadoop. MRUnit drivers allow you to specify expected output from Mapper and Reducer classes given specific inputs. Notice that my job uses Hadoop Counters to keep track of decisions it makes, such as skipping a station or when it encounters an error when parsing the data. Counters can help you diagnose unexpected job results and make writing unit tests easier since you can assert expected counts based on your test input data. For example, the following unit test checks to see if the job is handling invalid input records correctly: &lt;div class="code"&gt;&lt;pre&gt;@Test
    public void testMapperIgnoreParseError() throws Exception {
      // data that produces parse errors
      String csv = ...

      BytesWritable value = new BytesWritable(csv.getBytes(MaxValueByYearJob.UTF8));              
      mapDriver.withInput(new StationYearWritable(690190,2005), value)
               .runTest();      

      Counters counters = mapDriver.getCounters();
      Counter counter = counters.findCounter(MaxValueByYearJob.StationDataCounters.DATE_PARSE_ERROR);
      assertTrue("Expected 1 date parse rror", counter.getValue() == 1);              
      
      counter = counters.findCounter(MaxValueByYearJob.StationDataCounters.DATA_PARSE_ERROR);
      assertTrue("Expected 1 data parse rror", counter.getValue() == 1);              
    }
&lt;/pre&gt;&lt;/div&gt;Both CSV formatted input records contain data formatting errors. Thus, I expect the MaxValueByYearJob.StationDataCounters.DATE_PARSE_ERROR and DATA_PARSE_ERROR Counters to report 1 parse error each. Try to avoid overloading the meaning of each counter and have a specific counter for each possible error / event in your job as it will help you troubleshoot data problems when running in a large distributed cluster. For information on Hadoop Counters, see &lt;a href="http://www.umiacs.umd.edu/~jimmylin/cloud9/docs/content/counters.html" target="_blank"&gt;Cloud9: Working with counters&lt;/a&gt; &lt;/p&gt;&lt;div class="sectionHdr"&gt;Conclusion&lt;/div&gt;&lt;p&gt;So now we have a basic framework for loading OpenEI data into a Hadoop cluster and have seen how to run MapReduce jobs against it. As you can tell, there are many moving parts and some non-trivial Java coding work to do some simple analysis on the data with Hadoop. While energy researchers may benefit from being able to analyze huge Solar data sets using MapReduce and Hadoop, they will need a higher-level of abstraction than what was presented here. Moreover, 8 minutes really isn't very fast for having run such a simple job on 8 machines with such a small data set, so I'll also need to address performance issues. Consequently, I plan to follow-up this posting with an example of working with this data using Apache &lt;a href="http://hive.apache.org/" target="_blank"&gt;Hive&lt;/a&gt; and &lt;a href="http://pig.apache.org/" target="_blank"&gt;Pig&lt;/a&gt; to see if these data warehouse frameworks make it easier to analyze the NSRDB. I also plan to compare the results with Hive / Pig to using a MySQL database for analyzing this data.&lt;/p&gt;&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6666528949054177156-3069797037270423457?l=thelabdude.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://thelabdude.blogspot.com/feeds/3069797037270423457/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://thelabdude.blogspot.com/2011/07/analyzing-national-solar-radiation.html#comment-form' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6666528949054177156/posts/default/3069797037270423457'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6666528949054177156/posts/default/3069797037270423457'/><link rel='alternate' type='text/html' href='http://thelabdude.blogspot.com/2011/07/analyzing-national-solar-radiation.html' title='Analyzing the National Solar Radiation Database (NSRDB) with Hadoop'/><author><name>thelabdude</name><uri>http://www.blogger.com/profile/08032248788175889303</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6666528949054177156.post-1908737213798752296</id><published>2010-09-23T20:44:00.000-07:00</published><updated>2011-07-26T11:38:10.164-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='mahout'/><category scheme='http://www.blogger.com/atom/ns#' term='ci'/><category scheme='http://www.blogger.com/atom/ns#' term='recommendation engine'/><category scheme='http://www.blogger.com/atom/ns#' term='content intelligence'/><category scheme='http://www.blogger.com/atom/ns#' term='apache mahout'/><category scheme='http://www.blogger.com/atom/ns#' term='collaborative filtering'/><title type='text'>VinWiki Part 4: Making Recommendations with Mahout</title><content type='html'>&lt;style&gt;
.todo {
  font:bold 14px verdana;
  color:#FF0000;
  padding: 15px;
}
.annotation {
  font:normal 12px monospace;
  color:#0033CC;
}
.seam-comp {
  font:bold 12px monospace;
  color:#660099;
}
.sectionHdr {
  font-size:15px;
  font-weight:bold;
  color:#CC6633;
  margin-top:20px;
  margin-bottom:15px;
}
.subHdr {
  font-size:13px;
  font-weight:bold;
  color:#CC6633;
  margin-top:15px;
  margin-bottom:10px;
}
.sub2Hdr {
  font-size:12px;
  font-weight:bold;
  font-style:italic;
  color:#CC6633;
  margin-top:10px;
  margin-bottom:5px;
}
.file-path {
  color:#336600;
}
.code {
  font:normal 12px monospace;
  color:#333333;
}
.tip {
  padding: 5px;
  background: #FFFFCC;
  border: 1px solid #F0F0F0;
  width: 500px;
  margin:20px;
}
.java-class {
  font:bold 12px monospace;
  color:#003399;
}
.java-intf {
  font:italic 12px monospace;
  color:#003399;
}
.java-kw {
  font:normal 12px monospace;
  color:#0000FF;
}
.java-id {
  font:normal 12px monospace;
  color:#FF0000;
}
.java-str {
  font:normal 12px monospace;
  color:#FF00CC;
}
.java-cmt {
  font:normal 12px monospace;
  color:#339900;
}
.el {
  font:normal 12px monospace;
  color:#990000;
}
.xml-elm {
  font:normal 12px monospace;
  color:#999933;
}
.hashLink {
  font-size:11px;
  font-style:italic;
}
.screenshot {
  margin: 20px;
}
td {
font-family:verdana;font-size:12px;
}
&lt;/style&gt; &lt;span style="font-family:verdana;font-size:12px;"&gt; &lt;div class="sectionHdr"&gt;Introduction&lt;/div&gt;&lt;p&gt;This is the final post in a four part series about a wine rating and recommendation Web application built using open source Java technology. The purpose of this series is to document key design and implementation decisions that can be applied to other Web applications. Please read the &lt;a href="http://thelabdude.blogspot.com/2010/05/building-intelligent-web-app-using-seam.html"&gt;first&lt;/a&gt;, &lt;a href="http://thelabdude.blogspot.com/2010/06/vinwiki-part-2-full-text-search-with.html"&gt;second&lt;/a&gt;, and &lt;a href="http://thelabdude.blogspot.com/2010/07/vinwiki-part-3-authentication-with.html"&gt;third&lt;/a&gt; posts to get up-to-speed. You can download the project (with source) from &lt;a href="http://vinwiki.com/thelabdude/vinwiki/vinwiki.zip"&gt;here&lt;/a&gt;.&lt;/p&gt;&lt;p&gt;In this posting, I lay the foundation for making recommendations using &lt;a href="http://mahout.apache.org/" target="_blank"&gt;Apache Mahout&lt;/a&gt; v. 0.3. For a thorough introduction to Mahout, I recommend &lt;a href="http://www.manning.com/owen/" target="_blank"&gt;Mahout in Action&lt;/a&gt;.&lt;/p&gt;&lt;p&gt;&lt;div class="sectionHdr"&gt;Collaborative Filtering in Mahout&lt;/div&gt;Mahout's main goal is to provide scalable machine-learning libraries for classification, clustering, frequent itemset mining, and recommendations. Classification assigns a category (or class) from a fixed set of known categories to an un-categorized document. For example, some feed readers assign articles to broad categories like Sports or Politics using classification techniques. Clustering assigns documents to groups of similar documents using some notion of similarity between the documents in the group. For example, Google News uses clustering to group articles from different publishers that cover the same basic story. Frequent itemset mining determines which items, such as products in a shopping cart, typically occur together.&lt;/p&gt;&lt;p&gt;In this posting, I leverage the &lt;a href="http://en.wikipedia.org/wiki/Collaborative_filtering" target="_blank"&gt;collaborative filtering&lt;/a&gt; features of Mahout to make wine recommendations based on ratings by VinWiki users. Collaborative filtering produces recommendations based on user preferences for items and does not require knowledge of the specific properties of the items. In contrast, content-based recommendation produces recommendations based off of intimate knowledge of the properties of items. This implies, of course, that content-based recommendation engines are domain-specific, whereas Mahout's collaborative filtering approach can work in any domain provided it has sufficient user-item preference data to work with.&lt;/p&gt;&lt;p&gt;For VinWiki, I experimented with three basic types of Mahout Recommenders: &lt;ol&gt;&lt;li&gt;User Similarity&lt;/li&gt;
&lt;li&gt;Item Similarity&lt;/li&gt;
&lt;li&gt;SlopeOne&lt;/li&gt;
&lt;/ol&gt;&lt;div class="tip"&gt;Check out the Mahout Web site for information about other more experimental recommenders, such as one based on Singular value decomposition (SVD).&lt;/div&gt;&lt;br /&gt;
&lt;br /&gt;
To decide which one of these recommenders is best for your application, you need to consider four key questions: &lt;ol&gt;&lt;li&gt;How to represent a user's preference for an item?&lt;/li&gt;
&lt;li&gt;What is the ratio of items to users?&lt;/li&gt;
&lt;li&gt;How do you determine the similarity between users or between items?&lt;/li&gt;
&lt;li&gt;If using UserSimilarity, what is the size of a user neighboorhood?&lt;/li&gt;
&lt;/ol&gt;As I'll demonstrate below, Mahout provides a framework to allow you to answer these questions by analyzing your data. &lt;/p&gt;&lt;p&gt;&lt;div class="subHdr"&gt;User Preferences&lt;/div&gt;What constitutes a user preference for an item in your application? Is it a boolean "like" or "dislike" or does the preference have a strength, such as &lt;i&gt;"I like Chardonnay but like Sauvignon Blanc better"&lt;/i&gt;? The structure of user-item preference data used in Mahout is surprisingly simple: userID, itemID, score, where score represents the strength of the user's preference for the item, see &lt;a href="https://hudson.apache.org/hudson/job/Mahout-Quality/javadoc/" target="_blank"&gt;org.apache.mahout.cf.taste.model.Preference&lt;/a&gt;. There are two concrete implementations of the Preference interface in Mahout: BooleanPreference and GenericPreference. For VinWiki, I use GenericPreference because I chose to allow users to give a score for a wine.&lt;/p&gt;&lt;p&gt;&lt;div class="subHdr"&gt;Basic Structure of a UserSimilarity Recommender&lt;/div&gt;Let's take a look at the basic approach Mahout takes to make a UserSimilarity based recommendation using VinWiki nomenclature: &lt;div class="code"&gt;&lt;pre&gt;1: For all wines W that user A has NOT expressed a preference for
2:   For every other user B (in A's neighborhood) that has expressed a preference for W
3:     Compute the similarity S between user A and B
4:       Add the User B's preference X for W weighted by S to a running average preference
5: Sort Wines by weighted average preference
6: return top R wines from sorted collection as recommendations
&lt;/pre&gt;&lt;/div&gt;&lt;/p&gt;&lt;p&gt;Intuitively, this approach makes sense. From the pseudo-code above, it should be clear that we need a way to calculate the similarity S between two Users A and B, which is represented in Mahout as a &lt;a href="https://hudson.apache.org/hudson/job/Mahout-Quality/javadoc/" target="_blank"&gt;org.apache.mahout.cf.taste.similarity.UserSimilarity&lt;/a&gt;. Also, notice that the algorithm weights recommendations by user similarity, which means that the more similar a user is to you, the more heavily their preferences count in making recommendations. Consequently, the selection of the similarity calculation is very important to making good recommendations. Mahout provides a number of concrete implementations if the UserSimilarity interface, see the org.apache.mahout.cf.taste.impl package.&lt;/p&gt;&lt;p&gt;In practice, most systems that need to produce recommendations have many users and calculating a similarity between all users is too computationally expensive. Thus, Mahout uses the concept of a &lt;i&gt;user neighborhood&lt;/i&gt; to limit the number of similarity calculations to a smaller subset of similar users. This introduces another question that needs to be answered when building your recommender: What is the optimal size of the user-neighborhood for my data?&lt;/p&gt;&lt;div class="tip"&gt;Mahout also allows you to make recommendations based on similarity between Items. Don't confuse Mahout's Item-based recommender with content-based recommenders since it is still based on user-item interactions and not the content of items.&lt;/div&gt;&lt;p&gt;&lt;div class="sectionHdr"&gt;Using Mahout in VinWiki&lt;/div&gt;The main service for creating recommendations at runtime is the MahoutWineRecommender, which is an application-scoped Seam component. The MahoutWineRecommender has two dependencies injected during initialization: &lt;ul&gt;&lt;li&gt;DataModelProvider&lt;/li&gt;
&lt;li&gt;RecommenderConfig&lt;/li&gt;
&lt;/ul&gt;&lt;/p&gt;&lt;p&gt;&lt;div class="subHdr"&gt;DataModel&lt;/div&gt;The org.vinwiki.recommender.DataModelProvider Seam component (configured in components.xml) provides a Mahout DataModel to the recommender. For now, I'm using Mahout's FileDataModel, which as you might have guessed, reads preference data from a flat file. During startup, if this file doesn't exist, then the DataModelProvider reads wine ratings from the database and writes them to a new file.&lt;/p&gt;&lt;p&gt;&lt;div class="sub2Hdr"&gt;Sample Wine Ratings Data&lt;/div&gt;As this is just an example Web application, I don't have real wine ratings data. Consequently, I generated some fake data that recommends wines to sample users based on the first letter of the user name. For example, the data will cause Mahout to recommend wines that start with the letter "A" to the "A_test0" user. Here is some log output to demonstrate how the sample ratings data works:  &lt;div class="code"&gt;&lt;pre&gt;NearestNUserNeighborhood[3,0.6,0.8,EuclideanDistanceSimilarity] 
  recommended [1887, 286, 1120, 1350, 520, 1905] wines to A_test0
     Neighbor(43) A_test30
         rated Wine 1120 91.0 pts
         rated Wine 1350 87.0 pts
     Neighbor(33) A_test20
         rated Wine 1887 88.0 pts
         rated Wine 1350 88.0 pts
     Neighbor(63) A_test50
         rated Wine 1350 90.0 pts
&lt;/pre&gt;&lt;/div&gt;Notice that A_test0's neighbor's user names also start with "A_". When I created the sample ratings data, I had users rate wines that begin with the same letter a little higher than they rated other wines. You can try this yourself after deploying the application to JBoss 4.2.3 by logging in with username "A_test0" and password "P@$$w0rD" (without the quotes of course).&lt;/p&gt;&lt;p&gt;&lt;div class="sub2Hdr"&gt;Refreshing the Recommender&lt;/div&gt;When I first started working with Mahout, it wasn't clear how to handle data model changes at runtime because most of the built-in Mahout examples work with static, pre-existing datasets. In VinWiki, rating wines is a primary activity, so preference data will be changing frequently. Moreover, if a user provides several new ratings in a session, then they'll expect to have some recommendations based on those new ratings or they will think the site is broken and probably not return. Consequently, it's very important for this application to incorporate recent user activity into recommendations in near real-time.&lt;/p&gt;&lt;p&gt;Whenever a user rates a wine, the &lt;span class="seam-comp"&gt;ratingHome&lt;/span&gt; component will raise the &lt;span class="el"&gt;App.WINE_RATED_BY_USER&lt;/span&gt; event. The MahoutWineRecommender component observes this event and passes it to the DataModelProvider. &lt;div class="code"&gt;&lt;pre&gt;@Observer(App.WINE_RATED_BY_USER)
@Asynchronous
public void onWineRatedByUser(Rating r) {
    // Let the model provider know that data has changed ...
    if (dataModelProvider.updateDataModel(r.getUser().getId(), r.getWine().getId(), r.getScore())) {
        // provider indicates that we should refresh the recommender
        recommender.refresh(null);
    }
}
&lt;/pre&gt;&lt;/div&gt;In response to this event, the DataModelProvider component can choose to update its internal state to reflect the change. In my current implementation, the DataModelProvider uses a nice feature provided by Mahout's FileDataModel by writing updates to a smaller "delta" file. The FileDataModel will load these additional "delta" files when it is refreshed. So that covers updating the DataModel, but what about the Recommender and its other dependencies, such as UserSimilarity and UserNeighborhood? In my implementation, the DataModelProvider makes the decision of whether the Recommender should be refreshed. This allows a more sophisticated DataModelProvider implementation to batch up changes so that the recommender is not refreshed too often as refreshing a recommender and its dependencies can be an expensive operation for large data sets.&lt;/p&gt;&lt;p&gt;&lt;div class="subHdr"&gt;Accounting for User Preferences&lt;/div&gt;Users can de-select wines they are not interested in using the Preferences dialog. Changes to a user's preferences should be reflected in recommendations. For example, if a user indicates that they are not interested in white wine, then we should not recommend any white wines to them. Mahout allows you to inject this type of filtering on recommendations using an &lt;span class="java-intf"&gt;org.apache.mahout.cf.taste.recommender.IDRescorer&lt;/span&gt;.&lt;/p&gt;&lt;p&gt;In VinWiki, filtering recommendations by preferences is provided by the &lt;span class="java-class"&gt;org.vinwiki.recommender.PreferencesIDRescorer&lt;/span&gt; class. If you revisit the pseudo-code above, then it should be obvious that the IDRescorer may need to evaluate the filter on a large number of wines. Thus, the IDRescorer should be implemented in an efficient manner; I used the Lucene native API to iterate over all wines to build and cache a Mahout FastIDSet of wine Ids that can be recommended to the current user.&lt;/p&gt;&lt;div class="code"&gt;&lt;pre&gt;&lt;span class="java-cmt"&gt;// Using Lucene to initialize a Mahout FastIDSet for rescoring&lt;/span&gt;
int maxDoc = reader.maxDoc();
for (int docId = 0; docId &lt; maxDoc; docId++) {
    if (reader.isDeleted(docId))
        continue;

    try {
        doc = reader.document(docId, getFieldSelector());
    } catch (Exception zzz) {
        ...
    }
    if (doc == null)
        continue;

    Long wineId = new Long(doc.get(ID));
    String type = doc.get(TYPE);
    String style = doc.get(STYLE);
    Long regionId = new Long(doc.get(REGION));

    // ask the User's Preferences object if this wine is enabled
    if (prefs.checkWineFilter(wineId, type, style, regionId)) {
        idSet.add(wineId);
    }
}
&lt;/pre&gt;
&lt;/div&gt;&lt;p&gt;There is one subtle aspect to the current implementation in that it does not refresh during the user's session as new wines are added to the search index. In other words, you are not going to see any code that tries to update the rescorer after new wines are added to the system. Remember that our recommendations are based on user-item interactions and new wine objects are not going to have enough (if any) ratings to impact the current user's session. However, the rescorer is refreshed if the user changes their preferences.&lt;/p&gt;&lt;div class="tip"&gt;Tip: using an IDRescorer is a simple form of content-based recommendations in that we're using specific properties of the wine objects to influence our recommendations.&lt;/div&gt;&lt;p&gt;&lt;div class="subHdr"&gt;RecommenderConfig&lt;/div&gt;&lt;span class="java-class"&gt;org.vinwiki.recommender.RecommenderConfig&lt;/span&gt; is an application-scoped Seam component (configured in components.xml) that supports common options for configuring the behavior of a Recommender.&lt;/p&gt;&lt;p&gt;At startup, the MahoutWineRecommender uses the DataModel and RecommenderConfig to initialize a Recommender. The Recommender is held in application-scope because it is expensive to build and should be re-used for all recommendation requests from FetchRecommended objects (see &lt;a href="http://thelabdude.blogspot.com/2010/05/building-intelligent-web-app-using-seam.html#nav"&gt;Server-side Pagination&lt;/a&gt; from the first posting in this series). The following code snippet gives you an idea of how to construct a User-based recommender with Mahout:&lt;/p&gt;&lt;div class="code"&gt;&lt;pre&gt;&lt;span class="java-cmt"&gt;// see RecommenderConfig.java&lt;/span&gt;
    UserSimilarity userSimilarity = createUserSimilarity(dataModel);
    UserNeighborhood neighborhood = createUserNeighborhood(userSimilarity, dataModel);
    return new GenericUserBasedRecommender(dataModel, neighborhood, userSimilarity);
&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Here is an example configuration from components.xml. &lt;b&gt;NOTE: You must set the fileDataModelFileName to a valid path on your server before running the sample!&lt;/b&gt;
&lt;div class="code"&gt;&lt;pre&gt;&amp;lt;component name="dataModelProvider" auto-create="true" scope="application" 
        class="org.vinwiki.recommender.DataModelProvider"&amp;gt;
    &amp;lt;property name="fileDataModelFileName"&amp;gt;/home/thelabdude/thelabdude-blog-dev/jboss-4.2.3/bin/recommender/ratings.txt&amp;lt;/property&amp;gt;
    &amp;lt;property name="updateFileSizeThresholdKb"&amp;gt;10&amp;lt;/property&amp;gt;
  &amp;lt;/component&amp;gt;

  &amp;lt;component name="recommenderConfig" auto-create="true" scope="application" 
        class="org.vinwiki.recommender.RecommenderConfig"&amp;gt;
    &amp;lt;property name="recommenderType"&amp;gt;USER_SIMILARITY&amp;lt;/property&amp;gt;
    &amp;lt;property name="similarityClassName"&amp;gt;org.apache.mahout.cf.taste.impl.similarity.PearsonCorrelationSimilarity&amp;lt;/property&amp;gt;
    &amp;lt;property name="neighborhoodSize"&amp;gt;2&amp;lt;/property&amp;gt;
    &amp;lt;property name="minSimilarity"&amp;gt;0.7&amp;lt;/property&amp;gt;
    &amp;lt;property name="samplingRate"&amp;gt;0.2&amp;lt;/property&amp;gt;    
  &amp;lt;/component&amp;gt;
&lt;/pre&gt;&lt;/div&gt;Looks easy enough, but what about all those parameters to the recommenderConfig? Thankfully, Mahout provides a powerful tool to help you determine the correct values to use for each of these settings for your data - RecommenderEvaluator.&lt;/p&gt;&lt;p&gt;&lt;div class="subHdr"&gt;Evaluating a Mahout Recommender&lt;/div&gt;It should be clear that the optimal recommender for your data requires experimentation with how you represent preferences, calculate user or item similarity, and the size of user neighborhood. Mahout provides an easy way to compare the results for different configuration options using a RecommenderEvaluator. Currently, there are two concrete RecommenderEvaluator implementations:
&lt;ul&gt;&lt;li&gt;AverageAbsoluteDifferenceRecommenderEvaluator - computes the average absolute difference between predicted and actual ratings for users.&lt;/li&gt;
&lt;li&gt;RMSRecommenderEvaluator - computes the "root mean squared" difference between predicted and actual ratings for users&lt;/li&gt;
&lt;/ul&gt;I chose to use the RMSRecommenderEvaluator because it penalizes bad recommendations more heavily than the AverageAbsoluteDifference evaluator. When doing evaluations, the lowest score is best. Notice in the code snippet below how the RecommenderConfig (part of VinWiki) helps you run evaluations:&lt;/p&gt;&lt;div class="code"&gt;&lt;pre&gt;RecommenderConfig config = new RecommenderConfig();
config.setRecommenderType(RecommenderType.USER_SIMILARITY);
config.setSimilarityClassName(simClass.getName());
config.setNeighborhoodSize(c);
config.setMinSimilarity(minSimilarity);
config.setSamplingRate(samplingRate);

RecommenderBuilder builder = config.getBuilder();
double score = evaluator.evaluate(builder, 
                 null, // no DataModelBuilder
                 recommenderDataModel, 
                 0.8, // training data pct
                 1); // use all users

&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;For VinWiki, I developed a Seam ComponentTest to run evaluations. At this point, the output is not as important as the process, since the results are based on simulated ratings data (VinWiki is not yet a live application with real users). This is a problem facing any new application that uses machine-learning algorithms that require real user input. One idea to get real user input is to use Amazon's &lt;a href="https://www.mturk.com/mturk/welcome" target="_blank"&gt;Mechanical Turk&lt;/a&gt; service to hire users to create real user-item interactions for your application. Regardless of how you seed your application with real user data, the approach in &lt;span class="file-path"&gt;src/test/org/vinwiki/RecommenderTest.java&lt;/span&gt; should still be useful to you.&lt;/p&gt;&lt;p&gt;&lt;div class="sectionHdr"&gt;Conclusion&lt;/div&gt;So this concludes the four-part series on VinWiki. As you can see, integrating Mahout is easy, but it does require experimentation and tuning. You also have to be cognizant of scalability issues, such as how often to refresh your recommender. The framework I added to VinWiki should be useful for your application too. Please leave comments on my blog if you have questions or would like to suggest improvements to any of the features I discussed in any of the four posts.&lt;/p&gt;&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6666528949054177156-1908737213798752296?l=thelabdude.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://thelabdude.blogspot.com/feeds/1908737213798752296/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://thelabdude.blogspot.com/2010/09/vinwiki-part-4-making-recommendations.html#comment-form' title='4 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6666528949054177156/posts/default/1908737213798752296'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6666528949054177156/posts/default/1908737213798752296'/><link rel='alternate' type='text/html' href='http://thelabdude.blogspot.com/2010/09/vinwiki-part-4-making-recommendations.html' title='VinWiki Part 4: Making Recommendations with Mahout'/><author><name>thelabdude</name><uri>http://www.blogger.com/profile/08032248788175889303</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>4</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6666528949054177156.post-7245762356043798339</id><published>2010-07-08T16:48:00.001-07:00</published><updated>2011-07-26T11:37:28.178-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='facebook connect'/><category scheme='http://www.blogger.com/atom/ns#' term='social graph'/><category scheme='http://www.blogger.com/atom/ns#' term='social plug-in'/><title type='text'>VinWiki Part 3: Authentication with Facebook Connect and Sharing Content with Friends</title><content type='html'>&lt;style&gt;
.todo {
  font:bold 14px verdana;
  color:#FF0000;
  padding: 15px;
}
.annotation {
  font:normal 12px monospace;
  color:#0033CC;
}
.seam-comp {
  font:bold 12px monospace;
  color:#660099;
}
.sectionHdr {
  font-size:15px;
  font-weight:bold;
  color:#CC6633;
  margin-top:20px;
  margin-bottom:15px;
}
.subHdr {
  font-size:13px;
  font-weight:bold;
  color:#CC6633;
  margin-top:15px;
  margin-bottom:10px;
}
.sub2Hdr {
  font-size:12px;
  font-weight:bold;
  font-style:italic;
  color:#CC6633;
  margin-top:10px;
  margin-bottom:5px;
}
.file-path {
  color:#336600;
}
.code {
  font:normal 12px monospace;
  color:#333333;
}
.tip {
  padding: 5px;
  background: #FFFFCC;
  border: 1px solid #F0F0F0;
  width: 500px;
  margin:20px;
}
.java-class {
  font:bold 12px monospace;
  color:#003399;
}
.java-intf {
  font:italic 12px monospace;
  color:#003399;
}
.java-kw {
  font:normal 12px monospace;
  color:#0000FF;
}
.java-id {
  font:normal 12px monospace;
  color:#FF0000;
}
.java-str {
  font:normal 12px monospace;
  color:#FF00CC;
}
.java-cmt {
  font:normal 12px monospace;
  color:#339900;
}
.el {
  font:normal 12px monospace;
  color:#990000;
}
.xml-elm {
  font:normal 12px monospace;
  color:#999933;
}
.hashLink {
  font-size:11px;
  font-style:italic;
}
.screenshot {
  margin: 20px;
}
td {
font-family:verdana;font-size:12px;
}
&lt;/style&gt; &lt;span style="font-family:verdana;font-size:12px;"&gt; &lt;div class="sectionHdr"&gt;Introduction&lt;/div&gt;&lt;p&gt;This is the third post in a four part series about a wine rating and recommendation Web application built using open source Java technology. The purpose of this series is to document key design and implementation decisions that can be applied to other Web applications. Please read the &lt;a href="http://thelabdude.blogspot.com/2010/05/building-intelligent-web-app-using-seam.html"&gt;first&lt;/a&gt; and &lt;a href="http://thelabdude.blogspot.com/2010/06/vinwiki-part-2-full-text-search-with.html"&gt;second&lt;/a&gt; posts to get up-to-speed. You can download the project (with source) from &lt;a href="http://vinwiki.com/thelabdude/vinwiki/vinwiki.zip"&gt;here&lt;/a&gt;.&lt;/p&gt;&lt;p&gt;In this posting, I implement several common features that should help users find and contribute to your application. Specifically, I integrate with Facebook Connect (based on &lt;a href="http://wiki.oauth.net/OAuth-2" target="_blank"&gt;OAuth 2&lt;/a&gt;) to allow Facebook users to instantly register and authenticate using their Facebook profile. From there, I integrate the Facebook Like social plug-in which allows users to share content in your application with their friends on Facebook.&lt;/p&gt;&lt;p&gt;&lt;div class="sectionHdr"&gt;Authentication using Facebook Connect&lt;/div&gt;&lt;img class="screenshot" src="http://vinwiki.com/thelabdude/vinwiki/fb_connect_login.png" alt="Login with Facebook Connect"/&gt;&lt;br /&gt;
The Facebook team has made integrating Facebook Connect into your application very easy. There are open source Java libraries available for integrating with Facebook, however I found it easier to just use the JavaScript SDK. Let's go through the process in five simple steps:&lt;/p&gt;&lt;p&gt;&lt;div class="subHdr"&gt;I. Register Your Application with Facebook&lt;/div&gt;First, you need to &lt;a href="http://developers.facebook.com/setup/" target="_blank"&gt;register an application&lt;/a&gt; in Facebook to get a unique Application ID (please use something other than "VinWiki" for your application name since I'll be using that one in the near future). Update &lt;span class="file-path"&gt;resources/WEB-INF/components.xml&lt;/span&gt; to set the &lt;b&gt;facebookAppId&lt;/b&gt; property on the &lt;span class="seam-comp"&gt;app&lt;/span&gt; component: &lt;div class="code"&gt;&lt;pre&gt;&amp;lt;component name="app" auto-create="true" scope="application" class="org.vinwiki.App"&amp;gt;
    ...
    &amp;lt;property name="facebookAppId"&amp;gt;ENTER_YOUR_FB_APPLICATION_ID_HERE&amp;lt;/property&amp;gt;
  &amp;lt;/component&amp;gt;
&lt;/pre&gt;&lt;/div&gt;&lt;/p&gt;&lt;p&gt;&lt;div class="subHdr"&gt;II. Initialize the Facebook JavaScript Library&lt;/div&gt;Second, you need to initialize the Facebook JavaScript library when your page loads. For this, I created a new Facelets include file &lt;span class="file-path"&gt;view/WEB-INF/facelets/facebook.xhtml&lt;/span&gt; and loaded it into the footer in my layout template &lt;span class="file-path"&gt;view/layout/template.xhtml&lt;/span&gt;: &lt;div class="code"&gt;&lt;pre&gt;&amp;lt;h:panelGroup rendered="#{app.isFacebookEnabled()}"&amp;gt;
    &amp;lt;ui:include src="/WEB-INF/facelets/facebook.xhtml"/&amp;gt;
  &amp;lt;/h:panelGroup&amp;gt;
&lt;/pre&gt;&lt;/div&gt;In facebook.xhtml, I have the following JavaScript: &lt;div class="code"&gt;&lt;pre&gt;(function() {
    var e = document.createElement('script'); e.async = true;
    e.src = document.location.protocol + '//connect.facebook.net/en_US/all.js';
    document.getElementById('fb-root').appendChild(e);
  }());
&lt;/pre&gt;&lt;/div&gt;This function, borrowed from the Facebook developer documentation, asynchronously loads the Facebook JavaScript file into your page. &lt;/p&gt;&lt;p&gt;&lt;div class="subHdr"&gt;III. Register a JavaScript callback handler for Facebook session events&lt;/div&gt;Once the JavaScript library is loaded, the &lt;span class="code"&gt;window.fbAsyncInit&lt;/span&gt; function is called automatically. &lt;div class="code"&gt;&lt;pre&gt;&lt;b&gt;window.fbAsyncInit&lt;/b&gt; = function() {    
    FB.init({appId:'#{app.facebookAppId}', status:true, cookie:true, xfbml:true});
    FB.Event.subscribe('auth.sessionChange', function(response) {
      if (response.session) {
        &lt;span class="java-cmt"&gt;// Login successful&lt;/span&gt;
        var uid = response.session.uid;
        FB.api('/me', function(resp) {
          onFbLogin(uid, resp.email, resp.name, resp.first_name, resp.last_name, resp.gender);
        });
      } else {
        &lt;span class="java-cmt"&gt;// The user has logged out, and the cookie has been cleared&lt;/span&gt;
        onFbLogout();
      }
    });
  };
&lt;/pre&gt;&lt;/div&gt;After initializing the library (see &lt;a href="http://developers.facebook.com/docs/reference/javascript/FB.init" target="_blank"&gt;FB.init&lt;/a&gt;), the application registers a listener for &lt;b&gt;auth.sessionChange&lt;/b&gt; events (login or logout). On login, I use Facebook's Graph API to get some basic information about the current user (see &lt;a href="http://developers.facebook.com/docs/reference/javascript/FB.api" target="_blank"&gt;FB.api&lt;/a&gt;). In the response callback handler for &lt;span class="code"&gt;FB.api('/me')&lt;/span&gt;, I invoke a JavaScript function &lt;span class="code"&gt;onFbLogin&lt;/span&gt; that executes the &lt;span class="el"&gt;#{guest.onFbLogin()}&lt;/span&gt; action: &lt;div class="code"&gt;&lt;pre&gt;&amp;lt;a4j:form prependId="false"&amp;gt;
  &lt;b&gt;&amp;lt;s:token/&amp;gt;&lt;/b&gt;
  &amp;lt;a4j:jsFunction immediate="true" name="onFbLogin" ajaxSingle="true" action="#{guest.onFbLogin()}"&amp;gt;
    ... params here ...
  &amp;lt;/a4j:jsFunction&amp;gt;
&amp;lt;/a4j:form&amp;gt;
&lt;/pre&gt;&lt;/div&gt;Notice that I'm using Seam's &lt;a href="http://docs.jboss.org/seam/2.2.1.CR1/reference/en-US/html_single/#d0e29211" target="_blank"&gt;&amp;lt;s:token/&amp;gt;&lt;/a&gt; tag to prevent cross-site request forgery since the &lt;span class="el"&gt;#{guest.onFbLogin()}&lt;/span&gt; action simply trusts that the user was authenticated by Facebook on the client-side. For an overview of the thinking behind the &lt;span class="xml-elm"&gt;&amp;lt;s:token/&amp;gt;&lt;/span&gt; tag, I refer you to Dan Allen's post at &lt;a href="http://seamframework.org/Community/NewComponentTagStokenAimedToGuardAgainstCSRF" target="_blank"&gt;http://seamframework.org/Community/NewComponentTagStokenAimedToGuardAgainstCSRF&lt;/a&gt;. However, please realize that you must set &lt;span class="code"&gt;javax.faces.STATE_SAVING_METHOD&lt;/span&gt; to "server" in web.xml for this method to secure your forms: &lt;div class="code"&gt;&lt;pre&gt;&amp;lt;context-param&amp;gt;
    &amp;lt;param-name&amp;gt;javax.faces.STATE_SAVING_METHOD&amp;lt;/param-name&amp;gt;
    &amp;lt;param-value&amp;gt;server&amp;lt;/param-value&amp;gt;
  &amp;lt;/context-param&amp;gt;
&lt;/pre&gt;&lt;/div&gt;The action handler on the server side is straight-forward because Seam supports alternative authentication mechanisms out-of-the-box. Specifically, all you need to do is invoke the &lt;span class="code"&gt;acceptExternallyAuthenticatedPrincipal&lt;/span&gt; method of the org.jboss.seam.security.Identity object. I utilized the existing &lt;span class="seam-comp"&gt;guest&lt;/span&gt; component because it handles other guest related actions such as register and login (see &lt;span class="file-path"&gt;src/hot/org/vinwiki/user/GuestSupport.java&lt;/span&gt;) &lt;div class="code"&gt;&lt;pre&gt;Identity.instance().acceptExternallyAuthenticatedPrincipal(new FacebookPrincipal(user.getUserName()));
    Contexts.getSessionContext().set("currentUser", user);
    Events.instance().raiseEvent(Identity.EVENT_POST_AUTHENTICATE, Identity.instance());
&lt;/pre&gt;&lt;/div&gt;I also raise the &lt;span class="code"&gt;Identity.EVENT_POST_AUTHENTICATE&lt;/span&gt; event manually so that my &lt;span class="seam-comp"&gt;nav&lt;/span&gt; component can re-configure the default for the authenticated user instead of showing the guest view.&lt;/p&gt;&lt;p&gt;&lt;div class="subHdr"&gt;IV. Show Facebook Connect Button on Login Panel&lt;/div&gt;Lastly, we need to let users know that they login (or register) using their Facebook credentials. This is accomplished with the &lt;span class="xml-elm"&gt;&amp;lt;fb:login-button&amp;gt;&lt;/span&gt; tag, see &lt;span class="file-path"&gt;view/WEB-INF/facelets/guestSupport.xhtml&lt;/span&gt;. &lt;div class="code"&gt;&lt;pre&gt;&amp;lt;fb:login-button perms="email,publish_stream"&amp;gt;
    &amp;lt;fb:intl&amp;gt;Login with Facebook&amp;lt;/fb:intl&amp;gt;
  &amp;lt;/fb:login-button&amp;gt;
&lt;/pre&gt;&lt;/div&gt;The new RichFaces Login dialog looks like: &lt;br /&gt;
&lt;img class="screenshot" src="http://vinwiki.com/thelabdude/vinwiki/fb_login.png" alt="Login Dialog with Facebook Connect"/&gt;&lt;br /&gt;
Notice that VinWiki requests access to the &lt;b&gt;email&lt;/b&gt; and &lt;b&gt;publish_stream&lt;/b&gt; &lt;a href="http://developers.facebook.com/docs/authentication/permissions" target="_blank"&gt;extended permissions&lt;/a&gt;. The &lt;b&gt;publish_stream&lt;/b&gt; permission allows users to share wines of interest found on VinWiki with their friends on Facebook. When accessing VinWiki for the first time, users will see a dialog that allows them to grant permissions to the application: &lt;br /&gt;
&lt;img class="screenshot" src="http://vinwiki.com/thelabdude/vinwiki/fb_ext_perms.png" alt="Facebook Extended Permissions"/&gt;&lt;br /&gt;
&lt;/p&gt;&lt;p&gt;&lt;div class="subHdr"&gt;V. Logout&lt;/div&gt;It's doubtful whether many users will ever explicitly logout of your site unless they are accessing it from a public computer. Consequently, you'll want to keep your session timeout value as low as possible. That said, you still need to offer the ability to logout. In &lt;span class="file-path"&gt;view/WEB-INF/facelets/headerControls.xhtml&lt;/span&gt;, the logout action is implemented using a simple JSF commandLink: &lt;div class="code"&gt;&lt;pre&gt;&amp;lt;h:commandLink rendered="#{nav.isFbSession()}" onclick="FB.logout();" styleClass="hdrLink"&amp;gt;
  &amp;lt;h:outputText value="Logout"/&amp;gt;
&amp;lt;/h:commandLink&amp;gt;
&lt;/pre&gt;&lt;/div&gt;The magic is in our sessionChange event listener discussed above. The Facebook JavaScript function &lt;span class="code"&gt;FB.logout()&lt;/span&gt; triggers an &lt;b&gt;auth.sessionChange&lt;/b&gt; event, which in turn calls &lt;span class="code"&gt;onFbLogout()&lt;/span&gt; to execute the &lt;span class="el"&gt;#{guest.onFbLogout()}&lt;/span&gt; action on the server.&lt;/p&gt;&lt;p&gt;So that covers authentication using Facebook Connect. There are many other ways to integrate your application with Facebook. In the next section, I'll implement a way for users to share content in your application with their friends on Facebook. For this feature, it is helpful to have bookmarkable URLs.&lt;/p&gt;&lt;p&gt;&lt;div class="sectionHdr"&gt;Sharing Content with Friends&lt;/div&gt;In VinWiki, users may want to share specific wines of interest with their friends on Facebook. For this, I used Facebook's &lt;a href="http://developers.facebook.com/docs/reference/plugins/like" target="_blank"&gt;Like&lt;/a&gt; social plug-in. There are other options, including just posting a shared link to the user's activity stream. There's not much to integrating the Like social plug-in into your page once you've accounted for bookmarkable URLs (see posting 1). The &lt;span class="xml-elm"&gt;&amp;lt;fb:like&amp;gt;&lt;/span&gt; tag will use the URL of the current page if you don't specify an &lt;u&gt;href&lt;/u&gt; attribute. However, I want to make sure the URL that is shared with Facebook is as clean as possible. Thus, I introduced a new setting for the &lt;span class="seam-comp"&gt;org.vinwiki.App&lt;/span&gt; component named &lt;b&gt;baseUrl&lt;/b&gt;. You should change this to match your server in &lt;span class="file-path"&gt;resources/WEB-INF/components.xml&lt;/span&gt;: &lt;div class="code"&gt;&lt;pre&gt;&amp;lt;component name="app" auto-create="true" scope="application" class="org.vinwiki.App"&amp;gt;
    ...
    &amp;lt;property name="baseUrl"&amp;gt;http://192.168.1.2:8080/vinwiki/&amp;lt;/property&amp;gt;
  &amp;lt;/component&amp;gt;
&lt;/pre&gt;&lt;/div&gt;I also decided to add the &lt;a href="http://developers.facebook.com/docs/opengraph" target="_blank"&gt;Open Graph&lt;/a&gt; meta tags to the header of my wine details page, &lt;span class="file-path"&gt;view/wine.xhtml&lt;/span&gt;: &lt;div class="code"&gt;&lt;pre&gt;&amp;lt;meta property="og:title" content="#{currentWine.fullName}"/&amp;gt;
  &amp;lt;meta property="og:type" content="drink"/&amp;gt;
  &amp;lt;meta property="og:url" content="#{viewWine.openGraphUrl}"/&amp;gt;
  &amp;lt;meta property="og:site_name" content="VinWiki"/&amp;gt;
  &amp;lt;meta property="og:description" content="#{jsf.truncate(currentWine.description,100,'...')}"/&amp;gt;
&lt;/pre&gt;&lt;/div&gt;Presumably, when you share a page that supports the Open Group protocol, Facebook is able to present this additional metadata to your friends. Of course, the URL needs to be public before this will actually work ;-)&lt;/p&gt;&lt;p&gt;&lt;div class="sectionHdr"&gt;What's Next?&lt;/div&gt;In the next and final post in this series, I'll integrate Mahout for making wine recommendations and discuss some considerations for scaling the application.&lt;/p&gt;&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6666528949054177156-7245762356043798339?l=thelabdude.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://thelabdude.blogspot.com/feeds/7245762356043798339/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://thelabdude.blogspot.com/2010/07/vinwiki-part-3-authentication-with.html#comment-form' title='8 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6666528949054177156/posts/default/7245762356043798339'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6666528949054177156/posts/default/7245762356043798339'/><link rel='alternate' type='text/html' href='http://thelabdude.blogspot.com/2010/07/vinwiki-part-3-authentication-with.html' title='VinWiki Part 3: Authentication with Facebook Connect and Sharing Content with Friends'/><author><name>thelabdude</name><uri>http://www.blogger.com/profile/08032248788175889303</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>8</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6666528949054177156.post-6540483731847943950</id><published>2010-06-14T13:34:00.000-07:00</published><updated>2011-07-26T11:36:41.630-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='EntityManager'/><category scheme='http://www.blogger.com/atom/ns#' term='lucene 2.9'/><category scheme='http://www.blogger.com/atom/ns#' term='advanced search'/><category scheme='http://www.blogger.com/atom/ns#' term='lucene'/><category scheme='http://www.blogger.com/atom/ns#' term='type-ahead suggestion'/><category scheme='http://www.blogger.com/atom/ns#' term='term highlighting'/><category scheme='http://www.blogger.com/atom/ns#' term='FullTextEntityManager'/><category scheme='http://www.blogger.com/atom/ns#' term='spell correction'/><category scheme='http://www.blogger.com/atom/ns#' term='hibernate search'/><title type='text'>VinWiki Part 2: Full-text Search with Hibernate Search and Lucene</title><content type='html'>&lt;style&gt;
.todo {
  font:bold 14px verdana;
  color:#FF0000;
  padding: 15px;
}
.annotation {
  font:normal 12px monospace;
  color:#0033CC;
}
.seam-comp {
  font:bold 12px monospace;
  color:#660099;
}
.sectionHdr {
  font-size:15px;
  font-weight:bold;
  color:#CC6633;
  margin-top:20px;
  margin-bottom:15px;
}
.subHdr {
  font-size:13px;
  font-weight:bold;
  color:#CC6633;
  margin-top:15px;
  margin-bottom:10px;
}
.sub2Hdr {
  font-size:12px;
  font-weight:bold;
  font-style:italic;
  color:#CC6633;
  margin-top:10px;
  margin-bottom:5px;
}
.file-path {
  color:#336600;
}
.code {
  font:normal 12px monospace;
  color:#333333;
}
.tip {
  padding: 5px;
  background: #FFFFCC;
  border: 1px solid #F0F0F0;
  width: 500px;
  margin:20px;
}
.java-class {
  font:bold 12px monospace;
  color:#003399;
}
.java-intf {
  font:italic 12px monospace;
  color:#003399;
}
.java-kw {
  font:normal 12px monospace;
  color:#0000FF;
}
.java-id {
  font:normal 12px monospace;
  color:#FF0000;
}
.java-str {
  font:normal 12px monospace;
  color:#FF00CC;
}
.java-cmt {
  font:normal 12px monospace;
  color:#339900;
}
.el {
  font:normal 12px monospace;
  color:#990000;
}
.xml-elm {
  font:normal 12px monospace;
  color:#999933;
}
.hashLink {
  font-size:11px;
  font-style:italic;
}
.screenshot {
  margin: 20px;
}
td {
font-family:verdana;font-size:12px;
}
&lt;/style&gt; &lt;span style="font-family:verdana;font-size:12px;"&gt; &lt;div class="sectionHdr"&gt;Introduction&lt;/div&gt;&lt;p&gt;This is the second post in a four part series about a wine recommendation Web application built using open source Java technology. The purpose of this series is to document key design and implementation decisions that can be applied to other Web applications. Please read the first &lt;a href="http://thelabdude.blogspot.com/2010/05/building-intelligent-web-app-using-seam.html"&gt;post&lt;/a&gt; for an overview of the project. You can download the project (with source) from &lt;a href="http://vinwiki.com/thelabdude/vinwiki/vinwiki.zip"&gt;here&lt;/a&gt;.&lt;/p&gt;&lt;p&gt;In this posting, I leverage Hibernate Search and Lucene to provide full-text search for wines of interest. This is post is merely an attempt to complement the already existing documentation on Lucene and Hibernate Search. If you are familiar with Hibernate and are new to Lucene, then I recommend starting out with the online manual for &lt;a target="_blank" href="http://docs.jboss.org/hibernate/search/3.1/reference/en/html_single/"&gt;Hibernate Search 3.1.1 GA&lt;/a&gt;. In addition, I highly recommend reading &lt;a href="http://www.manning.com/bernard/" target="_blank"&gt;Hibernate Search in Action&lt;/a&gt; and &lt;a href="http://www.manning.com/hatcher3/" target="_blank"&gt;Lucene in Action (Second Edition)&lt;/a&gt;; both are extremely well-written books by experts in each field.&lt;/p&gt;&lt;p&gt;&lt;div class="sectionHdr"&gt;Basic Requirements&lt;/div&gt;Full-text search allows users to find wines using simple keywords, such as "zinfandel" or "dry creek valley". Users have come to expect certain features from a full-text search engine. The following table summarizes the key features our wine search engine will support: &lt;table width="100%" cellpadding="3" cellspacing="0"&gt;&lt;tr bgcolor="#cccccc"&gt;&lt;td width="200"&gt;&lt;b&gt;Feature&lt;/b&gt;&lt;/td&gt;&lt;td&gt;&lt;b&gt;Description&lt;/b&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr bgcolor="#ffffff"&gt; &lt;td&gt;Paginated search results&lt;br /&gt;
(see first &lt;a href="http://thelabdude.blogspot.com/2010/05/building-intelligent-web-app-using-seam.html"&gt;post&lt;/a&gt; for solution)&lt;/td&gt; &lt;td&gt;Return the first 5-10 results, ordered by relevance on the first page of the search results. Allow users to retrieve more results by advancing the page navigator.&lt;br /&gt;
Allows users to set a bounded preference for how many items to show on earch page.&lt;/td&gt; &lt;/tr&gt;
&lt;tr bgcolor="#f0f0f0"&gt; &lt;td&gt;Query term highlighting&lt;br /&gt;
(&lt;a href="#highlighting" class="hashLink"&gt;view solution&lt;/a&gt;)&lt;/td&gt; &lt;td&gt;Highlight query terms in search results to allow users to quickly scan the results for the most relevant items.&lt;/td&gt; &lt;/tr&gt;
&lt;tr bgcolor="#ffffff"&gt; &lt;td&gt;Spell correction&lt;br /&gt;
(&lt;a href="#spell" class="hashLink"&gt;view solution&lt;/a&gt;)&lt;/td&gt; &lt;td&gt;Ability to detect and correct for misspelled terms in the query.&lt;/td&gt; &lt;/tr&gt;
&lt;tr bgcolor="#f0f0f0"&gt; &lt;td&gt;Type-ahead suggestions&lt;br /&gt;
(&lt;a href="#suggestions" class="hashLink"&gt;view solution&lt;/a&gt;)&lt;/td&gt; &lt;td&gt;Show a drop-down selection box containing suggested search terms after the user has typed at least 3 characters.&lt;/td&gt; &lt;/tr&gt;
&lt;tr bgcolor="#ffffff"&gt; &lt;td&gt;Advanced search form&lt;br /&gt;
(&lt;a href="#advanced" class="hashLink"&gt;view solution&lt;/a&gt;)&lt;/td&gt; &lt;td&gt;Allow users to fine-tune the search with an advanced search form.&lt;/td&gt; &lt;/tr&gt;
&lt;/table&gt;&lt;/p&gt;&lt;p&gt;&lt;div class="sectionHdr"&gt;Setup&lt;/div&gt;From the previous post, you'll recall that I'm using Hibernate to manage my persistent objects. As such, I've chosen to leverage &lt;a href="http://www.hibernate.org/subprojects/search.html" target="_blank"&gt;Hibernate Search&lt;/a&gt; (referred to as &lt;b&gt;HS&lt;/b&gt; hereafter) to integrate my Hibernate-based object model with Lucene. For now, I'm using Hibernate Search v. &lt;b&gt;3.1.1 GA&lt;/b&gt; with Hibernate Core 3.3.1 GA. On the Lucene side, I'm using version 2.9.2 and a few classes from Solr 1.4. In the last posting in this series, we'll upgrade to the latest Hibernate Search and Core, which rely on JPA 2 and thus will require a move up to JBoss 6.x. As far as I know, the latest HS and Core classes have trouble on JBoss 4.2 and 5.1 because of their dependency on JPA 2 (&lt;i&gt;let me know if you get it working&lt;/i&gt;).&lt;/p&gt;&lt;p&gt;&lt;div class="subHdr"&gt;Why not use the database to search?&lt;/div&gt;Some may be wondering why I'm using Lucene instead of doing full-text search in my database? While possible, I feel the database is not the best tool for full-text searching. For an excellent treatment of the mis-match between relational databases and full-text search, I recommend reading &lt;a href="http://www.manning.com/bernard/" target="_blank"&gt;Hibernate Search in Action&lt;/a&gt;. The first chapter provides a strong case for using Lucene instead of your database for full-text search. Just in terms of scalability, the database is the most expensive and complex layer to scale in most applications. Offloading searches to a local Lucene running on each node in your cluster reduces the amount of work your database is performing and ensures searches return quickly. Moreover, distributing a Lucene index across a cluster of app servers is almost trivial since Hibernate Search has built-in clustering support.&lt;/p&gt;&lt;p&gt;&lt;div class="subHdr"&gt;Project Configuration Changes&lt;/div&gt;The following JAR files need to be included in your Web application's LIB directory (/WEB-INF/lib): &lt;div class="code"&gt;&lt;pre&gt;hibernate-search.jar
    lucene-core-2.9.2.jar
    lucene-highlighter-2.9.2.jar
    lucene-misc-2.9.2.jar
    lucene-snowball-2.9.2.jar
    lucene-spatial-2.9.2.jar
    lucene-spellchecker-2.9.2.jar
    lucene-memory-2.9.2.jar
    solr-core-1.4.0.jar
    solr-solrj-1.4.0.jar
&lt;/pre&gt;&lt;/div&gt;&lt;/p&gt;&lt;p&gt;&lt;div class="subHdr"&gt;How does Seam know to use Hibernate Search's FullTextEntityManager?&lt;/div&gt;Seam's &lt;span class="java-class"&gt;org.jboss.seam.persistence.HibernatePersistenceProvider&lt;/span&gt; automatically detects if Hibernate Search is available on the classpath. If HS is available, then Seam uses the &lt;span class="java-class"&gt;org.jboss.seam.persistence.FullTextEntityManagerProxy&lt;/span&gt; instead of the default EntityManagerProxy, meaning that you will have access to a FullTextEntityManager wherever you have a Seam in-jected EntityManager. You also need to add a few more properties to your persistence deployment descriptor (&lt;span class="file-path"&gt;resources/META-INF/persistence-*.xml&lt;/span&gt;): &lt;div class="code"&gt;&lt;pre&gt;&amp;lt;property name="hibernate.search.default.directory_provider" value="org.hibernate.search.store.FSDirectoryProvider"/&amp;gt;
  &amp;lt;property name="hibernate.search.default.indexBase" value="lucene_index"/&amp;gt;
  &amp;lt;property name="hibernate.search.reader.strategy" value="shared"/&amp;gt;
  &amp;lt;property name="hibernate.search.worker.execution" value="sync"/&amp;gt;
&lt;/pre&gt;&lt;/div&gt;We'll make some adjustments to these settings as the project progresses, but these will suffice for now. &lt;/p&gt;&lt;p&gt;&lt;div class="sectionHdr"&gt;Indexing&lt;/div&gt;The first step in providing full-text search capabilities is to index the content you want to search. In most cases, our users will want to find Wine objects and to a lesser extent Winery objects. Let's start by indexing Wine objects.&lt;/p&gt;&lt;p&gt;&lt;div class="subHdr"&gt;Indexing Wine Objects&lt;/div&gt;I'll tackle indexing Wine objects in a few passes, progressively adding features, so let's start with the basics. First, we tell HS to index Wine objects using the &lt;span class="annotation"&gt;@Indexed&lt;/span&gt; annotation on the class. As for what to search, I tend to favor using a single field to hold all searchable text for each object because it simplifies working with other Lucene extensions, such as the &lt;b&gt;More Like This&lt;/b&gt; and &lt;b&gt;term highlighting&lt;/b&gt; features. &lt;div class="code"&gt;&lt;pre&gt;&lt;span class="annotation"&gt;@Field(name=DEFAULT_SEARCH_FIELD, index=Index.TOKENIZED)
    @Analyzer(definition="wine_stem_en")&lt;/span&gt;
    public String getContent() {
        // return one string containing all searchable text for this object
    }
&lt;/pre&gt;&lt;/div&gt;Notice that the content field is &lt;b&gt;Index.TOKENIZED&lt;/b&gt; during indexing. This means that the String value returned by the &lt;span class="code"&gt;getContent&lt;/span&gt; method will be broken into a stream of tokens using a Lucene Analyzer; each token returned by the Analyzer will be searchable. Your choice of Analyzer depends on the type of text you are processing, which in our case is English text provided by our users when creating Wine objects. Hibernate Search leverages the text analysis framework provided by &lt;a href="http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters" target="_blank"&gt;Solr&lt;/a&gt;. You can specify your Analyzer using the &lt;span class="annotation"&gt;@Analyzer&lt;/span&gt; annotation. Behind the scenes, Hibernate Search creates the Analyzer using the Factory defined by an &lt;span class="annotation"&gt;@AnalyzerDef&lt;/span&gt; class annotation. In our case &lt;div class="code"&gt;&lt;pre&gt;&lt;span class="annotation"&gt;@AnalyzerDef(name = "wine_stem_en",
        tokenizer = @TokenizerDef(factory = org.vinwiki.search.Lucene29StandardTokenizerFactory.class),
        filters = {
          @TokenFilterDef(factory = StandardFilterFactory.class),
          @TokenFilterDef(factory = LowerCaseFilterFactory.class),
          @TokenFilterDef(factory = StopFilterFactory.class),
          @TokenFilterDef(factory = EnglishPorterFilterFactory.class)
    })&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;This looks more complicated than it is ... Let's take a closer look at the &lt;span class="annotation"&gt;@AnalyzerDef&lt;/span&gt; annotation: &lt;table&gt;&lt;tr&gt; &lt;td&gt;&lt;pre&gt;  name = "wine_stem_en"
&lt;/pre&gt;This gives our analyzer definition a name so we can refer to it in the &lt;span class="annotation"&gt;@Analyzer&lt;/span&gt; annotation. This will also come in handy when we start parsing user queries using Lucene's QueryParser, which also requires an Analyzer.&lt;/td&gt; &lt;/tr&gt;
&lt;tr&gt; &lt;td&gt;&lt;pre&gt;  tokenizer = 
  @TokenizerDef(factory = org.vinwiki.search.Lucene29StandardTokenizerFactory.class)
&lt;/pre&gt;Specify a factory for Tokenizer instances. In this case, I'm supplying a custom factory class that creates a Lucene 2.9.x StandardTokenizer as the default factory provided by Solr creates a Lucene 2.4 StandardTokenizer (most likely for backwards compatibility).&lt;/td&gt; &lt;/tr&gt;
&lt;tr&gt; &lt;td&gt;&lt;pre&gt;  filters = {
    @TokenFilterDef(factory = StandardFilterFactory.class),
    @TokenFilterDef(factory = LowerCaseFilterFactory.class),
    @TokenFilterDef(factory = StopFilterFactory.class),
    @TokenFilterDef(factory = EnglishPorterFilterFactory.class)
  }
&lt;/pre&gt;A tokenizer can pass tokens through a filter-chain to perform various transformations on each token. In this case, we're passing the tokens through four filters, in the order listed above.&lt;br /&gt;
&lt;ol&gt;&lt;li&gt;StandardFilter: Removes dots from acronyms and 's from the end of tokens. Works only on typed tokens, i.e., those produced by StandardTokenizer or equivalent.&lt;/li&gt;
&lt;li&gt;LowerCaseFilter: Lowercases letters in tokens&lt;/li&gt;
&lt;li&gt;StopFilter: Discards common English words like "an" "the" and "of".&lt;/li&gt;
&lt;li&gt;EnglishPorterFilter: Extracts the stem for each token using Porter Stemmer implemented by the Lucene Snowball extension.&lt;/li&gt;
&lt;/ol&gt;&lt;/td&gt; &lt;/tr&gt;
&lt;/table&gt;You can index other fields, such as the wine type and style using the &lt;span class=""&gt;@Field&lt;/span&gt; annotation. &lt;div class="code"&gt;&lt;pre&gt;&lt;span class="annotation"&gt;@Field(name="type",index=Index.UN_TOKENIZED,store=Store.YES)&lt;/span&gt;    
    public WineType getType() {
        return this.type;
    }
&lt;/pre&gt;&lt;/div&gt;&lt;/p&gt;&lt;p&gt;&lt;div class="subHdr"&gt;Building the Index&lt;/div&gt;If you worked through the first post, then you should already have a database containing 2,025 wines. Hibernate Search will automatically index new Wine objects for us when the insert transaction commits, but we need to manually index the existing wines. During startup, the &lt;span class="seam-comp"&gt;org.vinwiki.search.IndexHelper&lt;/span&gt; component counts the number of documents in the index and re-builds the index from the objects in the database if needed. During startup, you should see log output similar to: &lt;pre&gt;[IndexHelper] Observed event org.vinwiki.event.REBUILD_INDEX from Thread QuartzScheduler1_Worker-8
[IndexHelper] Re-building Wine index for 2025 objects.
[IndexHelper] Flushed index update 300 from Thread Quartz ...
[IndexHelper] Flushed index update 600 from Thread Quartz ...
[IndexHelper] Flushed index update 900 from Thread Quartz ...
[IndexHelper] Flushed index update 1200 from Thread Quartz ...
[IndexHelper] Flushed index update 1500 from Thread Quartz ...
[IndexHelper] Flushed index update 1800 from Thread Quartz ...
[IndexHelper] Took 4094 (ms) to re-build the index containing 2025 documents.
&lt;/pre&gt;Alright, with a few lines of code and some clever annotations, we now have a full-text search index for Wine objects. Let's do some searching! &lt;/p&gt;&lt;p&gt;&lt;div class="sectionHdr"&gt;Basic Search Box&lt;/div&gt;From the first post, we already have a server-side pagination framework in place for displaying Wine objects. To integrate full-text search capabilities, we simply need to implement the &lt;span class="java-intf"&gt;org.vinwiki.action.PagedDataFetcher&lt;/span&gt; interface in terms of Hibernate Search. Here is a simple implementation (we'll add more features later): &lt;br /&gt;
&lt;img class="screenshot" src="http://vinwiki.com/thelabdude/vinwiki/search_java.jpg" alt="Search.java source listing"/&gt;&lt;br /&gt;
This is sufficient to get us searching quickly.&lt;/p&gt;&lt;p&gt;Next, we need to pass the user's query from the search box to our &lt;span class="seam-comp"&gt;nav&lt;/span&gt; component. In &lt;span class="file-path"&gt;view/WEB-INF/facelets/searchControls.xhtml&lt;/span&gt;, update the JSF inputText tag to bind the user's query to an Event-scoped component named &lt;span class="seam-comp"&gt;searchCriteria&lt;/span&gt;: &lt;div class="code"&gt;&lt;pre&gt;&amp;lt;h:inputText id="basicSearchQuery" styleClass="searchField" value="&lt;span class="el"&gt;#{searchCriteria.basicSearchQuery}&lt;/span&gt;"/&amp;gt;
&lt;/pre&gt;&lt;/div&gt;The &lt;span class="seam-comp"&gt;searchCriteria&lt;/span&gt; component is in-jected into &lt;span class="seam-comp"&gt;nav&lt;/span&gt;, which passes it on to an instance of &lt;span class="code"&gt;org.vinwiki.search.Search&lt;/span&gt; which implements the PagedDataFetcher interface. You may be thinking that having a separate component for a single input field is over-kill, but the &lt;span class="seam-comp"&gt;searchCriteria&lt;/span&gt; component will also come in very handy once we add an advanced search form. Here is what happens in &lt;span class="seam-comp"&gt;nav&lt;/span&gt;: &lt;div class="code"&gt;&lt;pre&gt;&lt;span class="annotation"&gt;@In(required=false)&lt;/span&gt; private SearchCriteria searchCriteria;
    ...
    public void doSearch() {
        cleanup();
        dataFetcher = new FeatureRichSearch(log, &lt;b&gt;searchCriteria&lt;/b&gt;);
        initFirstPageOfItems(); &lt;span class="java-cmt"&gt;// fetches the first page of Items&lt;/span&gt;
    }
&lt;/pre&gt;&lt;/div&gt;&lt;/p&gt;&lt;p&gt;&lt;div class="subHdr"&gt;Execute Search on Enter&lt;/div&gt;There is the obligatory &lt;b&gt;Search&lt;/b&gt; button next to my search input field, but most user's will just hit enter to execute the search. RichFaces makes this trivial to support using the &lt;span class="xml-elm"&gt;&amp;lt;rich:hotKey&amp;gt;&lt;/span&gt; tag: &lt;div class="code"&gt;&lt;pre&gt;&amp;lt;rich:hotKey key="return" selector="#basicSearchQuery" 
          handler="#{rich:element('searchBtn')}.onclick();return false;"/&amp;gt;
&lt;/pre&gt;&lt;/div&gt;The &lt;span class="xml-elm"&gt;&amp;lt;rich:hotKey&amp;gt;&lt;/span&gt; tag binds a JavaScript event listener for the search box to submit the form when the user hits "enter". Be sure to use the &lt;b&gt;selector&lt;/b&gt; attribute to limit the listener to the search box and not all input text boxes on the page! Any valid jQuery selector will do ...&lt;/p&gt;&lt;p&gt;At this point, re-compile and re-deploy these changes (at the command-line: &lt;b&gt;seam clean reexplode&lt;/b&gt;). A search for "zinfandel" results in: &lt;br /&gt;
&lt;img class="screenshot" src="http://vinwiki.com/thelabdude/vinwiki/zinfandel.png" alt="Search results for zinfandel"/&gt;&lt;br /&gt;
Hopefully, you get something similar in your environment ;-) Yes, those images are terribly ugly right now ... They were pulled dynamically from Freebase; if this were a real production application, then I'd work on getting better images. Next, we'll add some more features to the search engine, such as query term highlighting, spell correction, more like this, and filters. These additional features are implemented in the &lt;span class="java-class"&gt;org.vinwiki.search.FeatureRichSearch&lt;/span&gt; class.&lt;/p&gt;&lt;a name="highlighting"&gt;&amp;nbsp;&lt;/a&gt; &lt;p&gt;&lt;div class="sectionHdr"&gt;Highlighting Query Terms in Results&lt;/div&gt;Highlighting query terms in results is a common and very useful feature of most search engines. The &lt;span class="file-path"&gt;contrib/highlighter&lt;/span&gt; module in Lucene makes this feature easy to implement. There are two aspects to consider when displaying results to the user. First, we want to display the best "fragment" from the document text for the specific query. In other words, rather than just showing the first X characters of the description, we should show the best section of the description matching the user's query. Second, highlight the query terms in the chosen fragment. However, most of the description text for our Wine objects is short enough to display the full description for each result. Based on my current UI layout, 390 is a safe maximum length for fragments as it is long enough to provide useful information about each wine, yet short enough to keep from cluttering up the screen with text. This is something you'll have to work out for your application.&lt;/p&gt;&lt;p&gt;&lt;div class="subHdr"&gt;Highlighting Fragments with Hibernate Search&lt;/div&gt;Recall that we stuff all the searchable text information about a Wine into a single searchable field "content". While good for searching, it's probably not something you want to display to your users directly. Instead, I've chosen to highlight terms in the wine description only. Here is how to construct a Highlighter (from the Lucene &lt;span class="code"&gt;org.apache.lucene.search.highlight&lt;/span&gt; package): &lt;div class="code"&gt;&lt;pre&gt;&lt;span class="java-cmt"&gt;// Pass the Lucene Query to a Scorer that extracts matching spans for the query
    // and then uses these spans to score each fragment&lt;/span&gt;    
    QueryScorer scorer = new QueryScorer(luceneQuery, Wine.DEFAULT_SEARCH_FIELD);
    &lt;span class="java-cmt"&gt;// Highlight using a CSS style&lt;/span&gt;    
    SimpleHTMLFormatter formatter = new SimpleHTMLFormatter("&amp;lt;span class='termHL'&amp;gt;", "&amp;lt;/span&amp;gt;");
    Highlighter highlighter = new Highlighter(formatter, scorer);
    highlighter.setTextFragmenter(new SimpleSpanFragmenter(scorer, Nav.MAX_FRAGMENT_LEN));
&lt;/pre&gt;&lt;/div&gt;Also notice that I get a reference to the "wine_stem_en" Analzyer using: &lt;div class="code"&gt;&lt;pre&gt;Analyzer analyzer = searchFactory.getAnalyzer("wine_stem_en");
&lt;/pre&gt;&lt;/div&gt;While iterating over the results, we pass the Analyzer and the actual description text to the Highlighter. The observant reader will notice that I didn't "stem" the description text during indexing, but now I am stemming the description text for highlighting. You'll see why I'm taking this approach in the next section when I add spell correction. &lt;div class="code"&gt;&lt;pre&gt;highlightedText = highlighter.getBestFragment(analyzer, Wine.SPELL_CHECK_SEARCH_FIELD, description);
&lt;/pre&gt;&lt;/div&gt;FeatureRichSearch saves the dynamically highlighted fragments in a Map because fragments are specific to each query. If the query changes, then so must the fragments for the results. The &lt;span class="code"&gt;getResultDisplayText&lt;/span&gt; method on the &lt;span class="seam-comp"&gt;nav&lt;/span&gt; component interfaces with the FeatureRichSearch dataFetcher to get fragments for search results. &lt;/p&gt;&lt;a name="spell"&gt;&amp;nbsp;&lt;/a&gt; &lt;p&gt;&lt;div class="sectionHdr"&gt;Spell Correction&lt;/div&gt;For spell correction, I'm using the &lt;span class="file-path"&gt;contrib/spellchecker&lt;/span&gt; module in Lucene (see &lt;a target="_blank" href="http://wiki.apache.org/lucene-java/SpellChecker"&gt;http://wiki.apache.org/lucene-java/SpellChecker&lt;/a&gt;), which is a good place to start, but you should realize that handling spelling mistakes in search is a complex problem so this is by no means the "best" solution.&lt;/p&gt;&lt;p&gt;&lt;div class="subHdr"&gt;Spell Correction Dictionary&lt;/div&gt;To begin, you need a dictionary of correctly spelled terms from which to base spell corrections on. The spellchecker module builds a supplementary index from the terms in your main index using fields based on &lt;a target="_blank" href="http://en.wikipedia.org/wiki/Ngram"&gt;n-grams&lt;/a&gt; of each term. Internally, SpellChecker creates several n-gram fields for each word depending on word length. Here is a screenshot of the word "zinfandel" in the SpellChecker index courtesy of &lt;a target="_blank" href="http://code.google.com/p/luke/"&gt;Luke&lt;/a&gt;: &lt;br /&gt;
&lt;img class="screenshot" src="http://vinwiki.com/thelabdude/vinwiki/spell_luke.jpg" alt="SpellChecker Index in Luke"/&gt;&lt;br /&gt;
Recall that we're stemming terms in our default search field "content". If you build the spell checker index from this field, the dictionary will only contain stemmed terms. This has un-desired side-effect that the corrected term will look misspelled to the user. For example, if you search for "strawbarry", the spell checker will probably recommend "strawberri", which is good, but we really want to show the user &lt;i&gt;"Did you mean 'strawberry'?"&lt;/i&gt;. Thus, we need to base our spell checker index off non-stemmed text. When we ask the spell checker to suggest a term for "strawbarry", then it will return "strawberry". When we query the search index, we need to query for "strawberri", which is why I pass the "wine_stem_en" Analyzer to the QueryParser after applying the spell correction process.&lt;/p&gt;&lt;p&gt;&lt;div class="subHdr"&gt;Hibernate Search Automatic Indexing and Spell Correction&lt;/div&gt;Lucene's SpellChecker builds a supplementary index from the terms in your main index. If your main index changes, e.g. after adding a new entity, then you need to update the spell correction index. From discussing this within the HS community, there's nothing built into HS to help you determine when to update the spell checker index, especially if you're using &lt;span class="code"&gt;hibernate.search.worker.execution=async&lt;/span&gt;. In other words, you don't know when Hibernate Search is finished updating the Lucene index. You have a couple of options to consider here: 1) update the spell index incrementally as new content is added (or updated), or 2) update the index periodically in a background job. This depends on your requirements and how much you trust the source of the changes to the main index. For now, I'm using Seam Events to incrementally update the spell index after a new Wine is added or an existing Wine is updated (see &lt;span class="file-path"&gt;src/hot/org/vinwiki/search/IndexHelper.java&lt;/span&gt; for details).&lt;/p&gt;&lt;p&gt;&lt;div class="subHdr"&gt;Edit Distance as Measure of Similarity Between Terms&lt;/div&gt;Lucene's SpellChecker uses the notion of "edit distance" to measure the similarity between terms. When you call &lt;span class="code"&gt;spellChecker.suggestSimilar(wordToRespell, ...)&lt;/span&gt;, the checker consults an instance of &lt;span class="java-intf"&gt;org.apache.lucene.search.spell.StringDistance&lt;/span&gt; to score hits from the spell checker index. Here's how a SpellChecker is constructed: &lt;div class="code"&gt;&lt;pre&gt;Directory dir = FSDirectory.open(new java.io.File("lucene_index/spellcheck"));
    SpellChecker spell = new SpellChecker(dir);
    spell.setStringDistance(new LevensteinDistance());
    spell.setAccuracy(0.8f);
    String[] suggestions = spell.suggestSimilar(wordToRespell, 10,
                             indexReader, Wine.SPELL_CHECK_SEARCH_FIELD, true);
&lt;/pre&gt;&lt;/div&gt;The &lt;a target="_blank" href="http://en.wikipedia.org/wiki/Levenshtein_distance"&gt;LevensteinDistance&lt;/a&gt; class implements StringDistance by calculating the minimum number of edits needed to transform one string into the other, with the allowable edit operations being insertion, deletion, or substitution of a single character.&lt;/p&gt;&lt;p&gt;&lt;div class="subHdr"&gt;Spell Correction Heuristics&lt;/div&gt;The spellchecker module does a fine job of suggesting possible terms, but we still need to decide how to handle these suggestions depending on the structure of the user's query. To keep things simple, our first heuristic applies to single term queries where the mis-spelled term &lt;u&gt;does not exist&lt;/u&gt; in our spell checker index. In this case, we simply re-issue the search using the best suggestion from the spell checker. Here is a screen shot of how we present the results to the user: &lt;br /&gt;
&lt;img class="screenshot" src="http://vinwiki.com/thelabdude/vinwiki/strawbarry.png" alt="Spell correction for single term"/&gt;&lt;br /&gt;
However, we can't be sure that a term is mis-spelled if it exists in the spell checker index. Thus, if a single mis-spelled term exists in the spell correction index, then you need to decide if you are going to just "OR" the term mis-spelled and suggested terms together or let the user decide (it seems like Google doesn't always do one or the other, so their implementation is a bit more advanced than the one I'll provide here). &lt;br /&gt;
&lt;img class="screenshot" src="http://vinwiki.com/thelabdude/vinwiki/velvety_tanins.jpg"/&gt;&lt;br /&gt;
In the screenshot above, the user queried for "velvety tanins"; both terms exist in the spell correction index, but "tanins" is mis-spelled. During query processing, the SpellChecker suggested "tannins" for the mis-spelled word "tanins", which is correct. Thus, we search for "velvety tanins", but also suggest a spell-corrected query "velvety tannins", allowing the user to click on the suggested correction to see better results (hopefully). Please refer to the &lt;span class="code"&gt;checkSpelling&lt;/span&gt; method in the &lt;span class="file-path"&gt;src/hot/org/vinwiki/search/FeatureRichSearch.java&lt;/span&gt; for details on how these simple heuristics are implemented.&lt;/p&gt;&lt;a name="suggestions"&gt;&amp;nbsp;&lt;/a&gt; &lt;p&gt;&lt;div class="sectionHdr"&gt;Type-ahead Search Suggestions&lt;/div&gt;Most users appreciate when an application spares them un-necessary effort, such as typing common phrases into the search box. Thus, we'll use RichFaces &lt;span class="xml-elm"&gt;&amp;lt;rich:suggestionbox&amp;gt;&lt;/span&gt; tag to provide a drop-down suggestion list of known search terms after the user types 3 or more characters. While trivial, this feature can help users be more productive with your search interface and helps minimize spelling mistakes. &lt;br /&gt;
&lt;img class="screenshot" src="http://vinwiki.com/thelabdude/vinwiki/type_ahead.jpg" alt="Type-ahead Suggestion List"/&gt;&lt;br /&gt;
In &lt;span class="file-path"&gt;/WEB-INF/facelets/searchControls.xhtml&lt;/span&gt;, we attach the &lt;span class="xml-elm"&gt;&amp;lt;rich:suggestionbox&amp;gt;&lt;/span&gt; to the search input text field using: &lt;div class="code"&gt;&lt;pre&gt;&amp;lt;rich:suggestionbox id="suggestionBox" &lt;b&gt;for="basicSearchQuery"&lt;/b&gt; ignoreDupResponses="true"
                    immediate="true" limitToList="true" height="100" width="250" minChars="3"
                    usingSuggestObjects="true" suggestionAction="&lt;span class="el"&gt;#{nav.suggestSearchTerms}&lt;/span&gt;"
                    selfRendered="true" first="0" var="suggestion"&amp;gt;
  &amp;lt;h:column&amp;gt;&amp;lt;h:outputText value="#{suggestion}"/&amp;gt;&amp;lt;/h:column&amp;gt;
&amp;lt;/rich:suggestionbox&amp;gt;
&lt;/pre&gt;&lt;/div&gt;In the first posting, I discussed queue and traffic flood protection using RichFace's Global Default Queue. You should definitely active an AJAX request queue if you are using type-ahead suggestions. I've also added an animated GIF and loading message using the &lt;span class="xml-elm"&gt;&amp;lt;a4j:status&amp;gt;&lt;/span&gt; tag.&lt;/p&gt;&lt;p&gt;On the Java side, I've resorted to a simple LIKE query to the database, but you can also query a field analyzed with Lucene's &lt;span class="code"&gt;org.apache.lucene.analysis.ngram.EdgeNGramTokenFilter&lt;/span&gt; in your full-text index.&lt;/p&gt;&lt;a name="advanced"&gt;&amp;nbsp;&lt;/a&gt; &lt;p&gt;&lt;div class="sectionHdr"&gt;Advanced Search&lt;/div&gt;Sometimes a single search field isn't enough to pin-point the information you're looking for; advanced search addresses this, albeit uncommon, need for users to formalize complex queries. The advanced search form is application specific but most allow the user construct a query composed of AND, OR, exact phrase, and NOT clauses, as shown in the following screen shot: &lt;br /&gt;
&lt;img class="screenshot" src="http://vinwiki.com/thelabdude/vinwiki/advanced_search.png" alt="Advanced Search Form"/&gt;&lt;br /&gt;
As mentioned above, the &lt;span class="java-class"&gt;org.vinwiki.search.SearchCriteria&lt;/span&gt; class manages the advanced search form data. I'll refer you to the source code for further details about advanced search. Notice that I'm using a RangeQuery to implement the Date field on the search form.&lt;/p&gt;&lt;p&gt;&lt;div class="sectionHdr"&gt;Testing Search&lt;/div&gt;Be careful when building tests involving Lucene because &lt;span class="file-path"&gt;lib/test/thirdparty-all.jar&lt;/span&gt; contains an older version of Lucene. To remedy this, I added the following list of JARs to the top of the &lt;b&gt;test.path&lt;/b&gt; path in build.xml: &lt;pre&gt;&amp;lt;fileset dir="${lib.dir}"&amp;gt;
    &amp;lt;include name="lucene-core-2.9.2.jar"/&amp;gt;
    &amp;lt;include name="lucene-highlighter-2.9.2.jar"/&amp;gt;
    &amp;lt;include name="lucene-misc-2.9.2.jar"/&amp;gt;
    &amp;lt;include name="lucene-snowball-2.9.2.jar"/&amp;gt;
    &amp;lt;include name="lucene-spatial-2.9.2.jar"/&amp;gt;
    &amp;lt;include name="lucene-spellchecker-2.9.2.jar"/&amp;gt;
    &amp;lt;include name="lucene-memory-2.9.2.jar"/&amp;gt;
    &amp;lt;include name="solr-core-1.4.0.jar"/&amp;gt;
    &amp;lt;include name="solr-solrj-1.4.0.jar"/&amp;gt;
  &amp;lt;/fileset&amp;gt;
&lt;/pre&gt;A new test case was added for testing search, see &lt;span class="file-path"&gt;src/test/org/vinwiki/test/SearchTest.java&lt;/span&gt;. While fairly trivial, this class helped me root out a few issues (like updating the spell correction index after an update) while developing the code for this post. So test-driven development shows its worth once again!&lt;/p&gt;&lt;p&gt;&lt;div class="sectionHdr"&gt;Future Enhancements to the Search Engine&lt;/div&gt;There are still a few search-related features I think this application needs, including tagging, phrase handling, and synonyms. Specifically, user's should be able to add tags like "jammy" or "flabby" when rating wines. The application should be able to render a tag cloud from these user-supplied tags as another form of navigation. User tags should also be fed into the search index (using caution of course). Phrase detection complements user-supplied tagging by recognizing multi-term tags during text analysis. How you handle stop words also affects exact phrase matching. The &lt;span class="file-path"&gt;contrib/shingles&lt;/span&gt; module helps speed up phrase searches involving common terms, so I'd definitely like to investigate its applicability for this application. Lastly, synonyms help supply the search index (and/or search queries) with additional terms that mean the same thing as other words in your documents. If time allows, I'll try to add a fifth post dealing with tags, phrases, and synonyms. I'd also like to hear from the community on other features that might be helpful.&lt;/p&gt;&lt;p&gt;&lt;div class="subHdr"&gt;TODO: Evaluating Search Quality with contrib/benchmark&lt;/div&gt;The &lt;span class="file-path"&gt;contrib/benchmark&lt;/span&gt; module is one extension all Lucene developers should be familiar with; benchmark allows you to run repeatable performance tests against your index. I'll refer you to the JavaDoc for using benchmark for performance testing. In the future, I'd like to use benchmark's &lt;span class="code"&gt;quality&lt;/span&gt; package to measure relevance of search results. The classes in the quality package allow us to measure &lt;u&gt;precision&lt;/u&gt; and &lt;u&gt;recall&lt;/u&gt; for our search engine. Precision is a measure of "exactness" that tells us the proportion of the top N results that are relevant to the search. Recall is a measure of "completeness" that tells us the proportion of all relevant documents included in the top results. In other words, if there are 100 relevant documents for a query and the results return 80, then recall is 0.8. Sometimes, the two metrics conflict with each other in that as you return more results (to increase recall), you can introduce irrelevant documents, which decreases precision. The &lt;span class="code"&gt;quality&lt;/span&gt; package will give us a way to benchmark precision and recall and then tune Lucene to improve these as much as possible. If I get time, then I'll post my results (and source).&lt;/p&gt;&lt;p&gt;&lt;div class="sectionHdr"&gt;A last words about scalability&lt;/div&gt;While Lucene is very fast, you can run into search performance issues if you are constantly updating the index, which requires Hibernate Search to close and re-open its shared IndexReader. Hibernate Search has built-in support for a Master / Slave architecture where updates occur on a single master server and searches are executed on replicated, read-only slave indexes in your cluster. I use this configuration in my day job and it scales well. However, you'll need a JMS queue and Message-driven EJB to allow slave nodes to send updates to the master node, so you'll have to deploy your application as an EAR instead of a WAR (to deploy the MDB). Please refer to the online documentation provided with Hibernate Search for a good discussion on how to setup a master / slave cluster.&lt;/p&gt;&lt;p&gt;&lt;div class="sectionHdr"&gt;What's Next?&lt;/div&gt;In the next &lt;a href="http://thelabdude.blogspot.com/2010/07/vinwiki-part-3-authentication-with.html"&gt;post&lt;/a&gt;, I'll add authentication and registration using &lt;a href="http://developers.facebook.com/docs/authentication/" target="_blank"&gt;Facebook Connect&lt;/a&gt;.&lt;/p&gt;&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6666528949054177156-6540483731847943950?l=thelabdude.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://thelabdude.blogspot.com/feeds/6540483731847943950/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://thelabdude.blogspot.com/2010/06/vinwiki-part-2-full-text-search-with.html#comment-form' title='5 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6666528949054177156/posts/default/6540483731847943950'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6666528949054177156/posts/default/6540483731847943950'/><link rel='alternate' type='text/html' href='http://thelabdude.blogspot.com/2010/06/vinwiki-part-2-full-text-search-with.html' title='VinWiki Part 2: Full-text Search with Hibernate Search and Lucene'/><author><name>thelabdude</name><uri>http://www.blogger.com/profile/08032248788175889303</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>5</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6666528949054177156.post-1932020035980285742</id><published>2010-05-24T15:17:00.000-07:00</published><updated>2011-07-26T11:35:50.675-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='mahout'/><category scheme='http://www.blogger.com/atom/ns#' term='bookmarkable URL'/><category scheme='http://www.blogger.com/atom/ns#' term='facelets'/><category scheme='http://www.blogger.com/atom/ns#' term='facebook connect'/><category scheme='http://www.blogger.com/atom/ns#' term='lucene'/><category scheme='http://www.blogger.com/atom/ns#' term='richfaces'/><category scheme='http://www.blogger.com/atom/ns#' term='recommendation engine'/><category scheme='http://www.blogger.com/atom/ns#' term='jboss seam'/><category scheme='http://www.blogger.com/atom/ns#' term='server-side paging'/><category scheme='http://www.blogger.com/atom/ns#' term='intelligent web'/><category scheme='http://www.blogger.com/atom/ns#' term='hibernate search'/><title type='text'>VinWiki Part 1: Building an intelligent Web app using Seam, Hibernate, RichFaces, Lucene and Mahout</title><content type='html'>&lt;style&gt;
.todo {
  font:bold 14px verdana;
  color:#FF0000;
  padding: 15px;
}
.annotation {
  font:normal 12px monospace;
  color:#0033CC;
}
.seam-comp {
  font:bold 12px monospace;
  color:#660099;
}
.sectionHdr {
  font-size:15px;
  font-weight:bold;
  color:#CC6633;
  margin-top:20px;
  margin-bottom:15px;
}
.subHdr {
  font-size:13px;
  font-weight:bold;
  color:#CC6633;
  margin-top:15px;
  margin-bottom:10px;
}
.sub2Hdr {
  font-size:12px;
  font-weight:bold;
  font-style:italic;
  color:#CC6633;
  margin-top:10px;
  margin-bottom:5px;
}
.file-path {
  color:#336600;
}
.code {
  font:normal 12px monospace;
  color:#333333;
}
.tip {
  padding: 5px;
  background: #FFFFCC;
  border: 1px solid #F0F0F0;
  width: 500px;
  margin:20px;
}
.java-class {
  font:bold 12px monospace;
  color:#003399;
}
.java-intf {
  font:italic 12px monospace;
  color:#003399;
}
.java-kw {
  font:normal 12px monospace;
  color:#0000FF;
}
.java-id {
  font:normal 12px monospace;
  color:#FF0000;
}
.java-str {
  font:normal 12px monospace;
  color:#FF00CC;
}
.java-cmt {
  font:normal 12px monospace;
  color:#339900;
}
.el {
  font:normal 12px monospace;
  color:#990000;
}
.xml-elm {
  font:normal 12px monospace;
  color:#999933;
}
.hashLink {
  font-size:11px;
  font-style:italic;
}
.screenshot {
  margin: 20px;
}
&lt;/style&gt; &lt;span style="font-family:verdana;font-size:12px;"&gt; &lt;div class="sectionHdr"&gt;Introduction&lt;/div&gt;&lt;p&gt;This is the first post in a four part series about a wine rating and recommendation Web application, named VinWiki, built using open source technology. The purpose of this series is to document key design and implementation decisions, which may be of interest to anyone wanting to build an intelligent Web application using Java technologies. The end result will not be a 100% functioning Web application, but will have enough functionality to prove the concepts. Specifically, here is a basic roadmap of the concepts covered in each post: &lt;ol&gt;&lt;li&gt;&lt;b&gt;Introduction&lt;/b&gt; &lt;i&gt;(May 24, 2010)&lt;/i&gt;: Covers project setup, primary domain objects, and basic UI constructs such as server-side pagination, dynamic menus, and bookmarkable URLs with JSF / RichFaces / Facelets.&lt;/li&gt;


&lt;li&gt;&lt;b&gt;Full-text Search&lt;/b&gt; &lt;i&gt;(&lt;a href="http://thelabdude.blogspot.com/2010/06/vinwiki-part-2-full-text-search-with.html"&gt;June 14, 2010&lt;/a&gt;)&lt;/i&gt;: Implements full-text search using Hibernate Search with &lt;a href="http://lucene.apache.org/java/docs/index.html" target="_blank"&gt;Lucene&lt;/a&gt; 2.9.2.&lt;/li&gt;


&lt;li&gt;&lt;b&gt;Integrating with the Web&lt;/b&gt; &lt;i&gt;(&lt;a href="http://thelabdude.blogspot.com/2010/07/vinwiki-part-3-authentication-with.html"&gt;July 8, 2010&lt;/a&gt;)&lt;/i&gt;: Authentication and registration using &lt;a href="http://developers.facebook.com/docs/authentication/" target="_blank"&gt;Facebook Connect&lt;/a&gt; and sharing/liking bookmarkable URLs on Facebook.&lt;/li&gt;


&lt;li&gt;&lt;b&gt;Recommendations&lt;/b&gt; &lt;i&gt;(&lt;a href="http://thelabdude.blogspot.com/2010/09/vinwiki-part-4-making-recommendations.html"&gt;Sept 23, 2010&lt;/a&gt;)&lt;/i&gt;: Provide wine recommendations using &lt;a href="http://mahout.apache.org" target="_blank"&gt;Apache Mahout&lt;/a&gt;.&lt;/li&gt;


&lt;/ol&gt;The idea for VinWiki was born out of two interests in my life: wine and collective intelligence. I'm a wine connoisseur and love red wines from Piedmont (Italy) and the Dry Creek and Alexander Valley's in Sonoma. I don't, however, drink expensive wine. Rather, I'm always on the lookout for that $10-15/bottle wine that just tastes good (who isn't right?) My strategy is to know a lot about a little. For example, I know the major varietals in Piedmont, their characteristics, which years were good and which were not, etc. I know these things because I want to be able to drink great inexpensive wines and help my friends do the same. Wouldn't it be nice if you had access to all this knowledge in my head the next time you go out for Italian food in North Beach?&lt;/p&gt;&lt;p&gt;One of my other interests is figuring out how to harvest intelligence from the interactions and contributions of users on a Web site and then apply that intelligence to improve the personal experience with the application as well as the application as a whole. The science behind this is called Collective Intelligence. To keep things simple, I consider a Web application to be "intelligent" if it has the following key ingredients: &lt;ol&gt;&lt;li&gt;automatically improves its capabilities as more users contribute to it,&lt;/li&gt;
&lt;li&gt;scalable (machine learning algorithms do better with large amounts of data),&lt;/li&gt;
&lt;li&gt;mashable (simple Web services that expose data and services to other applications), and&lt;/li&gt;
&lt;li&gt;doesn't pretend to be more than it is!&lt;/li&gt;
&lt;/ol&gt;There's a wealth of knowledge available about how to implement the first ingredient using machine learning. And, as we'll see in the 4th post in this series, there is a wonderful open source project named Mahout that provides scalable implementations of many popular machine learning algorithms, such as clustering, classification, and recommendations. However, please keep in mind that it takes rigorous testing and experimentation to ensure these algorithms produce good results for &lt;u&gt;your data&lt;/u&gt;. Don't assume that because an algorithm sounds sophisticated that it will produce good results in your application. Hence, the last key ingredient! In this application, intelligence comes in the form of making recommendations for wines from previous ratings using collaborative filtering. Presumably, as more users contribute wines and ratings, the application will improve for all users.&lt;/p&gt;&lt;p&gt;&lt;div class="subHdr"&gt;Toolset&lt;/div&gt;The solution is built using the following open-source technologies: &lt;ul&gt;&lt;li&gt;&lt;a href="http://www.jboss.org/jbossas" target="_blank"&gt;JBoss AS 4.2.3&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="http://www.seamframework.org" target="_blank"&gt;Seam 2.2.0&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="http://www.hibernate.org/" target="_blank"&gt;Hibernate&lt;/a&gt; (Core, EntityManager, Annotations, Validator &amp;amp; Search)&lt;/li&gt;
&lt;li&gt;&lt;a href="http://java.sun.com/javaee/javaserverfaces/reference/api/index.html" target="_blank"&gt;JSF 1.2&lt;/a&gt; / &lt;a href="http://www.jboss.org/richfaces" target="_blank"&gt;RichFaces 3.3.2 SR1&lt;/a&gt; / &lt;a href="https://facelets.dev.java.net/" target="_blank"&gt;Facelets&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="http://lucene.apache.org/java/docs/index.html" target="_blank"&gt;Lucene 2.9.2&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="http://mahout.apache.org" target="_blank"&gt;Mahout 0.3&lt;/a&gt; / &lt;a href="http://hadoop.apache.org/" target="_blank"&gt;Hadoop 0.20.2&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;/p&gt;&lt;p&gt;&lt;div class="subHdr"&gt;Recommended Reading List&lt;/div&gt;Here are a few excellent books that introduce you to the subject of intelligent Web applications: &lt;ul&gt;&lt;li&gt;&lt;a href="http://www.manning.com/marmanis/" target="_blank"&gt;Algorithms of the Intelligent Web&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="http://www.manning.com/alag/" target="_blank"&gt;Collective Intelligence in Action&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="http://www.manning.com/owen/" target="_blank"&gt;Mahout in Action&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="http://oreilly.com/catalog/9780596529321" target="_blank"&gt;Programming Collective Intelligence&lt;/a&gt; &lt;i&gt;(code samples in Python, but still a great read and Python is a fun scripting language to know anyway ;-)&lt;/i&gt;&lt;/li&gt;
&lt;/ul&gt;If you are interested in full-text search, the following two books are invaluable assets: &lt;ul&gt;&lt;li&gt;&lt;a href="http://www.manning.com/bernard/" target="_blank"&gt;Hibernate Search in Action&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="http://www.manning.com/hatcher3/" target="_blank"&gt;Lucene in Action (Second Edition)&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;/p&gt;&lt;p&gt;&lt;a name="reqts"&gt;&lt;/a&gt;&lt;div class="sectionHdr"&gt;Basic Requirements&lt;/div&gt;There are three primary activities you can do with this application: &lt;ol&gt;&lt;li&gt;Rate wines you've already tried&lt;/li&gt;


&lt;li&gt;Find and read ratings for wines you may like to try&lt;/li&gt;


&lt;li&gt;Receive recommendations of new wines you should to try based on your previous ratings&lt;/li&gt;
&lt;/ol&gt;These activities equate to the following use cases: &lt;ul&gt;&lt;li&gt;Register New User Account &amp;nbsp;&amp;nbsp;&lt;a href="#register" class="hashLink"&gt;view solution&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Basic Navigation: Browse Wine by Region, Date, Tag, or Popularity &amp;nbsp;&amp;nbsp;&lt;a href="#nav" class="hashLink"&gt;view solution&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;View detailed information for a specific wine &amp;nbsp;&amp;nbsp;&lt;a href="#viewWine" class="hashLink"&gt;view solution&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Add Rating for an Existing Wine &amp;nbsp;&amp;nbsp;&lt;a href="#ratingHome" class="hashLink"&gt;view solution&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Add New Wine &amp;nbsp;&amp;nbsp;&lt;a href="#wineHome" class="hashLink"&gt;view solution&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Edit Wine &amp;nbsp;&amp;nbsp;&lt;a href="#editWine" class="hashLink"&gt;view solution&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Search for Wine &amp;nbsp;&amp;nbsp;&lt;a href="http://thelabdude.blogspot.com/2010/06/vinwiki-part-2-full-text-search-with.html" class="hashLink"&gt;view solution&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Share wines and ratings with Facebook friends &amp;nbsp;&amp;nbsp;&lt;a href="http://thelabdude.blogspot.com/2010/07/vinwiki-part-3-authentication-with.html" class="hashLink"&gt;view solution&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;View Recommendations &amp;nbsp;&amp;nbsp;&lt;a href="http://thelabdude.blogspot.com/2010/09/vinwiki-part-4-making-recommendations.html" class="hashLink"&gt;view solution&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;Here is a screen shot of the home page (note: best I could do with the image quality since blogger limits the width to 800).&lt;br /&gt;
&lt;img class="screenshot" src="http://vinwiki.com/thelabdude/vinwiki/home.jpg" alt="Home page"/&gt;&lt;br /&gt;
&lt;/p&gt;&lt;p&gt;&lt;div class="sectionHdr"&gt;Domain Objects&lt;/div&gt;The following UML Class Diagram depicts the primary domain objects in our model.&lt;br /&gt;
&lt;img class="screenshot" src="http://vinwiki.com/thelabdude/vinwiki/model.png" alt="UML Class Diagram for the org.vinwiki.model package"/&gt;&lt;br /&gt;
&lt;table style="font-family:verdana;font-size:12px;" cellpadding="5"&gt;&lt;tr bgcolor="#f0f0f0"&gt;&lt;td&gt;&lt;b&gt;JPA Entity&lt;/b&gt;&lt;/td&gt;&lt;td&gt;&lt;b&gt;Description&lt;/b&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr bgcolor="#ffffff"&gt;&lt;td&gt;Region&lt;/td&gt;&lt;td&gt;Encapsulates information about a geographical area that produces wine, such as Bordeaux. Implements the &lt;span class="java-intf"&gt;org.vinwiki.model.Composite&lt;/span&gt; interface to represent a hierarchical tree of regions and sub-regions.&lt;/td&gt;&lt;/tr&gt;
&lt;tr bgcolor="#f0f0f0"&gt;&lt;td&gt;Varietal&lt;/td&gt;&lt;td&gt;Encapsulates information about grape variety used to make wine, such as Chardonnay.&lt;/td&gt;&lt;/tr&gt;
&lt;tr bgcolor="#ffffff"&gt;&lt;td&gt;VarietalPct&lt;/td&gt;&lt;td&gt;Represents the percentage of a varietal in a specific wine.&lt;/td&gt;&lt;/tr&gt;
&lt;tr bgcolor="#f0f0f0"&gt;&lt;td&gt;Winery&lt;/td&gt;&lt;td&gt;Encapsulates information about a wine producer, such as Robert Mondavi. A Winery may have a latitude and longitude specified, which would allow us to show the winery on map of wineries in a region.&lt;/td&gt;&lt;/tr&gt;
&lt;tr bgcolor="#ffffff"&gt;&lt;td&gt;Wine&lt;/td&gt;&lt;td&gt;Encapsulates information about a specific wine, uniquely identified by name, winery, and vintage (typically the year the grapes were harvested).&lt;/td&gt;&lt;/tr&gt;
&lt;tr bgcolor="#f0f0f0"&gt;&lt;td&gt;User&lt;/td&gt;&lt;td&gt;Encapsulates information about a registered user in the application.&lt;/td&gt;&lt;/tr&gt;
&lt;tr bgcolor="#ffffff"&gt;&lt;td&gt;Preferences&lt;/td&gt;&lt;td&gt;Encapsulates user-supplied settings used to personalize the application.&lt;/td&gt;&lt;/tr&gt;
&lt;tr bgcolor="#f0f0f0"&gt;&lt;td&gt;Tag&lt;/td&gt;&lt;td&gt;Holds information for a keyword created by a user for rating a wine.&lt;/td&gt;&lt;/tr&gt;
&lt;tr bgcolor="#ffffff"&gt;&lt;td&gt;UserTag&lt;/td&gt;&lt;td&gt;Associates a tag with a user's rating.&lt;/td&gt;&lt;/tr&gt;
&lt;tr bgcolor="#f0f0f0"&gt;&lt;td&gt;Rating&lt;/td&gt;&lt;td&gt;Represents a specific user's rating (score and comments) of a specific wine.&lt;/td&gt;&lt;/tr&gt;
&lt;/table&gt;&lt;div class="subHdr"&gt;Relational Database Model&lt;/div&gt;Here is the Hibernate model translated into a MySQL database model:&lt;br /&gt;
&lt;img class="screenshot" src="http://vinwiki.com/thelabdude/vinwiki/eer.png" alt="Relational Database Model"/&gt;&lt;br /&gt;
&lt;div class="subHdr"&gt;Seed Data&lt;/div&gt;Since an application like this isn't very interesting without some real data, I extracted some wine-related entities from &lt;a href="http://www.freebase.com" target="_blank"&gt;Freebase&lt;/a&gt; using their RESTful Web Services API, along with some manual tweaking to better fit the desired data model. Specifically, the seed data contains: &lt;ul&gt;&lt;li&gt;104 wine regions / sub-regions&lt;/li&gt;
&lt;li&gt;487 grape varietals&lt;/li&gt;
&lt;li&gt;772 wineries&lt;/li&gt;
&lt;li&gt;2,025 wines&lt;/li&gt;
&lt;/ul&gt;So this should get us started with some real data, but as we'll see below, users will also be able to add new wines as needed.&lt;/p&gt;&lt;p&gt;&lt;div class="subHdr"&gt;Persistence Annotations&lt;/div&gt;Most of the JPA annotations in the domain model are straight-forward, so I'll refer you to the source code for further study. However, I'm using a few annotations that warrant further discussion, including:&lt;br /&gt;
&lt;br /&gt;
&lt;div class="sub2Hdr"&gt;Hierarchical Regions (composite pattern)&lt;/div&gt;A wine Region can have sub-regions, such as France has Bordeaux and Alsace. JPA makes modeling composites really easy using &lt;span class="annotation"&gt;@ManyToOne&lt;/span&gt; with &lt;span class="annotation"&gt;@JoinColumn&lt;/span&gt;: &lt;div class="code"&gt;&lt;pre&gt;&lt;span class="annotation"&gt;@ManyToOne(fetch = FetchType.LAZY)&lt;/span&gt;
    &lt;span class="annotation"&gt;@JoinColumn(name = "parent_region", nullable = true)&lt;/span&gt;
    public Region getParentRegion() {
        return this.parentRegion;
    }
&lt;/pre&gt;&lt;/div&gt;Yep! It really is that easy ... &lt;div class="sub2Hdr"&gt;@javax.persistence.MappedSuperclass&lt;/div&gt;Entities typically share a common set of fields, such as ID, name, description so it is common to encapsulate these common fields into a base class. The &lt;a href="http://java.sun.com/javaee/5/docs/api/javax/persistence/MappedSuperclass.html" target="_blank"&gt;javax.persistence.MappedSuperclass&lt;/a&gt; annotation designates a class whose mapping information is applied to the entities that inherit from it. As you can see from the class diagram, the &lt;span class="java-class"&gt;org.vinwiki.model.AbstractItemBase&lt;/span&gt; class serves as the &lt;span class="annotation"&gt;@MappedSuperclass&lt;/span&gt; in our model.&lt;/p&gt;&lt;p&gt;&lt;div class="sub2Hdr"&gt;@org.hibernate.annotations.Cache&lt;/div&gt;If you think about it, most of the wine-related entities in this model are primarily read-only in nature. The main user activity in this application is to add ratings to existing wine objects. Thus, the Wine, Winery, Varietal, and Region entities are good candidates for caching in what is called the &lt;b&gt;second-level cache&lt;/b&gt;. We use the &lt;span class="annotation"&gt;@org.hibernate.annotations.Cache&lt;/span&gt; annotation to define the cache concurrency strategy for each entity we want to cache. I'm using &lt;span class="code"&gt;org.hibernate.annotations.CacheConcurrencyStrategy.NONSTRICT_READ_WRITE&lt;/span&gt; because the application may need to update these entities occasionally, but in most cases they are used as read-only objects. For more information about cache concurrency strategy, please refer to &lt;a href="http://docs.jboss.org/hibernate/core/3.3/reference/en/html/performance.html#performance-cache" target="_blank"&gt;The Second Level Cache&lt;/a&gt; section in the Hibernate Core documentation.&lt;br /&gt;
&lt;br /&gt;
Of course, we also need to enable the second-level cache in &lt;span class="file-path"&gt;persistence.xml&lt;/span&gt;. I found it easiest to use Ehcache initially but you should research the other providers, e.g. JBoss Cache, to determine the most appropriate solution for your application. I'll revisit this decision when I tackle clustering this application in the cloud. For now, here are the salient properties specified in &lt;span class="file-path"&gt;persistence.xml&lt;/span&gt;: &lt;div class="code"&gt;&lt;pre&gt;&amp;lt;property name="hibernate.cache.provider_class" value="net.sf.ehcache.hibernate.EhCacheProvider"/&amp;gt;
  &amp;lt;property name="hibernate.cache.use_query_cache" value="true"/&amp;gt;
  &amp;lt;property name="hibernate.cache.use_second_level_cache" value="true"/&amp;gt;
  &amp;lt;property name="hibernate.cache.region_prefix" value=""/&amp;gt;
  &amp;lt;property name="hibernate.generate_statistics" value="true"/&amp;gt;
  &amp;lt;property name="hibernate.session_factory_name" value="SessionFactories/vinwikiSF"/&amp;gt;
&lt;/pre&gt;&lt;/div&gt;Notice that I'm enabling Hibernate Statistics using the &lt;span class="code"&gt;hibernate.generate_statistics&lt;/span&gt; property. During startup, I deploy the Hibernate StatisticsService MBean using: &lt;div class="code"&gt;&lt;pre&gt;hibernateMBeanName = &lt;span class="java-kw"&gt;new&lt;/span&gt; &lt;span class="java-id"&gt;ObjectName&lt;/span&gt;(&lt;span class="java-str"&gt;"Hibernate:type=statistics,application=vinwiki"&lt;/span&gt;);
    &lt;span class="java-id"&gt;StatisticsService&lt;/span&gt; mBean = &lt;span class="java-kw"&gt;new&lt;/span&gt; &lt;span class="java-id"&gt;StatisticsService&lt;/span&gt;();
    mBean.setSessionFactoryJNDIName(&lt;span class="java-str"&gt;"SessionFactories/vinwikiSF"&lt;/span&gt;);
    &lt;span class="java-id"&gt;ManagementFactory&lt;/span&gt;.getPlatformMBeanServer().registerMBean(mBean, hibernateMBeanName);
&lt;/pre&gt;&lt;/div&gt;I also use Ehcache as the Seam cache provider, in &lt;span class="file-path"&gt;resources/WEB-INF/components.xml&lt;/span&gt;: &lt;div class="code"&gt;&lt;pre&gt;&amp;lt;cache:eh-cache-provider/&amp;gt;
&lt;/pre&gt;&lt;/div&gt;This will come in handy when we start displaying tag clouds and other expensive data structures to our users. &lt;div class="tip"&gt;Be sure to include &lt;b&gt;ehcache.jar&lt;/b&gt; in your deployed-jars.list file!&lt;/div&gt;&lt;/p&gt;&lt;p&gt;&lt;div class="sectionHdr"&gt;Solutions&lt;/div&gt;In this section, I walk you through the solutions to each use case described &lt;a href="#reqts"&gt;above&lt;/a&gt;. &lt;br /&gt;
&lt;br /&gt;
&lt;a name="register"&gt;&lt;/a&gt;&lt;div class="subHdr"&gt;Register New User Account&lt;/div&gt;For this requirement, I leveraged the solution from my previous &lt;a href="http://thelabdude.blogspot.com/2009/05/user-registration-solution-using-jboss.html" target="_blank"&gt;blog post&lt;/a&gt;. We'll see in post #3 in this series how to allow Facebook users to automatically register and authenticate using their Facebook account. However, I did make a few minor improvements to the handling of user preferences so be sure to review the code in the &lt;span class="code"&gt;org.vinwiki.user&lt;/span&gt; package after reading the aforementioned blog post. &lt;br /&gt;
&lt;br /&gt;
&lt;a name="nav"&gt;&lt;/a&gt;&lt;div class="subHdr"&gt;Basic Navigation&lt;/div&gt;There are a number of ways a user can browse for wines in the application, including by region, most recent, best rated, as well as search. Moreover, a user can switch between these mechanisms at any time. The &lt;span class="seam-comp"&gt;nav&lt;/span&gt; component (&lt;span class="java-class"&gt;org.vinwiki.action.Nav&lt;/span&gt;) deployed in &lt;b&gt;session scope&lt;/b&gt; supports user navigation.&lt;br /&gt;
&lt;br /&gt;
The &lt;span class="seam-comp"&gt;nav&lt;/span&gt; component can be in one of two states during its lifecycle: A) guest mode when &lt;span class="el"&gt;#{identity.loggedIn}&lt;/span&gt; is &lt;span class="java-kw"&gt;false&lt;/span&gt; or B) authenticated user mode when &lt;span class="el"&gt;#{identity.loggedIn}&lt;/span&gt; is &lt;span class="java-kw"&gt;true&lt;/span&gt;. This is because Seam does not create a new session component after a user is authenticated.&lt;br /&gt;
&lt;br /&gt;
When a new session is created, we need to generate a default view of the wine data; a list of the &lt;b&gt;Most Recent&lt;/b&gt; Wines added to the application seems like a good choice. I use Seam's &lt;span class="annotation"&gt;@Factory&lt;/span&gt; annotation to provide the default view (the Seam documentation calls this pull-style MVC): &lt;div class="code"&gt;&lt;pre&gt;&lt;span class="annotation"&gt;@Factory("mainMenu")&lt;/span&gt;
    public void initMainMenu() {
        if (mainMenu == null) {
            mainMenu = buildMainMenu();
        }
        &lt;span class="java-cmt"&gt;// Ensure the current PagedDataFetcher is in-sync with the current request&lt;/span&gt;
        &lt;b&gt;syncDataFetcherWithRequest();&lt;/b&gt;
    }
&lt;/pre&gt;&lt;/div&gt;To handle the switch from state A to B, &lt;span class="seam-comp"&gt;nav&lt;/span&gt; observes the &lt;span class="code"&gt;Identity.EVENT_POST_AUTHENTICATE&lt;/span&gt; event: &lt;div class="code"&gt;&lt;pre&gt;&lt;span class="annotation"&gt;@Observer(Identity.EVENT_POST_AUTHENTICATE)&lt;/span&gt;
    public void postAuthenticate(Identity identity) {
        cleanupDataProvider();
        &lt;span class="java-cmt"&gt;// after login, display the user menu instead of the guest menu&lt;/span&gt;
        mainMenu = getMainMenu();
        syncDataFetcherWithRequest();
    }
&lt;/pre&gt;&lt;/div&gt;&lt;div class="sub2Hdr"&gt;Dynamic RichFaces PanelMenu&lt;/div&gt;One method of navigating is to browse wines by region. Recall that Region implements the &lt;span class="java-class"&gt;org.vinwiki.model.Composite&lt;/span&gt; interface so we can compose a hierarchical tree structure of regions. I chose the RichFaces PanelMenu for this example, but you could also adapt the code to build another type of menu. Menu items are dynamic so we need to programmatically build a &lt;a href="http://docs.jboss.org/richfaces/latest_3_3_X/en/apidoc/index.html?org/richfaces/component/html/HtmlPanelMenu.html" target="_blank"&gt;org.richfaces.component.html.HtmlPanelMenu&lt;/a&gt; using the RichFaces API. Because the menu is a form of navigation, the &lt;span class="seam-comp"&gt;nav&lt;/span&gt; component builds the menu. On the home page, we have a &lt;span class="xml-elm"&gt;&amp;lt;rich:panelMenu&amp;gt;&lt;/span&gt; tag with the binding set to "mainMenu".&lt;br /&gt;
&lt;div class="code"&gt;&lt;pre&gt;&amp;lt;a4j:form prependId="false"&amp;gt;
    &amp;lt;rich:panelMenu id="mainMenu" &lt;b&gt;binding="#{mainMenu}"&lt;/b&gt;/&amp;gt;
  &amp;lt;/a4j:form&amp;gt;
&lt;/pre&gt;&lt;/div&gt;Seam invokes the &lt;span class="code"&gt;initMainMenu&lt;/span&gt; method of our &lt;span class="seam-comp"&gt;nav&lt;/span&gt; component to resolve the binding. All guests share the same menu, but each user can have their own menu. You could imagine allowing users to hide specific regions they are not interested in, which will impact the rendering of the main menu. One caveat is that you &lt;u&gt;should&lt;/u&gt; manually set the ID for all the UI components your binding creates or you will most likely encounter duplicate ID exceptions, especially after hot deployments.&lt;br /&gt;
&lt;br /&gt;
&lt;div class="sub2Hdr"&gt;Bookmarkable Navigation URLs&lt;/div&gt;It would be nice if the application allowed users to bookmark a specific filter like "Most Recent" or region like "California". For example, in the screen shot below, notice that the URL in the browser address bar reflects the fact that the user clicked on California in the main menu. The user can bookmark this page in their browser to instantly view the list of Californian wines available in VinWiki.&lt;br /&gt;
&lt;img class="screenshot" src="http://vinwiki.com/thelabdude/vinwiki/california.jpg" alt="Bookmarkable URL"/&gt;&lt;br /&gt;
Seam makes this possible using URL re-writing and page parameters. My solution is largely based on the &lt;a href="http://docs.jboss.org/seam/2.2.1.CR1/reference/en-US/html_single/#blog" target="_blank"&gt;Blog example&lt;/a&gt; provided in the Seam documentation. When the menu was constructed, I set the action for the California region menu item to be &lt;span class="el"&gt;/region.xhtml?r=California&lt;/span&gt;. In pages.xml, I link the "r" request parameter to the region property of an event-scoped component &lt;span class="seam-comp"&gt;bookmarkable&lt;/span&gt;, see &lt;span class="code"&gt;org.vinwiki.action.BookmarkableRequestInfo&lt;/span&gt;. In addition, I use a rewrite rule to make the URL more intuitive (&lt;b&gt;/region/California&lt;/b&gt; instead of &lt;b&gt;/region.seam?r=California&lt;/b&gt;): &lt;div class="code"&gt;&lt;pre&gt;&amp;lt;page view-id="/region.xhtml"&amp;gt;
    &amp;lt;rewrite pattern="/region/{r}"/&amp;gt;
    &amp;lt;param name="r" value="#{bookmarkable.region}"/&amp;gt;
  &amp;lt;/page&amp;gt;
&lt;/pre&gt;&lt;/div&gt;This is where the aforementioned &lt;span class="code"&gt;syncDataFetcherWithRequest&lt;/span&gt; method comes into play; this method uses the injected &lt;span class="seam-comp"&gt;bookmarkable&lt;/span&gt; component to determine which data to show to the user, which in this case is the list of Californian wines.&lt;br /&gt;
&lt;br /&gt;
One drawback to allowing navigation requests to be bookmarked in this manner is that I had to duplicate the contents of &lt;span class="file-path"&gt;view/home.xhtml&lt;/span&gt; into filter.xhtml because it seems that adding multiple rewrite patterns on a single page leads to some weird URLs. I'm sure this could be overcome with some work, but since I'm using Facelets, the amount of duplication is minimal (and of course the UI is not final so the filter page may end up being different anyway). In the next posting, I'll show you how to make search results bookmarkable as well.&lt;br /&gt;
&lt;br /&gt;
&lt;div class="sub2Hdr"&gt;Server-side Pagination&lt;/div&gt;Server-side pagination is essential for gracefully handling a large number of objects. The basic idea is to only display a small subset of the total results to the user at one time. The key, of course, is to extend the sub-set concept to the server and only load small sub-set of data from the database at a time. Only doing client-side pagination will become a major performance issue as the number of objects in your database increases. Thankfully, RichFaces does most of the work for us! Here is a screenshot of VinWiki's scrollable data table backed by server-side pagination:&lt;br /&gt;
&lt;img class="screenshot" src="http://vinwiki.com/thelabdude/vinwiki/ss_page_ui.png" alt="scrollable data table screenshot"/&gt;&lt;br /&gt;
And, the corresponding JSF syntax (from &lt;span class="file-path"&gt;view/WEB-INF/facelets/itemTable.xhtml&lt;/span&gt;:&lt;br /&gt;
&lt;img class="screenshot" src="http://vinwiki.com/thelabdude/vinwiki/ss_page_jsf.png" alt="scrollable data table JSF syntax"/&gt;&lt;br /&gt;
&lt;ol&gt;&lt;li&gt;My solution is based largely on the sample code provided with RichFaces (see org.richfaces.demo.extendeddatamodel.AuctionDataModel). Essentially, my &lt;span class="xml-elm"&gt;&amp;lt;rich:dataTable&amp;gt;&lt;/span&gt; interacts with an event-scoped component &lt;span class="seam-comp"&gt;navDataModel&lt;/span&gt; that extends &lt;span class="java-class"&gt;org.ajax4jsf.model.SerializableDataModel&lt;/span&gt;. Under the covers, the DataModel (see &lt;span class="java-class"&gt;org.vinwiki.action.PagedDataModelBase&lt;/span&gt;) component does everything needed to support AJAX-driven pagination except the actual loading of data. To load data, the DataModel delegates to a &lt;span class="java-intf"&gt;org.vinwiki.action.PagedDataProvider&lt;/span&gt;, whose implementation is a session-scoped component that loads and caches items from the database.&lt;/li&gt;
&lt;li&gt;Under the covers, the number of rows displayed to the user per page comes from the user's preferences.&lt;/li&gt;
&lt;li&gt;The RichFaces &lt;span class="xml-elm"&gt;&amp;lt;rich:datascroller&amp;gt;&lt;/span&gt; provides the paging mechanism for our paged data table.&lt;/li&gt;
&lt;/ol&gt;Here is a UML class diagram showing the relationship between the DataModel and DataProvider (most of these classes can be used in your project as they have nothing to do with Wine):&lt;br /&gt;
&lt;img class="screenshot" src="http://vinwiki.com/thelabdude/vinwiki/pagination.png" alt="UML Class Diagram of server-side pagination"/&gt;&lt;br /&gt;
As you might expect, the concrete implementation of &lt;span class="java-intf"&gt;org.vinwiki.action.PagedDataProvider&lt;/span&gt; is our handy &lt;span class="seam-comp"&gt;nav&lt;/span&gt; component which extends &lt;span class="java-class"&gt;org.vinwiki.action.PagedDataProviderBase&lt;/span&gt;. PagedDataProviderBase does most of the work except the actual &lt;u&gt;fetching&lt;/u&gt; of a page of data from the database, which is provided by a &lt;span class="java-intf"&gt;org.vinwiki.action.PagedDataFetcher&lt;/span&gt;. There is a concrete implementation of &lt;span class="java-intf"&gt;org.vinwiki.action.PagedDataFetcher&lt;/span&gt; for each type of navigation. For example, the &lt;span class="java-class"&gt;org.vinwiki.action.FetchRegion&lt;/span&gt; class provides Wine objects for a specific region to the data provider. Notice that my current PagedDataFetcher implementations are *not* Seam components and expect the EntityManager and User ID to be passed to them. I chose this approach to make it really easy to build new types of navigation queries; all you have to do is provide a query to return a page of data and a query to return the total number of items available to the current user. &lt;br /&gt;
&lt;br /&gt;
The PagedDataProviderBase maps each starting index to a List of Item objects. A Map is a good solution if you are using the &lt;span class="xml-elm"&gt;&amp;lt;rich:datascroller&amp;gt;&lt;/span&gt; because the user can jump from page 1 to 10 without going through pages 2-9 first. If you're using a sequential paging mechanism then a simple List should suffice. Of course the Map could grow very big if the user visits many pages in the scroller. I'll leave it as an exercise for the reader to solve using SoftReferences. &lt;br /&gt;
&lt;br /&gt;
&lt;a name="viewWine"&gt;&lt;/a&gt;&lt;div class="subHdr"&gt;View Wine Details&lt;/div&gt;Assuming the user finds a wine of interest while browsing the application, they may want to view more information for the wine as well as scroll through other users' ratings. &lt;img class="screenshot" src="http://vinwiki.com/thelabdude/vinwiki/view_wine.png" alt="wine details screenshot"/&gt;&lt;br /&gt;
&lt;i&gt;You didn't know the &lt;a href="http://en.wikipedia.org/wiki/Rat_pack" target="_blank"&gt;Rat Pack&lt;/a&gt; liked wine and wrote Latin did you ;-)&lt;/i&gt;&lt;br /&gt;
&lt;br /&gt;
There are a number of possible activities the wine details view can offer the user, including: &lt;ul&gt;&lt;li&gt;Add / edit rating&lt;/li&gt;
&lt;li&gt;View a list of similar wines (MoreLikeThis)&lt;/li&gt;
&lt;li&gt;Contextual display Ads&lt;/li&gt;
&lt;li&gt;Navigation controls to view the next wine in the results&lt;/li&gt;
&lt;li&gt;Edit information about the wine itself&lt;/li&gt;
&lt;/ul&gt;Consequently, it makes sense to implement a new page for viewing wine details ( &lt;span class="file-path"&gt;view/wine.xhtml&lt;/span&gt; ). From the home page, we take the user to the wine details page using a Seam &lt;span class="xml-elm"&gt;&amp;lt;s:link&amp;gt;&lt;/span&gt; tag, which allows us to pass the Wine ID as a request parameter:&lt;br /&gt;
&lt;div class="code"&gt;&lt;pre&gt;&amp;lt;s:link view="/wine.xhtml" value="#{item.fullName}" style="font-size:14px;"&amp;gt;
    &amp;lt;f:param name="wid" value="#{item.id}"/&amp;gt;
  &amp;lt;/s:link&amp;gt;
&lt;/pre&gt;&lt;/div&gt;In pages.xml, I map a request parameter to the wineId property of the &lt;span class="seam-comp"&gt;viewWine&lt;/span&gt; component: &lt;div class="code"&gt;&lt;pre&gt;&amp;lt;page view-id="/wine.xhtml"&amp;gt;
    &amp;lt;rewrite pattern="/wine/{wid}"/&amp;gt;
    &amp;lt;param name="wid" value="#{viewWine.wineId}"/&amp;gt;
  &amp;lt;/page&amp;gt;
&lt;/pre&gt;&lt;/div&gt;Notice that I'm also re-writing the URL. Link the "wid" parameter to &lt;span class="el"&gt;#{viewWine.wineId}&lt;/span&gt; begins a new conversation for viewing the wine: &lt;div class="code"&gt;&lt;pre&gt;&lt;span class="annotation"&gt;@Begin(flushMode = FlushModeType.MANUAL, join = true)&lt;/span&gt;
    public void setWineId(Long wineId) {
       ...
    }
&lt;/pre&gt;&lt;/div&gt;&lt;div class="tip"&gt;There are a number of ways to begin a conversation so be sure to research the best approach for your application in the Seam documentation.&lt;/div&gt;The wine details page can be bookmarked as well. In this case, however, I'm using the push-style MVC approach discussed in the Seam documentation. I had to get clever with the &lt;b&gt;Home&lt;/b&gt; link on the details page because if the user comes through a bookmark, history.go(-1) won't work for returning the user to the VinWiki home page. Thus, I rely on an AJAX call to end the conversation and then invoke the &lt;span class="code"&gt;backToHome&lt;/span&gt; JavaScript function: &lt;div class="code"&gt;&lt;pre&gt;&amp;lt;a4j:commandLink value="#{messages.backToHome}" action="#{viewWine.endViewWine()}" oncomplete="backToHome()" style="font-size:14px;"/&amp;gt;
&lt;/pre&gt;&lt;/div&gt;From &lt;span class="file-path"&gt;view/layout/template.xhtml&lt;/span&gt;: &lt;div class="code"&gt;&lt;pre&gt;function backToHome() {
    var currentHost = document.location.protocol+"//"+document.location.host;
    if (document.referrer &amp;&amp; document.referrer.startsWith(currentHost)) {
        history.back();
    } else {
        window.location.replace("/vinwiki/");
    }
}
&lt;/pre&gt;&lt;/div&gt;&lt;br /&gt;
&lt;br /&gt;
&lt;a name="ratingHome"&gt;&lt;/a&gt;&lt;div class="subHdr"&gt;Add Rating for Existing Wine&lt;/div&gt;There are two ways a user can add a rating: &lt;ol&gt;&lt;li&gt;lookup a wine on-the-fly from the home page, or &lt;/li&gt;


&lt;li&gt;view details for a specific wine and then add the rating from the wine details view.&lt;/li&gt;
&lt;/ol&gt;&lt;div class="sub2Hdr"&gt;Rating with on-the-fly Lookup&lt;/div&gt;For authenticated users, the header menu on the home page includes a link to "Add Rating". Clicking on the &lt;b&gt;Add Rating&lt;/b&gt; link, triggers the &lt;span class="el"&gt;#{ratingHome.beginRatingWine()}&lt;/span&gt; action using AJAX. When the action completes, the &lt;b&gt;addRatingPanel&lt;/b&gt; is shown to the user.&lt;br /&gt;
&lt;img class="screenshot" src="http://vinwiki.com/thelabdude/vinwiki/add_rating_lookup.jpg" alt="Add rating from Home Page, requires lookup of wine on-the-fly."/&gt;&lt;br /&gt;
From &lt;span class="file-path"&gt;view/WEB-INF/facelets/headerControls.xhtml&lt;/span&gt; &lt;div class="code"&gt;&lt;pre&gt;&amp;lt;a4j:commandLink value="Add Rating" action="#{ratingHome.beginRatingWine()}" reRender="addRatingPanel"
            oncomplete="#{rich:component('addRatingPanel')}.show()" styleClass="hdrLink"/&amp;gt;
&lt;/pre&gt;&lt;/div&gt;The conversational &lt;span class="seam-comp"&gt;ratingHome&lt;/span&gt; component (&lt;span class="java-class"&gt;org.vinwiki.action.RatingHome&lt;/span&gt; based on Seam's &lt;a href="http://docs.jboss.org/seam/2.2.0.GA/en-US/html_single/#framework" target="_blank"&gt;Application Framework&lt;/a&gt;) manages adding and editing Rating objects by users.&lt;br /&gt;
&lt;br /&gt;
So now let's see how on-the-fly lookup works ... from &lt;span class="file-path"&gt;view/WEB-INF/facelets/addRating.xhtml&lt;/span&gt;&lt;br /&gt;
&lt;img class="screenshot" src="http://vinwiki.com/thelabdude/vinwiki/lookup.png" alt="JSF syntax for auto-complete on wine field."/&gt; &lt;ol&gt;&lt;li&gt;Decorate the form field as a required field using Seam's &lt;span class="xml-elm"&gt;&amp;lt;s:decorate&amp;gt;&lt;/span&gt; tag and a Facelets template. If a validation error occurs, Seam will decorate the field with error information.&lt;/li&gt;


&lt;li&gt;When the "onblur" event fires on the input field, send an AJAX request to the server to determine if the user selected a known wine.&lt;/li&gt;


&lt;li&gt;Attach a RichFaces suggestion box &lt;span class="xml-elm"&gt;&amp;lt;rich:suggestionbox&amp;gt;&lt;/span&gt; to the input field to auto-complete the wine name as the user types. On the server, the RichFaces suggestion box invokes the &lt;span class="el"&gt;#{ratingHome.autoCompleteWine}&lt;/span&gt; action. I'll discuss the RichFaces request queue in more detail below, but for now, you should realize that it is dangerous to have an auto-complete component without some sort of request flood control in place.&lt;/li&gt;
&lt;/ol&gt;So assuming the user has successfully filled-in the rating form, let's see what happens when the form is submitted:&lt;br /&gt;
&lt;img class="screenshot" src="http://vinwiki.com/thelabdude/vinwiki/add_rating_submit.png" alt="JSF syntax to submit the add rating form."/&gt;&lt;br /&gt;
There's a fair amount of complex machinery going on in this one tag! Let's analyze it step-by-step: &lt;ol&gt;&lt;li&gt;A RichFaces &lt;span class="xml-elm"&gt;&amp;lt;a4j:commandButton&amp;gt;&lt;/span&gt; submits the form using AJAX to invoke the &lt;span class="el"&gt;#{ratingHome.saveRating}&lt;/span&gt; action within the same conversation started when the user clicked on Add Rating.&lt;/li&gt;


&lt;li&gt;The &lt;i&gt;reRender&lt;/i&gt; attribute tells RichFaces to re-render the &lt;b&gt;addRatingPanel&lt;/b&gt; and &lt;b&gt;itemTable&lt;/b&gt; components in the component tree and updated in the browser DOM after the AJAX response is completed. This ensures that the panel will display any errors that occur on the server.&lt;/li&gt;


&lt;li&gt;If there are no errors, close the modal panel. Notice that I'm calling the &lt;span class="code"&gt;hasJsfErrMsg&lt;/span&gt; JavaScript function, which returns true if there are any error messages queued in the FacesContext. The &lt;span class="code"&gt;hasJsfErrMsg&lt;/span&gt; function is defined in my main Facelets layout template (&lt;span class="file-path"&gt;view/layout/template.xhtml&lt;/span&gt;) and relies on an &lt;span class="xml-elm"&gt;&amp;lt;a4j:outputPanel&amp;gt;&lt;/span&gt; to update the value of a hidden field with ID 'jsfMsgMaxSev' on every AJAX request. The value for this field is pulled from the FacesContext.getMaximumSeverity() method.&lt;br /&gt;
&lt;div class="code"&gt;&lt;pre&gt;&amp;lt;a4j:outputPanel ajaxRendered="true"&amp;gt;
    &amp;lt;a4j:form prependId="false"&amp;gt;
      &amp;lt;h:inputHidden id="jsfMsgMaxSev" value="#{jsf.maxMsgSev}"/&amp;gt;
    &amp;lt;/a4j:form&amp;gt;
  &amp;lt;/a4j:outputPanel&amp;gt;
&lt;/pre&gt;&lt;/div&gt;&lt;/li&gt;
&lt;/ol&gt;&lt;div class="sub2Hdr"&gt;Rating from Details&lt;/div&gt;When a user views a wine, a new conversation is started with the &lt;span class="seam-comp"&gt;viewWine&lt;/span&gt; component.&lt;br /&gt;
&lt;img class="screenshot" src="http://vinwiki.com/thelabdude/vinwiki/add_rating_details.jpg" alt="Add rating from wine details, wine to rate is already selected."/&gt;&lt;br /&gt;
Recall that we are already in a conversation when viewing the wine. Thus, the Add Rating operation will occur in a &lt;b&gt;nested conversation&lt;/b&gt; using the &lt;span class="seam-comp"&gt;ratingHome&lt;/span&gt; component.&lt;br /&gt;
&lt;div class="code"&gt;&lt;pre&gt;&lt;span class="annotation"&gt;@Begin(&lt;b&gt;nested=true&lt;/b&gt;, flushMode=FlushModeType.MANUAL)&lt;/span&gt;
    public void beginRatingWine(Wine currentWine) {
        &lt;span class="java-cmt"&gt;// attach the objects needed to create a rating to the
        // extended PersistenceContext for this conversation&lt;/span&gt;
        user = getEntityManager().merge(currentUser);
        wine = getEntityManager().find(Wine.class, currentWine.getId());
        updateStateForWine();
        info("User {0} has starting rating wine {1} in NESTED conversation {2}.", user.getUserName(), wine.getFullName(), Conversation.instance().getId());
    }
&lt;/pre&gt;&lt;/div&gt;Upon starting the nested conversation, you need to "attach" the User and Wine objects to the extended PersistenceContext (&lt;span class="code"&gt;entityManager&lt;/span&gt;). &lt;br /&gt;
&lt;br /&gt;
&lt;a name="wineHome"&gt;&lt;/a&gt;&lt;div class="subHdr"&gt;Add New Wine&lt;/div&gt;What type of wiki would this be if user's could not add new content on-the-fly? Of course, this application is not a full-featured wiki (see the &lt;b&gt;wiki&lt;/b&gt; project in the Seam examples), but any authenticated user can add a new Wine (and dependent objects like Winery and Region) to the application. The &lt;span class="seam-comp"&gt;wineHome&lt;/span&gt; component does most of the heavy lifting for adding a new Wine to the system, see &lt;span class="java-class"&gt;org.vinwiki.action.WineHome&lt;/span&gt;. WineHome provides persistence operations for Wine entities by extending &lt;span class="java-class"&gt;org.jboss.seam.framework.EntityHome&lt;/span&gt; from the Seam Application Framework. The implementation is mostly uninteresting from a re-use perspective, with the exception of being a &lt;span class="annotation"&gt;@Factory&lt;/span&gt; for a component named &lt;span class="seam-comp"&gt;wine&lt;/span&gt;. &lt;div class="code"&gt;&lt;pre&gt;&lt;span class="annotation"&gt;@Factory("wine")&lt;/span&gt;
    public Wine initWine() {
        return getInstance();
    }
&lt;/pre&gt;&lt;/div&gt;This allows EL expressions in /admin.xhtml to refer to the new Wine object simply as &lt;span class="seam-comp"&gt;wine&lt;/span&gt;, such as &lt;span class="el"&gt;#{wine.type}&lt;/span&gt;. The &lt;span class="seam-comp"&gt;wine&lt;/span&gt; is managed in conversation scope along with &lt;span class="seam-comp"&gt;wineHome&lt;/span&gt;. &lt;div class="tip"&gt;We'll need to address the threat of spam and cross-site scripting (XSS) hacks before we can release this application on the Web, which I'll address in post #3.&lt;/div&gt;&lt;br /&gt;
&lt;br /&gt;
&lt;a name="editWine"&gt;&lt;/a&gt;&lt;div class="subHdr"&gt;Edit Wine&lt;/div&gt;Edit wine also uses the &lt;span class="seam-comp"&gt;wineHome&lt;/span&gt; component and /admin.xhtml. &lt;br /&gt;
&lt;br /&gt;
So that about covers the specific use cases for this posting. Now, let's look at some specific UI implemenation details that will help you build better apps with Seam and RichFaces. &lt;/p&gt;&lt;p&gt;&lt;div class="sectionHdr"&gt;Other Useful UI Implementation Details&lt;/div&gt;&lt;div class="subHdr"&gt;Facelets&lt;/div&gt;I'm using &lt;a href="https://facelets.dev.java.net/" target="_blank"&gt;Facelets&lt;/a&gt; for templating and UI component re-use. The main layout template is &lt;span class="file-path"&gt;view/layout/template.xhtml&lt;/span&gt;. Each view references this template in the root Facelets &lt;span class="xml-elm"&gt;&amp;lt;ui:composition ... template="layout/template.xhtml"&amp;gt;&lt;/span&gt; element.&lt;br /&gt;
&lt;br /&gt;
&lt;div class="subHdr"&gt;RichFaces Semantic Layouts&lt;/div&gt;I've started experimenting with the semantic layouts support provided by RichFaces (see &lt;span class="file-path"&gt;view/layout/template.xhtml&lt;/span&gt;). I think the layouts help reduce CSS you need to manage for application so their definitely worth checking out.&lt;br /&gt;
&lt;br /&gt;
&lt;div class="subHdr"&gt;RichFaces ModalPanel, Forms, and Seam Conversations&lt;/div&gt;The RichFaces &lt;a href="http://docs.jboss.org/richfaces/latest_3_3_X/en/devguide/html/rich_modalPanel.html" target="_blank"&gt;ModalPanel&lt;/a&gt; component is an excellent tool for allowing the user to perform a quick operation without losing their current context with your application. For example, the &lt;b&gt;Add Rating&lt;/b&gt; ModalPanel allows the user to add a rating to the wine they are currently viewing without going to a separate page. Unfortunately, ModalPanel can also be major source of frustration if not handled correctly, especially when working with an AJAX-driven form on the panel. In this section, I hope to spare you some of that frustration by tackling some of the troublesome areas you may encounter when working with ModalPanels. &lt;div class="sub2Hdr"&gt;Opening the ModalPanel (and beginning a Conversation)&lt;/div&gt;For the complete code for this section, please see &lt;span class="file-path"&gt;view/WEB-INF/facelets/userPreferences.xhtml&lt;/span&gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;img class="screenshot" src="http://vinwiki.com/thelabdude/vinwiki/mp_show.png" alt="JSF syntax to open a ModalPanel."/&gt;&lt;br /&gt;
Here are some tips to keep in mind when opening a ModalPanel within a conversation: &lt;ol&gt;&lt;li&gt;The &lt;span class="code"&gt;reRender&lt;/span&gt; attribute should include the ID of the panel you are opening. This ensures the UI components in the panel reflect the state of the conversation.&lt;/li&gt;
&lt;li&gt;When the action completes, use JavaScript to show the panel via &lt;span class="code"&gt;oncomplete="#{rich:component('prefPanel')}.show()"&lt;/span&gt;&lt;br /&gt;
&lt;li&gt;In the action handler on the server, be sure to load any entities needed by the ModalPanel into the extended PersistenceContext for the conversation: &lt;div class="code"&gt;&lt;pre&gt;&lt;span class="java-cmt"&gt;// Out-ject the User object that we're updating, vs. the one that came in ...&lt;/span&gt;
    &lt;span class="annotation"&gt;@Out&lt;/span&gt; protected User updatingUser = null;

    &lt;span class="java-cmt"&gt;// Just a handy reference to the preferences object we're updating ...&lt;/span&gt;
    &lt;span class="annotation"&gt;@Out&lt;/span&gt; protected Preferences updatingPrefs = null;

    &lt;span class="annotation"&gt;@Begin(flushMode = FlushModeType.MANUAL, join = true)&lt;/span&gt;
    public void beginEditPreferences() {
        &lt;span class="java-cmt"&gt;// load the User entity to edit preferences for into the extended PC for this conversation&lt;/span&gt;
        updatingUser = getEntityManager().find(User.class, currentUser.getId());
        updatingPrefs = updatingUser.getPreferences();
        initDob();
    }
&lt;/pre&gt;&lt;/div&gt;&lt;/li&gt;&lt;br /&gt;
&lt;/ol&gt;&lt;div class="sub2Hdr"&gt;Processing the Form on the ModalPanel (and ending the Conversation)&lt;/div&gt;So now you have a visible ModalPanel with a form displayed, within a Seam conversation. When the user submits the form, we need to validate the input and display any errors that occur. If no errors occur, then close the panel when the action completes. &lt;img class="screenshot" src="http://vinwiki.com/thelabdude/vinwiki/mp_submit.png" alt="JSF syntax to process a form on ModalPanel."/&gt; &lt;ol&gt;&lt;li&gt;The action handler should end the conversation only if it completes successfully. So be careful with using &lt;span class="annotation"&gt;@End&lt;/span&gt; with AJAX action handlers of type void. I prefer to use &lt;span class="code"&gt;Conversation.instance().end()&lt;/span&gt; when dealing with AJAX actions as it makes it more explicit when the conversation ends.&lt;/li&gt;
&lt;li&gt;Be sure to re-render the form after submit so validation errors are displayed correctly.&lt;/li&gt;
&lt;li&gt;As mentioned previously, use a simple JavaScript function to determine if there are FacesMessages with severity ERROR or greater queued in the FacesContext; close the panel if there are no errors.&lt;/li&gt;
&lt;/ol&gt;&lt;div class="subHdr"&gt;Queue and Traffic Flood Protection using the Global Default Queue&lt;/div&gt;I've enabled the global queue for this application in web.xml: &lt;div class="code"&gt;&lt;pre&gt;&amp;lt;context-param&amp;gt;
    &amp;lt;param-name&amp;gt;org.richfaces.queue.global.enabled&amp;lt;/param-name&amp;gt;
    &amp;lt;param-value&amp;gt;true&amp;lt;/param-value&amp;gt;
  &amp;lt;/context-param&amp;gt;
&lt;/pre&gt;&lt;/div&gt;&lt;/p&gt;&lt;p&gt;&lt;div class="sectionHdr"&gt;Project Setup&lt;/div&gt;&lt;div class="subHdr"&gt;Building&lt;/div&gt;You can download the project from &lt;a href="http://vinwiki.com/thelabdude/vinwiki/vinwiki.zip"&gt;here&lt;/a&gt;.
&lt;/br&gt;The project should build with minimal effort using Seam 2.2.0 with Java 6 from the command-line or Eclipse (I used v. 3.4.2). 

Once deployed to your JBoss 4.2.3 server, you can register or login as "vinwiki" with password "S00ner$1". However, you should also register your own account to see how the registration system works. &lt;div class="subHdr"&gt;Database Setup&lt;/div&gt;I've included a mysqldump file (vinwiki.sql) in the root directory which contains some wine related objects pulled from Freebase. To install this database in your MySQL 5.1.x database, do the following: &lt;div class="code"&gt;&lt;pre&gt;mysql&amp;gt; create database vinwiki;
mysql&amp;gt; grant all privileges on vinwiki.* to 'vinwiki'@'localhost' identified by 'vinwiki';
mysql&amp;gt; use vinwiki;
mysql&amp;gt; source vinwiki.sql;
&lt;/pre&gt;&lt;/div&gt;&lt;div class="subHdr"&gt;Testing&lt;/div&gt;The tests are configured to run against a MySQL database named &lt;b&gt;vinwiki_test&lt;/b&gt; so that tests do not affect the development database. The JDBC connection parameters are specified in: &lt;span class="file-path"&gt;bootstrap/deploy/hsqldb-ds.xml&lt;/span&gt;. From the MySQL command-line, do the following: &lt;div class="code"&gt;&lt;pre&gt;mysql&amp;gt; create database vinwiki_test;
mysql&amp;gt; grant all privileges on vinwiki_test.* to 'vinwiki'@'localhost' identified by 'vinwiki';
&lt;/pre&gt;&lt;/div&gt;&lt;/p&gt;&lt;p&gt;&lt;div class="sectionHdr"&gt;What's Next?&lt;/div&gt;So by now you should have a good understanding of the basic structure of this application and be able to build it and run the unit tests. Please continue on to the &lt;a href="http://thelabdude.blogspot.com/2010/06/vinwiki-part-2-full-text-search-with.html"&gt;second post&lt;/a&gt; which covers Hibernate Search and Lucene to help users find wines of interest.&lt;/p&gt;&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6666528949054177156-1932020035980285742?l=thelabdude.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://thelabdude.blogspot.com/feeds/1932020035980285742/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://thelabdude.blogspot.com/2010/05/building-intelligent-web-app-using-seam.html#comment-form' title='8 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6666528949054177156/posts/default/1932020035980285742'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6666528949054177156/posts/default/1932020035980285742'/><link rel='alternate' type='text/html' href='http://thelabdude.blogspot.com/2010/05/building-intelligent-web-app-using-seam.html' title='VinWiki Part 1: Building an intelligent Web app using Seam, Hibernate, RichFaces, Lucene and Mahout'/><author><name>thelabdude</name><uri>http://www.blogger.com/profile/08032248788175889303</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>8</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6666528949054177156.post-2296953254013005418</id><published>2009-05-26T20:37:00.000-07:00</published><updated>2011-07-26T11:35:05.406-07:00</updated><title type='text'>Minor updates to the Seam-based user registration example ...</title><content type='html'>&lt;span style="font-family:verdana;font-size:12px;"&gt;&lt;p&gt;Based on feedback from the community, I made the following updates to the &lt;a href="http://thelabdude.blogspot.com/2009/05/user-registration-solution-using-jboss.html"&gt;Seam example&lt;/a&gt; (nothing too major except verifying that the solution works with the latest RichFaces release)&lt;/p&gt;&lt;ul&gt;&lt;li&gt;Tested with RichFaces 3.3.1 GA -- works great!&lt;/li&gt;
&lt;li&gt;Updated the GuestSupport class to use Seam's &lt;a href="http://docs.jboss.org/seam/2.1.1.GA/api/org/jboss/seam/international/StatusMessage.html" target="_blank"&gt;StatusMessages&lt;/a&gt; instead of working directly with FacesMessage(s).&lt;/li&gt;
&lt;li&gt;Updated the registration form to use Seam's &amp;lt;s:decorate&amp;gt; tag and a Facelets template, which really simplified the form.&lt;/li&gt;
&lt;li&gt;Moved the solution-specific text messages from i18n.properties into the Seam default messages.properties ... got tired of having two separate message bundles at work ...&lt;/li&gt;
&lt;li&gt;Cleaned up a few JPA annotations to be more inline with the spec.&lt;/li&gt;
&lt;li&gt;Localized the last few Hibernate Validator error messages that had hard-coded English text in them.&lt;/li&gt;
&lt;li&gt;Created the Gender and ProfilePolicy enums used by the Preferences entity.&lt;/li&gt;
&lt;/ul&gt;&lt;p&gt;The updated source code is available &lt;a href="http://vinwiki.com/thelabdude/jsf_rf_seam_user_reg.zip" target="_blank"&gt;here&lt;/a&gt;.&lt;/p&gt;&lt;p&gt;NOTE: If you want to upgrade to RichFaces 3.3.1, then you need to remove the version information from the JAR filenames:&lt;/p&gt;&lt;pre&gt;richfaces-api-3.3.1.GA.jar rename to richfaces-api.jar
richfaces-impl-3.3.1.GA.jar rename to richfaces-impl.jar
richfaces-ui-3.3.1.GA.jar rename to richfaces-ui.jar
&lt;/pre&gt;&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6666528949054177156-2296953254013005418?l=thelabdude.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://thelabdude.blogspot.com/feeds/2296953254013005418/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://thelabdude.blogspot.com/2009/05/updates-to-seam-user-registration.html#comment-form' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6666528949054177156/posts/default/2296953254013005418'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6666528949054177156/posts/default/2296953254013005418'/><link rel='alternate' type='text/html' href='http://thelabdude.blogspot.com/2009/05/updates-to-seam-user-registration.html' title='Minor updates to the Seam-based user registration example ...'/><author><name>thelabdude</name><uri>http://www.blogger.com/profile/08032248788175889303</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6666528949054177156.post-6514085227545807991</id><published>2009-05-13T19:39:00.001-07:00</published><updated>2011-07-26T11:34:02.353-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='hibernate'/><category scheme='http://www.blogger.com/atom/ns#' term='facelets'/><category scheme='http://www.blogger.com/atom/ns#' term='ajax'/><category scheme='http://www.blogger.com/atom/ns#' term='richfaces'/><category scheme='http://www.blogger.com/atom/ns#' term='seam'/><category scheme='http://www.blogger.com/atom/ns#' term='jboss seam'/><category scheme='http://www.blogger.com/atom/ns#' term='jsf'/><title type='text'>User registration solution using JBoss Seam with JSF / RichFaces and Hibernate</title><content type='html'>&lt;span style="font-family:verdana;font-size:12px;"&gt; &lt;p&gt;In this article, I show you how to implement a Web 2.0 style user registration solution using JBoss Seam with JSF/Facelets/RichFaces and Hibernate. This article evolved from a real Web 2.0 site I'm building. You can learn more about me on my &lt;a href="http://www.linkedin.com/in/thelabdude" target="_blank"&gt;LinkedIn&lt;/a&gt; profile. As for what's in it for you, I've provided the &lt;a href='http://vinwiki.com/thelabdude/jsf_rf_seam_user_reg.zip' target='_blank'&gt;source code&lt;/a&gt; and a working solution using:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;&lt;a href="https://javaserverfaces.dev.java.net/" target="_blank"&gt;JSF 1.2 (Mojarra)&lt;/a&gt; with &lt;a href="https://facelets.dev.java.net/" target="_blank"&gt;Facelets&lt;/a&gt; and &lt;a href="http://www.jboss.org/jbossrichfaces/" target="_blank"&gt;RichFaces 3.2.2 SR1&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="http://www.hibernate.org/" target="_blank"&gt;Hibernate&lt;/a&gt; (EntityManager, Annotations, and Validator)&lt;/li&gt;
&lt;li&gt;&lt;a href="http://www.seamframework.org" target="_blank"&gt;Seam 2.1.1.GA&lt;/a&gt; on JBoss 4.2.2.GA&lt;/li&gt;
&lt;/ul&gt;&lt;p&gt;In future articles, I plan to incorporate Hibernate Search and some collective intelligence techniques such as tag clouds, clustering, collaborative filtering, and recommendations. While writing this article, I assumed the reader would have a basic understanding of JSF and Hibernate. However, I do spend extra time explaining some of the key features of Seam and RichFaces.&lt;/p&gt;&lt;p&gt;I'd like to give credit to Dan Allen as I've used many of the ideas he teaches in &lt;a href="http://www.manning.com/dallen/" target="_blank"&gt;Seam In Action&lt;/a&gt; in my own Seam development; his book is one that all Seam developers should own!&lt;/p&gt;&lt;div style="font-size:125%;font-weight:bold;"&gt;Basic Requirements&lt;/div&gt;&lt;p&gt;Here are the high-level requirements this solution satisfies. Notice that most these requirements apply to many Web 2.0 sites seen on the Web today.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;1. Public-facing Web site with read-only access to Guests&lt;/strong&gt;&lt;br /&gt;
The site should be useful without logging in and the content should be accessible to search engines. Since casual surfers will be visiting the site, it must load fast and be easy to use on all major browsers (IE 6 / 7, FireFox 3.x, and Safari 3.x).&lt;/p&gt;&lt;p style="text-align:center;"&gt;&lt;img src='http://vinwiki.com/thelabdude/jsf_rf_seam_user_reg/site.png' alt='Site'/&gt;&lt;/p&gt;&lt;p&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;a href="#Solution1"&gt;&lt;i&gt;Jump to solution ...&lt;/i&gt;&lt;/a&gt;&lt;/p&gt;&lt;p&gt;&lt;strong&gt;2. Open registration process with email verification&lt;/strong&gt;&lt;br /&gt;
Guests should be able sign-up for a new account using a simple form that collects minimal information from the user; CAPTCHAs should be used to prevent bots from creating accounts. Each user must agree to the site's terms of use before registering. After registering, the user will also need to click on an activation link sent to their email, which ensures they have access to the email account provided on the form.&lt;/p&gt;&lt;p style="text-align:center;"&gt;&lt;img src='http://vinwiki.com/thelabdude/jsf_rf_seam_user_reg/register.png' alt='Registration Form'/&gt;&lt;/p&gt;&lt;p&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;a href="#Solution2"&gt;&lt;i&gt;Jump to solution ...&lt;/i&gt;&lt;/a&gt;&lt;/p&gt;&lt;p&gt;&lt;strong&gt;3. Remember Me&lt;/strong&gt;&lt;br /&gt;
Registered users can request the site to remember their credentials for automatic login for future visits.&lt;/p&gt;&lt;p style="text-align:center;"&gt;&lt;img src='http://vinwiki.com/thelabdude/jsf_rf_seam_user_reg/signin.png' alt='Sign-In Form'/&gt;&lt;/p&gt;&lt;p&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;a href="#Solution3"&gt;&lt;i&gt;Jump to solution ...&lt;/i&gt;&lt;/a&gt;&lt;/p&gt;&lt;p&gt;&lt;strong&gt;4. Ability to recover forgotten passwords&lt;/strong&gt;&lt;br /&gt;
Since we have a valid email for every user, users can request a new temporary password to be emailed to them. Upon logging in with the temporary password, the user should be prompted to change their password to a permanent value.&lt;/p&gt;&lt;p style="text-align:center;"&gt;&lt;table&gt;&lt;tr&gt; &lt;td&gt;&lt;img src='http://vinwiki.com/thelabdude/jsf_rf_seam_user_reg/forgot.png' alt='Recover Password Form'/&gt;&lt;/td&gt; &lt;td&gt;&lt;img src='http://vinwiki.com/thelabdude/jsf_rf_seam_user_reg/password_reset.png' alt='Password Reset Confirmation Dialog'/&gt;&lt;/td&gt; &lt;td&gt;&lt;img src='http://vinwiki.com/thelabdude/jsf_rf_seam_user_reg/change_password.png' alt='Change Password Dialog'/&gt;&lt;/td&gt; &lt;/tr&gt;
&lt;/table&gt;&lt;/p&gt;&lt;p&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;a href="#Solution4"&gt;&lt;i&gt;Jump to solution ...&lt;/i&gt;&lt;/a&gt;&lt;/p&gt;&lt;p&gt;&lt;strong&gt;5. User Preferences&lt;/strong&gt;&lt;br /&gt;
Registered users can share personal information, upload a photo, and setup site preferences, such as whether they are interested in receiving emails about special promotions from the site.&lt;/p&gt;&lt;p style="text-align:center;"&gt;&lt;img src='http://vinwiki.com/thelabdude/jsf_rf_seam_user_reg/preferences.png' alt='User Preferences'/&gt;&lt;/p&gt;&lt;p&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;a href="#Solution5"&gt;&lt;i&gt;Jump to solution ...&lt;/i&gt;&lt;/a&gt;&lt;/p&gt;&lt;p&gt;&lt;strong&gt;6. Strong Server-side Validation&lt;/strong&gt;&lt;br /&gt;
Since we're exposing the site to the Wild Wild Web, server-side validation is essential; client-side validation using JavaScript is a nice to have, but is not sufficient. It's trivial for a hacker to by-pass client-side validation so you want to make sure your server code is protected as well. Where appropriate, use AJAX to give immediate feedback to the user when they've entered invalid data.&lt;/p&gt;&lt;p style="text-align:center;"&gt;&lt;img src='http://vinwiki.com/thelabdude/jsf_rf_seam_user_reg/ajax.png' alt='AJAX Validation'/&gt;&lt;/p&gt;&lt;p&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;a href="#Solution6"&gt;&lt;i&gt;Jump to solution ...&lt;/i&gt;&lt;/a&gt;&lt;/p&gt;&lt;div style="font-size:125%;font-weight:bold;"&gt;Seam-based Architecture&lt;/div&gt;&lt;p&gt;In a &lt;a href="http://thelabdude.blogspot.com/2009/04/user-authentication-registration-with.html" target="_blank"&gt;previous blog posting&lt;/a&gt;, I used Spring with JSF / RichFaces to satisfy these requirements and advocated a layered architecture. While that is still a valid approach, I take a different approach in this article to highlight key features of Seam. Specifically, instead of covering each layer, I describe how I implemented each requirement. The layers are there if you need them, but with Seam you're not &lt;i&gt;forced&lt;/i&gt; to think in terms of stateless layers. What initially attracted me to Seam was its get down to business approach. First off, it comes with a tool &lt;b&gt;seam-gen&lt;/b&gt; which creates the skeleton project for your application, including generating JPA-based Entity code by reverse engineering a database schema. With Seam, you can get right to work building your UI that directly interacts with your entities. In contrast, with my Spring-based solution, I had to flush out the persistence and business-service layers consisting of the following interfaces and classes:&lt;/p&gt;&lt;pre&gt;example.user.UserService (interface)
example.user.impl.UserServiceImpl (class)
example.user.dao.UserDao (interface)
example.user.dao.RoleDao (interface)
example.user.dao.impl.HibernateUserDao (class)
example.user.dao.impl.HibernateRoleDao (class)
&lt;/pre&gt;&lt;p&gt;Seam, on the other hand, treats the persistence manager as the DAO and provides a framework for CRUD components (see &lt;a href="http://docs.jboss.com/seam/2.1.1.GA/reference/en-US/html_single/#framework" target="_blank"&gt;Seam Application Framework&lt;/a&gt;). I'll address the Seam Application Framework in a future article, for now I'm going to stay focused on the requirements at hand. I just wanted to make sure you knew that I wasn't throwing the persistence and business-service layers out the window by using Seam. They are there when you need them but are not in the way otherwise.&lt;/p&gt;&lt;p&gt;Here are some of the key aspects of Seam that I'll be using to satisfy the requirements:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;POJOs with bi-directional dependency injection known as &lt;b&gt;bijection&lt;/b&gt; (note: I'm not using EJB3 for this project, but Seam makes it really easy to use EJBs if you need them)&lt;/li&gt;
&lt;li&gt;Object-Relational Mapping (ORM) using the Java Persistence API (JPA) and Hibernate under the covers.&lt;/li&gt;
&lt;li&gt;Declarative transaction management using Seam annotations&lt;/li&gt;
&lt;li&gt;JSF with Facelets and RichFaces&lt;/li&gt;
&lt;li&gt;Test-driven development based on TestNG&lt;/li&gt;
&lt;/ul&gt;&lt;div style="font-size:125%;font-weight:bold;"&gt;Seam Project Organization&lt;/div&gt;&lt;p&gt;Here is the &lt;a href='http://vinwiki.com/thelabdude/jsf_rf_seam_user_reg.zip' target='_blank'&gt;link&lt;/a&gt; to the Seam project created by &lt;b&gt;seam-gen&lt;/b&gt;. Before digging too deeply into the details of the configuration, take moment to familiarize yourself with the organization of the Seam project:&lt;/p&gt;&lt;p style="text-align:center;"&gt;&lt;img src='http://vinwiki.com/thelabdude/jsf_rf_seam_user_reg/seam_project.png' alt='Seam Project Structure'/&gt;&lt;/p&gt;&lt;p&gt;NOTE: You'll need to copy the lib directory from your Seam 2.1.1.GA directory to my project directory as I excluded the lib directory to minimize the size of the download.&lt;/p&gt;&lt;div style="font-size:125%;font-weight:bold;"&gt;Object Persistence via JPA&lt;/div&gt;&lt;div style="font-size:110%;font-weight:bold;"&gt;Domain Entities&lt;/div&gt;&lt;p&gt;There are three primary domain entities: User, Preferences, and Role, which should be self-explanatory given the requirements outlined above. The entities are POJOs with the exception of using Seam, JPA, and Hibernate annotations in a few key places. For example, in the following code snippet, we annotate the example.user.User class as a JPA Entity and map it to the ex_user table in the database:&lt;/p&gt;&lt;div style="font-family:courier new;font-size:95%;margin-left:25px;"&gt;&lt;pre&gt;@Entity
@Table(name = "ex_user", uniqueConstraints = @UniqueConstraint(columnNames = "user_name"))
public class User implements Serializable ...
&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Notice that I'm using &lt;a href="http://en.wikipedia.org/wiki/Surrogate_key" target="_blank"&gt;surrogate keys&lt;/a&gt; for my domain object identifiers, which are auto-generated by Hibernate using the best approach for the specific database-type:&lt;/p&gt;&lt;div style="font-family:courier new;font-size:95%;margin-left:25px;"&gt;&lt;pre&gt;@Id @GeneratedValue(generator="native")
@GenericGenerator(name="native", strategy = "native")
@Column(name = "user_id", unique = true, nullable = false)
public Long getUserId() {
    return userId;
}
&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;I'm also using the &lt;a href="http://docs.jboss.org/hibernate/stable/validator/reference/en/html_single/" target="_blank"&gt;Hibernate Validator&lt;/a&gt; annotations to restrict the values for some of the members, such as the user's email address:&lt;/p&gt;&lt;div style="font-family:courier new;font-size:95%;margin-left:25px;"&gt;&lt;pre&gt;@Column(name = "email", nullable = false, length = 60)
@NotNull
@Length(max = 60)
&lt;span style="color:#cc0000;"&gt;@Email&lt;/span&gt;
public String getEmail() {
    return this.email;
}
&lt;/pre&gt;&lt;/div&gt;&lt;div style="font-size:110%;font-weight:bold;"&gt;Domain Object Associations&lt;/div&gt;&lt;p&gt;There is a one-to-one relationship between the Preferences and User objects. The Preferences object is loaded with the User object (fetch=FetchType.EAGER) because it contains information that is needed to render the UI for the user.&lt;/p&gt;&lt;div style="font-family:courier new;font-size:95%;margin-left:25px;"&gt;&lt;pre&gt;@OneToOne(cascade = CascadeType.ALL,fetch = FetchType.EAGER)
@PrimaryKeyJoinColumn
public Preferences getPreferences() {
    return this.preferences;
}
&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;There is a bidirectional many-to-many relationship between the User and Role objects using a join table ex_user_role.&lt;/p&gt;&lt;div style="font-family:courier new;font-size:95%;margin-left:25px;"&gt;&lt;pre&gt;@UserRoles
@ManyToMany(targetEntity = Role.class,cascade = {CascadeType.PERSIST,CascadeType.MERGE})
@JoinTable(name = "ex_user_role", joinColumns = @JoinColumn(name = "user_id"), inverseJoinColumns = @JoinColumn(name = "role_id"))
public Set&lt;role&gt; getUserRoles() {
    if (userRoles == null) {
        userRoles = new HashSet&lt;role&gt; ();
    }
    return userRoles;
}
&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;On the Role object, we have:&lt;/p&gt;&lt;div style="font-family:courier new;font-size:95%;margin-left:25px;"&gt;&lt;pre&gt;@ManyToMany(cascade={CascadeType.PERSIST,CascadeType.MERGE},mappedBy="userRoles",targetEntity=User.class)
public Set&lt;user&gt; getRoleUsers() {
    if (roleUsers == null) {
        roleUsers=new HashSet&lt;user&gt; ();
    }
    return roleUsers;
}
&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;This solution comes directly from the Hibernate documentation, &lt;a href="http://www.hibernate.org/hib_docs/reference/en/html/assoc-bidirectional-join.html#assoc-bidirectional-join-m2m" target="_blank"&gt;section 7.5.3&lt;/a&gt;.&lt;/p&gt;&lt;p&gt;Notice that my entity classes are in the &lt;b&gt;src/main&lt;/b&gt; directory as these classes cannot be incrementally hot-deployed by Seam. This means that if you change the code in these classes, then you need to recycle the Web application before the change is activated. On the other hand, classes in the &lt;b&gt;src/hot&lt;/b&gt; directory are updated by Seam on-the-fly &lt;b&gt;&lt;i&gt;without&lt;/i&gt;&lt;/b&gt; recycling the Web application. Consequently, you want to put as many of your classes in the &lt;b&gt;src/hot&lt;/b&gt; directory as possible (see: &lt;a target="_blank" href="http://docs.jboss.com/seam/2.1.1.GA/reference/en-US/html_single/#gettingstarted-hotdeployment"&gt;2.8. Seam and incremental hot deployment&lt;/a&gt;).&lt;/p&gt;&lt;p&gt;Now that you understand the key domain entities involved in this solution, you're ready to see how I implemented each requirement.&lt;/p&gt;&lt;a name="Solution1"&gt;&lt;div style="font-size:125%;font-weight:bold;"&gt;Solution for Requirement #1 (Public-facing Web site with read-only access to Guests)&lt;/div&gt;&lt;/a&gt; &lt;p&gt;Much like many Web 2.0 sites today, my site will be useful to casual guests of the site. So it needs to load quickly and look attractive on all major browsers (IE 6-7, FireFox 3.x, and Safari 3.x). Thankfully, JSF and RichFaces satisfy these requirements out-of-the-box. Here are some of the reasons why I like JSF with RichFaces:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;Modern, professional looking skin out-of-the-box (I don't like to fool with CSS).&lt;/li&gt;
&lt;li&gt;Extensive set of easy-to-use, yet flexible UI components, such as trees, data grids, tabbed panes, etc.&lt;/li&gt;
&lt;li&gt;Easy to create reusable custom components.&lt;/li&gt;
&lt;li&gt;Large community of users and developers building real apps.&lt;/li&gt;
&lt;li&gt;Performance: loads and runs fast on all modern browsers.&lt;/li&gt;
&lt;li&gt;Minimizes hand-coded JavaScript.&lt;/li&gt;
&lt;li&gt;Templates using Facelets&lt;/li&gt;
&lt;/ul&gt;&lt;p&gt;&lt;b&gt;Key design decision:&lt;/b&gt; I decided to use the RichFaces &lt;b&gt;ModalPanel&lt;/b&gt; component for my forms because it helps the user keep their current orientation with the site. Instead of navigating to a different page, I simply open a dialog over the current view. Consequently, I didn't have any need for JSF navigation rules in this example because there is only one page.&lt;/p&gt;&lt;a name="Solution2"&gt;&lt;div style="font-size:125%;font-weight:bold;"&gt;Solution for Requirement #2 (Open registration process with email verification)&lt;/div&gt;&lt;/a&gt; &lt;p&gt;For this requirement, it's useful to break it down into several detailed steps:&lt;/p&gt;&lt;ol style="list-style-type: lower-alpha;"&gt;&lt;li&gt;Guests are presented with a link to register.&lt;/li&gt;
&lt;li&gt;Simple form to collect minimal user information.&lt;/li&gt;
&lt;li&gt;Require the user to solve a CAPTCHA to prevent bots from creating accounts.&lt;/li&gt;
&lt;li&gt;Require the user to agree to the site's terms of use.&lt;/li&gt;
&lt;li&gt;Validate the user's email address before activating the account.&lt;/li&gt;
&lt;li&gt;Activate the account when the user clicks on a link in the activation email.&lt;/li&gt;
&lt;/ol&gt;&lt;div style="font-size:110%;font-weight:bold;"&gt;Solution for Requirement #2a (Register Link)&lt;/div&gt;&lt;p&gt;In the &lt;span style="font-family:courier new;font-size:95%;font-weight:bold;color:#336600;"&gt;view/WEB-INF/facelets/headerControls.xhtml&lt;/span&gt; file, I open the &lt;b&gt;registerPanel&lt;/b&gt; dialog using a &lt;b&gt;rich:componentControl&lt;/b&gt; which calls the &lt;b&gt;show&lt;/b&gt; method of the &lt;b&gt;rich:modalPanel&lt;/b&gt; component when the link is clicked:&lt;/p&gt;&lt;div style="font-family:courier new;font-size:95%;margin-left:25px;"&gt;&lt;pre&gt;&amp;lt;h:outputLink value="#" id="registerLink"&amp;gt;
  &amp;lt;h:outputText value="#{i18n.register}"/&amp;gt;
  &lt;span style="color:#cc0000;"&gt;&amp;lt;rich:componentControl for="registerPanel" attachTo="registerLink" operation="show" event="onclick"/&amp;gt;&lt;/span&gt;
&amp;lt;/h:outputLink&amp;gt;
&lt;/pre&gt;&lt;/div&gt;&lt;div style="font-size:110%;font-weight:bold;"&gt;Solution for Requirement #2b (Simple form to collect minimal user information)&lt;/div&gt;&lt;p&gt;In the &lt;span style="font-family:courier new;font-size:95%;font-weight:bold;color:#336600;"&gt;view/WEB-INF/facelets/guestSupport.xhtml&lt;/span&gt; file, I use a &lt;b&gt;rich:modalPanel&lt;/b&gt; to hold the registration form. My registration form only requires the user to supply a screen name, password, email, first, and last name. Notice that the form fields are bound to properties of a &lt;b&gt;guest&lt;/b&gt; component, for example:&lt;/p&gt;&lt;div style="font-family:courier new;font-size:95%;margin-left:25px;"&gt;&lt;pre&gt;&amp;lt;h:inputText label="#{i18n.login_username}" id="userName" required="true" 
value="&lt;span style="color:#cc0000;"&gt;#{guest.registrationScreenName}&lt;/span&gt;" size="12" maxlength="12" redisplay="true"&amp;gt;
  ...
&amp;lt;/h:inputText&amp;gt;
&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;The &lt;b&gt;guest&lt;/b&gt; object is a Seam component &lt;b&gt;example.action.GuestSupport&lt;/b&gt; that handles the registration process, see: &lt;span style="font-family:courier new;font-size:95%;font-weight:bold;color:#336600;"&gt;src/hot/example/action/GuestSupport.java&lt;/span&gt;. Notice the annotations on the GuestSupport class:&lt;/p&gt;&lt;div style="font-family:courier new;font-size:95%;margin-left:25px;"&gt;&lt;pre&gt;@Name("guest")
@Scope(ScopeType.EVENT)
public class GuestSupport ...
&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;The &lt;b&gt;@Name&lt;/b&gt; annotation means that this is a Seam component that can be referenced in the EL using the identifier &lt;b&gt;guest&lt;/b&gt;, such as: &lt;b&gt;#{guest.registrationScreenName}&lt;/b&gt;. The &lt;b&gt;@Scope(ScopeType.EVENT)&lt;/b&gt; annotation means that the component is managed in Seam's &lt;b&gt;EVENT&lt;/b&gt; scope, which means the component exists from the Restore View phase until the end of the Render Response phase in the JSF lifecyle. Essentially, the guest component exists only to handle a guest action and then is released by the Seam container. If no guest actions are performed, then this component is never created.&lt;/p&gt;&lt;p&gt;The form is bound to the &lt;b&gt;guest.doRegister&lt;/b&gt; method:&lt;/p&gt;&lt;div style="font-family:courier new;font-size:95%;margin-left:25px;"&gt;&lt;pre&gt;&amp;lt;a:commandButton type="button" id="registerButton" value="#{i18n.register}" action="&lt;span style="color:#cc0000;"&gt;#{guest.doRegister}&lt;/span&gt;"/&amp;gt;
&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Notice that I'm using a RichFaces AJAX form &lt;b&gt;&amp;lt;a:form&amp;gt;&lt;/b&gt; which means the register action will be submitted as an AJAX request without reloading the entire page.&lt;/p&gt;&lt;div style="font-size:110%;font-weight:bold;"&gt;Solution for Requirement #2c (CAPTCHA)&lt;/div&gt;&lt;p&gt;In my Spring solution, I integrated with reCAPTCHA. However, Seam comes with a CAPTCHA solution built-in so I decided to give it a try. Here is how to embed the Seam CAPTCHA solution into your form:&lt;/p&gt;&lt;div style="font-family:courier new;font-size:95%;margin-left:25px;"&gt;&lt;pre&gt;&amp;lt;s:decorate id="verifyCaptchaField" template="captcha.xhtml"&amp;gt;
  &amp;lt;h:graphicImage id="captchaChallenge" value="/seam/resource/captcha" styleClass="captchaChallenge"/&amp;gt;
  &amp;lt;h:inputText id="verifyCaptcha" value="#{captcha.response}" required="true" size="3"/&amp;gt;
&amp;lt;/s:decorate&amp;gt;
&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;The default solution is to show an image of a simple addition problem, which unlike reCAPTCHA is much easier to read and is probably pretty safe unless your site is for preschoolers who might not be able to solve the addition problem.&lt;/p&gt;&lt;p style="text-align:center;"&gt;&lt;img src='http://vinwiki.com/thelabdude/jsf_rf_seam_user_reg/captcha.png' alt='Seam CAPTCHA on Registration Form'/&gt;&lt;/p&gt;&lt;p&gt;From what I've seen, it's also pretty easy to extend the built-in solution to provide your own image creation solution. You should note that if you use Seam's CAPTCHA solution, then you must make sure the Seam Resource Servlet is activated in your web.xml file (done automatically by seam-gen).&lt;/p&gt;&lt;div style="font-size:110%;font-weight:bold;"&gt;Solution for Requirement #2d (Terms of Use)&lt;/div&gt;&lt;p&gt;I built a placeholder HTML file to hold the site's legal text, see: &lt;span style="font-family:courier new;font-size:95%;font-weight:bold;color:#336600;"&gt;view/WEB-INF/facelets/legal.html&lt;/span&gt;. If the user submits the form without agreeing to the terms of use, then a nice error message is returned. This is handled by the following code in my &lt;b&gt;doRegister&lt;/b&gt; method:&lt;/p&gt;&lt;div style="font-family:courier new;font-size:95%;margin-left:25px;"&gt;&lt;pre&gt;if (!agreedToTermsOfUse) {
    String msg = facesSupport.getMessage("please_agree_to_terms", extCtxt.getRequestLocale());
    jsf.addMessage("registerForm:registerButton", new FacesMessage(FacesMessage.SEVERITY_ERROR, msg, null));
    return;
}
&lt;/pre&gt;&lt;/div&gt;&lt;div style="font-size:110%;font-weight:bold;"&gt;Solution for Requirement #2e (Email Verification)&lt;/div&gt;&lt;p&gt;The &lt;b&gt;doRegister&lt;/b&gt; method creates the user but does not activate the account. Instead, it creates an activation key which is an MD5 hash of the user's information and the current timestamp. The timestamp is needed because there is a very slight chance of getting the same hash code for two different users. The activation key is sent to the user as a link back to the site that will activate their account.&lt;/p&gt;&lt;p&gt;Seam uses Facelet templates as email templates. For account verification, I created the &lt;span style="font-family:courier new;font-size:95%;font-weight:bold;color:#336600;"&gt;view/WEB-INF/facelets/email/activation.xhtml&lt;/span&gt; template. Notice that I can use the EL to reference my Seam components just as I would for an HTML template:&lt;/p&gt;&lt;div style="font-family:courier new;font-size:95%;margin-left:25px;"&gt;&lt;pre&gt;&amp;lt;m:from name="JSF / RF Seam Example" address="support@saastk.com"/&amp;gt;
&amp;lt;m:to name="#{inactiveNewUser.email}"&amp;gt;&lt;span style="color:#cc0000;"&gt;#{inactiveNewUser.email}&lt;/span&gt;&amp;lt;/m:to&amp;gt;
&amp;lt;m:subject&amp;gt;JSF / RF Seam Example Account Activation&amp;lt;/m:subject&amp;gt;
&amp;lt;m:body&amp;gt;
  &amp;lt;html&amp;gt;
    &amp;lt;body&amp;gt;
      &amp;lt;p&amp;gt;Dear #{inactiveNewUser.displayName},&amp;lt;/p&amp;gt;
      &amp;lt;p&amp;gt;Please click on the following link to activate your account:&amp;lt;/p&amp;gt;
      &amp;lt;p&amp;gt;&amp;lt;a href="&lt;span style="color:#cc0000;"&gt;#{inactiveNewUser.activationLink}&lt;/span&gt;"&amp;gt;#{inactiveNewUser.activationLink}&amp;lt;/a&amp;gt;&amp;lt;/p&amp;gt;
    &amp;lt;/body&amp;gt;
  &amp;lt;/html&amp;gt;
&amp;lt;/m:body&amp;gt;
&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;The &lt;b&gt;registrationMailer&lt;/b&gt; component processes the email templates and sends the email, see: &lt;span style="font-family:courier new;font-size:95%;font-weight:bold;color:#336600;"&gt;src/hot/example/action/RegistrationMailer.java&lt;/span&gt;. Notice the annotations on the RegistrationMailer class:&lt;/p&gt;&lt;div style="font-family:courier new;font-size:95%;margin-left:25px;"&gt;&lt;pre&gt;@Name("registrationMailer")
@AutoCreate
@Scope(ScopeType.APPLICATION)
public class RegistrationMailer ...
&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;The &lt;b&gt;@Name&lt;/b&gt; annotation means that this is a Seam component that can be referenced in the EL using the identifier &lt;b&gt;registrationMailer&lt;/b&gt;. The &lt;b&gt;@Scope(ScopeType.APPLICATION)&lt;/b&gt; annotation means that the component is managed in Seam's &lt;b&gt;APPLICATION&lt;/b&gt; scope, which means the component exists while the Web application is active. Essentially, there will be only one registrationMailer used by the Seam container. &lt;b&gt;@AutoCreate&lt;/b&gt; means that Seam will create this component for us automatically.&lt;/p&gt;&lt;p&gt;I decided to send the email as part of the transaction meaning that if the email fails, then no account is created. The following code puts an instance of the aptly named &lt;b&gt;InactiveNewUser&lt;/b&gt; bean into EVENT scope and the raises a synchronous application event.&lt;/p&gt;&lt;div style="font-family:courier new;font-size:95%;margin-left:25px;"&gt;&lt;pre&gt;Contexts.getEventContext().set("inactiveNewUser",
  new InactiveNewUser(newUser.toString(), newUser.getUserName(), newUser.getEmail(), newUserLink));
Events.instance().raiseEvent(&lt;span style="color:#cc0000;"&gt;"userRegistered"&lt;/span&gt;);
&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;The RegistrationMailer is an observer of this event:&lt;/p&gt;&lt;div style="font-family:courier new;font-size:95%;margin-left:25px;"&gt;&lt;pre&gt;&lt;span style="color:#cc0000;"&gt;@Observer("userRegistered")&lt;/span&gt;
public void sendActivationEmail() {
    renderer.render("/WEB-INF/facelets/email/activation.xhtml");
}
&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Supposedly, you can send the email asynchronously, but I ran into trouble so I stayed with the synchronous approach for now. I'll post to the Seam forum to see what I can find out ...&lt;/p&gt;&lt;p&gt;NOTE: Be sure to configure a &lt;b&gt;&amp;lt;mail:mail-session host="???" port="25"/&amp;gt;&lt;/b&gt; in the Seam configuration file: &lt;span style="font-family:courier new;font-size:95%;font-weight:bold;color:#336600;"&gt;resources/WEB-INF/components.xml&lt;/span&gt;. See &lt;a href="http://docs.jboss.com/seam/2.1.1.GA/reference/en-US/html_single/#mail" target="_blank"&gt;Chapter 21. Email&lt;/a&gt;&lt;/p&gt;&lt;div style="font-size:110%;font-weight:bold;"&gt;Solution for Requirement #2f (Account Activation)&lt;/div&gt;&lt;p&gt;Assuming all goes well and the email provided by the user is valid, then they will receive an email with a link back to the site, such as:&lt;/p&gt;&lt;div style="font-family:courier new;font-size:95%;margin-left:25px;"&gt;&lt;pre&gt;http://localhost:8989/jsf_rf_seam_user_reg/home.seam?&lt;span style="color:#cc0000;"&gt;act=19eb2e7567913c47a931e305cdeb6d311242247928343&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;When the user clicks on the link, we need to process the &lt;b&gt;act&lt;/b&gt; request parameter. Fortunately, Seam makes this really easy to do using a page action. In &lt;span style="font-family:courier new;font-size:95%;font-weight:bold;color:#336600;"&gt;resources/WEB-INF/pages.xml&lt;/span&gt; I declare an action for the &lt;b&gt;home.xhtml&lt;/b&gt; page:&lt;/p&gt;&lt;div style="font-family:courier new;font-size:95%;margin-left:25px;"&gt;&lt;pre&gt;&amp;lt;action execute="#{guest.doActivate(request.getParameter('act'))}" 
  if="#{request.getParameter('act') != null}"/&amp;gt;
&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;In plain English, Seam invokes the &lt;b&gt;guest.doActivate&lt;/b&gt; method if the request contains the &lt;b&gt;act&lt;/b&gt; parameter. If activation is successful, then I trigger the Login form to open up with the newly activated username pre-populated.&lt;/p&gt;&lt;div style="font-family:courier new;font-size:95%;margin-left:25px;"&gt;&lt;pre&gt;@Transactional
public void doActivate(String activationKey) {
    Query q = entityManager.createQuery("from User u where u.active=0 AND u.activationKey=:activationKey");
    q.setParameter("activationKey", activationKey);
    User activatedUser = (User)q.getSingleResult();
    activatedUser.setActive(true);
    activatedUser.setActivationKey(null);
    loginUserId = activatedUser.getUserName();
    credentials.setUsername(loginUserId);
    log.info("User {0} activated successfully.", loginUserId);
}
&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;NOTE: You may want to have a background process that cleans up accounts that haven't been activated after so many days.&lt;/p&gt;&lt;a name="Solution3"&gt;&lt;div style="font-size:125%;font-weight:bold;"&gt;Solution for Requirement #3 (Remember Me)&lt;/div&gt;&lt;/a&gt; &lt;div style="font-size:110%;font-weight:bold;"&gt;Login&lt;/div&gt;&lt;p&gt;Before I can tackle the &lt;b&gt;Remember Me&lt;/b&gt; requirement, I need to make sure you know how login works with Seam. If you search my code, you'll see that there is no &lt;b&gt;login&lt;/b&gt; method. Instead, I rely on Seam's &lt;a href="http://docs.jboss.com/seam/2.1.1.GA/reference/en-US/html_single/#d0e8804" target="_blank"&gt;Identity Management&lt;/a&gt; framework to do the login work for me. The Identity Management framework requires an identity store in &lt;span style="font-family:courier new;font-size:95%;font-weight:bold;color:#336600;"&gt;resources/WEB-INF/components.xml&lt;/span&gt;:&lt;/p&gt;&lt;div style="font-family:courier new;font-size:95%;margin-left:25px;"&gt;&lt;pre&gt;&amp;lt;security:jpa-identity-store user-class="example.user.User" role-class="example.user.Role"/&amp;gt;
&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Notice that I'm reusing my User and Role Entities that I've already developed. My login form is bound to Seam's &lt;b&gt;identity.login&lt;/b&gt; method:&lt;/p&gt;&lt;div style="font-family:courier new;font-size:95%;margin-left:25px;"&gt;&lt;pre&gt;&amp;lt;h:commandButton type="submit" id="loginBtn" &lt;span style="color:#cc0000;"&gt;action="#{identity.login}"&lt;/span&gt; value="#{i18n.login}"/&amp;gt;
&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;See loginPanel in &lt;span style="font-family:courier new;font-size:95%;font-weight:bold;color:#336600;"&gt;view/WEB-INF/facelets/guestSupport.xhtml&lt;/span&gt;&lt;/p&gt;&lt;p&gt;If login is successful, then Seam raises the &lt;b&gt;Identity.EVENT_LOGIN_SUCCESSFUL&lt;/b&gt; event; if login fails, then Seam raises the &lt;b&gt;Identity.EVENT_LOGIN_FAILED&lt;/b&gt; event. Thus, my GuestSupport component is configured as an &lt;b&gt;@Observer&lt;/b&gt; for these events:&lt;/p&gt;&lt;div style="font-family:courier new;font-size:95%;margin-left:25px;"&gt;&lt;pre&gt;@Observer(Identity.EVENT_LOGIN_SUCCESSFUL)
public void onLogin() {
    &lt;span style="color:#cc0000;"&gt;this.currentUser = byUserName(credentials.getUsername());&lt;/span&gt;
    log.info("User {0} logged in successfully.", this.currentUser.getUserName());
}

@Observer(Identity.EVENT_LOGIN_FAILED)
public void onLoginFailed() {
    this.currentUser = null;
    FacesContext jsf = FacesContext.getCurrentInstance();
    String msg = facesSupport.getMessage("login_error", jsf.getExternalContext().getRequestLocale());
    jsf.addMessage("loginForm:loginBtn", new FacesMessage(FacesMessage.SEVERITY_ERROR, msg, null));
}
&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;In the &lt;b&gt;onLogin&lt;/b&gt; method, we see an application of Seam's outjection &lt;b&gt;@Out&lt;/b&gt; in action. If login succeeds, then we outject the &lt;b&gt;currentUser&lt;/b&gt; object into Seam's SESSION Scope as seen by:&lt;/p&gt;&lt;div style="font-family:courier new;font-size:95%;margin-left:25px;"&gt;&lt;pre&gt;&lt;span style="color:#cc0000;"&gt;@Out(required=false, scope=ScopeType.SESSION)&lt;/span&gt;
private User currentUser;
&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;So, unlike Spring, Seam can easily export runtime objects into the container using simple annotations! Once outjected, the &lt;b&gt;currentUser&lt;/b&gt; component can be used seamlessly in our JSF UI.&lt;/p&gt;&lt;p&gt;At this point, you may be wondering how Seam verifies the credentials supplied by the user against the database. There is a little annotation magic going on here that is not immediately apparent. If you study the example.user.User class, you'll see that I'm using a few annotations from the &lt;span style="font-family:courier new;font-size:95%;font-weight:bold;"&gt;org.jboss.seam.annotations.security.management&lt;/span&gt; package:&lt;/p&gt;&lt;div style="font-family:courier new;font-size:95%;margin-left:25px;"&gt;&lt;pre&gt;@Column(name = "user_name",unique = true,nullable = false,length = 16)
@NotNull
@Length(min = 4,max = 16)
@Pattern(regex = "^[a-zA-Z\\d_]{4,12}$",message = "Invalid screen name.")
&lt;span style="color:#cc0000;"&gt;@UserPrincipal&lt;/span&gt;
public String getUserName() {
    return this.userName;
}

...

@Column(name = "password",nullable = false,length = 128)
@NotNull
@Length(max = 128)
&lt;span style="color:#cc0000;"&gt;@UserPassword(hash = "SHA")&lt;/span&gt;
public String getPasswordHash() {
    return this.passwordHash;
}
&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;The &lt;b&gt;@UserPrincipal&lt;/b&gt; and &lt;b&gt;@UserPassword&lt;/b&gt; annotations are used by the Seam Identity Management framework to verify the credentials during login. With these annotations, Seam has all the information it needs to validate credentials against a one-way SHA-1 hash stored in the database.&lt;/p&gt;&lt;div style="font-size:110%;font-weight:bold;"&gt;Remember Me&lt;/div&gt;&lt;p&gt;Seam supports two approaches to remembering users when they return to your site, aptly named &lt;a href="http://docs.jboss.com/seam/2.1.1.GA/reference/en-US/html_single/#d0e8588" target="_blank"&gt;Remember Me&lt;/a&gt;: 1) only the username is remembered, or 2) a persistent access token is stored in a cookie to allow full automatic login. If you read the Seam documentation, you are strongly advised against the latter approach due to cross-site scripting vulnerabilities. While I understand those concerns, it is still a really nice feature to offer for users. Consequently, I'm going to show you how I implemented the persistent token approach and if you don't decide to use it, then it is very easy to rollback to the username only solution. I actually think the username-only solution is pretty slick if implemented correctly. LinkedIn is a good example of the username only solution where it recognizes you and shows your profile in read-only mode. LinkedIn only requires you to login if you try to make a change to your profile.&lt;/p&gt;&lt;p&gt;There are four steps you need to do to enable automatic login:&lt;/p&gt;&lt;p&gt;First, you need to activate Seam's &lt;b&gt;rememberMe&lt;/b&gt; component in the &lt;span style="font-family:courier new;font-size:95%;font-weight:bold;color:#336600;"&gt;resources/WEB-INF/components.xml&lt;/span&gt; file and set the mode to &lt;b&gt;autoLogin&lt;/b&gt;:&lt;/p&gt;&lt;div style="font-family:courier new;font-size:95%;margin-left:25px;"&gt;&lt;pre&gt;&amp;lt;security:jpa-token-store token-class="example.security.AutoLoginToken"/&amp;gt;
  &amp;lt;security:remember-me mode="autoLogin"/&amp;gt;
&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;The &lt;b&gt;rememberMe&lt;/b&gt; component needs a token store. Seam provides a JPA-based implementation &lt;span style="font-family:courier new;font-size:95%;font-weight:bold;"&gt;org.jboss.seam.security.JpaTokenStore&lt;/span&gt; that stores the token in your database, but you need to provide the Entity class, see &lt;span style="font-family:courier new;font-size:95%;font-weight:bold;color:#336600;"&gt;src/main/example/security/AutoLoginToken.java&lt;/span&gt;. The &lt;b&gt;AutoLoginToken&lt;/b&gt; implementation came directly from &lt;a href="http://docs.jboss.com/seam/2.1.1.GA/reference/en-US/html_single/#d0e8588" target="_blank"&gt;15.3.5.1. Token-based Remember-me Authentication&lt;/a&gt; in the Seam documentation.&lt;/p&gt;&lt;p&gt;Second, you need to add a checkbox on the login form bound to the &lt;b&gt;rememberMe&lt;/b&gt; component's enabled property:&lt;/p&gt;&lt;div style="font-family:courier new;font-size:95%;margin-left:25px;"&gt;&lt;pre&gt;&amp;lt;h:selectBooleanCheckbox id="rememberMe" value="&lt;span style="color:#cc0000;"&gt;#{rememberMe.enabled}&lt;/span&gt;"/&amp;gt;
&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Third, you need to invoke the &lt;b&gt;identity.tryLogin&lt;/b&gt; method when the home page is requested. As with the activation link processing, I configured a simple page action for home.xhtml in pages.xml:&lt;/p&gt;&lt;div style="font-family:courier new;font-size:95%;margin-left:25px;"&gt;&lt;pre&gt;&amp;lt;action execute="#{identity.tryLogin}" if="#{not identity.loggedIn}"/&amp;gt;
&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;The Seam documentation mentioned using the following event declarations in components.xml, but that did not work for me:&lt;/p&gt;&lt;div style="font-family:courier new;font-size:95%;margin-left:25px;"&gt;&lt;pre&gt;&amp;lt;event type="org.jboss.seam.security.notLoggedIn"&amp;gt;
  &amp;lt;action execute="#{redirect.captureCurrentView}"/&amp;gt;
  &amp;lt;action execute="#{identity.tryLogin()}"/&amp;gt;
&amp;lt;/event&amp;gt;
&amp;lt;event type="org.jboss.seam.security.loginSuccessful"&amp;gt;
  &amp;lt;action execute="#{redirect.returnToCapturedView}"/&amp;gt;
&amp;lt;/event&amp;gt;
&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Lastly, you need to observe the Identity.EVENT_POST_AUTHENTICATE event and out-ject the &lt;b&gt;currentUser&lt;/b&gt; component into Session scope if auto-login was successful:&lt;/p&gt;&lt;div style="font-family:courier new;font-size:95%;margin-left:25px;"&gt;&lt;pre&gt;@Observer(Identity.EVENT_POST_AUTHENTICATE)
public void postAuthenticate(Identity identity) {
    if (identity != null &amp;&amp; identity.isLoggedIn() &amp;&amp; this.currentUser == null) {
        String userName = identity.getUsername();
        if (userName == null) userName = identity.getPrincipal().getName();
        if (userName != null) {
            &lt;span style="color:#cc0000;"&gt;this.currentUser = byUserName(userName);&lt;/span&gt;
            log.info("User {0} logged in successfully.", this.currentUser.getUserName());
        }
    }
}
&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;The Identity.EVENT_LOGIN_SUCCESSFUL event was not raised by the &lt;b&gt;rememberMe&lt;/b&gt; component upon successful auto-login, which may be a bug? I'll ask the Seam folks ... for now, you have to have this additional @Observer in place :-(&lt;/p&gt;&lt;a name="Solution4"&gt;&lt;div style="font-size:125%;font-weight:bold;"&gt;Solution for Requirement #4 (Ability to recover forgotten passwords)&lt;/div&gt;&lt;/a&gt; &lt;p&gt;If a user forgets their password, then they can request the site to email a temporary password by submitting their screen name and email.&lt;/p&gt;&lt;p style="text-align:center;"&gt;&lt;img src='http://vinwiki.com/thelabdude/jsf_rf_seam_user_reg/forgot.png' alt='Recover Password Form'/&gt;&lt;/p&gt;&lt;p&gt;Since this is an action performed by a guest of the site, the form is processed by the &lt;b&gt;guest&lt;/b&gt; component:&lt;/p&gt;&lt;div style="font-family:courier new;font-size:95%;margin-left:25px;"&gt;&lt;pre&gt;@Transactional
public void doRecoverLostPassword() throws Exception {
    ...
}
&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;The &lt;b&gt;@Transactional&lt;/b&gt; annotation tells Seam that this method must be invoked within the context of a transaction. If the method completes successfully, then Seam will commit the work for you regardless of the underlying transaction manager your application is using.&lt;/p&gt;&lt;p&gt;When the user logs into the site with their temporary password (sent in an email), the site needs to prompt them to change the password to a permanent value. As before, to keep the user's orientation with the site, I pop-up a modal dialog to require the user to change their password, see &lt;b&gt;chngPswdPanel&lt;/b&gt; in &lt;span style="font-family:courier new;font-size:95%;font-weight:bold;color:#336600;"&gt;view/WEB-INF/facelets/userPreferences.xhtml&lt;/span&gt;.&lt;/p&gt;&lt;p style="text-align:center;"&gt;&lt;img src='http://vinwiki.com/thelabdude/jsf_rf_seam_user_reg/change_password.png' alt='Change Password Dialog'/&gt;&lt;/p&gt;&lt;p&gt;The modal panel will only show itself if the user's password is temporary:&lt;/p&gt;&lt;div style="font-family:courier new;font-size:95%;margin-left:25px;"&gt;&lt;pre&gt;&amp;lt;rich:modalPanel id="chngPswdPanel" width="320" height="240" rendered="#{identity.loggedIn}" &lt;span style="color:#cc0000;"&gt;showWhenRendered="#{currentUser.temporaryPassword}"&lt;/span&gt;&amp;gt;
&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;The change password form is handled by the &lt;b&gt;changePassword&lt;/b&gt; component, see: &lt;span style="font-family:courier new;font-size:95%;font-weight:bold;color:#336600;"&gt;src/hot/example/action/ChangePasswordAction.java&lt;/span&gt;&lt;/p&gt;&lt;a name="Solution5"&gt;&lt;div style="font-size:125%;font-weight:bold;"&gt;Solution for Requirement #5 (User Preferences)&lt;/div&gt;&lt;/a&gt; &lt;p&gt;Once logged in, the user can click on the Preferences link to edit their profile and control how they want the site to interact with them. The &lt;b&gt;prefPanel&lt;/b&gt; dialog in userPreferences.xhtml is activated using the following AJAX Link &lt;b&gt;a:commandLink&lt;/b&gt; in headerControls.xhtml:&lt;/p&gt;&lt;div style="font-family:courier new;font-size:95%;margin-left:25px;"&gt;&lt;pre&gt;&amp;lt;a:commandLink id="prefLink" action="&lt;span style="color:#cc0000;"&gt;#{updatePreferences.beginEditPreferences(currentUser.userId)}&lt;/span&gt;" oncomplete="#{rich:component('prefPanel')}.show()" value="#{i18n.preferences}"/&amp;gt;
&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Look closely at the EL for the action attribute; I'm using a new component &lt;b&gt;updatePreferences&lt;/b&gt;, see: &lt;span style="font-family:courier new;font-size:95%;font-weight:bold;color:#336600;"&gt;src/hot/example/action/UpdatePreferencesAction.java&lt;/span&gt;.&lt;/p&gt;&lt;div style="font-family:courier new;font-size:95%;margin-left:25px;"&gt;&lt;pre&gt;@Name("updatePreferences")
&lt;span style="color:#cc0000;"&gt;@Scope(ScopeType.CONVERSATION)&lt;/span&gt;
@Restrict("#{identity.loggedIn}")
@Transactional
public class UpdatePreferencesAction implements java.io.Serializable ...
&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;The Scope of this component is &lt;b&gt;ScopeType.CONVERSATION&lt;/b&gt;, which means this component will exist for the duration of a conversation, possibly spanning multiple requests from the user. The Seam documentation equates a &lt;a href="http://docs.jboss.com/seam/2.1.1.GA/reference/en-US/html_single/#d0e3567" target="_blank"&gt;conversation&lt;/a&gt; with a single unit-of-work from the perspective of the user (not the database). In this case, the unit-of-work is to edit preferences, which may include uploading a photo as well as submitting an AJAX form one or more times. So the conversation begins when the user opens the preferences dialog and ends when they close it. Here is the &lt;b&gt;beginEditPreferences&lt;/b&gt; method:&lt;/p&gt;&lt;div style="font-family:courier new;font-size:95%;margin-left:25px;"&gt;&lt;pre&gt;@Begin(flushMode = FlushModeType.MANUAL, join = true)
public void beginEditPreferences(Long userId) {
    &lt;span style="color:#cc0000;"&gt;currentUser = entityManager.find(User.class, userId);&lt;/span&gt; &lt;span style="color:#339900;"&gt;// attach to this EntityManager&lt;/span&gt;
    log.info("User {0} has started editing preferences.", currentUser.getUserName());
}
&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;When the conversation begins, I use the &lt;b&gt;entityManager&lt;/b&gt; to get the User Entity for the current user, identified by the &lt;b&gt;userId&lt;/b&gt; parameter. Of course, there is a &lt;b&gt;currentUser&lt;/b&gt; component in Session Scope (out-jected during the login process). However, it is not attached to the updatePreferences component's entityManager (it is a detached Entity that is only useful for reading data about the current user). When the &lt;b&gt;beginEditPreferences&lt;/b&gt; method completes, the currentUser stays attached to the entityManager for the duration of the conversation.&lt;/p&gt;&lt;p&gt;While the preferences dialog is open, the user can submit several requests to the server, such as uploading an image. For image upload, I employ the RichFaces &lt;b&gt;rich:fileUpload&lt;/b&gt; component:&lt;/p&gt;&lt;div style="font-family:courier new;font-size:95%;margin-left:25px;"&gt;&lt;pre&gt;&amp;lt;rich:fileUpload fileUploadListener="#{updatePreferences.fileUploadListener}" maxFilesQuantity="1" id="uploadPhoto" immediateUpload="true" acceptedTypes="jpg,gif,png" allowFlash="false" listHeight="110"&amp;gt;
  &amp;lt;a:support event="onuploadcomplete" reRender="prefPanel,uploadPhotoPanel"/&amp;gt;
&amp;lt;/rich:fileUpload&amp;gt;
&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;The &lt;b&gt;fileUploadListener&lt;/b&gt; method participates in the conversation accessing the &lt;b&gt;currentUser&lt;/b&gt; that was attached to the entityManager when the conversation began.&lt;/p&gt;&lt;p&gt;When the user closes the preferences dialog, the conversation ends by invoking an AJAX call to &lt;b&gt;updatePreferences.endEditPreferences&lt;/b&gt; which is annotated with @End:&lt;/p&gt;&lt;div style="font-family:courier new;font-size:95%;margin-left:25px;"&gt;&lt;pre&gt;&lt;span style="color:#cc0000;"&gt;@End&lt;/span&gt;
public void endEditPreferences() {
    &lt;span style="color:#cc0000;"&gt;entityManager.flush();&lt;/span&gt; &lt;span style="color:#339900;"&gt;// flush the changes to the database&lt;/span&gt;
    log.info("User {0} finished editing preferences.", currentUser.getUserName());
}
&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Notice that the updatePreferences component out-jects the &lt;b&gt;currentUser&lt;/b&gt; component back into Session scope (as the guest component did after a successful login). This is needed so that changes are propagated back to the UI, such as the user's display name shown in the top right corner of the application.&lt;/p&gt;&lt;p style="text-align:center;"&gt;&lt;img src='http://vinwiki.com/thelabdude/jsf_rf_seam_user_reg/conversation.png' alt='Conversation'/&gt;&lt;/p&gt;&lt;p&gt;It may be hard to appreciate Seam conversations with such simplistic objects at play. However, as your object associations become more complex, conversations are immensely powerful in helping you manage updates to your entities.&lt;/p&gt;&lt;a name="Solution6"&gt;&lt;div style="font-size:125%;font-weight:bold;"&gt;Solution for Requirement #6 (Strong Server-side Validation using AJAX)&lt;/div&gt;&lt;/a&gt; &lt;p&gt;With Seam, you don't need any JavaScript validation--all validation can be done efficiently on the server. For example, notice how I ensure the user's password is strong:&lt;/p&gt;&lt;div style="font-family:courier new;font-size:95%;margin-left:25px;"&gt;&lt;pre&gt;&amp;lt;h:inputSecret label="#{i18n.login_password}" id="password" required="true" value="#{passwordSupport.password}" size="12" maxlength="12"&amp;gt;
  &amp;lt;f:validateLength minimum="8" maximum="12"/&amp;gt;
  &amp;lt;rich:beanValidator/&amp;gt;
&amp;lt;/h:inputSecret&amp;gt;
&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;On the server, the password is validated against a regular expression using Hibernate Validator:&lt;/p&gt;&lt;div style="font-family:courier new;font-size:95%;margin-left:25px;"&gt;&lt;pre&gt;@NotEmpty
@Length(min=8,max=12)
@Pattern(regex="(?=^.{8,}$)((?=.*\\d)|(?=.*\\W+))(?![.\\n])(?=.*[A-Z])(?=.*[a-z]).*$", message="Password is not secure enough!")
private String password = null;
&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;See: &lt;span style="font-family:courier new;font-size:95%;font-weight:bold;color:#336600;"&gt;src/hot/example/action/PasswordSupport.java&lt;/span&gt;&lt;/p&gt;&lt;p&gt;Most of the forms used in this solution are submitted using AJAX requests. The login form and the image upload form are the only two forms that redisplay the entire page when submitted. Here are some other areas where I've used the AJAX features of RichFaces:&lt;/p&gt;&lt;div style="font-size:110%;font-weight:bold;"&gt;Country / Region&lt;/div&gt;&lt;p&gt;One of the nice things about RichFaces is that you can attach AJAX-driven logic to any JSF component without writing any JavaScript. For example, suppose we want to limit the list of regions based on the selected country on the Preferences form. This is trivial using the &amp;lt;a4j:support&amp;gt; tag provided by RichFaces:&lt;/p&gt;&lt;div style="font-family:courier new;font-size:95%;margin-left:25px;"&gt;&lt;pre&gt;&amp;lt;a4j:support event="onchange" rerender="region" ajaxsingle="true"&amp;gt;
&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;This fires an AJAX request when the country is changed and upon success, the region component is re-rendered with a country-specific list of regions.&lt;/p&gt;&lt;div style="font-size:110%;font-weight:bold;"&gt;User name check with AJAX&lt;/div&gt;&lt;p&gt;During registration, if the user's desired screen name is already taken by another user, we should let them know immediately. To do this, we use the &amp;lt;a4j:support&amp;gt; tag provided by RichFaces on the userName &amp;lt;h:inputText&amp;gt; field:&lt;/p&gt;&lt;div style="font-family:courier new;font-size:95%;margin-left:25px;"&gt;&lt;pre&gt;&amp;lt;h:inputText label="#{i18n.login_username}" id="userName" required="true" value="#{guest.registrationScreenName}" size="12" maxlength="12" redisplay="true"&amp;gt;
  &amp;lt;f:validateLength minimum="4" maximum="12"/&amp;gt;
  &amp;lt;rich:beanValidator summary="#{i18n.invalid_screen_name}"/&amp;gt;
  &amp;lt;a:support event="onblur" reRender="userName" ajaxSingle="true"/&amp;gt;
&amp;lt;/h:inputText&amp;gt;
&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;When the onblur event occurs, RichFaces sends an AJAX request to update the &lt;b&gt;guest.registrationScreenName&lt;/b&gt; property. If the desired screen name is already taken, a FacesMessage is created to be displayed by the &amp;lt;rich:message&amp;gt; tag associated with the username field.&lt;/p&gt;&lt;div style="font-family:courier new;font-size:95%;margin-left:25px;"&gt;&lt;pre&gt;&amp;lt;rich:message for="userName"&amp;gt;
  &amp;lt;f:facet name="errorMarker"&amp;gt;&amp;lt;h:graphicimage value="img/error.gif"&amp;gt;&amp;lt;/f:facet&amp;gt;
&amp;lt;/rich:message&amp;gt;
&lt;/pre&gt;&lt;/div&gt;&lt;div style="font-size:125%;font-weight:bold;"&gt;Test Driven Development with Seam&lt;/div&gt;&lt;p&gt;One of the major attractions of POJO development and frameworks like Spring is the ease in which you can test your code as you develop it outside the container. Seam offers a similar solution, but does require the JBoss micro-container to execute tests. Other than the tests running much slower than similar Spring-based tests, I found testing with Seam to be very easy and effective. On the other hand, I wasn't able to find a simple JSF testing solution for Spring anyway, so just having the SeamTest framework in place is a benefit over Spring, even if it is slower. Seam tests are executed by &lt;a href="http://testng.org/doc/documentation-main.html" target="_blank"&gt;TestNG&lt;/a&gt;. I've implemented a test that performs all the actions needed to support the requirements described above, see: &lt;span style="font-family:courier new;font-size:95%;font-weight:bold;color:#336600;"&gt;src/test/example/test/UserRegistrationTest.java&lt;/span&gt;. Notice that you use the same EL as your JSF Facelets do!&lt;/p&gt;&lt;p&gt;Lastly, I want to mention one thing you should to do to fix an issue with incremental hot deploy. The &lt;b&gt;seam explode&lt;/b&gt; command is supposed to hot-deploy your Facelets and Java classes from the &lt;b&gt;hot&lt;/b&gt; directory. However, it also updates the JCA ConnectionFactory in the JBoss deploy directory. This can cause issues (see: https://jira.jboss.org/jira/browse/JBSEAM-3844?focusedCommentId=12442804). My solution was to add a new target to my project's build.xml file &lt;b&gt;hotdeploy&lt;/b&gt;, which is the same as the explode target except datasource is no longer a dependency:&lt;/p&gt;&lt;div style="font-family:courier new;font-size:95%;margin-left:25px;"&gt;&lt;pre&gt;&amp;lt;target name="hotdeploy" depends="stage" description="Deploy the exploded archive but not the datasource"&amp;gt;
  &amp;lt;fail unless="jboss.home"&amp;gt;
    jboss.home not set
  &amp;lt;/fail&amp;gt;
  &amp;lt;mkdir dir="${war.deploy.dir}"/&amp;gt;
  &amp;lt;copy todir="${war.deploy.dir}"&amp;gt;
    &amp;lt;fileset dir="${war.dir}"/&amp;gt;
  &amp;lt;/copy&amp;gt;
&amp;lt;/target&amp;gt;
&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;You also have to update the build.xml file in the seam-gen directory of your Seam distribution:&lt;/p&gt;&lt;div style="font-family:courier new;font-size:95%;margin-left:25px;"&gt;&lt;pre&gt;&amp;lt;target name="hotdeploy" depends="validate-project" description="Deploy the project as an exploded directory"&amp;gt;
  &amp;lt;echo message="Deploying project '${project.name}' to JBoss AS as an exploded directory"/&amp;gt;
  &amp;lt;ant antfile="${project.home}/build.xml" target="hotdeploy" inheritall="false"/&amp;gt;
&amp;lt;/target&amp;gt;
&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;&lt;b&gt;IMPORTANT&lt;/b&gt;: I excluded the lib directory from the download zip to reduce the file size. You simply need to copy the lib directory from your seam installation to my project folder to get the build working.&lt;/p&gt;&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6666528949054177156-6514085227545807991?l=thelabdude.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://thelabdude.blogspot.com/feeds/6514085227545807991/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://thelabdude.blogspot.com/2009/05/user-registration-solution-using-jboss.html#comment-form' title='43 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6666528949054177156/posts/default/6514085227545807991'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6666528949054177156/posts/default/6514085227545807991'/><link rel='alternate' type='text/html' href='http://thelabdude.blogspot.com/2009/05/user-registration-solution-using-jboss.html' title='User registration solution using JBoss Seam with JSF / RichFaces and Hibernate'/><author><name>thelabdude</name><uri>http://www.blogger.com/profile/08032248788175889303</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>43</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6666528949054177156.post-6473217683954305768</id><published>2009-04-16T15:13:00.001-07:00</published><updated>2011-07-26T11:32:40.765-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='hibernate'/><category scheme='http://www.blogger.com/atom/ns#' term='spring'/><category scheme='http://www.blogger.com/atom/ns#' term='richfaces'/><category scheme='http://www.blogger.com/atom/ns#' term='spring security'/><category scheme='http://www.blogger.com/atom/ns#' term='jsf'/><title type='text'>User authentication / registration with JSF / RichFaces, Spring, and Hibernate</title><content type='html'>&lt;span style="font-family:verdana;font-size:12px;"&gt; &lt;p&gt;In this article, I show you how to implement a Web 2.0-style user registration solution using JSF/RichFaces, Spring, and Hibernate. This article evolved from a real Web 2.0 site I'm building; I hope to document more lessons learned as I have time. You can learn more about me on my &lt;a href="http://www.linkedin.com/in/thelabdude" target="_blank"&gt;LinkedIn&lt;/a&gt; profile. As for what's in it for you, I've provided the &lt;a href='http://vinwiki.com/thelabdude/jsf_rf_spring_user_reg_src.zip' target='_blank'&gt;source code&lt;/a&gt; and a working solution using:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;&lt;a href="https://javaserverfaces.dev.java.net/" target="_blank"&gt;JSF 1.2 (Mojarra)&lt;/a&gt; with &lt;a href="http://www.jboss.org/jbossrichfaces/" target="_blank"&gt;RichFaces 3.3.1&lt;/a&gt; (using JSP not Facelets)&lt;/li&gt;
&lt;li&gt;&lt;a href="http://www.hibernate.org/" target="_blank"&gt;Hibernate&lt;/a&gt; Core 3.3.1, Hibernate Annotations 3.4.0, Hibernate Validator 3.1.0&lt;/li&gt;
&lt;li&gt;&lt;a href="http://www.springsource.org/about" target="_blank"&gt;Spring Framework 2.5.6&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="http://static.springsource.org/spring-security/site/index.html" target="_blank"&gt;Spring Security 2.0.4&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;p&gt;In future articles, I also plan to incorporate Hibernate Search and implementations of collective intelligence techniques such as tag clouds, clustering, collaborative filtering, and recommendations.&lt;/p&gt;&lt;p&gt;While writing this article, I assumed the reader would have a basic understanding of the technologies listed above and focused more on how to integrate them to solve the requirements discussed in the next section.&lt;/p&gt;&lt;div style="font-size:125%;font-weight:bold;"&gt;Basic Requirements&lt;/div&gt;&lt;p&gt;Here are the high-level requirements this solution satisfies. Notice that most these requirements apply to many Web 2.0 sites seen on the Web today.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;1. Public-facing Web site with read-only access to Guests&lt;/strong&gt;&lt;br /&gt;
The site should be useful without logging in and the content should be accessible to search engines. Since casual surfers will be visiting the site, it must load fast and be easy to use on all major browsers (IE 6 / 7, FireFox 3.x, and Safari 3.x).&lt;/p&gt;&lt;p style="text-align:center;"&gt;&lt;img src='http://vinwiki.com/thelabdude/jsf_rf_seam_user_reg/site.png' alt='Site'/&gt;&lt;/p&gt;&lt;p&gt;&lt;strong&gt;2. Open registration process with email verification&lt;/strong&gt;&lt;br /&gt;
Guests should be able sign-up for a new account using a simple form that collects minimal information from the user; CAPTCHAs should be used to prevent bots from creating accounts. Each user must agree to the site's terms of use before registering. After registering, the user will also need to click on an activation link sent to their email, which ensures they have access to the email account provided on the form.&lt;/p&gt;&lt;p style="text-align:center;"&gt;&lt;img src='http://vinwiki.com/thelabdude/jsf_rf_seam_user_reg/register.png' alt='Registration Form'/&gt;&lt;/p&gt;&lt;p&gt;&lt;strong&gt;3. Remember Me&lt;/strong&gt;&lt;br /&gt;
Registered users can request the site to remember their credentials for automatic login for future visits.&lt;/p&gt;&lt;p style="text-align:center;"&gt;&lt;img src='http://vinwiki.com/thelabdude/jsf_rf_seam_user_reg/signin.png' alt='Sign-In Form'/&gt;&lt;/p&gt;&lt;p&gt;&lt;strong&gt;4. Ability to recover forgotten passwords&lt;/strong&gt;&lt;br /&gt;
Since we have a valid email for every user, users can request a new temporary password to be emailed to them. Upon logging in with the temporary password, the user should be prompted to change their password to a permanent value.&lt;/p&gt;&lt;p style="text-align:center;"&gt;&lt;table&gt;&lt;tr&gt; &lt;td&gt;&lt;img src='http://vinwiki.com/thelabdude/jsf_rf_seam_user_reg/forgot.png' alt='Recover Password Form'/&gt;&lt;/td&gt; &lt;td&gt;&lt;img src='http://vinwiki.com/thelabdude/jsf_rf_seam_user_reg/password_reset.png' alt='Password Reset Confirmation Dialog'/&gt;&lt;/td&gt; &lt;td&gt;&lt;img src='http://vinwiki.com/thelabdude/jsf_rf_seam_user_reg/change_password.png' alt='Change Password Dialog'/&gt;&lt;/td&gt; &lt;/tr&gt;
&lt;/table&gt;&lt;/p&gt;&lt;p&gt;&lt;strong&gt;5. User Preferences&lt;/strong&gt;&lt;br /&gt;
Registered users can share personal information, upload a photo, and setup site preferences, such as whether they are interested in receiving emails about special promotions from the site.&lt;/p&gt;&lt;p style="text-align:center;"&gt;&lt;img src='http://vinwiki.com/thelabdude/jsf_rf_seam_user_reg/preferences.png' alt='User Preferences'/&gt;&lt;/p&gt;&lt;p&gt;&lt;strong&gt;6. Strong Server-side Validation&lt;/strong&gt;&lt;br /&gt;
Since we're exposing the site to the Wild Wild Web, server-side validation is essential; client-side validation using JavaScript is a nice to have, but is not sufficient. It's trivial for a hacker to by-pass client-side validation so you want to make sure your server code is protected as well. Where appropriate, use AJAX to give immediate feedback to the user when they've entered invalid data.&lt;/p&gt;&lt;p style="text-align:center;"&gt;&lt;img src='http://vinwiki.com/thelabdude/jsf_rf_seam_user_reg/ajax.png' alt='AJAX Validation'/&gt;&lt;/p&gt;&lt;div style="font-size:125%;font-weight:bold;"&gt;Layered Architecture&lt;/div&gt;&lt;p&gt;To satisfy these requirements, I've implemented a layered architecture depicted in the diagram below:&lt;/p&gt;&lt;p style="text-align:center;"&gt;&lt;img src='http://vinwiki.com/thelabdude/jsf_rf_spring_user_reg/layers.png' alt='Layered Architecture Diagram'/&gt;&lt;/p&gt;&lt;p&gt;For simple authentication, Spring Security may seem like overkill. However, as the application evolves, I'm also going to need authorization. In addition, Spring Security supports OpenID, which is also of interest for this site.&lt;/p&gt;&lt;div style="font-size:125%;font-weight:bold;"&gt;Web Application Organization&lt;/div&gt;&lt;p&gt;Here is the &lt;a href='http://vinwiki.com/thelabdude/jsf_rf_spring_user_reg_src.zip' target='_blank'&gt;link&lt;/a&gt; to the source code and WAR file. A simple Ant build script is provided so you can build the source. Before digging too deeply into the details of the configuration, take moment to familiarize yourself with the organization of the Web application:&lt;/p&gt;&lt;p style="text-align:center;"&gt;&lt;img src='http://vinwiki.com/thelabdude/jsf_rf_spring_user_reg/dir_struct.png' alt='Web Application Directory Structure'/&gt;&lt;/p&gt;&lt;div style="font-size:125%;font-weight:bold;"&gt;Persistence Layer&lt;/div&gt;&lt;div style="font-size:110%;font-weight:bold;"&gt;Domain Objects&lt;/div&gt;&lt;p&gt;There are three primary domain objects: User, Preferences, and Role, which should be self-explanatory given the requirements outlined above. The domain objects are POJOs with the exception of also using Hibernate annotations in a few key places. For example, in the following code snippet, we annotate the User class as a Hibernate Entity and map it to the ex_user table in the database:&lt;/p&gt;&lt;div style="font-family:courier new;font-size:95%;margin-left:25px;"&gt;&lt;pre&gt;@Entity
@Table(name="ex_user")
public class User implements Serializable ...
&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Notice that we are using &lt;a href="http://en.wikipedia.org/wiki/Surrogate_key" target="_blank"&gt;surrogate keys&lt;/a&gt; for our domain objects that are auto-generated by Hibernate using the best approach for the specific database-type:&lt;/p&gt;&lt;div style="font-family:courier new;font-size:95%;margin-left:25px;"&gt;&lt;pre&gt;@Id
@GeneratedValue(strategy = GenerationType.AUTO)
public Long getUserId() {
    return userId;
}
&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;We also define the userName property to be the natural ID for the User entity. While rare, you can allow the userName to change, such as if a user gets married and wants to change her name.&lt;/p&gt;&lt;div style="font-family:courier new;font-size:95%;margin-left:25px;"&gt;&lt;pre&gt;@NaturalId(mutable=true)
public String getUserName() {
    return userName;
}
&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;I'm also using the Hibernate Validator annotations to restrict the values for some of the members, such as the userName:&lt;/p&gt;&lt;div style="font-family:courier new;font-size:95%;margin-left:25px;"&gt;&lt;pre&gt;@NotEmpty
@Length(min=4,max=16)
@Pattern(regex="^[a-zA-Z\\d_]{4,12}$", message="Invalid screen name.")
private String userName;
&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Unfortunately, the default localized message for the &lt;b&gt;@Pattern&lt;/b&gt; annotation isn't very user-friendly so I've had to resort to a hard-coded English message until I figure out a cleaner solution.&lt;/p&gt;&lt;div style="font-size:110%;font-weight:bold;"&gt;Domain Object Associations&lt;/div&gt;&lt;p&gt;There is a one-to-one relationship between the Preferences and User objects. The Preferences object is loaded with the User object (fetch=FetchType.EAGER) because it contains information that is needed to render the UI for the user.&lt;/p&gt;&lt;div style="font-family:courier new;font-size:95%;margin-left:25px;"&gt;&lt;pre&gt;@OneToOne(cascade=CascadeType.ALL, fetch=FetchType.EAGER)
@PrimaryKeyJoinColumn
public Preferences getPreferences() {
    return preferences;
}
&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;There is a bidirectional many-to-many relationship between the User and Role objects using a join table ex_user_role.&lt;/p&gt;&lt;div style="font-family:courier new;font-size:95%;margin-left:25px;"&gt;&lt;pre&gt;@ManyToMany(targetEntity=example.user.Role.class, cascade={CascadeType.PERSIST, CascadeType.MERGE})
@JoinTable(name="ex_user_role",joinColumns=@JoinColumn(name="userId"), inverseJoinColumns=@JoinColumn(name="roleId"))
protected Set&amp;lt;Role&amp;gt; getUserRoles() {
    if (userRoles == null) userRoles = new HashSet&amp;lt;Role&amp;gt;();
    return userRoles;
}
&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;On the Role object, we have:&lt;/p&gt;&lt;div style="font-family:courier new;font-size:95%;margin-left:25px;"&gt;&lt;pre&gt;@ManyToMany(cascade = {CascadeType.PERSIST, CascadeType.MERGE}, mappedBy="userRoles", targetEntity=User.class)
public Set&amp;lt;User&amp;gt; getRoleUsers() {
    if (roleUsers == null) roleUsers = new HashSet&amp;lt;User&amp;gt;();
    return roleUsers;
}
&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;This solution comes directly from the Hibernate documentation, &lt;a href="http://www.hibernate.org/hib_docs/reference/en/html/assoc-bidirectional-join.html#assoc-bidirectional-join-m2m" target="_blank"&gt;section 7.5.3&lt;/a&gt;.&lt;/p&gt;&lt;div style="font-size:110%;font-weight:bold;"&gt;Data Access Object (DAO)&lt;/div&gt;&lt;p&gt;The persistence layer adopts the &lt;a href="http://en.wikipedia.org/wiki/Data_Access_Object" target="_blank"&gt;Data Access Object (DAO)&lt;/a&gt; design pattern to hide the details of the underlying persistence mechanism from the business logic of the application. For this solution, there are two DAO interfaces defined: UserDAO and RoleDAO.&lt;/p&gt;&lt;p&gt;The HibernateUserDao implements the UserDAO interface and extends Spring's &lt;a href="http://static.springframework.org/spring/docs/2.0.x/api/org/springframework/orm/hibernate/support/HibernateDaoSupport.html" target="_blank"&gt;HibernateDaoSupport&lt;/a&gt; base class. HibernateDaoSupport uses the &lt;a href="http://en.wikipedia.org/wiki/Template_method_pattern" target="_blank"&gt;Template method pattern&lt;/a&gt; to greatly simplify the process of using Hibernate correctly. Additionally, all checked Hibernate exceptions are translated into unchecked exceptions that extend from &lt;a href="http://static.springsource.org/spring/docs/2.5.x/api/org/springframework/dao/DataAccessException.html" target="_blank"&gt;org.springframework.dao.DataAccessException&lt;/a&gt;. This allows developers to handle most persistence exceptions, which are non-recoverable, only in the appropriate layers, without having annoying boilerplate catch-and-throw blocks and exception declarations in the DAOs. For example, an attempt to insert a new record that violates a primary key constraint can only be resolved by changing the values the end-user is providing, such as a User ID during registration. With the Spring DataAccessException strategy, the DAO will throw a DataIntegrityViolationException and allow the presentation layer to catch it and prompt the end-user to change their input accordingly. Moreover, the Spring DataAccessException strategy insulates our business-logic layer from Hibernate specific classes. This is helpful if the underlying persistence mechanism needs to change for your application.&lt;/p&gt;&lt;div style="font-size:110%;font-weight:bold;"&gt;Hibernate&lt;/div&gt;&lt;p&gt;Each DAO implementation bean is injected with an instance of &lt;a href="http://static.springframework.org/spring/docs/2.5.x/api/org/springframework/orm/hibernate3/annotation/AnnotationSessionFactoryBean.html" target="_blank"&gt;AnnotationSessionFactoryBean&lt;/a&gt;. This Spring FactoryBean provides a shared Hibernate SessionFactory and supports JDK 1.5+ annotation metadata for mappings, which alleviates the need to have hbm.xml files for our persistent objects.&lt;/p&gt;&lt;div style="font-family:courier new;font-size:95%;margin-left:25px;"&gt;&lt;pre&gt;&amp;lt;bean id="hbSessionFactory" class="org.springframework.orm.hibernate3.annotation.AnnotationSessionFactoryBean"&amp;gt;
  &amp;lt;property name="dataSource" ref="jdbcDs"&amp;gt;
  &amp;lt;property name="annotatedClasses"&amp;gt;
    &amp;lt;list&amp;gt;
      &amp;lt;value&amp;gt;example.user.User&amp;lt;/value&amp;gt;
      &amp;lt;value&amp;gt;example.user.Role&amp;lt;/value&amp;gt;
      &amp;lt;value&amp;gt;example.user.Preferences&amp;lt;/value&amp;gt;
    &amp;lt;/list&amp;gt;
  &amp;lt;/property&amp;gt;
  &amp;lt;property name="hibernateProperties"&amp;gt;
    &amp;lt;value&amp;gt;hibernate.dialect=example.user.dao.hibernate.HSQLDialect_HHH_1598&amp;lt;/value&amp;gt;
  &amp;lt;/property&amp;gt;
  &amp;lt;property name="schemaUpdate" value="true"&amp;gt;
&amp;lt;/bean&amp;gt;
&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;The hbSessionFactory bean is configured to auto-generate the database schema and is injected with our Spring DataSource. The custom hibernate.dialect property is provided to fix an issue with HSQLDB (see: &lt;a href="http://forum.hibernate.org/viewtopic.php?p=2324183" target="_blank"&gt;http://forum.hibernate.org/viewtopic.php?p=2324183&lt;/a&gt;).&lt;/p&gt;&lt;p&gt;Under the covers, we use an Apache Commons DBCP connection pool that is exposed as a javax.sql.DataSource in the JNDI tree in Tomcat. A javax.sql.DataSource is defined by the JDBC specification to hide connection pooling and transaction management issues from application code. The &lt;span style="font-family:courier new;font-size:90%;font-weight:bold;"&gt;web-application-config.xml&lt;/span&gt; file defines a reference to the javax.sql.DataSource using:&lt;/p&gt;&lt;div style="font-family:courier new;font-size:95%;margin-left:25px;"&gt;&lt;pre&gt;&amp;lt;bean class="org.springframework.jndi.JndiObjectFactoryBean" id="jdbcDs"&amp;gt;
  &amp;lt;property name="jndiName" value="java:comp/env/jdbc/exampleDs"&amp;gt;
&amp;lt;/bean&amp;gt;
&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;The database specific properties for the pool are configured in the &lt;span style="font-family:courier new;font-size:90%;font-weight:bold;"&gt;META-INF/context.xml&lt;/span&gt; file:&lt;/p&gt;&lt;div style="font-family:courier new;font-size:95%;margin-left:25px;"&gt;&lt;pre&gt;&amp;lt;Resource name="jdbc/exampleDs"
          auth="Container"
          type="javax.sql.DataSource"
          username="sa"
          password=""
          driverClassName="org.hsqldb.jdbcDriver"
          url="jdbc:hsqldb:file:example"
          initialSize="0"
          maxActive="10"
          maxIdle="10"
          minIdle="0"
          maxWait="30000"
          poolPreparedStatements="true"
          /&amp;gt;
&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Be sure to put the hsqldb.jar file into the tomcat/lib directory if you use HSQL.&lt;/p&gt;&lt;div style="font-size:125%;font-weight:bold;"&gt;Service Layer&lt;/div&gt;&lt;p&gt;One could allow the presentation layer beans to interact directly with the DAOs. However, this approach leads to your business logic being infected with presentation logic, which makes it difficult to support other interfaces, such as Web services. The Service Layer sits between your presentation layer and persistence layer. The Service Layer is where you implement the &lt;em&gt;business-logic&lt;/em&gt; of your application.&lt;/p&gt;&lt;p&gt;For this solution, we have only one Spring bean in the Service Layer: &lt;strong&gt;UserService&lt;/strong&gt;. The UserService provides a high-level interface to the operations provided by the UserDAO and RoleDAO. UserService is an application of the &lt;a href='http://java.sun.com/blueprints/corej2eepatterns/Patterns/SessionFacade.html' target='_blank'&gt;Session Facade&lt;/a&gt; design pattern in that it aggregates fine-grained data access operations into a single high-level transaction-aware service. Moreover, the UserService keeps business-logic out of the DAO implementations, which are only concerned with efficient object storage and retrieval.&lt;/p&gt;&lt;div style="font-size:110%;font-weight:bold;"&gt;Transactions&lt;/div&gt;&lt;p&gt;As mentioned above, the UserService is transaction-aware. This is accomplished using a Spring TransactionProxyFactoryBean which works with the HibernateTransactionManager. However, since we're using Spring 2.5.x and Java 5, we can use a more compact, annotation driven approach for making our UserServiceImpl transactional. First, you need to include the following element in your Spring configuration:&lt;/p&gt;&lt;div style="font-family:courier new;font-size:95%;margin-left:25px;"&gt;&lt;pre&gt;&amp;lt;tx:annotation-driven/&amp;gt;
&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Second, you need to mark your service's concrete implementation class with the @Transactional annotation:&lt;/p&gt;&lt;div style="font-family:courier new;font-size:95%;margin-left:25px;"&gt;&lt;pre&gt;@Transactional
public class UserServiceImpl implements UserService ...
&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;For more information about declarative transactions with Spring, see: &lt;a href="file:///C:/F:/dev/thirdparty/spring-framework-2.5.6/docs/reference/html_single/index.html#transaction-declarative" target="_blank"&gt;Declarative transaction management&lt;/a&gt;.&lt;/p&gt;&lt;div style="font-size:110%;font-weight:bold;"&gt;Mail Service&lt;/div&gt;&lt;p&gt;As described in the requirements, we'll need to send emails from time to time. For this, we'll be using Spring's &lt;a href='http://static.springsource.org/spring/docs/2.5.x/api/org/springframework/mail/javamail/JavaMailSender.html' target='_blank'&gt;JavaMailSender&lt;/a&gt;. The environment specific settings for your SMTP server are pulled from the &lt;span style="font-family:courier new;font-size:90%;font-weight:bold;"&gt;example-env.properties&lt;/span&gt; file.&lt;/p&gt;&lt;div style="font-size:125%;font-weight:bold;"&gt;Presentation Layer&lt;/div&gt;&lt;p&gt;I chose JSF with RichFaces for this project based on the following criteria:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;Modern, professional looking skin out-of-the-box (I don't want to fool with CSS).&lt;/li&gt;
&lt;li&gt;Extensive set of easy-to-use, yet flexible UI components, such as trees, data grids, tabbed panes, etc.&lt;/li&gt;
&lt;li&gt;Easy to create reusable custom components.&lt;/li&gt;
&lt;li&gt;Large community of users and developers building real apps.&lt;/li&gt;
&lt;li&gt;Performance: loads and runs fast on all modern browsers.&lt;/li&gt;
&lt;li&gt;Minimizes hand-coded JavaScript.&lt;/li&gt;
&lt;/ul&gt;&lt;div style="font-size:110%;font-weight:bold;"&gt;RichFaces ModalPanel&lt;/div&gt;&lt;p&gt;I decided to use the RichFaces ModalPanel component &amp;lt;rich:modalPanel&amp;gt; for my dialogs because it helps the user keep their current orientation with the site. Instead of navigating to a different page, I simply open a dialog over the current view. Consequently, I didn't have any need for Spring Web Flow or JSF navigation rules in this example.&lt;/p&gt;&lt;div style="font-size:110%;font-weight:bold;"&gt;Spring-managed JSF Controller Beans&lt;/div&gt;&lt;p&gt;For this solution, there are two Spring-managed JSF controller beans of interest: &lt;b&gt;example.jsf.GuestSupport&lt;/b&gt; and &lt;b&gt;example.jsf.UserSession&lt;/b&gt;. Here is the definition from the &lt;span style="font-family:courier new;font-size:90%;font-weight:bold;"&gt;example-jsf.xml&lt;/span&gt; file:&lt;/p&gt;&lt;div style="font-family:courier new;font-size:95%;margin-left:25px;"&gt;&lt;pre&gt;&amp;lt;bean id="guest" class="example.jsf.GuestSupport" scope="request"&amp;gt;
&amp;lt;property name="userService" ref="userService"/&amp;gt;
&amp;lt;property name="facesSupport" ref="facesSupport"/&amp;gt;
&amp;lt;/bean&amp;gt;
&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;The GuestSupport bean (&lt;b&gt;&lt;i&gt;request-scoped&lt;/i&gt;&lt;/b&gt;) is the controller for actions performed by guests, such as login and register. Spring creates a new instance per request (if needed).&lt;/p&gt;&lt;div style="font-family:courier new;font-size:95%;margin-left:25px;"&gt;&lt;pre&gt;&amp;lt;bean id="userSession" class="example.jsf.UserSession" scope="session"&amp;gt;
&amp;lt;property name="userService" ref="userService"/&amp;gt;
&amp;lt;property name="facesSupport" ref="facesSupport"/&amp;gt;
&amp;lt;/bean&amp;gt;
&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;The UserSession bean (&lt;b&gt;&lt;i&gt;session-scoped&lt;/i&gt;&lt;/b&gt;) is the controller for actions performed by authenticated users, such as managing preferences. There will be one instance of UserSession per HttpSession; Spring makes sure a new UserSession bean is created whenever a new HttpSession is created, which happens before the user is logged in. Thus, there are two possible states for the UserSession bean:&lt;/p&gt;&lt;ol&gt;&lt;li&gt;User has not been authenticated ( remoteUser == null )&lt;/li&gt;
&lt;li&gt;User has logged in ( remoteUser != null )&lt;/li&gt;
&lt;/ol&gt;&lt;p&gt;Thus, you need to make sure not to use this bean in your JSF EL unless the user is logged in.&lt;/p&gt;&lt;div style="font-weight:bold;"&gt;Registration&lt;/div&gt;&lt;p&gt;When a user clicks on the Register link on the home page, a RichFaces &amp;lt;rich:componentcontrol&amp;gt; is used to open the RichFaces ModalPanel with id &lt;em&gt;registerPanel&lt;/em&gt;.&lt;/p&gt;&lt;div style="font-family:courier new;font-size:95%;margin-left:25px;"&gt;&lt;pre&gt;&amp;lt;rich:componentControl for="registerPanel" attachTo="registerLink" operation="show" event="onclick"/&amp;gt;
&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;The form is bound to the doRegister method on our GuestSupport bean using a standard JSF commandButton. Under the covers, JSF uses the &lt;a href="http://static.springsource.org/spring/docs/2.5.x/api/org/springframework/web/jsf/el/SpringBeanFacesELResolver.html" target="_blank"&gt;SpringBeanELResolver&lt;/a&gt; to get an instance of the GuestSupport bean from Spring. This is configured in &lt;span style="font-family:courier new;font-size:90%;font-weight:bold;"&gt;faces-config.xml&lt;/span&gt;:&lt;/p&gt;&lt;div style="font-family:courier new;font-size:95%;margin-left:25px;"&gt;&lt;pre&gt;&amp;lt;application&amp;gt;
  &amp;lt;el-resolver&amp;gt;org.springframework.web.jsf.el.SpringBeanFacesELResolver&amp;lt;/el-resolver&amp;gt;
&amp;lt;/application&amp;gt;
&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Registration requires the user to solve a CAPTCHA. The easiest way to add this to your site is to use reCAPTCHA. The best way to get started with reCAPTCHA is to read their developer's guide at &lt;a href="http://recaptcha.net/whyrecaptcha.html" target="_blank"&gt;http://recaptcha.net/whyrecaptcha.html&lt;/a&gt;. I had to use a custom JSF tag to embed reCAPTCHA JavaScript into the registration form because the standard JSF script tag was being written to the page header instead of inline. This is due to the following setting in the &lt;span style="font-family:courier new;font-size:90%;font-weight:bold;"&gt;web.xml&lt;/span&gt;:&lt;/p&gt;&lt;div style="font-family:courier new;font-size:95%;margin-left:25px;"&gt;&lt;pre&gt;&amp;lt;context-param&amp;gt;
  &amp;lt;param-name&amp;gt;com.sun.faces.externalizeJavaScript&amp;lt;/param-name&amp;gt;
  &amp;lt;param-value&amp;gt;true&amp;lt;/param-value&amp;gt;
&amp;lt;/context-param&amp;gt;
&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;This allows browsers to cache our JSF JavaScript instead of having to write it each time the page is accessed, which presumably will help our pages load faster. However, we need the reCAPTCHA script to be inline and not in the header. Fortunately, JSF makes it very easy to create custom tags.&lt;/p&gt;&lt;div style="font-family:courier new;font-size:95%;margin-left:25px;"&gt;&lt;pre&gt;&amp;lt;example:script src="#{uiSupport.recaptchaScriptUrl}"&amp;gt;
&lt;/pre&gt;&lt;/div&gt;&lt;div style="font-weight:bold;"&gt;Strong Password Validation&lt;/div&gt;&lt;p&gt;Notice that the registration form binds the password to the &lt;b&gt;guest.passwordSupport.password&lt;/b&gt;.&lt;/p&gt;&lt;div style="font-family:courier new;font-size:95%;margin-left:25px;"&gt;&lt;pre&gt;&amp;lt;h:inputSecret label="#{i18n.login_password}" id="password" required="true" value="#{guest.passwordSupport.password}" size="12" maxlength="12"&amp;gt;
  &amp;lt;rich:beanvalidator/&amp;gt;
&amp;lt;/h:inputSecret&amp;gt;
&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;This is because the User password is expected to be a one-way MD5 hash instead of plain text. The &lt;b&gt;example.user.PasswordSupport&lt;/b&gt; utility bean validates the password using Hibernate Validator:&lt;/p&gt;&lt;div style="font-family:courier new;font-size:95%;margin-left:25px;"&gt;&lt;pre&gt;@NotEmpty
@Length(min=8,max=12)
@Pattern(regex="(?=^.{8,}$)((?=.*\\d)|(?=.*\\W+))(?![.\\n])(?=.*[A-Z])(?=.*[a-z]).*$", message="Password is not secure enough!")
private String password = null;
&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Upon successful registration, the guest.newUserName and guest.newUserEmail properties are updated on the guest bean. This activates the &lt;b&gt;regPending&lt;/b&gt; modal panel to notify the user that their registration must be activated by clicking on an activation link sent to their email. However, if registration fails, then the registerPanel is displayed with the appropriate error message(s).&lt;/p&gt;&lt;div style="font-family:courier new;font-size:95%;margin-left:25px;"&gt;&lt;pre&gt;&amp;lt;a:commandButton id="registerButton" value="#{i18n.register}"
    action="#{guest.doRegister}" styleClass="rich-fileupload-button" reRender="regPending"
    oncomplete="if (document.getElementById('jsfMsgMaxSev').value != '2') { #{rich:component('registerPanel')}.hide(); #{rich:component('regPending')}.show(); }"/&amp;gt;
&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;The &lt;b&gt;oncomplete&lt;/b&gt; attribute will close the registerPanel and open the regPending panel if no errors are present in the AJAX response. This relies on the use of an A4J &amp;lt;a:outputPanel&amp;gt;, see index.jsp.&lt;/p&gt;&lt;div style="font-weight:bold;"&gt;Account Activation&lt;/div&gt;&lt;p&gt;The activation link includes the &lt;em&gt;act&lt;/em&gt; parameter containing a hex-encoded MD5 hash code linked to the pending registration. The &lt;b&gt;guest.getUser()&lt;/b&gt; method checks for the &lt;em&gt;act&lt;/em&gt; parameter and if present tries to activate the user. If successful, the login panel is displayed with the new user name pre-popluated.&lt;/p&gt;&lt;div style="font-weight:bold;"&gt;Login&lt;/div&gt;&lt;p&gt;For authentication, we delegate to Spring Security using an ExternalContext dispatch to &lt;em&gt;/j_spring_security_check&lt;/em&gt;. This handler expects two parameters provided by our Login form: j_username and j_password. Notice that the login form is configured with prependId="false" so that JSF does not add &lt;em&gt;loginForm:&lt;/em&gt; to the parameter names. The request to &lt;em&gt;/j_spring_security_check&lt;/em&gt; is handled by the springSecurityFilterChain configured in &lt;span style="font-family:courier new;font-size:90%;font-weight:bold;"&gt;web.xml&lt;/span&gt;:&lt;/p&gt;&lt;div style="font-family:courier new;font-size:95%;margin-left:25px;"&gt;&lt;pre&gt;&amp;lt;filter&amp;gt;
  &amp;lt;filter-name&amp;gt;springSecurityFilterChain&amp;lt;/filter-name&amp;gt;
  &amp;lt;filter-class&amp;gt;org.springframework.web.filter.DelegatingFilterProxy&amp;lt;/filter-class&amp;gt;
&amp;lt;/filter&amp;gt;
&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Spring Security, configured in &lt;span style="font-family:courier new;font-size:90%;font-weight:bold;"&gt;example-security.xml&lt;/span&gt;, uses a userDetailsService based authentication provider. In this case, we configure two simple SQL queries based on the Hibernate schema for the &lt;a href="http://static.springframework.org/spring-security/site/reference/html/authentication-common-auth-services.html#jdbc-service" target="_blank"&gt;org.springframework.security.userdetails.jdbc.JdbcDaoImpl&lt;/a&gt;.&lt;/p&gt;&lt;p&gt;Authenticate by Username and Password:&lt;/p&gt;&lt;div style="font-family:courier new;font-size:95%;margin-left:25px;"&gt;&lt;pre&gt;SELECT userName, password, active from ex_user where userName=?
&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Get Roles for User:&lt;/p&gt;&lt;div style="font-family:courier new;font-size:95%;margin-left:25px;"&gt;&lt;pre&gt;SELECT ij.userId, r.name FROM ex_role r INNER JOIN (SELECT u.userId, ur.roleId FROM ex_user_role ur, ex_user u WHERE ur.userId=u.userId AND u.userName=?) ij ON r.roleId=ij.roleId
&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;If authentication fails, Spring Security puts the exception into the session keyed by: &lt;a href="http://static.springsource.org/spring-security/site/apidocs/org/springframework/security/ui/AbstractProcessingFilter.html" target="_blank"&gt;AbstractProcessingFilter.SPRING_SECURITY_LAST_EXCEPTION_KEY&lt;/a&gt;. In the JSF layer, we need to check for this exception and if present display a generic JSF error message to the user. It's a good idea to show a generic message because the actual exception may contain information a hacker can use to break into the site. Unfortunately, I had to resort to invoking a static function &lt;span style="font-family:courier new;font-size:95%;"&gt;GuestSupport.setLoginPanelVisibility&lt;/span&gt; from a JSP Scriptlet in auth.jspf to check for the Spring Security exception.&lt;/p&gt;&lt;div style="font-weight:bold;"&gt;Remember Me&lt;/div&gt;&lt;p&gt;Recall that one of our requirements was to support a Remember Me feature so that returning users are automatically logged into the site. With Spring Security, this is trivial; the JSF login form simply needs to pass the &lt;span style="font-family:courier new;font-size:95%;"&gt;_spring_security_remember_me parameter&lt;/span&gt;. Under the covers, Spring Security is configured to use the simple hash-based token approach. However, you can also use a more secure solution described &lt;a href="http://static.springframework.org/spring-security/site/reference/html/remember-me.html" target="_blank"&gt;here&lt;/a&gt;.&lt;/p&gt;&lt;div style="font-size:125%;font-weight:bold;"&gt;JavaScript and AJAX Tricks&lt;/div&gt;&lt;p&gt;In this section, I demonstrate a few of the JavaScript and AJAX tricks used by the solution.&lt;/p&gt;&lt;div style="font-size:110%;font-weight:bold;"&gt;Country / Region&lt;/div&gt;&lt;p&gt;One of the nice things about RichFaces is that you can attach AJAX-driven logic to any JSF component without writing any JavaScript. For example, suppose we want to limit the list of regions based on the selected country on the Preferences form. This is trivial using the &amp;lt;a4j:support&amp;gt; tag provided by RichFaces:&lt;/p&gt;&lt;div style="font-family:courier new;font-size:95%;margin-left:25px;"&gt;&lt;pre&gt;&amp;lt;a4j:support event="onchange" rerender="region" ajaxsingle="true"&amp;gt;
&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;This fires an AJAX request when the country is changed and upon success, the region component is re-rendered with a country-specific list of regions.&lt;/p&gt;&lt;div style="font-size:110%;font-weight:bold;"&gt;User name check with AJAX&lt;/div&gt;&lt;p&gt;During registration, if the user's desired screen name is already taken by another user, we should let them know immediately. To do this, we use the &amp;lt;a4j:support&amp;gt; tag provided by RichFaces on the userName &amp;lt;h:inputText&amp;gt; field:&lt;/p&gt;&lt;div style="font-family:courier new;font-size:95%;margin-left:25px;"&gt;&lt;pre&gt;&amp;lt;a4j:support event="onblur" rerender="userName" ajaxsingle="true"&amp;gt;
&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;When the onblur event occurs, RichFaces sends an AJAX request to update the guest.registrationScreenName property. If the desired screen name is already taken, a FacesMessage is created to be displayed by the &amp;lt;rich:message&amp;gt; tag associated with the userName field.&lt;/p&gt;&lt;div style="font-family:courier new;font-size:95%;margin-left:25px;"&gt;&lt;pre&gt;&amp;lt;rich:message for="userName"&amp;gt;
  &amp;lt;f:facet name="errorMarker"&amp;gt;&amp;lt;h:graphicimage value="img/error.gif"&amp;gt;&amp;lt;/f:facet&amp;gt;
&amp;lt;/rich:message&amp;gt;
&lt;/pre&gt;&lt;/div&gt;&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6666528949054177156-6473217683954305768?l=thelabdude.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://thelabdude.blogspot.com/feeds/6473217683954305768/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://thelabdude.blogspot.com/2009/04/user-authentication-registration-with.html#comment-form' title='30 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6666528949054177156/posts/default/6473217683954305768'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6666528949054177156/posts/default/6473217683954305768'/><link rel='alternate' type='text/html' href='http://thelabdude.blogspot.com/2009/04/user-authentication-registration-with.html' title='User authentication / registration with JSF / RichFaces, Spring, and Hibernate'/><author><name>thelabdude</name><uri>http://www.blogger.com/profile/08032248788175889303</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>30</thr:total></entry></feed>
