<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	xmlns:media="http://search.yahoo.com/mrss/"
	>

<channel>
	<title>Solr 'n Stuff</title>
	<atom:link href="http://yonik.wordpress.com/feed/" rel="self" type="application/rss+xml" />
	<link>http://yonik.wordpress.com</link>
	<description></description>
	<lastBuildDate>Tue, 07 Jul 2009 01:43:09 +0000</lastBuildDate>
	<generator>http://wordpress.com/</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<cloud domain='yonik.wordpress.com' port='80' path='/?rsscloud=notify' registerProcedure='' protocol='http-post' />
<image>
		<url>http://www.gravatar.com/blavatar/061b26197487f31416b0eebb44d3b7e8?s=96&#038;d=http://s.wordpress.com/i/buttonw-com.png</url>
		<title>Solr 'n Stuff</title>
		<link>http://yonik.wordpress.com</link>
	</image>
			<item>
		<title>Ranges over Functions in Solr 1.4</title>
		<link>http://yonik.wordpress.com/2009/07/06/ranges-over-functions-in-solr-1-4/</link>
		<comments>http://yonik.wordpress.com/2009/07/06/ranges-over-functions-in-solr-1-4/#comments</comments>
		<pubDate>Tue, 07 Jul 2009 01:37:42 +0000</pubDate>
		<dc:creator>yonik</dc:creator>
				<category><![CDATA[lucene]]></category>
		<category><![CDATA[search]]></category>
		<category><![CDATA[solr]]></category>
		<category><![CDATA[frange]]></category>
		<category><![CDATA[function query]]></category>
		<category><![CDATA[qparser]]></category>
		<category><![CDATA[query syntax]]></category>
		<category><![CDATA[range query]]></category>

		<guid isPermaLink="false">http://yonik.wordpress.com/?p=37</guid>
		<description><![CDATA[Solr 1.4 contains a new feature that allows range queries or range filters over arbitrary functions.  It&#8217;s implemented as a standard Solr QParser plugin, and thus easily available for use any place that accepts the standard Solr Query Syntax by specifying the frange query type.  Here&#8217;s an example of a filter specifying the lower and [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=yonik.wordpress.com&blog=1995971&post=37&subd=yonik&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<div class='snap_preview'><br /><p>Solr 1.4 contains a new feature that allows range queries or range filters over arbitrary functions.  It&#8217;s implemented as a standard <a href="http://lucene.apache.org/solr/api/org/apache/solr/search/FunctionRangeQParserPlugin.html">Solr QParser plugin</a>, and thus easily available for use any place that accepts the standard <a href="http://wiki.apache.org/solr/SolrQuerySyntax">Solr Query Syntax</a> by specifying the <strong>frange </strong>query type.  Here&#8217;s an example of a filter specifying the lower and upper bounds for a function:</p>
<p><code>fq={!frange l=0 u=2.2}log(sum(user_ranking,editor_ranking))</code></p>
<p>The other interesting use for frange is to trade off memory for speed when doing range queries on any type of single-valued field.  For example, one can use <strong>frange </strong>on a string field provided that there is only one value per field, and that numeric functions are avoided.</p>
<p>For example, here is a filter that only allows authors between martin and rowling, specified using a standard range query:<br />
<code>fq=author_last_name:[martin TO rowling]</code></p>
<p>And the same filter using a function range query (<strong>frange</strong>):<br />
<code>fq={!frange l=martin u=rowling}author_last_name</code></p>
<p>This can lead to significant performance improvements for range queries with many terms between the endpoints, at the cost of memory to hold the un-inverted form of the field in memory (i.e. a FieldCache entry &#8211; same as would be used for sorting).  If the field in question is already being used for sorting or other function queries, there won&#8217;t be any additional memory overhead.</p>
<p>The following chart shows the results of a test of frange queries vs standard range queries on a string field with 200,000 unique values.  For example, frange was 14 times faster when executing a range query / range filter that covered 20% of the terms in the field.  For narrower ranges that matched less than 5% of the values, the traditional range query performed better.</p>
<table border="1">
<tbody>
<tr>
<th>Percent of terms covered</th>
<th>Fastest implementation</th>
<th>Speedup (how many times faster)</th>
</tr>
<tr>
<td>100%</td>
<td>frange</td>
<td>43.32</td>
</tr>
<tr>
<td>20%</td>
<td>frange</td>
<td>14.25</td>
</tr>
<tr>
<td>10%</td>
<td>frange</td>
<td>8.07</td>
</tr>
<tr>
<td>5%</td>
<td>frange</td>
<td>1.337</td>
</tr>
<tr>
<td>1%</td>
<td>normal range query</td>
<td>3.59</td>
</tr>
</tbody>
</table>
<p>Of course, Solr 1.4 also contains the new <a href="http://www.lucidimagination.com/blog/2009/05/13/exploring-lucene-and-solrs-trierange-capabilities/">TrieRange </a>functionality that will generally have the best time/space profile for range queries over numeric fields.</p>
  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/yonik.wordpress.com/37/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/yonik.wordpress.com/37/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/yonik.wordpress.com/37/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/yonik.wordpress.com/37/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/yonik.wordpress.com/37/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/yonik.wordpress.com/37/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/yonik.wordpress.com/37/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/yonik.wordpress.com/37/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/yonik.wordpress.com/37/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/yonik.wordpress.com/37/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=yonik.wordpress.com&blog=1995971&post=37&subd=yonik&ref=&feed=1" /></div>]]></content:encoded>
			<wfw:commentRss>http://yonik.wordpress.com/2009/07/06/ranges-over-functions-in-solr-1-4/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/d826abbc3ebe028c7db08a03a159503f?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">yonik</media:title>
		</media:content>
	</item>
		<item>
		<title>Filtered query performance increases for Solr 1.4</title>
		<link>http://yonik.wordpress.com/2009/05/27/filtered-query-performance-increases-for-solr-1-4/</link>
		<comments>http://yonik.wordpress.com/2009/05/27/filtered-query-performance-increases-for-solr-1-4/#comments</comments>
		<pubDate>Wed, 27 May 2009 18:12:49 +0000</pubDate>
		<dc:creator>yonik</dc:creator>
				<category><![CDATA[lucene]]></category>
		<category><![CDATA[search]]></category>
		<category><![CDATA[solr]]></category>
		<category><![CDATA[filtered query]]></category>
		<category><![CDATA[solr 1.4]]></category>
		<category><![CDATA[solr performance]]></category>

		<guid isPermaLink="false">http://yonik.wordpress.com/?p=30</guid>
		<description><![CDATA[One of the many performance improvements in the upcoming Solr 1.4 release involves improved filtering performance.  Solr 1.4 filters are both faster (anywhere from 30% to 80% faster to calculate intersections, depending on configuration), take less memory (40% smaller), and are more efficiently applied to the query during a search.
In previous Solr releases, filters [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=yonik.wordpress.com&blog=1995971&post=30&subd=yonik&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<div class='snap_preview'><br /><p>One of the many performance improvements in the upcoming Solr 1.4 release involves improved filtering performance.  Solr 1.4 filters are both faster (anywhere from 30% to 80% faster to calculate intersections, depending on configuration), take less memory (40% smaller), and are more efficiently applied to the query during a search.</p>
<p>In previous Solr releases, filters were applied after the main query and thus had little impact on overall query performance.  Filters are now checked in parallel with the query, resulting in greater speedups the fewer documents that match the filters.</p>
<p>Example: Adding a filter that matched 10% of a large index resulted in a 300% performance increase for a dismax query consisting of three words on a single field with proximity boost.</p>
<p>Related issues:</p>
<p><a href="https://issues.apache.org/jira/browse/SOLR-1169">https://issues.apache.org/jira/browse/SOLR-1169</a></p>
<p><a href="https://issues.apache.org/jira/browse/SOLR-1179">https://issues.apache.org/jira/browse/SOLR-1179</a></p>
  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/yonik.wordpress.com/30/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/yonik.wordpress.com/30/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/yonik.wordpress.com/30/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/yonik.wordpress.com/30/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/yonik.wordpress.com/30/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/yonik.wordpress.com/30/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/yonik.wordpress.com/30/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/yonik.wordpress.com/30/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/yonik.wordpress.com/30/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/yonik.wordpress.com/30/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=yonik.wordpress.com&blog=1995971&post=30&subd=yonik&ref=&feed=1" /></div>]]></content:encoded>
			<wfw:commentRss>http://yonik.wordpress.com/2009/05/27/filtered-query-performance-increases-for-solr-1-4/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/d826abbc3ebe028c7db08a03a159503f?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">yonik</media:title>
		</media:content>
	</item>
		<item>
		<title>Solr scalability improvements</title>
		<link>http://yonik.wordpress.com/2008/12/01/solr-scalability-improvements/</link>
		<comments>http://yonik.wordpress.com/2008/12/01/solr-scalability-improvements/#comments</comments>
		<pubDate>Tue, 02 Dec 2008 02:53:48 +0000</pubDate>
		<dc:creator>yonik</dc:creator>
				<category><![CDATA[java]]></category>
		<category><![CDATA[lucene]]></category>
		<category><![CDATA[search]]></category>
		<category><![CDATA[solr]]></category>
		<category><![CDATA[cache]]></category>
		<category><![CDATA[LRU]]></category>
		<category><![CDATA[NIO]]></category>
		<category><![CDATA[scalability]]></category>

		<guid isPermaLink="false">http://yonik.wordpress.com/?p=20</guid>
		<description><![CDATA[With CPU cores constantly increasing, there has been some major work done in Lucene/Solr to increase the scalability under multi-threaded load.
Read-only IndexReaders
One bottleneck was synchronization around the checking of deleted docs in a Lucene IndexReader.  Since another thread could delete a document at any time, the IndexReader.isDeleted() call was synchronized.  It&#8217;s a very quick call, [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=yonik.wordpress.com&blog=1995971&post=20&subd=yonik&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<div class='snap_preview'><br /><p>With CPU cores constantly increasing, there has been some major work done in <a href="http://lucene.apache.org/solr/">Lucene/Solr</a> to increase the scalability under multi-threaded load.</p>
<h2>Read-only IndexReaders</h2>
<p>One bottleneck was synchronization around the checking of deleted docs in a Lucene IndexReader.  Since another thread could delete a document at any time, the IndexReader.isDeleted() call was <em>synchronized</em>.  It&#8217;s a very quick call, simply checking if a bit is set in a BitVector, but the problem was that it can be called millions of times in the process of satisfying a single query. The Read-only IndexReader feature allowed for the removal of this synchronization by prohibiting deletion.</p>
<h2>Use of NIO to read index files</h2>
<p>The standard method for Lucene to read index files is via Java&#8217;s RandomAccessFile.  Reading a part of the file involves two calls, a <strong>seek() </strong>to position the file pointer followed by a <strong>read()</strong> to get the data.  For multiple threads to share the same RandomAccessFile instance, this obviously involves synchronization to avoid one thread changing the file pointer before another thread gets to read at the file position it set.   If the data to be read isn&#8217;t in the operating system cache, it&#8217;s even worse news&#8230; the synchronization causes all other reads to block while the data is retrieved from disk, even if some of those reads could have been quickly satisified.</p>
<p>The preferred solution would be to have a method on RandomAccessFile that accepted an offset to read from.  This could easily be implemented by the JVM via a <strong>pread()</strong> system call.  But since Sun has not provided this functionality, we need to use something else.  NIO&#8217;s FileChannel <em>does </em>have the type of method we are looking for:  <strong>FileChannel.read(ByteBuffer dst, long position)</strong></p>
<p>Solr now uses the non-synchronizing NIO method of reading index files (via Lucene&#8217;s NIOFSDirectory)  by default if you are on a non-Windows platform.  Windows systems default to the older method since it turns out to be faster than the new method &#8211; the reason being a long standing &#8220;bug&#8221; in Java that still synchronizes internally even when using FileChannel.read().</p>
<h2>Non blocking caches</h2>
<p>Solr&#8217;s standard LRU cache implementation use a synchronized LinkedHashMap.  A single cache could be checked hundreds or thousands of times during the course of a single request that involves faceting.  A non-blocking ConcurrentLRUCache was developed as an alternative implementation, and is now the default for Solr&#8217;s filter cache.  One user indicated that this has doubled their query throughput under ideal circumstances.</p>
<h2>Where to find this scalability goodness?</h2>
<p><a href="http://www.apache.org/dyn/closer.cgi/lucene/solr">Solr 1.3</a> has read-only IndexReaders, but for the other scalability improvements, including the improved faceting, you&#8217;ll have to grab a <a href="http://hudson.zones.apache.org/hudson/job/Solr-trunk/lastSuccessfulBuild/artifact/trunk/dist/">nightly Solr build</a>.</p>
  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/yonik.wordpress.com/20/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/yonik.wordpress.com/20/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/yonik.wordpress.com/20/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/yonik.wordpress.com/20/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/yonik.wordpress.com/20/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/yonik.wordpress.com/20/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/yonik.wordpress.com/20/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/yonik.wordpress.com/20/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/yonik.wordpress.com/20/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/yonik.wordpress.com/20/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=yonik.wordpress.com&blog=1995971&post=20&subd=yonik&ref=&feed=1" /></div>]]></content:encoded>
			<wfw:commentRss>http://yonik.wordpress.com/2008/12/01/solr-scalability-improvements/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/d826abbc3ebe028c7db08a03a159503f?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">yonik</media:title>
		</media:content>
	</item>
		<item>
		<title>Solr Faceted Search Performance Improvements</title>
		<link>http://yonik.wordpress.com/2008/11/25/solr-faceted-search-performance-improvements/</link>
		<comments>http://yonik.wordpress.com/2008/11/25/solr-faceted-search-performance-improvements/#comments</comments>
		<pubDate>Tue, 25 Nov 2008 05:25:54 +0000</pubDate>
		<dc:creator>yonik</dc:creator>
				<category><![CDATA[java]]></category>
		<category><![CDATA[lucene]]></category>
		<category><![CDATA[search]]></category>
		<category><![CDATA[solr]]></category>
		<category><![CDATA[Faceted Search]]></category>
		<category><![CDATA[performance]]></category>

		<guid isPermaLink="false">http://yonik.wordpress.com/?p=8</guid>
		<description><![CDATA[Having performance issues with Solr&#8217;s faceted search and certain types of fields?  Help has arrived in the form of a new Solr faceting algorithm!  This new faceting implementation dramatically improves the performance of faceted search, making it suitable for a much wider range of applications.
The existing multivalued field faceting algorithm (where each document may have [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=yonik.wordpress.com&blog=1995971&post=8&subd=yonik&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<div class='snap_preview'><br /><p>Having performance issues with Solr&#8217;s faceted search and certain types of fields?  Help has arrived in the form of a new Solr faceting algorithm!  This new faceting implementation dramatically improves the performance of faceted search, making it suitable for a much wider range of applications.</p>
<p>The existing multivalued field faceting algorithm (where each document may have multiple values) steps over each term in the index for that field.  For each term, the set of documents that match that term is retrieved from the filterCache, and an intersection count is calculated with the set of documents that match the query.  This works well for fields with a limited number of terms (less than 1000), but not so great for fields with many terms.</p>
<p>The new method works by un-inverting the indexed field to be faceted, allowing quick lookup of the terms in the field for any given document.  It&#8217;s actually a hybrid approach &#8211; to save memory and increase speed, terms that appear in many documents (over 5%) are not un-inverted, instead the traditional set intersection logic is used to get the counts.</p>
<p><strong>Results: up to 5000% increase in queries per second and up to 700% improvement in memory utilization.<br />
</strong></p>
<p>More gory details and detailed benchmark results can be found at<a href="https://issues.apache.org/jira/browse/SOLR-475?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel" target="_blank"> http://issues.apache.org/jira/browse/SOLR-475</a></p>
<p>Try it now with a Solr <a title="nightly/test development build" href="http://people.apache.org/builds/lucene/solr/nightly/" target="_blank">nightly/test development build</a> dated 11/25/2008 or later.</p>
  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/yonik.wordpress.com/8/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/yonik.wordpress.com/8/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/yonik.wordpress.com/8/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/yonik.wordpress.com/8/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/yonik.wordpress.com/8/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/yonik.wordpress.com/8/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/yonik.wordpress.com/8/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/yonik.wordpress.com/8/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/yonik.wordpress.com/8/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/yonik.wordpress.com/8/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=yonik.wordpress.com&blog=1995971&post=8&subd=yonik&ref=&feed=1" /></div>]]></content:encoded>
			<wfw:commentRss>http://yonik.wordpress.com/2008/11/25/solr-faceted-search-performance-improvements/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/d826abbc3ebe028c7db08a03a159503f?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">yonik</media:title>
		</media:content>
	</item>
		<item>
		<title>lookup3ycs : a standard high performance string hash</title>
		<link>http://yonik.wordpress.com/2008/06/14/lookup3ycs-a-standard-high-performance-string-hash/</link>
		<comments>http://yonik.wordpress.com/2008/06/14/lookup3ycs-a-standard-high-performance-string-hash/#comments</comments>
		<pubDate>Sat, 14 Jun 2008 16:25:25 +0000</pubDate>
		<dc:creator>yonik</dc:creator>
				<category><![CDATA[java]]></category>
		<category><![CDATA[search]]></category>
		<category><![CDATA[Bob Jenkins]]></category>
		<category><![CDATA[hash function]]></category>
		<category><![CDATA[lookup3]]></category>
		<category><![CDATA[lookup3ycs]]></category>
		<category><![CDATA[string hash]]></category>

		<guid isPermaLink="false">http://yonik.wordpress.com/?p=5</guid>
		<description><![CDATA[I was surprised to discovered that there isn&#8217;t a good cross-platform hash function defined for strings. MD5, SHA, FVN, etc, all define hash functions over bytes, meaning that it&#8217;s under-specified for strings.
So I set out to create a standard 32 bit string hash that would be well defined for implementation in all languages, have very [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=yonik.wordpress.com&blog=1995971&post=5&subd=yonik&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<div class='snap_preview'><br /><p>I was surprised to discovered that there isn&#8217;t a good cross-platform hash function defined for strings. MD5, SHA, FVN, etc, all define hash functions over bytes, meaning that it&#8217;s under-specified for strings.</p>
<p>So I set out to create a standard 32 bit string hash that would be well defined for implementation in all languages, have very high performance, and have very good hash properties such as distribution.  After evaluating all the options, I settled on using Bob Jenkins&#8217; <a href="http://burtleburtle.net/bob/c/lookup3.c">lookup3</a> as a base.  It&#8217;s a well studied and very fast hash function, and the hashword variant can work with 32 bits at a time (perfect for hashing unicode code points).  It&#8217;s also even faster on the latest JVMs which can translate pairs of shifts into native rotate instructions.</p>
<p>The only problem with using lookup3 hashword is that it includes a length in the initial value.  This would suck some performance out since directly hashing a UTF8 or UTF16 string (Java) would require a pre-scan to get the actual number of unicode code points.  The solution was to simply remove the length factor, which is equivalent to biasing initVal by -(numCodePoints*4).  This slightly modified lookup3 I define as lookup3ycs.</p>
<h3><span style="color:#00ccff;">So the definition of the cross-platform string hash <strong>lookup3ycs is</strong></span>:</h3>
<p><strong><span style="color:#000000;">The hash value of a character sequence (a string) is defined to be the hash of it&#8217;s unicode code points, according to lookup3 hashword, with the initval biased by -(length*4).</span></strong></p>
<p>So by definition<br />
<code><br />
lookup3ycs(k,offset,length,initval) == lookup3(k,offset,length,initval-(length*4))</code></p>
<p>AND</p>
<p><code> lookup3ycs(k,offset,length,initval+(length*</code><code>4)) == lookup3(k,offset,length,initval)<br />
</code></p>
<p>An obvious advantage of this relationship is that you can use lookup3 if you don&#8217;t have an implementation of lookup3ycs.</p>
<p>Here&#8217;s my <a href="http://people.apache.org/~yonik/code/hash/">optimized version for Java</a></p>
<p>Update: I&#8217;ve also included a 64 bit version called <strong>lookup3ycs64</strong></p>
<img alt="" border="0" src="http://feeds.wordpress.com/1.0/categories/yonik.wordpress.com/5/" /> <img alt="" border="0" src="http://feeds.wordpress.com/1.0/tags/yonik.wordpress.com/5/" /> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/yonik.wordpress.com/5/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/yonik.wordpress.com/5/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/yonik.wordpress.com/5/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/yonik.wordpress.com/5/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/yonik.wordpress.com/5/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/yonik.wordpress.com/5/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/yonik.wordpress.com/5/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/yonik.wordpress.com/5/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/yonik.wordpress.com/5/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/yonik.wordpress.com/5/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=yonik.wordpress.com&blog=1995971&post=5&subd=yonik&ref=&feed=1" /></div>]]></content:encoded>
			<wfw:commentRss>http://yonik.wordpress.com/2008/06/14/lookup3ycs-a-standard-high-performance-string-hash/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/d826abbc3ebe028c7db08a03a159503f?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">yonik</media:title>
		</media:content>
	</item>
		<item>
		<title>Distributed Search for Solr</title>
		<link>http://yonik.wordpress.com/2008/02/27/distributed-search-for-solr/</link>
		<comments>http://yonik.wordpress.com/2008/02/27/distributed-search-for-solr/#comments</comments>
		<pubDate>Wed, 27 Feb 2008 19:34:55 +0000</pubDate>
		<dc:creator>yonik</dc:creator>
				<category><![CDATA[java]]></category>
		<category><![CDATA[lucene]]></category>
		<category><![CDATA[search]]></category>
		<category><![CDATA[solr]]></category>
		<category><![CDATA[distributed search]]></category>
		<category><![CDATA[scalability]]></category>

		<guid isPermaLink="false">http://yonik.wordpress.com/?p=4</guid>
		<description><![CDATA[A new chapter in Solr scalability has been opened with the addition of distributed search!
http://wiki.apache.org/solr/DistributedSearch
Distributed Search splits an index into multiple shards, and queries across all the shards, combining the results and presenting a single merged response that looks like it came from a single server.
Solr&#8217;s current implementation uses SolrJ (the solr java client) to [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=yonik.wordpress.com&blog=1995971&post=4&subd=yonik&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<div class='snap_preview'><br /><p>A new chapter in Solr scalability has been opened with the addition of distributed search!</p>
<p><a href="http://wiki.apache.org/solr/DistributedSearch">http://wiki.apache.org/solr/DistributedSearch</a></p>
<p>Distributed Search splits an index into multiple shards, and queries across all the shards, combining the results and presenting a single merged response that looks like it came from a single server.</p>
<p>Solr&#8217;s current implementation uses SolrJ (the solr java client) to talk to other Solr servers via HTTP,  in two main phases.  The first phase collects matching document ids and scores, as well as doing any requested faceting.  The second phase retrieves the stored fields for selected documents, does highlighting, and may include additional faceting requests to nail down exact facet counts.</p>
<img alt="" border="0" src="http://feeds.wordpress.com/1.0/categories/yonik.wordpress.com/4/" /> <img alt="" border="0" src="http://feeds.wordpress.com/1.0/tags/yonik.wordpress.com/4/" /> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/yonik.wordpress.com/4/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/yonik.wordpress.com/4/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/yonik.wordpress.com/4/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/yonik.wordpress.com/4/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/yonik.wordpress.com/4/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/yonik.wordpress.com/4/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/yonik.wordpress.com/4/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/yonik.wordpress.com/4/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/yonik.wordpress.com/4/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/yonik.wordpress.com/4/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=yonik.wordpress.com&blog=1995971&post=4&subd=yonik&ref=&feed=1" /></div>]]></content:encoded>
			<wfw:commentRss>http://yonik.wordpress.com/2008/02/27/distributed-search-for-solr/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/d826abbc3ebe028c7db08a03a159503f?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">yonik</media:title>
		</media:content>
	</item>
		<item>
		<title>Solr at Web 2.0 Expo Berlin</title>
		<link>http://yonik.wordpress.com/2007/10/26/solr-at-web-20-expo-berlin/</link>
		<comments>http://yonik.wordpress.com/2007/10/26/solr-at-web-20-expo-berlin/#comments</comments>
		<pubDate>Fri, 26 Oct 2007 20:31:58 +0000</pubDate>
		<dc:creator>yonik</dc:creator>
				<category><![CDATA[search]]></category>

		<guid isPermaLink="false">http://yonik.wordpress.com/2007/10/26/solr-at-web-20-expo-berlin/</guid>
		<description><![CDATA[I&#8217;ll be giving a Solr presentation Nov 8th in Berlin, titled &#8220;Add Powerful Full Text Search to Your Web App with Solr&#8220;.  Should be fun, just wish I had more free time while in Berlin&#8230;
       <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=yonik.wordpress.com&blog=1995971&post=3&subd=yonik&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<div class='snap_preview'><br /><p>I&#8217;ll be giving a Solr presentation Nov 8th in Berlin<span class="AtAGlanceMore">, titled &#8220;Add Powerful Full Text Search to Your Web App with Solr</span>&#8220;.  Should be fun, just wish I had more free time while in Berlin&#8230;</p>
<img alt="" border="0" src="http://feeds.wordpress.com/1.0/categories/yonik.wordpress.com/3/" /> <img alt="" border="0" src="http://feeds.wordpress.com/1.0/tags/yonik.wordpress.com/3/" /> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/yonik.wordpress.com/3/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/yonik.wordpress.com/3/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/yonik.wordpress.com/3/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/yonik.wordpress.com/3/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/yonik.wordpress.com/3/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/yonik.wordpress.com/3/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/yonik.wordpress.com/3/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/yonik.wordpress.com/3/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/yonik.wordpress.com/3/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/yonik.wordpress.com/3/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=yonik.wordpress.com&blog=1995971&post=3&subd=yonik&ref=&feed=1" /></div>]]></content:encoded>
			<wfw:commentRss>http://yonik.wordpress.com/2007/10/26/solr-at-web-20-expo-berlin/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/d826abbc3ebe028c7db08a03a159503f?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">yonik</media:title>
		</media:content>
	</item>
	</channel>
</rss>