FieldCache Implementations

Does any out there using Lucene implement their own version of We are proposing to make it an abstract class which violates our general rule about back-compatibility (see https

can we do partial optimization?

Hello I am very new to Lucene.I am facing one problem. I have one very large index which is constantly getting update(add and delete) at a regular interval.after which I am optimizing the whole ind

FSDirectory Again

This is from Lucene 's CHANGES.txt LUCENE-773 Deprecate the FSDirectory.getDirectory(*) methods that take a boolean "create " argument. Instead you should use IndexWriter 's "create " argum

Applying SpellChecker to a phrase

Suppose I have an index containing the terms impostor imposter fraud and fruad then presumably regardless of whether I spell impostor and fraud correctly Lucene SpellChecker will offer the improp

SpellChecker performance and usage

My question is for anyone who has experience with Lucene 's SpellChecker especially around its performance characteristics/ramifications. 1. Given the fact that SpellChecker expands a query by adding

lucene-core-2.2.0.jar broken? CorruptIndexException?

> I 'll see if I can get back to this over the weekend. I got a chance to copy my corpus to another G4 and try indexing with Lucene 2.2. This one seems OK! Same texts. So now I 'm inclined to believ

multireader vs multisearcher

Hi What is the difference between using 1) MultiReader reader .... // create multi reader from different indexes IndexSearcher searcher new IndexSearcher(reader) v

BooleanQuery TooManyClauses in wildcard search

Erick/John thank you so much for the reply. I have gone through the mailing list u have redirected me to. I know i need to read more but some quick questions. Please bear with me if they appear

IndexReader locking index

I am using MoreLikeThis functionality in my code. This code is running on four separate servers. When I ran tests it seemed to be fine but looks like under heavy use the index file is always lo

fieldcache gives OOM. Deos a LRU-style fieldcache exist?

First my question Is there an (experimental / patch-version) lucene-fieldcache available which uses some kind of eviction-strategy (LRU or whatever) so that OOM 's would never happen in my case bu

get original term for synonym

Hi there Currently I am trying to get synonyms to work. I have gotten as far as injecting them into the index as Token.type SYNONYM. Lucene then finds the original word and synonym and points to t

best way to share cookie info (user search history, etc.) between two load b

Ah! There are so many ways to do this as there are so many questions unanswered in your mail. What kind of load balancer are you going to install? Will you be replicating the complete lucene index on

best way to share cookie info (user search history, etc.) between two load balan

Hi Everyone We are planning on scaling our current web server by adding a machine with similar specification. Both machine will be running lucene searches. What we plan to do is add a load balancer

Optimizing index takes too long

Hi Optimizing my index of 1.5 million documents takes days and days. I have a collection of 10 million documents that I am trying to index with Lucene. I 've divided the collection into chunks of a

CheckIndex tool

I just used the CheckIndex tool to try to salvage a corrupt index (http // Its a great tool thanks! I 'm wondering about adding support for t

problem undestanding the hits.score

Thanks you for your reply The thing is i 'am trying to emplement a weight for a word form indexing html web pages. The is like *50% + Weigth(word in doc d) *20% + * 10% + ... the code i

round robin search results with same score

Hi I am quite new to Lucene I 've read most of the documentation and can 't find want I need.. Basically for any documents returned from a search if they have the same score I need them to be return

restoring a corrupt index?

Using solr we have been running an indexing process for a while and when I checked on it today it spits out an error java.lang.RuntimeException /path/to/index/_cf

Comparing Two Indexes

Hi I wanted two compare two indexes.Please recommend an algorithm which takes all the factors into accoubt such as versions of software being used by lucene and application which has an effect on

Obtaining the number of segments in an index?

Hi Is there a way to get the number of segments in an index? I looked at the API 's for the reader writer and searcher but didn 't find anything. Thanks Lucifer

- lock improvement suggestion

I have briefly reviewed the SimpleFSLock of Lucene 2.1 and 2.2. I see that the lock release mechanism does not check the return value of delete public void release() { lockFile.delete()

Create and populate a field when indexing

Grant Ingersoll-6 wrote > > When you are indexing the file and adding the Document you will need > to parse out your filename per your regular expression and then > create the appropriate fi

TermDocs.skipTo error

I have posted before about a problem with TermDocs.skipTo () but never managed to reproduce it. I have now got it to fail using the following program please can someone try it and see if they get the

Chinese Segmentation with Phase Query

Hi We are having an issue while indexing Chinese Documents in Lucene. Some background first Since CJK languages doesn 't have space between words we first have to determine the words from sentence

SV: OutOfMemory-problems with SortComparatorSource / ScoreDocComparator

Hi Tobias I had the similar problem with lucene custom sorting about two years ago. Please take a look at these two email threads http // http /

OutOfMemory-problems with SortComparatorSource / ScoreDocComparator

Hi We have implemented a custom sort following the pattern in Lucene in Action. Unfortunately this has led to quite serious memory problems. When analyzing those (with a profiler) it seems that ther

Office 2007

Hello I know this has gone around a bit but anyone had any success with pulling text from Office 2007 files? Any recommendations? Thanks Michael

Sorting with MultiSearcher

Hi I have few Indexes with the same structure. I 'm using MultiSearcher to search into those indexes and when I try to sort the result by field the result is sort by field and by index (we have all re

why Term variable text can not be interned?

In Term object there are variables "field " and "text ". My question is why variable "text " can not be intern() ? Wouldn 't it save some memory especially in the FieldCache? -- Chris Lu -----------

how can i store lucene results from a webpage to a oracle database

i want to retrieve lucene search results from the web page and want to put them into oracle database through JDBC and after some manipulation want to display results again after fetching it from dat
