Mailing List
Home
Forum Home
Maven - Project building tool
Axis - Java SOAP implementation
Lucene - Full-featured text search engine APIs
Cocoon - MVC web framework based on XML/XSL
Fop - Create PDF, PCL, PS, SVG, XML driven by XSL formatting objects.
Log4J - A log library
POI - Java Excel, Word and other Microsoft Office files manipulating library
Oracle database error code ...
Subjects
log4j warning: No appenders could be found
java security AccessControlException: access denied (java io FilePermission clie
java lang InstantiationException: org apache tools ant Main
Apache Axis Tutorial
Subject: Struts <logic iterate >
log4j properties How to parse outpu to multiple files
configuring log4j with BEA Weblogic 8 1
How to use XSL FOP Java together
JSP precompile
Proposal: Adding jar manifest classpath in jar and war plugins
Servlet File Download dialog problem (IE6,Adobe 6 0)
java security AccessControlException: access denied (java io FilePermission
Unsupported major minor version 48 0 problem while running the an
   telope task
Subject: axis wsdl2java Ant Task usage
net sf hibernate MappingException: Error reading resource: test/User hbm xml
Building EAR ANT Script for websphere 5 0
CREATING WAR Files
Classpath problem
jsp data into Excel
Jboss 3 2 3+ vs Tomcat Axis Question
RE: How to include jars and add them into the MANIFEST MF/Class Path
attribute
Printing problem
Subject: InstantiationException
Couldn 't find trusted certificate
Please : How can one install ant 1 6 0 under Eclipse 2 1 ?
Excel: Too many different cell formats
Subject: AXIS: tomcat timeout ?
1 3 final: now giving me java io FileNotFoundException (Too many
open files)
XDoclet, Struts and Maven: Where to start? SOLUTION
Subject: Running junit tests fails
 
BooleanQuery TooManyClauses in wildcard search

BooleanQuery TooManyClauses in wildcard search

2007-12-01       - By Ruchi Thakur

 Back
Reply:     1     2     3     4  


 Erick/John, thank you so much for the reply. I have gone through the mailing
list u have redirected me  to. I know i need to read more, but some quick
questions. Please bear with me if they appear to be too simple.
 Below is the code snippet of my current search. Also i need to get score info
of each of my document returned in search, as i display the search result in
the order of scroing.
   {
  Directory fsDir = FSDirectory.getDirectory(aIndexDir, false);
  IndexSearcher is = new IndexSearcher(fsDir);
  ELSAnalyser elsAnalyser = new ELSStopAnalyser();
  Analyzer analyzer = elsAnalyser.getAnalyzer();
    QueryParser parser = new QueryParser(aIndexField, analyzer);
    Query query = parser.parse(aSearchStr);
    hits = is.search(query);
 }
 
 Now as i have understood, through the mail archives you have suggsted, below
is what we need to do.
 1)The second was to build a *Filter* that uses WildcardTermEnum -- not a
Query.
 because it's a filter, the scoring aspects of each document are taken out of
the equation (I am worried abt it , as i need scoring info)
 
 2)Once you have a "WildcardFilter" wrapping it in a ConstantScoreQuery would
give you a drop in replacement for WildCardQuery that would sacrifive the TF
/IDF scoring factors for speed and garunteed execution on any pattern in any
index regardless of size. (Does that mean it will solve my scoring issue and i
will get scoring info)
 
 Also it suggests "SpanNearQuery on a wildcard". I am kinda cofused which is
the approach that should be actually used. Please suggest. At the same time i
am studing more abt it. Thanks a lot for ur help on this.
 
 Best Regards,
 Ruchika
 
Erick Erickson <erickerickson@(protected)> wrote:
 John's answer is spot-on. There's a wealth of information in the user group
archives that you should be able to search on discussing ways of providing
the functionality. One thread titled "I just don't get wildcards at all"
is one where the folks who know generously helped me out.

Once you find out how to search for that you'll know you're in the right
place.
Here's the searchable archive.....

http://www.gossamer-threads.com/lists/engine?do=search;search_forum=forum_2;
;list=lucene

Make sure you select the "java user" from the top drop-down labeled
"Search".

Best
Erick

On Nov 30, 2007 2:07 PM, John Byrne wrote:

> Hi,
>
> Your problem is that when you do a wildacrd search, Lucene expands the
> wildacrd term into all possible terms. So, searching for "stat*"
> produces a list of terms like "state", "states", "stating" etc. (It only
> uses terms that actually occur in your index, however). These terms are
> all added as OR clauses of a boolean query.
>
> The thing is, be defult, there is a limit of 1024 caluses for a boolean
> query. If yuor wildacrd term expands into more than this, (which happens
> very easily), you get that exception you described. You can solve the
> issues by setting the maximum clause count yourself, using
>
> BooleanQuery.setMaxClauseCount(int maxClauseCount)
>
> See
>
> http://lucene.zones.apache.org:8080/hudson/job/Lucene-Nightly/javadoc/core
/index.html
> for mroe info.
>
> Bear in mind that putting a wildcard near the start of the term results
> in a large number of boolean clauses, which increases memory usage. This
> is the reason for the default limit. This limit will also affect fuzzy
> queries, because they are expanded in the same way.
>
> Regards,
> JB
>
> Ruchi Thakur wrote:
> >
> > Hi there.
> > I am a new Lucene user and I have been searching the group archives but
> couldn't solve the problem. I have just joined a project that uses Lucene.
> > We use the StandardAnalyzer for indexing our documents and our query is
> as
> > follows when we issue a search string of t* for example:
> > +t* +cont_type:pa
> >
> > We get an Exception when we issue some of our wildcard text searches
> we get following Exception
> > org.apache.lucene.search.BooleanQuery$TooManyClauses Exception : Max
> clause if 1024
> >
> > Please suggest.
> >
> > Regards,
> > Ruchi
> >
> >
> >
> >
> >
> >
> >
> >
> > -- ---- ---- ---- ---- ---- -----
> > Never miss a thing. Make Yahoo your homepage.
> >
> > -- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ----
> >
> > No virus found in this incoming message.
> > Checked by AVG Free Edition.
> > Version: 7.5.503 / Virus Database: 269.16.11/1161 - Release Date:
> 30/11/2007 12:12
> >
>
>
> -- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ------
> To unsubscribe, e-mail: java-user-unsubscribe@(protected)
> For additional commands, e-mail: java-user-help@(protected)
>
>


     
-- ---- ---- ---- ---- ---- -----
Be a better pen pal. Text or chat with friends inside Yahoo! Mail. See how.