Mailing List
Home
Forum Home
Maven - Project building tool
Axis - Java SOAP implementation
Lucene - Full-featured text search engine APIs
Cocoon - MVC web framework based on XML/XSL
Fop - Create PDF, PCL, PS, SVG, XML driven by XSL formatting objects.
Log4J - A log library
POI - Java Excel, Word and other Microsoft Office files manipulating library
Oracle database error code ...
Subjects
log4j warning: No appenders could be found
java security AccessControlException: access denied (java io FilePermission clie
java lang InstantiationException: org apache tools ant Main
Apache Axis Tutorial
Subject: Struts <logic iterate >
log4j properties How to parse outpu to multiple files
configuring log4j with BEA Weblogic 8 1
How to use XSL FOP Java together
JSP precompile
Proposal: Adding jar manifest classpath in jar and war plugins
Servlet File Download dialog problem (IE6,Adobe 6 0)
java security AccessControlException: access denied (java io FilePermission
Unsupported major minor version 48 0 problem while running the an
   telope task
Subject: axis wsdl2java Ant Task usage
net sf hibernate MappingException: Error reading resource: test/User hbm xml
Building EAR ANT Script for websphere 5 0
CREATING WAR Files
Classpath problem
jsp data into Excel
Jboss 3 2 3+ vs Tomcat Axis Question
RE: How to include jars and add them into the MANIFEST MF/Class Path
attribute
Printing problem
Subject: InstantiationException
Couldn 't find trusted certificate
Please : How can one install ant 1 6 0 under Eclipse 2 1 ?
Excel: Too many different cell formats
Subject: AXIS: tomcat timeout ?
1 3 final: now giving me java io FileNotFoundException (Too many
open files)
XDoclet, Struts and Maven: Where to start? SOLUTION
Subject: Running junit tests fails
 
Create and populate a field when indexing

Create and populate a field when indexing

2007-11-09       - By KR

 Back
Reply:     1     2     3     4  



Grant Ingersoll-6 (See http://oll-6.ora-code.com) wrote:
>
> When you are indexing the file and adding the Document, you will need  
> to parse out your filename per your regular expression, and then  
> create the appropriate field:
>
> Document doc = new Document()
> String cat = getCategoryFromFileName(inputFileName)
> doc.add(new Field("category", cat, ...)
> //do the rest of your adds
>
> Just locate where in the demo the Document add is taking place (I  
> forget the exact spot) and then add in the appropriate stuff from  
> above.  Obviously, you need to implement the method I stubbed called  
> getCategoryFromFileName.
>
> HTH,
> Grant
>

Thanks, Grant. That was just the hint I needed.

I found that the fields are populated in HTMLDocument.

I added:

doc.add(new Field("category", "test", Field.Store.YES,
Field.Index.TOKENIZED));

and then used Luke to verify that this field had been added. It had.

Now I am trying to get a quick-and-dirty way of setting the field based on
the filename, but I'm running into problems that I don't really understand
well enough to fix quickly.

I have only very limited experience of Java programming, so I might be using
the wrong terms, but I think the problem is variable scope. I get a
compilation error:

HTMLDocument.java:86: cannot find symbol
symbol  : variable url
location: class org.apache.lucene.demo.HTMLDocument
       if (url.indexOf("-ov-") != -1) {


I thought I'd be able to use a simple mechanism based on indexOf() to check
the existence of a short sequence of characters within the filename. For
example, "-sys-". I know that this sequence, if it exists anywhere in the
full path must be in the filename.

So I put in a series of if statements like this:

  if (url.indexOf("-sys-") != -1) {
    string category = "system";
  }

then right at the end:
doc.add(new Field("category", category, Field.Store.YES,
Field.Index.TOKENIZED));

Am I right in thinking that the variable url is undefined at this point in
the code? It certainly seems to be defined earlier on in the file:

 public static String uid2url(String uid) {
   String url = uid.replace('\u0000', '/');    // replace nulls with slashes
   return url.substring(0, url.lastIndexOf('/')); // remove date from end
 }

Is there some way for me to perhaps chop down to the filename here, and make
that available later in the code?

K.
--
View this message in context: http://www.nabble.com/Create-and-populate-a-field
-when-indexing-tf4713018.html#a13667927
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


-- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ------
To unsubscribe, e-mail: java-user-unsubscribe@(protected)
For additional commands, e-mail: java-user-help@(protected)