Java Mailing List Archive

http://www.java2.5341.com/

Home » fop-users-digest.xmlgraphics »

fop-users Digest 31 Jan 2013 19:12:04 -0000 Issue 2645

fop-users-digest-help

2013-01-31


Author LoginPost Reply

fop-users Digest 31 Jan 2013 19:12:04 -0000 Issue 2645

Topics (messages 36219 through 36225)

Re: list-item overflows body-region
 36219 by: Pascal Sancho

Re: FOP 1.1 - Unable to copy/paste text is not working
 36220 by: Neeraj
 36221 by: Glenn Adams
 36223 by: Luis Bernardo

FOP 1.1 - How to stop DCTDecode compression for images?
 36222 by: Neeraj
 36224 by: Luis Bernardo

Re: how to merge PDFs
 36225 by: Chris Bowditch

Administrivia:

---------------------------------------------------------------------
To post to the list, e-mail: fop-users@(protected)
To unsubscribe, e-mail: fop-users-digest-unsubscribe@(protected)
For additional commands, e-mail: fop-users-digest-help@(protected)

----------------------------------------------------------------------


Attachment: fop-users_36219.eml (zipped)
Hi,

mixing "elastic" space-* with keep-with-next reveals a fop bug.

You should report this issue in Jira, attaching both test case (FO)
and result (PDF).

Notes:
- I get the same result with either keep-with-next or k-w-n.within-line
- k-w-n.within-line shouldn't be taken into account on fo:list-item,
since fo:list-item generates only block area(s)

Workaround:
- remove either keep-with-next property, or .optimum, .minimum, and
.maximum properties

HTH,

2013/1/29 <markus.sticker.epos@(protected)>:
> Dear fop-users,
>
> I wonder about the result of fop by using list-items.
> Maybe somebody is able to identify my fault.
> I spend the whole day to find a proper workaround.
> So I will be very thankful for help.
> I added my fo file for tests.
>
> Mit freundlichen Grüßen/Kind regards
> Markus Sticker


--
pascal


Attachment: fop-users_36220.eml (zipped)
Hi Luis,

Thanks for reply.

Yes, my editor can handle used font.
If you highlight the text in the editor and set the font to Arial do you see any
glyph? For PDF text - No

For embedding this, May be I added embedding mode full later, after generating
PDF, but in both the cases it is giving same results.

The issue I reported was for non-Base14 font. You are using Arial which is
Base14 font and FOP has full support for these kinds of fonts.

Well as you said, I tried same functionality with Arial font also and found same
issue in different form.

Original Arabic text - هذا تعليق الاختبار. تتم كتابة الكلمات بشكل صحيح
PDF Arabic text    - ھذا تعلیق الاختبار. تتم كتابة الكلمات بشكل صحیح

If I compare PDF and MS-Word files, it looks exactly similar but when I copy it
to an editor(Font supported), the words look different (Glyphs are missing). You
can check the above text.

Why am I loosing text while doing copy/paste?






Attachment: fop-users_36221.eml (zipped)

On Wed, Jan 30, 2013 at 6:44 AM, Neeraj <neerajiiita@gmail.com> wrote:

Yes, my editor can handle used font.
If you highlight the text in the editor and set the font to Arial do you see any
glyph? For PDF text - No

For embedding this, May be I added embedding mode full later, after generating
PDF, but in both the cases it is giving same results.

The issue I reported was for non-Base14 font. You are using Arial which is
Base14 font and FOP has full support for these kinds of fonts.

Well as you said, I tried same functionality with Arial font also and found same
issue in different form.

Original Arabic text - هذا تعليق الاختبار. تتم كتابة الكلمات بشكل صحيح
PDF Arabic text      - ھذا تعلیق الاختبار. تتم كتابة الكلمات بشكل صحیح

If I compare PDF and MS-Word files, it looks exactly similar but when I copy it
to an editor(Font supported), the words look different (Glyphs are missing). You
can check the above text.

Why am I loosing text while doing copy/paste?

One thing to keep in mind is that some fonts do not include entries in the CMAP table for all glyphs that can be referenced by performing the character to glyph transformation process. In this case FOP, synthesizes a CMAP entry which is used in the embedded font, where this entry uses a dynamically generated Unicode value in the PUA (private use area). This latter is necessary since PDF requires specifying *some* character code (and not glyph index directly) when performing text drawing.

If you then attempt to copy this text and paste into another editor that isn't aware of this dynamic mapping using the embedded font's CMAP, then you may lose that mapping information. One possible way to fix this, which I haven't investigated in detail, is to provide a separately encoding Unicode string that contains the original, pre-transformed text, and associate this string with the displayed post-transformed character string that may contain these dynamic PUA characters. The PDF viewer would then need to make use of the pre-transformed string when performing copy operations. However, I haven't researched this to see if PDF supports.

Anyway, I suspect this is what is causing your problem. I've opened a bug on this at [1].

[1] https://issues.apache.org/jira/browse/FOP-2204


 

Attachment: fop-users_36223.eml (zipped)

The two lines look the same to me. Maybe you copied and pasted the same
content twice?

The only reason I suggested Arial was because I didn't have your font
and I know Arial has Arabic glyphs and it is known by all text editors.

If you use a text editor (say, openoffice) and export to PDF are you
then able to copy and paste from that PDF? If so, can you send that PDF
and the one generated by FOP (with full embedding) so that we can
compare them?

On 1/30/13 1:44 PM, Neeraj wrote:
> Hi Luis,
>
> Thanks for reply.
>
> Yes, my editor can handle used font.
> If you highlight the text in the editor and set the font to Arial do you see any
> glyph? For PDF text - No
>
> For embedding this, May be I added embedding mode full later, after generating
> PDF, but in both the cases it is giving same results.
>
> The issue I reported was for non-Base14 font. You are using Arial which is
> Base14 font and FOP has full support for these kinds of fonts.
>
> Well as you said, I tried same functionality with Arial font also and found same
> issue in different form.
>
> Original Arabic text - هذا تعليق الاختبار. تتم كتابة الكلمات بشكل صحيح
> PDF Arabic text    - ھذا تعلیق الاختبار. تتم كتابة الكلمات بشكل صحیح
>
> If I compare PDF and MS-Word files, it looks exactly similar but when I copy it
> to an editor(Font supported), the words look different (Glyphs are missing). You
> can check the above text.
>
> Why am I loosing text while doing copy/paste?
>
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: fop-users-unsubscribe@(protected)
> For additional commands, e-mail: fop-users-help@(protected)
>



Attachment: fop-users_36222.eml (zipped)
Hi,

I am using FOP 1.1 and generating a PDF through XML and XSL. I am trying to stop DCTDecode compression for images. The following snippet is from my PDF, Which has DCTDecode filter for image object, even after adding filter value to null in config file.

/Name /Im1
  /Type /XObject
  /Length 13 0 R
  /Filter /DCTDecode
  /Subtype /Image
  /Width 550
  /Height 432
  /BitsPerComponent 8
  /ColorSpace /DeviceRGB

I tried following settings in my config file.

<renderer mime="application/pdf">
  <filterList>
    <value>null</value>
  </filterList>
  <filterList type="image">
    <value>null</value>
    <value>ascii-85</value>
  </filterList>

and 

<renderer mime="application/pdf">
  <filterList>
    <value>null</value>
  </filterList>

But DCTDecode is still happening and because this is lossy compression, the image quality is lost. Can DCTDecode be stopped for images by doing any config setting?

Thanks
Neeraj


In Config file,

Attachment: fop-users_36224.eml (zipped)

Are you using JPEG? Are you specifying any content-width|height for the image? And source-resolution? When you say the image quality is lost how do you see that?

JPEG images are compressed and that is the filter you see below. Setting the filter to null means no further filter will be applied (and for instance text "streams" will not be compressed). You can check that if you set the filter to flate you will get a further filter applied to the image.

Now, you can use a different image loader than the default for JPEG that will result in a uncompressed image but the quality will not improve. In the conf file add this, before the <renderers> element:

  <image-loading>
    <penalty value="INFINITE" class="org.apache.xmlgraphics.image.loader.impl.ImageLoaderRawJPEG" />
  </image-loading>

I doubt the quality will be any different but the image will be uncompressed. If you are sure you see a loss in quality you can send us your example image so that we can investigate.


On 1/30/13 12:48 PM, Neeraj wrote:
Hi,

I am using FOP 1.1 and generating a PDF through XML and XSL. I am trying to stop DCTDecode compression for images. The following snippet is from my PDF, Which has DCTDecode filter for image object, even after adding filter value to null in config file.

/Name /Im1
  /Type /XObject
  /Length 13 0 R
  /Filter /DCTDecode
  /Subtype /Image
  /Width 550
  /Height 432
  /BitsPerComponent 8
  /ColorSpace /DeviceRGB

I tried following settings in my config file.

<renderer mime="application/pdf">
  <filterList>
    <value>null</value>
  </filterList>
  <filterList type="image">
    <value>null</value>
    <value>ascii-85</value>
  </filterList>

and 

<renderer mime="application/pdf">
  <filterList>
    <value>null</value>
  </filterList>

But DCTDecode is still happening and because this is lossy compression, the image quality is lost. Can DCTDecode be stopped for images by doing any config setting?

Thanks
Neeraj


In Config file,


Attachment: fop-users_36225.eml (zipped)
Hi,

A better approach to merging multiple PDF files together is to use the
Intermediate Format. So instead of transforming each separate File
FO->PDF directly, tranform FO->IF and then once you have all the
separate IF Files you can merge them into one large PDF File. If you try
to do that with PDFBox or similar you will be constrained by memory as
those tools have to load the entire PDF data into an object model inside
the process. We can generate 10s of thousands of pages using this
technique and you can resequence the page numbers with a few minor
tweaks to the XML as you load the files in, rather than having to
manipulate PDF Text to achieve this.

Take a look at the Java code example
\examples\embedding\java\embedding\intermediate\ExampleConcat.java to
see how to achieve what I'm describing.

Thanks,

Chris

On 29/01/2013 15:30, Campbell, Lance wrote:
>
> Thanks for your suggestion on submitting a bug. I put together the
> information to recreate the problem as well as the PDF that was in
> error. I submitted it just now.
>
> I think I will skip the merge and wait to hear back from the bug
> submission.
>
> Thanks,
>
> Lance Campbell
>
> Software Architect
>
> Web Services at Public Affairs
>
> 217-333-0382
>
> University of Illinois at Urbana-Champaign logo <http://illinois.edu/>
>
> *From:*Luis Bernardo [mailto:lmpmbernardo@(protected)]
> *Sent:* Monday, January 28, 2013 5:00 PM
> *To:* fop-users@(protected)
> *Subject:* Re: how to merge PDFs
>
>
> Please provide a test case if you think you found a bug. Most likely
> there is some oddity in your FO input that causes the problem.
>
> If you want to merge documents and have control over how they are
> generated you can use initial-page-number to set the page number for
> the start of a page sequence.
>
> To merge, besides PDFBox you can use pdftk (which itself uses iText).
>
> On 1/28/13 8:42 PM, Campbell, Lance wrote:
>
>   I looked over the PDFBox option. Do you believe there is a way to
>   renumber pages in a merged PDF document?
>
>   Thanks,
>
>   Lance Campbell
>
>   Software Architect
>
>   Web Services at Public Affairs
>
>   217-333-0382
>
>   University of Illinois at Urbana-Champaign logo <http://illinois.edu/>
>
>   *From:*Mehdi Houshmand [mailto:med1985@(protected)]
>   *Sent:* Monday, January 28, 2013 2:02 PM
>   *To:* fop-users@(protected)
>   <mailto:fop-users@(protected)>
>   *Subject:* Re: how to merge PDFs
>
>   Look into PDFBox, its another Apache project that can do just
>   that. However, you definitely shouldn't be seeing XSL-FO in the
>   output PDF.
>
>   Can you post a bug and attach a test sample? Depending on what
>   you're doing, FOP should be able to handle big documents so the
>   merging shouldn't be necessary.
>
>   On Jan 28, 2013 5:08 PM, "Campbell, Lance" <lance@(protected)
>   <mailto:lance@(protected):
>
>   FOP 1.1
>
>   We have been using FOP for quite a few years now. We are really
>   happy with it. We use it to generate PDF reports. We seem to be
>   running into an issue where really large reports start to display
>   the XSL-FOP code in the output.
>
>   Example:
>
>   d="submission-4"> <fo:static-content flow-name="xsl-region-after">
>   <fo:table> <fs week? If so,
>
>   provide details on the conference/workshop/journal, authors, paper
>
>   I was thinking that maybe I could process each page into its own
>   PDF and then splice the single page PDFs together into a master
>   PDF document.
>
>   Has anyone ever done this?
>
>   This will prevent the issue from occurring.
>
>   Thanks,
>
>   Lance Campbell
>
>   Software Architect
>
>   Web Services at Public Affairs
>
>   217-333-0382
>


©2008 java2.5341.com - Jax Systems, LLC, U.S.A.