In a java program, I want to insert at coding level a JPEG image (which already exists in the java-code/program as a BufferedImage) into a PDF-file.
Since the PDF-viewer (XChange-Editor) does not show the JPEG within the created PDF-File and
Acrobat Reader at least claims an issue with the PDF file (unfortunately without given any information about the nature of the error) there must be something wrong with the PDF-file (“itself”, I would say).
Concerned Files:
- The source-JPEG-file is ATH_Logo_Pythagoras_241225b.jpg.
- The created PDF-file is ATH_PDF-JPEG-Test_241227a.pdf (note: this contains a yellow rectagle just o be sure that anything is drawn in the PDF. The JPEG should be drawn on top of this rectangle; To keep the contents of the page human-readable I apply the only a Filter for the XObject “XO01” which contains the JPEG part).
- Test-extract from the created PDF-File: ATH_PDF-JPEG-Test_241227a_jpgextract.jpg
- Re-written JPEG List in order to check if the reading/writing-process of the JPEG is valid: ATH_Logo_Pythagoras_241225b.jpg_rewrite.jpg
If I extract the JPEG code from ATH_PDF-JPEG-Test_241227a.pdf (simply by copying the PDF-file and stripping everything in front of the start of the XObject-stream (line 93, zero-based position: 1317) and everything behind the corresponding “endstream” (pos 7355) I get the file ATH_PDF-JPEG-Test_241227a_jpgextract.jpg.
QUESTION(s):
Since this JPEG file is shown as expected (e.g, by means of IrfanView) , I think that the the byte-sequence between position 1317 and 7355 in the target-PDF should represent at least a correct JPEG-data structure.
Therefore I have the following Questions
- Is the assumption correct, that the byte-sequence in an PDF-XObject stream is identical to the byte-sequence wihin a “pure” JPEG-file ?
- I declared JPXDecode as the Filter for the XObject stream. Is this false? (if so, what else should I do?)
- Are there any issues with the XObject dictionary (XObject “/XO01”, starting as object nr. 8 from pos. 1169)?
Thank you very much!
ADDITIONAL NOTES
-
Apart from stackoverflow my source of information is PDFSPEC
-
The question concerns the pure PDF-code. It is not the goal to use (i.e. no external applications or libraries (except the java(8)-standard-libraries)
-
The code is written by an own java code of a larger application, which produces a PDF-file from the scratch.
-
Within this java-code, I read the source-JPEG-File ATH_Logo_Pythagoras_241225b.jpg (from a local directory) with
javax.imageio.ImageIO.read(...)
into a BufferedImage. Later in the code this BufferedImage is written withImageIO.write(imageBuffer, "jpg", outputStream) ;
into anByteArrayOutputStream outputStream
which is written withoutstream.write(outputStream.toByteArray())
to thePrintStream outstream
of the target-PDF-File. -
For test-purposes, in the same code I did read and immediately rewrite the source-JPEG ATH_Logo_Pythagoras_241225b.jpg to ATH_Logo_Pythagoras_241225b.jpg_rewrite.jpg JPEG-file. Both files have the same content what means, that reading / writing shpuld actually not be the issue!?. The code for this operation is
String srcflnm = "ATH_Logo_Pythagoras_241225b.jpg" ; BufferedImage image = ImageIO.read(new File(srcflnm)) ; String outflnm = srcflnm + "_new.jpg" ; ByteArrayOutputStream outputStream = new ByteArrayOutputStream() ; ImageIO.write(image, "jpg", outputStream ) ; PrintStream outstream = new PrintStream(outflnm) ; outstream.write(outputStream.toByteArray()) ; outstream.close() ;
ATH is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
2