I am using Apache PDFBox to create a very simple pdf with one line of text with conformance to PDFA 2b and I want to use VeraPDF to check this pdf for conformance. Vera is telling me, that the pdf is not compliant and shows me two failed assertions:
TestAssertion [ruleId=RuleId [specification=ISO 19005-2:2011, clause=6.6.2.1, testNumber=1], status=failed, message=The Catalog dictionary of a conforming file shall contain the Metadata key whose value is a metadata stream as defined in ISO 32000-1:2008, 14.3.2., location=Location [level=CosDocument, context=root/document[0]], locationContext=null, errorMessage=null]
TestAssertion [ruleId=RuleId [specification=ISO 19005-2:2011, clause=6.2.4.3, testNumber=4], status=failed, message=DeviceGray shall only be used if a device independent DefaultGray colour space has been set when the DeviceGray colour space is used, or if a PDF/A OutputIntent is present., location=Location [level=CosDocument, context=root/document[0]/pages[0](4 0 obj PDPage)/contentStream[0](6 0 obj PDContentStream)/operators[3]/fillCS[0]], locationContext=null, errorMessage=null]
My code looks something like this:
try (ByteArrayOutputStream baos = new ByteArrayOutputStream(); PDDocument document = new PDDocument(); COSStream cosStream = new COSStream()) {
PDPage page = new PDPage();
document.addPage(page);
PDDocumentInformation documentInformation = new PDDocumentInformation();
documentInformation.setTitle("Name");
documentInformation.setCreator("Creator");
documentInformation.setSubject("Subject");
document.setDocumentInformation(documentInformation);
try (ByteArrayOutputStream xmpOutputStream = new ByteArrayOutputStream(); OutputStream cosXMPStream = cosStream.createOutputStream()) {
XMPMetadata xmp = XMPMetadata.createXMPMetadata();
PDFAIdentificationSchema pdfaSchema = xmp.createAndAddPFAIdentificationSchema();
pdfaSchema.setPart(2);
pdfaSchema.setConformance("B");
DublinCoreSchema dublinCoreSchema = xmp.createAndAddDublinCoreSchema();
dublinCoreSchema.setTitle("Name");
dublinCoreSchema.addCreator("Creator");
dublinCoreSchema.setDescription("Subject");
XMPBasicSchema basicSchema = xmp.createAndAddXMPBasicSchema();
Calendar creationDate = Calendar.getInstance();
basicSchema.setCreateDate(creationDate);
basicSchema.setModifyDate(creationDate);
basicSchema.setMetadataDate(creationDate);
basicSchema.setCreatorTool("Creator Tool");
new XmpSerializer().serialize(xmp, xmpOutputStream, true);
cosXMPStream.write(xmpOutputStream.toByteArray());
document.getDocumentCatalog().setMetadata(new PDMetadata(cosStream));
}
PDViewerPreferences prefs = new PDViewerPreferences(page.getCOSObject());
prefs.setDisplayDocTitle(true);
document.getDocumentCatalog().setViewerPreferences(prefs);
File fontFile = new File("C:\Windows\Fonts\arial.ttf");
PDType0Font font = PDType0Font.load(document, fontFile);
PDPageContentStream contentStream = new PDPageContentStream(document, page);
contentStream.beginText();
contentStream.setFont(font, 12);
contentStream.newLineAtOffset(100, 700);
contentStream.showText("Hello PDF/A-2b World!");
contentStream.endText();
contentStream.close();
document.save(baos);
try (PDFAParser parser = Foundries.defaultInstance().createParser(new ByteArrayInputStream(baos.toByteArray()), PDFAFlavour.PDFA_2_B)) {
PDFAValidator validator = Foundries.defaultInstance().createValidator(PDFAFlavour.PDFA_2_B, false);
ValidationResult result = validator.validate(parser);
System.out.println(result.isCompliant());
}
}
When I inspect the generated PDF with debugger-app-2.0.31.jar, I can find the Metadata. When I compare the Metadata with a pdf file from the regression test from VeraPDF (eg. this one), the only difference that seems relevant to me is in the begin=”” tag. It is empty in the vera test file <?xpacket begin=''
and it seems to contain the BOM Start Sequence in the file created by pdfbox <?xpacket begin=""
.
Is someone able to tell me, if this is an error in VeraPDF or in PDFBox? Is there a solution for this problem?
Can someone explain the second error to me and offer an solution?