I’m using IronPdf to fill a PDF that was prepared with an embedded form using Adobe Acrobat. The form is relatively complicated with a couple of pages, and a few dozen fields per page.
I’m getting a reference to the Form
object, then looping over my input data, and for each field, I’m using FindFormField
to get a reference to the IFormField
associated with the relevant property name.
This works but it’s really slow. It’s taking more than a minute to add data to the form. Note that I’m not necessarily iterating over the fields in order as they appear on the embedded form: I’m following the order of my input data.
Here’s a code snippet that should reproduce the problem with any relatively dense embedded form:
var doc = PdfDocument.FromFile("ComplicatedForm.pdf");
Dictionary<string, string> inputValues = GetInputData();
IFormField formField;
foreach (KeyValuePair<string, string> prop in inputValues)
{
formField = doc.Form.FindFormField(prop.Key);
formField.Value = prop.Value;
}
Am I using the wrong approach here with FormFieldCollection.FindFormField
, or could there be something specific about how I’m creating the embedded form that’s causing this slowness? The IronPdf libraries are pretty large, but I’m focusing only on a small part related to interacting with these embedded forms. Maybe there’s something else in these libraries that I’m missing?
Here’s a couple of to the documentation links:
PdfDocument
hereFormFieldCollection
here
2
I solved this issue by pre-loading all of the IFormFields
and caching the IFormField
elements in a Dictionary
, then loading them from the dictionary when I need to set the value:
var doc = PdfDocument.FromFile("ComplicatedForm.pdf");
Dictionary<string, IFormField> formFields;
foreach (var field in doc.Form)
formFields.Add(field.name, field);
// ... later in the code:
Dictionary<string, string> inputValues = GetInputData();
IFormField formField;
foreach (KeyValuePair<string, string> prop in inputValues)
{
formField = formFields[prop.Key];
formField.Value = prop.Value;
}
Using this approach, filling my relatively complicated forms with data dropped from over a minute to less than a second.
As I understand from the question, your solution is completely appropriate and has solved the problem, but you are probably looking for a more optimal method. It is somewhat difficult to answer such questions. It also depends on the test conditions.
In any case, with the code below, the execution speed is more than twice as fast as your second suggested method (your answer to the question):
private static void ChangeFormFields_Method3(PdfDocument doc, Dictionary<string, string> inputValues)
{
foreach (IFormField formField in doc.Form)
{
var fieldName = formField.Name;
if (inputValues.ContainsKey(fieldName))
formField.Value = inputValues[fieldName];
}
}
My test conditions:
“IronPDF” version: 2024.8.3
OS: Windows 11
Development tool: Microsoft Visual Studio 2022
Project type: ASP .NET Core 6 MVC
Test PDF: 8-page PDF with two columns per page and 15 fields per
column (240 fields in total)CPU: Intel® Core™ i7-13620H Processor 2.4 GHz
RAM: 16 GB DDR5
Cause:
Apparently “IronPDF” isn’t quite optimal when navigating between PDF pages to search for “TextFormField” and it’s a bit slow. In the description of the “FindFormField” method in the package, it is also said that “Alternatively, you may use LINQ in order to locate a form.”
Solution:
I changed the search method. Instead of randomly searching through the “FormFields” in the PDF, I scrolled through the FormFields sequentially and searched in the dictionary instead, which is a bit faster.
Note 1:
We assume that some of the fields returned by the “GetDummyInputData()
” function in the following code are not present in the PDF file and the form fields are intended for different PDFs (For example, in the following code, 60 additional fields are defined and there are 240 input fields in the PDF file). If no additional fields are needed, the speed of the code should be investigated according to the intended use.
To compare the execution speed, I used “StopWatch” as in the following code:
Test Code:
public class FillFormDataForPdfDocument
{
public const string WORKING_FOLDER = @"C:[Temp]";
public const string SOURCE_PDF = "ComplicatedForm.pdf";
public const string SOURCE_PDF_TITLE = "Sample Form PDF";
public const string TARGET_PDF_FILE_1 = "ComplicatedForm_Changed_1.pdf";
public const string TARGET_PDF_FILE_2 = "ComplicatedForm_Changed_2.pdf";
public const string TARGET_PDF_FILE_3 = "ComplicatedForm_Changed_3.pdf";
public const string TARGET_PDF_FILE_4 = "ComplicatedForm_Changed_4.pdf";
public const int WIDTH_FOR_FORMFIELD = 200;
public const int HEIGHT_FOR_FORMFIELD = 35;
public const int NUMBER_OF_PAGES_IN_SOURCE_PDF = 8;
public const int NUMBER_OF_TEXTFORMFIELDS_PER_COLUMN_IN_PAGE = 15;
public const int NUMBER_OF_INVALID_ADDITIONAL_TEXTFORMFIELDS = 60;
public static void CreateTestPDF()
{
var Source_PDF_Path = Path.Combine(WORKING_FOLDER, SOURCE_PDF);
SamplePDF.CreateTestMultipagePDF_InTwoColumnsPerPage(Source_PDF_Path, SOURCE_PDF_TITLE,
NUMBER_OF_PAGES_IN_SOURCE_PDF, WIDTH_FOR_FORMFIELD, HEIGHT_FOR_FORMFIELD, NUMBER_OF_TEXTFORMFIELDS_PER_COLUMN_IN_PAGE);
}
public static void FillData()
{
Dictionary<string, string> inputValues = GetDummyInputData();
var Source_PDF_Path = Path.Combine(WORKING_FOLDER, SOURCE_PDF);
var Target_PDF_1_Path = Path.Combine(WORKING_FOLDER, TARGET_PDF_FILE_1);
var Target_PDF_2_Path = Path.Combine(WORKING_FOLDER, TARGET_PDF_FILE_2);
var Target_PDF_3_Path = Path.Combine(WORKING_FOLDER, TARGET_PDF_FILE_3);
var Target_PDF_4_Path = Path.Combine(WORKING_FOLDER, TARGET_PDF_FILE_4);
var ElapsedMilliseconds1 = ChangeFormFieldsAndGetElapsedMilliseconds(
Source_PDF_Path, Target_PDF_1_Path,
inputValues, ChangeFormFields_Method1);
var ElapsedMilliseconds2 = ChangeFormFieldsAndGetElapsedMilliseconds(
Source_PDF_Path, Target_PDF_2_Path,
inputValues, ChangeFormFields_Method2);
var ElapsedMilliseconds3 = ChangeFormFieldsAndGetElapsedMilliseconds(
Source_PDF_Path, Target_PDF_3_Path,
inputValues, ChangeFormFields_Method3);
var ElapsedMilliseconds4 = ChangeFormFieldsAndGetElapsedMilliseconds(
Source_PDF_Path, Target_PDF_4_Path,
inputValues, ChangeFormFields_Method4);
}
public delegate void Delegate_ChangeFormFieldsMethod(PdfDocument doc, Dictionary<string, string> inputValues);
private static long ChangeFormFieldsAndGetElapsedMilliseconds(string SourcePDF, string TargetPDF, Dictionary<string, string> inputValues,
Delegate_ChangeFormFieldsMethod ChangeMethod)
{
var doc = PdfDocument.FromFile(SourcePDF);
Stopwatch sw = Stopwatch.StartNew();
ChangeMethod(doc, inputValues);
sw.Stop();
var ElapsedMilliseconds = sw.ElapsedMilliseconds;
doc.SaveAs(TargetPDF);
return ElapsedMilliseconds;
}
private static void ChangeFormFields_Method1(PdfDocument doc, Dictionary<string, string> inputValues)
{
// The method suggested in the question
IFormField formField;
foreach (KeyValuePair<string, string> prop in inputValues)
{
try
{
var fieldName = prop.Key;
formField = doc.Form.FindFormField(fieldName);
formField.Value = prop.Value;
}
catch (Exception)
{
}
}
}
private static void ChangeFormFields_Method2(PdfDocument doc, Dictionary<string, string> inputValues)
{
// The method suggested in the answer written by the question owner
Dictionary<string, IFormField> formFields = new();
foreach (var field in doc.Form)
formFields.Add(field.Name, field);
IFormField formField;
foreach (KeyValuePair<string, string> prop in inputValues)
{
try
{
formField = formFields[prop.Key];
formField.Value = prop.Value;
}
catch (Exception)
{
}
}
}
private static void ChangeFormFields_Method3(PdfDocument doc, Dictionary<string, string> inputValues)
{
foreach (IFormField formField in doc.Form)
{
var fieldName = formField.Name;
if (inputValues.ContainsKey(fieldName))
formField.Value = inputValues[fieldName];
}
}
private static void ChangeFormFields_Method4(PdfDocument doc, Dictionary<string, string> inputValues)
{
foreach (IFormField formField in doc.Form)
{
try
{
formField.Value = inputValues[formField.Name];
}
catch (Exception)
{
}
}
}
public static Dictionary<string, string> GetDummyInputData()
{
const int NUMBER_OF_COLUMNS_PER_PAGE = 2;
return GetDummyInputData(NUMBER_OF_PAGES_IN_SOURCE_PDF, NUMBER_OF_COLUMNS_PER_PAGE,
NUMBER_OF_TEXTFORMFIELDS_PER_COLUMN_IN_PAGE, NUMBER_OF_INVALID_ADDITIONAL_TEXTFORMFIELDS);
}
public static Dictionary<string, string> GetDummyInputData(int NumberOfPages, int NumberOfColumns,
int NumberOfTextFormFieldsPerColumn, int NumberOfInvalidAdditionalTextFormFields, string BaseText = "New_Value")
{
int NumberOfDummyFields = NumberOfPages * NumberOfColumns * NumberOfTextFormFieldsPerColumn + NumberOfInvalidAdditionalTextFormFields;
var data = new Dictionary<string, string>();
for (int TextFieldNo = 1; TextFieldNo <= NumberOfDummyFields; TextFieldNo++)
data.Add(GetDummyTextFieldName(TextFieldNo), GetDummyTextFieldValue(TextFieldNo, BaseText));
return data;
}
private static string GetDummyTextFieldName(int TextFieldNo)
{
return $"Text_{TextFieldNo:000}";
}
private static string GetDummyTextFieldValue(int TextFieldNo, string BaseText = "Value")
{
return $"{BaseText}_{TextFieldNo:000}";
}
}
These are the results of three tests (Speed increase ratio):
Test 1:
Method1: 28116 milliseconds
Method2: 2025 milliseconds => Speed: 13.88 x Method1
Method3: 827 milliseconds => Speed: 2.45 x Method2
Method4: 782 milliseconds => Speed: 2.59 x Method2
Test 2:
Method1: 25624 milliseconds
Method2: 1968 milliseconds => Speed: 13.02 x Method1
Method3: 905 milliseconds => Speed: 2.17 x Method2
Method4: 860 milliseconds => Speed: 2.28 x Method2
Test 3:
Method1: 24485 milliseconds
Method2: 1757 milliseconds => Speed: 13.94 x Method1
Method3: 804 milliseconds => Speed: 2.19 x Method2
Method4: 807 milliseconds => Speed: 2.18 x Method2
Note 2:
I created a simple PDF and tested it. Perhaps, if the PDF file is quite complex, as mentioned in the question, the difference between method 2 and method 1 maybe much greater and reaches 60 times.
Note 3:
It is usually not easy to change the basic tools used in the project, but if you do not insist that the PDF be processed only with this tool, you can test the execution speed using other tools such as “QuestPDF” or reporting tools, maybe it will be faster.
Note 4:
The “SamplePDF” used inside the “Test Code” mentioned above is a simple class for creating a test PDF that I wrote, but I didn’t attach the code to this post to avoid a long answer.
If I come up with a faster method in the coming days, I’ll add more details to my answer.
5