empira / pdfsharp Goto Github PK

View Code? Open in Web Editor NEW

385.0 17.0 89.0 1.77 MB

PDFsharp and MigraDoc Foundation for .NET 6 and .NET Framework

Home Page: https://docs.pdfsharp.net/

License: Other

PowerShell 0.68% C# 99.32%

pdf-files pdf-generation

pdfsharp's Introduction

PDFsharp & MigraDoc 6

Version 6.1.0
Published 2024-05-28

This is a version of the PDFsharp project, the main project of PDFsharp & MigraDoc 6 with updates for C# 12 and .NET 6.

PDFsharp: Copyright (c) 2005-2024 empira Software GmbH, Troisdorf (Cologne Area), Germany MigraDoc: Copyright (c) 2001-2024 empira Software GmbH, Troisdorf (Cologne Area), Germany Published Open Source under the MIT License

For more information see docs.pdfsharp.net

Read this FIRST

Project documentation can be found on our DOCS site: https://docs.pdfsharp.net.

Note: PowerShell 7 is required to execute the PowerShell scripts that come with PDFsharp.

Download assets first

Assets like bitmaps, fonts, or PDF files are not part of the repository anymore. You must download them before compiling the solution for the first time. Use download-assets.ps1 in the dev folder to create assets folder required for some unit tests and needed by some projects.

Execute

.\dev\download-assets.ps1

Build the solution

dotnet build should build the solution without any warnings or errors.

You need the latest .NET SDK version installed
Please note that you need a git repository with at least one commit in order to build the PDFsharp solution.
Without a git repo with at least one commit, you will get an error message from GitVersion.MsBuild while building the solution. You can set a tag to define a valid version, e.g.: git tag v6.1.0 to make it build with a specific version number. Without tag, version 0.1.0 will be used.

Central package management

The solution uses central package management. Version numbers for all referenced packages are stored in file Directory.Packages.props in the src folder. When adding new packages, add the required version here.

Authors

PDFsharp and MigraDoc was mainly written by the following software developers. With support of a lot of community developers who found issues and fixed bugs.

Original PDFsharp developers

Stefan Lange
Niklas Schneider
David Stephensen

Original MigraDoc developers

Klaus Potzesny
Niklas Schneider
Stefan Lange

Current PDFsharp and MigraDoc developers

Stefan Lange
Thomas Hövel
Martin Ossendorf
Andreas Seifert

Libraries used by PDFsharp

The Core build of PDFsharp uses BigGustave to read PNG images. BigGustave was released into the public domain and does not restrict the MIT license used by PDFsharp.
Link to project repository: https://github.com/EliotJones/BigGustave

pdfsharp's People

Contributors

Stargazers

Watchers

Forkers

flensrocker chihsiliu1966 packdat simonproctor proximamonkey anna1-1 karayakar mahmudx interopxyz vdaular-dev plpt-ads-tec dungjk afarouk007 dongfo kubasgc rbrunken roydoy7 melandz nrajpoot1146 channwar stmen rwest500 devbox10 zzti odil-giyasov dssmith janousek meesoft marqueiv zaspard kds shizukanaqun apoiiy0n guohong365 parthi007 j3ro3nc sp4ceman bejik sude22 wieslawsoltes powercreek supergibbs aeindus erisonliang ivanutama jiahu brettam sunxiaotianmg jack89zhao workgroupengineering 18779758522 liuaoran ranvis mkmz zlangner danports faster-software zakuk audionysos mraipsec-mra ramzisahawneh1984 marqroldan arrgh1 fh-inway methuselah96 jeffreyabecker seanberrir rsmjreck gao-artur mattkrueger aiurerb patrickvdlinden njpaper jhwcn qplix-company th-joerger zinal001 dotnettreasury trucbinh bewqet docsvision leveragesoftwaregroup andreasgey xcarnotam popeandrewj oledid-forks

pdfsharp's Issues

Font Load/Display Issues

Hello,

Just started using PDFsharp for .NET6. Seems quite stable so far. I'm having two issues. The first I'm probably not doing something or doing it wrong. The second seems like it might be a bug. Possibly font related though.

Getting a warning when building with the "Arial" font called, possibly all fonts.
info: PdfSharp.Snippets.Font.FailsafeFontResolver[0] 'Arial' bold was substituted by a SegoeWP font.

if (Capabilities.Build.IsCoreBuild)
{
	GlobalFontSettings.FontResolver = new FailsafeFontResolver();
}
var heading = new XFont("Arial", 10, XFontStyleEx.Bold);

Vertical spacing of fonts doesn't seem to be correct. Spacing slightly favors the bottom.

int ypos = 0;
var xf16 = new XFont("Arial", 16, XFontStyleEx.Bold);
Testing(xf16, XBrushes.Teal, ref ypos);
var xf8 = new XFont("Arial", 8, XFontStyleEx.Regular);
Testing(xf8, XBrushes.PaleGreen, ref ypos);
var xf7 = new XFont("Arial", 7, XFontStyleEx.Regular);
Testing(xf7, XBrushes.LightBlue, ref ypos);
_linePosV = 50;

private static void Testing(XFont font, XBrush color, ref int yPos)
{
_gfx.DrawRectangle(color, 0, yPos, 180, font.Height);
_gfx.DrawString($"This Text is Size {font.Size} Font", font, XBrushes.Black, new XRect(0, yPos, 180, font.Height), XStringFormats.Center);
yPos += font.Height;
}

Let me know if you need any more info. Thank you for the help and the software.

PDF generation succeeds, but fails silently on specific page

Expected Behavior

PdfReader.Open to produce PdfDocument object with expected number of pages

Actual Behavior

PdfDocument produced, but only 3/34 pages generated with the 3rd page being an error
Removing the third page allows the pdf to generate all other pages

Steps to Reproduce the Behavior

Run the PDFsharp IssueSubmission console app
Navigate to {temp folder}\pdf-test
Open test-output.pdf

Issue.zip

MAUI Android: "Specified method is not supported." exception in PdfReader.Open()

Hi and thanks for the project.

Reporting an Issue Here

When I try to load a PDF on Android I get an exception (Specified method is not supported.) on PdfReader.Open().
I tried different PDF files in the Android Emulator and on a real device.
Opening, manipulating and saving works flawlessly on a MAUI Windows build.

The exception gets thrown in the following code in Android.Runtime.InputStreamInvoker, because BaseFileChannel is null:

public override long Length {
	get {
		if (BaseFileChannel != null)
			return BaseFileChannel.Size ();
		else
			throw new NotSupportedException ();
	}
}

This seems to be called from PdfSharp.Pdf.IO.Lexer..ctor.

Is this a bug in PDFsharp or MAUI?

Expected Behavior

PDF gets loaded.

Actual Behavior

NotSupportedException

Steps to Reproduce the Behavior

I created a completely new MAUI project (with net6 and net7) and added PDFsharp (6.0.0-preview-2), a PDF file as a raw resource / MauiAsset and the following code right at the end of the example OnCounterClicked() function in MainPage.xaml.cs:

using Stream stream = await FileSystem.OpenAppPackageFileAsync("test.pdf");
PdfDocument doc;
try {
	doc = PdfReader.Open(stream);
} catch(Exception ex) {
	Debug.WriteLine($"ERROR: {ex.Message}");
	return;
}
Debug.Assert(doc != null && doc.Pages.Count > 0);

VS 2022 version 17.6.2

Incorrectly combined path in NewFontResolver

Reporting an Issue Here

NewFontResolver uses Path.Combine to construct a path pointing to fonts installed in user profile folder.

PDFsharp/src/foundation/src/shared/src/PdfSharp.Snippets/Font/fontresolving/NewFontResolver.cs

Line 191 in b748631

 Path.Combine(Environment.GetFolderPath(Environment.SpecialFolder.UserProfile), "/.fonts"), 

However, the paths, which should be combined with the user profile folder are specified as absolute (because starts with /), resulting in incorrectly combined path in the final result. The final path will always consist only from the second path without the user profile folder as documentation for Path.Combine states:

If the one of the subsequent paths is an absolute path, then the combine operation resets starting with that absolute path, discarding all previous combined paths.

Expected Behavior

The combined path consists of user profile folder concatenated with .fonts folder.

Actual Behavior

The path consists of only of /.fonts.

Steps to Reproduce the Behavior

Precondition: On the linux machine, install some font to $HOME/.fonts folder. The font should not be available on any other place in the system.
Register the font family to NewFontResolver.Families.
Try to use the font when rendering the text.

Initial findings

Running a few tests with the new version i encountered some shortcomings I'd like to point out.
The method i used for these tests is very simple:

        private bool VerifyPdfCanBeImported(string filePath)
        {
            try
            {
                var document = PdfReader.Open(filePath, PdfDocumentOpenMode.Import);
                var documentCopy = new PdfDocument();
                foreach (var page in document.Pages)
                {
                    documentCopy.AddPage(page);
                }
                documentCopy.Save(Path.Combine(Path.GetTempPath(), "out.pdf"));
                return true;
            }
            catch (Exception ex)
            {
                var message = string.Format("{0}:{1}{2}{1}{1}", filePath, Environment.NewLine, ex);
                Console.WriteLine(message);
            }
            return false;
        }

FlateDecode skips PNG-Decoding
Exception:
PdfSharp.Pdf.IO.PdfReaderException: Unexpected token '' in PDF stream. The file may be corrupted. If you think this is a bug in PDFsharp, please send us your PDF file.
(the token can be anything, it is not limited to an empty string)
It seems there are only a few lines of code missing in method FlateDecode.Decode.
Between msOutput.Flush() and return msOutput.GetBuffer() these lines are missing:
```
    if (msOutput.Length >= 0)
    {
        msOutput.Capacity = (int)msOutput.Length;
        if (parms?.DecodeParms != null)
            return StreamDecoder.Decode(msOutput.GetBuffer(), parms.DecodeParms);
    }
```
These lines are present at the end of the file (in the #else block), as well as in LzwDecode.Decode
Improper handling of encrypted object-streams
Exception:
System.IO.InvalidDataException: The archive entry was compressed using an unsupported compression method.
This happens with encrypted documents containing object-streams (which are encrypted).
When the parser encounters an object stream in the PDF, it immediately tries to extract the objects from the stream.
Problem is, the object stream is still encrypted (decryption happens after all objects are read).
When the object stream is flate-encoded, the decoder tries to deflate encrpyted data, leading to the mentioned exception.
I was able to fix this by inserting the following lines at the beginning of PdfObjectStream.PdfObjectStream(PdfDictionary dict):
```
	// while objects inside an object-stream are not encrypted, the object-streams themself ARE !
	// 7.5.7, Page 47:
	// In an encrypted file (i.e., entire object stream is encrypted),
	// strings occurring anywhere in an object stream shall not be separately encrypted.
	var xrefEncrypt = _document._trailer.Elements[PdfTrailer.Keys.Encrypt] as PdfReference;
	if (xrefEncrypt != null && _document.SecurityHandler != null)
		_document.SecurityHandler.DecryptObject(dict);
```
The method PdfStandardSecurityHandler.DecryptObject had to be made internal for this to work.
You also have to make sure, you're not decrypting the streams again when the whole document is decrypted.
I simply used a flag in PdfStandardSecurityHandler and inserted the following code at the beginning of PdfStandardSecurityHandler.DecryptDictionary:
```
        // skip objects read from object streams (already done in PdfObjectStream.ctor())
        if (skipObjectStreams && (dict.Elements.GetName("/Type") == "/ObjStm" || dict.Reference != null && dict.Reference.Position < 0))
            return;    
```
The flag skipObjectStreams is set in PdfStandardSecurityHandler.DecryptDocument.
Improper calculation of decryption key (V5 R6)
Exception:
PdfSharp.Pdf.IO.PdfReaderException: The document seems to be not correctly encrypted. Could not verify P with Perms key.
In class PdfEncryptionV5 the encryption key is not calculated correctly.
In Method GetUserOwnerKeySalt there is this single line of code:
```
    return userOwnerValue.Skip(40).ToArray();
```
The issue could be fixed by replacing it with this line:
```
    return userOwnerValue.Skip(40).Take(8).ToArray();
```
Also encryption-related but merely a suggestion/question.
Some documents i encountered (which are AES-encrypted) seem to be incompletely/incorrectly encrypted.
When importing these documents, a CryptographicException is throw with the error message Padding is invalid and cannot be removed.
Looking at the supposedly encrypted data as a hex-dump i realized, the data was in fact NOT encrypted (i.e. it contained readable text).
I would therefore suggest surrounding the code in method DecryptForEnteredObject(ref byte[] bytes) (in classes PdfEncryptionV5 and PdfEncryptionV1To4) with a try -> catch -> ignore block.
That would make documents exhibiting this issue readable.
If there is something inherently wrong with the encryption of a document, the library will probably bail out somewhere else, but I'm not 100% sure about that as I've never encountered a document like that.

Attached is a zip containing a document for each point in this list for reproduction.
documents.zip

If it helps, i could create separate issues for each point as well as pull-requests.

Overflowing paragraph text in TextFrame is not fully visible in end of page

Expected Behavior

Last line in page should be fully visible (see attached pdf file).

Actual Behavior

Half of last line in end of first page is visible.

Steps to Reproduce the Behavior

Run attached solution

Code:

    static Document CreateDocument()
    {
        // Create a new MigraDoc document.
        var document = new Document { };
        // Add a section to the document.
        var section = document.AddSection();
        section.PageSetup.BottomMargin = 0;
        section.PageSetup.TopMargin = 0;
        for (int p = 1; p < 100; p++)
        {
            var textFrame = section.AddTextFrame();
            textFrame.RelativeVertical = RelativeVertical.Line;
            textFrame.WrapFormat.DistanceTop = Unit.FromCentimeter(0.11);
            textFrame.Height = Unit.FromCentimeter(0.47);
            var paragraph = textFrame.AddParagraph();
            paragraph.Format.Font.Name = "Times New Roman";
            paragraph.Format.Font.Size = 10;
            paragraph.AddText("Maksekorraldus 123456");
            paragraph.Format.SpaceBefore = 0;
            paragraph.Format.SpaceAfter = 0;
        }
        return document;
    }

texttruncatedinendofpage.zip
actual behaviour.pdf

Using PDFsharp-MigraDoc 6.0.0-preview-3

MigraDoc: AddImage nolonger supports usage of memorystream

Reporting an Issue Here

Expected Behavior

Adding an image to a paragraph (eg a bitmap as supported) from a memorystream instead of a file-path should be supported again.
(see: http://pdfsharp.net/wiki/MigraDoc_FilelessImages.ashx )

Actual Behavior

(German for "Image not read")

Steps to Reproduce the Behavior

As described in http://pdfsharp.net/wiki/MigraDoc_FilelessImages.ashx

Ximage FromStream Exception

Good Day!

I am always getting a "memorystream's internal buffer cannot be accessed".

What I do is I get a Base64String representation of an image then I convert it into a byte array. Then I put the byte array into a memory stream. Then I feed it into XImage and get an exception

I've been switching between this and the PDFSharpCore implementation. PDFSharpCore uses a different parameter like XImage.FromStream(() => ms) instead and it seems to be working on their side.

PDFsharp fails to open pdfs.

Here are two PDFs which fails to open with PDFsharp

Fails in PdfTrailer
Detaljer ARGO KOD rev.B.pdf

Gets Unexpected EOF
20-0193 - GR.10 BJÆLKER. ENDELIGE.pdf

code used to test this is from the sample collection.
works on other pdfs without issues

`
string filename1 = @"Detaljer ARGO KOD rev.B.pdf";
string filename2 = @"20-0193 - GR.10 BJÆLKER. ENDELIGE.pdf";

        // Open the input files
        PdfDocument inputDocument1 = PdfReader.Open(filename1, PdfDocumentOpenMode.Import);
        PdfDocument inputDocument2 = PdfReader.Open(filename2, PdfDocumentOpenMode.Import);

        // Create the output document
        PdfDocument outputDocument = new PdfDocument();

        // Show consecutive pages facing. Requires Acrobat 5 or higher.
        outputDocument.PageLayout = PdfPageLayout.TwoColumnLeft;

        XFont font = new XFont("Verdana", 10, XFontStyle.Bold);
        XStringFormat format = new XStringFormat();
        format.Alignment = XStringAlignment.Center;
        format.LineAlignment = XLineAlignment.Far;
        XGraphics gfx;
        XRect box;
        int count = Math.Max(inputDocument1.PageCount, inputDocument2.PageCount);
        for (int idx = 0; idx < count; idx++)
        {
            // Get page from 1st document
            PdfPage page1 = inputDocument1.PageCount > idx ?
              inputDocument1.Pages[idx] : new PdfPage();

            // Get page from 2nd document
            PdfPage page2 = inputDocument2.PageCount > idx ?
              inputDocument2.Pages[idx] : new PdfPage();

            // Add both pages to the output document
            page1 = outputDocument.AddPage(page1);
            page2 = outputDocument.AddPage(page2);

            // Write document file name and page number on each page
            gfx = XGraphics.FromPdfPage(page1);
            box = page1.MediaBox.ToXRect();
            box.Inflate(0, -10);
            gfx.DrawString(String.Format("{0} • {1}", filename1, idx + 1),
              font, XBrushes.Red, box, format);

            gfx = XGraphics.FromPdfPage(page2);
            box = page2.MediaBox.ToXRect();
            box.Inflate(0, -10);
            gfx.DrawString(String.Format("{0} • {1}", filename2, idx + 1),
              font, XBrushes.Red, box, format);
        }

        // Save the document...
        const string filename = "CompareDocument1_tempfile.pdf";
        outputDocument.Save(filename);

Crypt-filters attempt to modify documents opened for import

Reporting an Issue Here

Expected Behavior

I can open a PDF and import pages from it.

Actual Behavior

When the document is encrypted, the crypt-filters attempt to modify the Document's version by calling PdfDocument.SetRequiredVersion.
When the document is opened with PdfDocumentOpenMode.Import, an exception is thrown.
`Stack-trace:

    System.InvalidOperationException: The document cannot be modified.
   at PdfSharp.Pdf.PdfDocument.set_Version(Int32 value) 
   at PdfSharp.Pdf.PdfDocument.SetRequiredVersion(Int32 requiredVersion) 
   at PdfSharp.Pdf.Security.Encryption.PdfEncryptionV5..ctor(PdfStandardSecurityHandler securityHandler) 
   at PdfSharp.Pdf.Security.PdfStandardSecurityHandler.SetEncryptionFieldToV5() 
   at PdfSharp.Pdf.Security.PdfStandardSecurityHandler.PrepareForReading() 
   at PdfSharp.Pdf.IO.PdfReader.Open(Stream stream, String password, PdfDocumentOpenMode openMode, PdfPasswordProvider passwordProvider) 
   at PdfSharp.Pdf.IO.PdfReader.Open(String path, String password, PdfDocumentOpenMode openMode, PdfPasswordProvider provider) 
   at PdfSharp.Pdf.IO.PdfReader.Open(String path, PdfDocumentOpenMode openMode) 
   at PdfSharp.Tests.ReaderTests.VerifyPdfCanBeImported(String filePath)

Steps to Reproduce the Behavior

Run the attached issue-submission
It contains a PDF that causes the described behavior.

Issue.zip

MigraDoc RTF Document does not print Images (png)

Resources

Issue.zip

Version

I used the latest preview:
<PackageReference Include="PDFsharp" Version="6.0.0-preview-2" /> <PackageReference Include="PDFsharp-MigraDoc" Version="6.0.0-preview-2" />

Reporting an Issue Here

Expected Behavior

the logo, in this case "nehl-it-icon" should be printed in the pdf and in the rtf document, but it only works with PDF

Actual Behavior

The image is only visible in the pdf document. In Rtf it shows caracters but nothing else.

Steps to Reproduce the Behavior

Start the Project in the Issue.zip file and check the difference between word and pdf document

Regression: No appropriate font found for family name "Segoe UI".

Expected Behavior

No error like in 1.51.5185-beta.

Actual Behavior

 Message: 
System.InvalidOperationException : No appropriate font found for family name "Segoe UI".

  Stack Trace: 
XGlyphTypeface.GetOrCreateFrom(String familyName, FontResolvingOptions fontResolvingOptions)
XFont.Initialize()
XFont.ctor(String familyName, Double emSize, XFontStyleEx style, XPdfFontOptions pdfOptions)
XFont.ctor(String familyName, Double emSize, XFontStyleEx style)
FontHandler.FontToXFont(Font font)
ParagraphRenderer.get_CurrentFont()
ParagraphRenderer.CalcCurrentVerticalInfo()
ParagraphRenderer.InitFormat(Area area, FormatInfo previousFormatInfo)
ParagraphRenderer.Format(Area area, FormatInfo previousFormatInfo)
TopDownFormatter.FormatOnAreas(XGraphics gfx, Boolean topLevel)
<6 more frames...>
DocumentRenderer.PrepareDocument()
PdfDocumentRenderer.PrepareDocumentRenderer(Boolean prepareCompletely)
PdfDocumentRenderer.PrepareRenderPages()
PdfDocumentRenderer.RenderDocument()

Steps to Reproduce the Behavior

        var style = document.Styles[StyleNames.Normal];
        style.Font.Name = "Segoe UI";
        style.Font.Size = 10;

XImage.PointHeight returning width

Expected Behavior

XImage.PointHeight returning height...

Actual Behavior

XImage.PointHeight returns width...

Steps to Reproduce the Behavior

Get XImage.PointHeight from \src\foundation\src\PDFsharp\src\PdfSharp\Drawing\XImage.cs

PDFsharp/src/foundation/src/PDFsharp/src/PdfSharp/Drawing/XImage.cs

Line 942 in b748631

return _importedImage.Information.Width;

Creation of LoggerFactory in LogHost

When using PDFSharp and MigraDoc in Blazor WASM app the app crashes after starting to interact with the libraries methods.
This is due to the fact that ConsoleLogger is not supported in Blazor WASM app.

The only possible workaround is to set LogHost.LoggerFactory to custom instance before the instance is created automatically which causes the crash.

Besides the mentioned workaround it would be great to be able to set up PdfSharp and MigraDoc before the first usage with a builder class or just some static method. It could also be combined with the FontResolver setup.

Removing the ConsoleLogger and leaving just logging abstractions dependency would also reduce the app size which is always great when publishing Blazor WASM app.

I know the library is still in preview and that this is not the top priority issue, I just wanted to mention my use case scenario.
Besides this issue the library works great with Blazor WASM!
Thanks.

Cannot open Adobe Sign doc with unencrypted metadata.

PDFSharp 6.0.0-preview-3

Code to reproduce:
using var pdfReader = PdfReader.Open("c:\\temp\\test.pdf", PdfDocumentOpenMode.ReadOnly);

Exception:
Crypt filter value for PdfDictionary is set but CryptFilterDecodeParms are not initialized correctly.

Stack Trace:
at PdfSharp.Pdf.Security.PdfStandardSecurityHandler.GetCryptFilter(PdfDictionary dictionary)
at PdfSharp.Pdf.Security.PdfStandardSecurityHandler.DecryptStream(Byte[]& bytes, PdfDictionary dictionary)
at PdfSharp.Pdf.Security.PdfStandardSecurityHandler.DecryptDictionary(PdfDictionary dict, Boolean decryptObjectStream)
at PdfSharp.Pdf.Security.PdfStandardSecurityHandler.DecryptObject(PdfObject value)
at PdfSharp.Pdf.Security.PdfStandardSecurityHandler.DecryptDocument()
at PdfSharp.Pdf.IO.PdfReader.Open(Stream stream, String password, PdfDocumentOpenMode openMode, PdfPasswordProvider passwordProvider)
at PdfSharp.Pdf.IO.PdfReader.Open(String path, String password, PdfDocumentOpenMode openMode, PdfPasswordProvider provider)
at PdfSharp.Pdf.IO.PdfReader.Open(String path, PdfDocumentOpenMode openMode)

I unfortunately cannot provide a sample PDF as the failing documents as all have digital signatures and PII data. But here are the (I believe) relevant properties of the documents:

Header:
%PDF-1.7

Encrypt Dictionary:
<</P -1324/R 4/StrF/StdCF/CF<</StdCF<</Type/CryptFilter/CFM/AESV2/Length 16/EncryptMetadata false>>>>/Filter/Standard/Length 128/U({some bytes.})/V 4/StmF/StdCF/EncryptMetadata false/O({some bytes...})>>

Metadata:

<</Length 5087/Type/Metadata/Subtype/XML/Filter/Crypt>>stream
<?xpacket begin="ï»¿" id="W5M0MpCehiHzreSzNTczkc9d"?>
<x:xmpmeta xmlns:x="adobe:ns:meta/" x:xmptk="Adobe XMP Core 7.1.0">
   <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
      <rdf:Description rdf:about=""
            xmlns:xmp="http://ns.adobe.com/xap/1.0/"
            xmlns:pdf="http://ns.adobe.com/pdf/1.3/"
            xmlns:xmpMM="http://ns.adobe.com/xap/1.0/mm/"
            xmlns:dc="http://purl.org/dc/elements/1.1/">
         <xmp:CreatorTool>Acrobat Sign</xmp:CreatorTool>
         <xmp:ModifyDate>2023-07-26T10:05:17-07:00</xmp:ModifyDate>
         <xmp:MetadataDate>2023-07-26T10:05:17-07:00</xmp:MetadataDate>
         <pdf:Producer>Acrobat Sign</pdf:Producer>
         <xmpMM:DocumentID>uuid:82c91cd5-9c7c-29da-5522-ee39f67a036c</xmpMM:DocumentID>
         <xmpMM:InstanceID>uuid:f5b4b1f6-9727-2ee8-56ec-ee39f67a036c</xmpMM:InstanceID>
         <dc:format>application/pdf</dc:format>
      </rdf:Description>
   </rdf:RDF>
</x:xmpmeta>

PdfDictionary is set but CryptFilterDecodeParms are not initialized correctly

I am in the process of integrating the PDFsharp Net 6 to my Blazor but am testing it in a separate C# Windows Forms solution. I want to be able to open PDF Forms, fill out the form fields and save it.
While attempting to open the PDF form from a government institution with security restrictions, the program reports the following error:

"Crypt filter value for PdfDictionary is set but CryptFilterDecodeParms are not initialized correctly."

Details are:

System.InvalidOperationException
HResult=0x80131509
Message=Crypt filter value for PdfDictionary is set but CryptFilterDecodeParms are not initialized correctly.
Source=PdfSharp
StackTrace:
at PdfSharp.Pdf.Security.PdfStandardSecurityHandler.GetCryptFilter(PdfDictionary dictionary)
at PdfSharp.Pdf.Security.PdfStandardSecurityHandler.DecryptStream(Byte[]& bytes, PdfDictionary dictionary)
at PdfSharp.Pdf.Security.PdfStandardSecurityHandler.DecryptDictionary(PdfDictionary dict, Boolean decryptObjectStream)
at PdfSharp.Pdf.Security.PdfStandardSecurityHandler.DecryptObject(PdfObject value)
at PdfSharp.Pdf.Security.PdfStandardSecurityHandler.DecryptDocument()
at PdfSharp.Pdf.IO.PdfReader.Open(Stream stream, String password, PdfDocumentOpenMode openMode, PdfPasswordProvider passwordProvider)
at PdfSharp.Pdf.IO.PdfReader.Open(String path, String password, PdfDocumentOpenMode openMode, PdfPasswordProvider provider)
at PdfSharp.Pdf.IO.PdfReader.Open(String path, PdfDocumentOpenMode openMode)
at PDF_Testing_Ground.Form1.OpenClick(Object sender, EventArgs e) in C:\source\repos\PDF Testing Ground\PDF Testing Ground\Form1.cs:line 16
at System.Windows.Forms.Button.OnClick(EventArgs e)
at System.Windows.Forms.Button.OnMouseUp(MouseEventArgs mevent)
at System.Windows.Forms.Control.WmMouseUp(Message& m, MouseButtons button, Int32 clicks)
at System.Windows.Forms.Control.WndProc(Message& m)
at System.Windows.Forms.ButtonBase.WndProc(Message& m)
at System.Windows.Forms.Control.ControlNativeWindow.WndProc(Message& m)
at System.Windows.Forms.NativeWindow.Callback(IntPtr hWnd, WM msg, IntPtr wparam, IntPtr lparam)

The code used is:

`
using PdfSharp.Pdf;
using PdfSharp.Pdf.IO;

namespace PDF_Testing_Ground
{
public partial class Form1 : Form
{
public Form1()
{
InitializeComponent();
}

    private void OpenClick(object sender, EventArgs e)
    {
        string filename = @"c:\Text\i-130.pdf";
        PdfDocument pdfDocument = PdfReader.Open(filename, PdfDocumentOpenMode.ReadOnly);;
    }
}

}
`
The PDF file can be obtained from: https://www.uscis.gov/sites/default/files/document/forms/i-130.pdf

Getting Outlines.Count throws TargetInvocationException and Inner Exception: Destination Array expected.

If you think there is a bug in PDFsharp then please use the IssueSubmissionTemplate to make the issue replicable.
https://docs.pdfsharp.net/General/IssueReporting.html

Thanks.

Resources

The official project web site:
https://docs.pdfsharp.net/

The official peer-to-peer support forum:
http://forum.pdfsharp.net/

Reporting an Issue Here

Expected Behavior

When trying to get the bookmark count I expect to get some kind of PdfOutlineCollection object (maybe nested?). Instead I get an exception. See the attached picture.

Actual Behavior

System.Reflection.TargetInvocationException

Inner Exception 1:
Exception: Destination Array expected

Call stack:

System.Reflection.TargetInvocationException
  HResult=0x80131604
  Message=Exception has been thrown by the target of an invocation.
  Source=mscorlib
  StackTrace:
   at System.RuntimeMethodHandle.InvokeMethod(Object target, Object[] arguments, Signature sig, Boolean constructor)
   at System.Reflection.RuntimeConstructorInfo.Invoke(BindingFlags invokeAttr, Binder binder, Object[] parameters, CultureInfo culture)
   at PdfSharp.Pdf.PdfDictionary.DictionaryElements.CreateDictionary(Type type, PdfDictionary oldDictionary)
   at PdfSharp.Pdf.PdfDictionary.DictionaryElements.GetValue(String key, VCF options)
   at PdfSharp.Pdf.Advanced.PdfCatalog.get_Outlines()
   at PdfSharp.Pdf.PdfDocument.get_Outlines()
   at PdfTools.Models.PdfSharpHelper.MergePDFsWithBookmarks(String targetPath, String[] pdfs) in E:\GitHub\PdfTools\Models\PdfSharpHelper.cs:line 52
   at PdfTools.ViewModels.MainViewModel.MergePdfs() in E:\GitHub\PdfTools\ViewModels\MainViewModel.cs:line 40
   at PdfTools.ViewModels.MainViewModel.<get_MergePdfsCommand>b__14_0() in E:\GitHub\PdfTools\ViewModels\MainViewModel.cs:line 32
   at Utilities.CommandHandler.Execute(Object parameter) in E:\GitHub\Utilities\Utils\Commands.cs:line 31
   at MS.Internal.Commands.CommandHelpers.CriticalExecuteCommandSource(ICommandSource commandSource, Boolean userInitiated)
   at System.Windows.Controls.Primitives.ButtonBase.OnClick()
   at System.Windows.Controls.Button.OnClick()
   at System.Windows.Controls.Primitives.ButtonBase.OnMouseLeftButtonUp(MouseButtonEventArgs e)
   at System.Windows.UIElement.OnMouseLeftButtonUpThunk(Object sender, MouseButtonEventArgs e)
   at System.Windows.Input.MouseButtonEventArgs.InvokeEventHandler(Delegate genericHandler, Object genericTarget)
   at System.Windows.RoutedEventArgs.InvokeHandler(Delegate handler, Object target)
   at System.Windows.RoutedEventHandlerInfo.InvokeHandler(Object target, RoutedEventArgs routedEventArgs)
   at System.Windows.EventRoute.InvokeHandlersImpl(Object source, RoutedEventArgs args, Boolean reRaised)
   at System.Windows.UIElement.ReRaiseEventAs(DependencyObject sender, RoutedEventArgs args, RoutedEvent newEvent)
   at System.Windows.UIElement.OnMouseUpThunk(Object sender, MouseButtonEventArgs e)
   at System.Windows.Input.MouseButtonEventArgs.InvokeEventHandler(Delegate genericHandler, Object genericTarget)
   at System.Windows.RoutedEventArgs.InvokeHandler(Delegate handler, Object target)
   at System.Windows.RoutedEventHandlerInfo.InvokeHandler(Object target, RoutedEventArgs routedEventArgs)
   at System.Windows.EventRoute.InvokeHandlersImpl(Object source, RoutedEventArgs args, Boolean reRaised)
   at System.Windows.UIElement.RaiseEventImpl(DependencyObject sender, RoutedEventArgs args)
   at System.Windows.UIElement.RaiseTrustedEvent(RoutedEventArgs args)
   at System.Windows.UIElement.RaiseEvent(RoutedEventArgs args, Boolean trusted)
   at System.Windows.Input.InputManager.ProcessStagingArea()
   at System.Windows.Input.InputManager.ProcessInput(InputEventArgs input)
   at System.Windows.Input.InputProviderSite.ReportInput(InputReport inputReport)
   at System.Windows.Interop.HwndMouseInputProvider.ReportInput(IntPtr hwnd, InputMode mode, Int32 timestamp, RawMouseActions actions, Int32 x, Int32 y, Int32 wheel)
   at System.Windows.Interop.HwndMouseInputProvider.FilterMessage(IntPtr hwnd, WindowMessage msg, IntPtr wParam, IntPtr lParam, Boolean& handled)
   at System.Windows.Interop.HwndSource.InputFilterMessage(IntPtr hwnd, Int32 msg, IntPtr wParam, IntPtr lParam, Boolean& handled)
   at MS.Win32.HwndWrapper.WndProc(IntPtr hwnd, Int32 msg, IntPtr wParam, IntPtr lParam, Boolean& handled)
   at MS.Win32.HwndSubclass.DispatcherCallbackOperation(Object o)
   at System.Windows.Threading.ExceptionWrapper.InternalRealCall(Delegate callback, Object args, Int32 numArgs)
   at System.Windows.Threading.ExceptionWrapper.TryCatchWhen(Object source, Delegate callback, Object args, Int32 numArgs, Delegate catchHandler)
   at System.Windows.Threading.Dispatcher.LegacyInvokeImpl(DispatcherPriority priority, TimeSpan timeout, Delegate method, Object args, Int32 numArgs)
   at MS.Win32.HwndSubclass.SubclassWndProc(IntPtr hwnd, Int32 msg, IntPtr wParam, IntPtr lParam)
   at MS.Win32.UnsafeNativeMethods.DispatchMessage(MSG& msg)
   at System.Windows.Threading.Dispatcher.PushFrameImpl(DispatcherFrame frame)
   at System.Windows.Threading.Dispatcher.PushFrame(DispatcherFrame frame)
   at System.Windows.Application.RunDispatcher(Object ignore)
   at System.Windows.Application.RunInternal(Window window)
   at System.Windows.Application.Run(Window window)
   at System.Windows.Application.Run()
   at PdfTools.App.Main()

  This exception was originally thrown at this call stack:
    PdfSharp.Pdf.PdfOutline.Initialize()
    PdfSharp.Pdf.PdfOutline.InitializeChildren()
    PdfSharp.Pdf.PdfOutline.Initialize()
    PdfSharp.Pdf.PdfOutline.PdfOutline(PdfSharp.Pdf.PdfDictionary)

Inner Exception 1:
Exception: Destination Array expected.

Steps to Reproduce the Behavior

The pdf is something I've received externally and can't share it. It is produced with Python and TeX somehow. Maybe LaTeX or some other engine.

We strongly recommend using the IssueSubmissionTemplate to make sure we can replicate the issue.
https://docs.pdfsharp.net/General/IssueReporting.html

Blank space after footer if SpaceBefore is set

Expected Behavior

Footer text should appear in bottom of page

Actual Behavior

There is empty space between footer text and end of page

unexpectedblankspaceafterfootertext.pdf

Steps to Reproduce the Behavior

Run

emptyspaceafterfooterifSpaceBeforeIsUsed.zip

Code

        static Document CreateDocument()
        {
            var document = new Document { };
            var Section = document.AddSection();
            Section.PageSetup.PageWidth = "210mm";
            Section.PageSetup.PageHeight = "297mm";
            Section.PageSetup.TopMargin = 0;
            Section.PageSetup.BottomMargin = Unit.FromCentimeter(3);
            Section.PageSetup.FooterDistance = 0;
            Section.PageSetup.HeaderDistance = 0;
            for (int i = 0; i < 77; i++)
                Section.AddParagraph("paragraph " + i);
            var par1 = Section.Footers.Primary.AddParagraph("Unexpected blank space after footer text");
            par1.Format.SpaceBefore = Unit.FromCentimeter(2);
            par1.Format.SpaceAfter = 0;
            return document;
        }

Help to achieve pdf digital signature

Hi,

Is there any simple way to add a digital signature to a PdfDocument ?

I know the question is large, but we don't find any clue on that from the documention or the source code.

(For example PdfSignatureField constructor is internal and anyway we don't where to add it)

Thanks for any help

'page.AddDocumentLink’ The selected area is incorrect

Code

The selected areas of these two functions are incorrect, it seems that they are mirror symmetric

var pdf = new PdfDocument();
var page = pdf.AddPage();
page.AddDocumentLink(new PdfRectangle(new XRect(0, 0, page.Width, 40)), 2);

var graphics = XGraphics.FromPdfPage(page);
graphics.DrawRectangle(XBrushes.Black, new XRect(0, 0, page.Width, 40));
pdf.Save("D:\\test2.pdf");

Read real font from AcroForm

In current implementation the Font on an PdfTextField is "Curier New" hardcoded in PdfTextField class.
This causes to crash my automatically filled PDF because in the PDF the Font and Size is Predefined in this fields.

See here:
public XFont Font { get; set; } = new XFont("Courier New", 10.0);

Unexpected EOF error while reading PDF using PdfSharp

I am working on a project where I am using PdfSharp to modify PDF documents. My code reads PDFs from a byte array, modifies certain elements of the document, and then writes the modified document back to a byte array.

Here is the method where I am facing an issue:

private byte[] ModifyPDFDocument(ContentModifyDTO content)
{
byte[] fileData = content.FileData;
using (MemoryStream outStream = new MemoryStream())
{
using (MemoryStream inStream = new MemoryStream())
{
inStream.Write(content.FileData, 0, content.FileData.Length);
PdfDocument document = PdfReader.Open(inStream, PdfReadAccuracy.Moderate);

        string customProperty = "/uuid";
        if (document.Info.Elements.ContainsKey(customProperty))
        {
            document.Info.Elements.Remove(customProperty);
        }
        document.Info.Elements.Add(new KeyValuePair<string, PdfItem>(customProperty, new PdfString(content.DocumentId.ToString())));

        string customPropertyKey = "/ee5ff867-b433-45a8-aa6c-c69d56b7dde7";
        string customPropertyMetadata = @$"{{""DocumentId"":""{content.DocumentId}"",""VersionId"":""{content.VersionId}""}}";
        if (document.Info.Elements.ContainsKey(customPropertyKey))
        {
            document.Info.Elements.Remove(customPropertyKey);
        }
        document.Info.Elements.Add(
            new KeyValuePair<string, PdfItem>(
                customPropertyKey,
                new PdfString(customPropertyMetadata)
            )
        );

        document.Save(outStream);
        fileData = outStream.ToArray();
    }
}

return fileData;

}

When a user uploads an older PDF document, I get an "Unexpected EOF" error at this line:
PdfDocument document = PdfReader.Open(inStream, PdfReadAccuracy.Moderate);

The error message is as follows:
{
"type": null,
"message": "Unexpected EOF",
"details": [{"target": "Void Fill()", "message": "Unexpected EOF"}]
}

However, most of the documents are read without any problems. I am looking for a solution that can allow PdfSharp to read all PDFs without any errors. Alternatively, if there's a way to validate the PDFs before reading them with PdfSharp to avoid this error, I would appreciate any suggestions.

Thank you in advance for your help.

Encryption seems to break images that use indexed color

I have a PDF with an indexed color image of the company logo on every page. See attached for a similar document using an indexed version of the PDFsharp logo and which reproduces the issue (indexed-color-encryption.zip). The original logo, without converting to indexed color, does not cause any issues.

Using PDFsharp (Core) 6.0, I set an owner password:

public void ProcessFile(Stream inStream, Stream outStream) {
  var pdfDoc = PdfReader.Open(inStream);
  pdfDoc.SecuritySettings.OwnerPassword = "password";      
  pdfDoc.Save(outStream);
  pdfDoc.Close();
}

When opening the result, Acrobat reports that "An error exists on the page. Acrobat may not display the page correctly". pdfimages says there's something wrong with the palette:

$ pdfimages -list pdfsharplogo-indexed-output.pdf
Syntax Warning: Bad Indexed color space (lookup table string too short)
Syntax Error (101): Bad image parameters
page   num  type   width height color comp bpc  enc interp  object ID x-ppi y-ppi size ratio
--------------------------------------------------------------------------------------------

Under some encryption settings other than the default, Acrobat will actually render the image without any errors, but with the wrong colors. I've tested some other PDF readers and they sometimes render the PDF when Acrobat won't, but never with the right colors.

One thing I noticed is that the palette is not encrypted in the file (it's identical in the input and output files):

7 0 obj
[
/Indexed/DeviceRGB 255<000000FFFFFFFBD4DAFBDDE1FAE3E6FDE6E...>
]
endobj

However, I have done the same operation with other libraries (iText, IronPdf), and there the palette does seem to be encrypted, and there is no issue reading the file after. Is it possible that readers expect the palette to be encrypted and, when it isn't, end up trying to decrypt it and then use some nonsensical data as the palette?

Thanks in advance for any insights.

6.0.0-preview-3 font colour always Black

Hey guys, we're very happy with the new version - Thank you for continuing to support this great PDF library!

We have been testing 6.0.0 on windows and ubuntu and it works great.

The only trouble we have encountered is we have not been able to adjust the Font colours on the rendered PDF from the black default.
This is on 6.0.0-preview-3 and using the NewFontResolver.

Reporting an Issue Here

Expected Behavior

Font colour changes to yellow or green.

Actual Behavior

Font colour stays black.

Steps to Reproduce the Behavior

private const string VerdanaFontName = "Verdana";

private static readonly Font _greenFont = new() {
    Name = VerdanaFontName,
    Color = _green
};

public class VerdanaFontResolver : NewFontResolver {
    public byte[] GetFont(string faceName) {
        using var data = GeneralExtensions.GetEmbeddedStream(faceName, typeof(PdfMaker).Assembly);
        return data.AsByteArray();
    }
    public FontResolverInfo ResolveTypeface(string familyName, bool isBold, bool isItalic) {
        if (familyName == VerdanaFontName) {
            return new FontResolverInfo(VerdanaFontName);
        }
        throw new InvalidOperationException($"Font '{familyName}' is not supported.");
    }
}

var deetsPara = deetsFrame.AddParagraph();
deetsPara.AddFormattedText("Email Address\t\t", _greenFont);
var banner1 = section.AddA4TextBox(6.7, 1.75, 4);
banner1.AddParagraph().AddFormattedText("Total Due", _greenFont);

// RENDER

Is there a new way to set Font colours?

Issues with content-related classes

While adding new functionality to the library (in my case: Form-Flattening), i discovered several issues with content-related classes.
As some of the problems are with classes internal to the library, the IssueSubmission-Template is not well suited for this, but i do my best to explain the issues as detailed as possible.

What I'm trying to do:
AcroFields (to be more precise: the WidgetAnnotation's of the fields) in PDF-documents have Appearance-Streams that contain drawing-operators used to render the fields on a page.
In PdfSharp, these drawing-operators (and their operands) are represented by CObjects (and it's sub-classes).
In the process of flattening the AcroFields, I'm extracting these drawing-operators, adding some additional ones and render the result.
Somewhat simplified it looks like this:

protected virtual void RenderContentStream(PdfPage page, PdfDictionary streamDict, PdfRectangle rect)
{
	var stream = streamDict.Stream;
	var content = ContentReader.ReadContent(stream.UnfilteredValue);
	// start drawing at the position specified in rect
	var matrix = new XMatrix();
	matrix.TranslateAppend(rect.X1, rect.Y1);
	var matElements = matrix.GetElements();
	var matrixOp = OpCodes.OperatorFromName("cm");
	foreach (var el in matElements)
		matrixOp.Operands.Add(new CReal { Value = el });
	content.Insert(0, matrixOp);

	// Save and restore Graphics state
	content.Insert(0, OpCodes.OperatorFromName("q"));
	content.Add(OpCodes.OperatorFromName("Q"));
	// create new content
	var appendedContent = page.Contents.AppendContent();
	using (var ms = new System.IO.MemoryStream())
	{
		var cw = new ContentWriter(ms);
		foreach (var obj in content)
			obj.WriteObject(cw);
		appendedContent.CreateStream(ms.ToArray());
	}
}

The problems:

When reading the content (with ContentReader.ReadContent) the CParser does not set the CStringType for CStrings.
This results in an exception when writing the objects back to a stream with WriteObject.
The method CString.ToString does a
switch (CStringType)
And the getter of CStringType throws because _cStringType was never set:

        public CStringType CStringType
        {
            get => _cStringType ?? NRT.ThrowOnNull<CStringType>();
            set => _cStringType = value;
        }
        CStringType? _cStringType;

Fix in CParser.ParseObject:

                    case CSymbol.String:
                    case CSymbol.HexString:
                    case CSymbol.UnicodeString:
                    case CSymbol.UnicodeHexString:
                        s = new CString();
                        s.Value = _lexer.Token;
                        // CString.ToString() only supports CStringType.String   // added
                        s.CStringType = CStringType.String;                      // added
                        _operands.Add(s);
                        break;

Wondering why the flattened fields were not rendered as intended, (sometimes not visible at all, sometimes at weird positions), i discovered that the operators q and Q which I've added to the content were not present in the output-document.
The reason was found in COperator.WriteObject, which looks like this:

        internal override void WriteObject(ContentWriter writer)
        {
            if (_sequence != null)
            {
                int count = _sequence.Count;
                for (int idx = 0; idx < count; idx++)
                {
                    // ReSharper disable once PossibleNullReferenceException because the loop is not entered if _sequence is null
                    _sequence[idx].WriteObject(writer);
                }
                writer.WriteLineRaw(ToString());
            }
        }

This writes out the operator, but only if it has operands.
q and Q don't have operands, so they were left out.
Moving the line writer.WriteLineRaw(ToString()); out of the if-block fixed the issue.

CLexer.ScanHexadecimalString does not handle strings with odd length.
The standard-Lexer handles this btw. so it should be easy to fix.

CParser looses the last token.
Given the content-stream q (text) Tj Q when parsing this and re-writing the objects i should be able to reconstruct the input, but the last operator (Q in this case) is missing.

I created test-cases for these issues in the PdfSharp.Tests project so you should be able to reproduce them.
I needed to add the following lines to the PdfSharp.csproj file in order to access internal classes (like ContentWriter):

	<ItemGroup>
		<InternalsVisibleTo Include="$(AssemblyName).Tests" />
	</ItemGroup>

Test-cases (i added them to BasicTests.cs) :

        [Theory]
        [InlineData("q (text) Tj Q ")]  // this works
        [InlineData("q (text) Tj Q")]   // this doesn't
        public void Content_Can_Be_Parsed_And_Reconstructed(string contentString)
        {
            var contentBytes = Encoding.UTF8.GetBytes(contentString);

            var sequence = ContentReader.ReadContent(contentBytes);
            using var ms = new MemoryStream();
            var cw = new ContentWriter(ms);
            foreach (var obj in sequence)
            {
                obj.WriteObject(cw);
            }
            var newContent = new PdfContent(new PdfDictionary());
            newContent.CreateStream(ms.ToArray());

            // ContentWriter adds a newline after each operator
            newContent.Stream.ToString().Should().Be("q\n(text)Tj\nQ\n");
            // is this intended ? ToString() writes only operator-names but not the operands...
            var s = sequence.ToString();    // result: "qTjQ"
        }

        [Fact]
        public void Content_Can_Be_Manually_Constructed()
        {
            var sequence = new CSequence();
            var op = OpCodes.OperatorFromName("q");
            sequence.Add(op);
            op = OpCodes.OperatorFromName("Tj");
            op.Operands.Add(new CString() { CStringType = CStringType.String, Value = "text" });
            sequence.Add(op);
            op = OpCodes.OperatorFromName("Q");
            sequence.Add(op);

            using var ms = new MemoryStream();
            var cw = new ContentWriter(ms);
            foreach (var obj in sequence)
            {
                obj.WriteObject(cw);
            }
            var newContent = new PdfContent(new PdfDictionary());
            newContent.CreateStream(ms.ToArray());

            // ContentWriter adds a newline after each operator
            newContent.Stream.ToString().Should().Be("q\n(text)Tj\nQ\n");
        }

        [Theory]
        [InlineData("<7465787420> Tj")]  // this works
        [InlineData("<746578742> Tj")]   // this doesn't
        public void Can_Parse_Hex_String_With_Odd_Length(string contentString)
        {
            var contentBytes = Encoding.UTF8.GetBytes(contentString);

            var sequence = ContentReader.ReadContent(contentBytes);
            using var ms = new MemoryStream();
            var cw = new ContentWriter(ms);
            foreach (var obj in sequence)
            {
                obj.WriteObject(cw);
            }
            var newContent = new PdfContent(new PdfDictionary());
            newContent.CreateStream(ms.ToArray());

            // ContentWriter adds a newline after each operator
            newContent.Stream.ToString().Should().Be("(text )Tj\n");
        }

One last thing, but not content-related:
I am also working on an API to enable the creation of AcroForms from scratch.
In doing so i encountered a possible Font-related issue.
It seems, PdfSharp is always creating Font-Subsets when rendering text.
While this is great for saving space for text rendered on a page, it is an issue for AcroFields, where users may change the text.
For example when i create an PdfTextField, set the value to Bob and create an appearance that renders the text "Bob", PdfSharp creates a Font-subset with only the glyphs for B, o and b present.
When opening the Pdf, i'm unable to change the value to say Peter, because the required glyphs are missing in the font.
Is there an option in PdfSharp that allows to embed a font in full and not as a subset ?
Am i missing something here ?
(my workaround is to render all glyphs from the font to an XForm positioned outside the page, but this is obviously not optimal)

Anyway, thanks for a great library !

PdfSharp-Migradoc-GDI 6.0.0 with .net 8.0?

Will you release a Nuget Package that works with .net 8.0. The current version 6.0.0 does not work with .net 8.0. In my project I am using .net core 8.0 due to components that require it.

Html renderer

Hello,
I would like to convert an html string to pdf using your library for a .net maui app. I believe that you do not support html to pdf conversion directly. What would be a preferred parser to use for doing that? Do you have plans to support html to pdf in your library?

Footer overlap using negative SpaceBefore and minimum line height

FooterOverlapswithpagecontent.zip

Expected Behavior

Footer text should not overlap with page content

Actual Behavior

Footer text overlaps with page content

footeroverlapswithpagefontents.pdf

Steps to Reproduce the Behavior

Run attached solution.

Code:

        static Document CreateDocument()
        {
            var document = new Document { };
            var Section = document.AddSection();
            Section.PageSetup.PageWidth = "210mm";
            Section.PageSetup.PageHeight = "297mm";
            Section.PageSetup.TopMargin = 0;
            Section.PageSetup.BottomMargin = Unit.FromCentimeter(2);
            for (int i = 0; i < 77; i++)
            {
                var par = Section.AddParagraph("paragraph " + i);
                par.Format.SpaceBefore = Unit.FromCentimeter(-0.5);
                par.Format.LineSpacingRule = LineSpacingRule.Exactly;
                par.Format.LineSpacing = "1cm";
            }
            var par1 = Section.Footers.Primary.AddParagraph("Footer overlaps");
            par1.Format.SpaceBefore = Unit.FromCentimeter(1);
            return document;
        }

Null check throws before generating DataMatrix Image

PDFsharp/src/foundation/src/PDFsharp/src/PdfSharp/Drawing.BarCodes/CodeDataMatrix.cs

Line 156 in e973974

if (MatrixImage == null)

CodeDataMatrix.Render (called internally by XGraphics.DrawMatrixCode) checks for null to decide if it should call DataMatrixImage.GenerateMatrixImage, but the NullabilityHelper throws System.InvalidOperationException as a result of the Null Check.

I don't see a way to generate the image from outside the Render call, as all the related properties and methods are protected.

Image has no valid type error

I am using the pdfsharp-migradoc 6.0.0-preview-2 and facing the following problem

I have the image as a base64 string and I am using the following code to add it to the document.

private static void AddLabels(Document document, IList<string> images)
        {
            var section = document.LastSection;

            // Add an image
            for (int i = 0; i < images.Count; ++i)
            {
                if (i > 0)
                {
                    section.AddPageBreak();
                }
                var image = section.AddImage("base64:" + images[i]);
                image.LockAspectRatio = true;
                image.Height = "4in";
            }

        }

AddLabels(document, images);
 var renderer = new PdfDocumentRenderer
            {
                // Associate the MigraDoc document with a renderer.
                Document = document,
                PdfDocument = new PdfDocument()
            };
            renderer.RenderDocument();
using (var stream = new MemoryStream())
            {
                string filename = "C:\\HelloMigraDoc.pdf";
                renderer.PdfDocument.Save(filename);
            }

And this is the result I get.

This code was working fine and was giving the image in the pdf with pdfsharp-migradoc (1.50)

Is there anything that I am missing here with the new version of pdfsharp-migradoc

PDFsharp seems to be incompatible with .NET 6

On the page https://docs.pdfsharp.net/General/Overview/Port-to-v6.0.html you can see that PDFSharp supports .NET 6, but there is the following warning:

Project.csproj: [NU1701] Package 'PDFsharp 1.50.5147' was restored using '.NETFramework,Version=v4.6.1, .NETFramework,Version=v4.6.2, .NETFramework,Version=v4.7, .NETFramework,Version=v4.7.1, .NETFramework,Version=v4.7.2, .NETFramework,Version=v4.8, .NETFramework,Version=v4.8.1' instead of the project target framework 'net6.0-windows7.0'. This package may not be fully compatible with your project.

Project.csproj has target framework net6.0-windows.

* Please use the Support forum to ask Support questions *

Please use the GitHub Issues only for issues like

Bugs in PDFsharp and/or MigraDoc
Bug fixes
Improvement suggestions

Please use the support forum to ask "How do I" questions and such things:
https://forum.pdfsharp.net/
Please consult the forum rules and try to include all information that is required to efficiently answer your question.

Reading null Tag on PdfPage throws exception

Trying to read the Tag value on a PDFPage throws an exception if Tag is null. As this object is for user use, and stated as not being used by PDFSharp, it's value shouldn't be enforced. It's quite possible to want to optionally assign a value to Tag, and later check if that value is null, but doing so causes an exception.

PDFsharp/src/foundation/src/PDFsharp/src/PdfSharp/Pdf/PdfPage.cs

Line 79 in b748631

get => _tag ?? NRT.ThrowOnNull<object>();

Dependency on PowerShell 7.* prevent building on Windows 10

I just cloned the repository, and following the readme.md I tried to run the .\dev\download-assets.ps1 on my dev-box: VS2022 running on Windows 10.

This was the result:

This dependency on PowerShell prevents developers running Windows 10 from building the master branch of PDFsharp.

6.0.0-preview-4 - unable to set /Metadata at /Catalog level

Hi,

In 6.0.0-preview-4 I'm unable to set XMP metadata.
I add own /Metadata dictionary, it's written to the output PDF, but finally PdfSharp links own indirect object:

1 0 obj
<<
/Type/Catalog
/Pages 2 0 R
/Metadata 9 0 R
>>
endobj
...
8 0 obj
<<
/Length 340
/Type/Metadata
/Subtype/XML
>>
stream
<?xpacket begin="Ã¯Â»Â¿" id="W5M0MpCehiHzreSzNTczkc9d"?>
<x:xmpmeta xmlns:x="adobe:ns:meta/" x:xmptk="Adobe XMP Core 6.1.10">
  <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
    <rdf:Description rdf:about=""
        xmlns:xmp="http://ns.adobe.com/xap/1.0/"
      xmp:CreatorTool="Test"/>
  </rdf:RDF>
</x:xmpmeta>
endstream
endobj
...
9 0 obj
<<
/Type/Metadata
/Subtype/XML
/Length 1469
>>
stream
<?xpacket begin="ï»¿" id="W5M0MpCehiHzreSzNTczkc9d"?>
  <x:xmpmeta xmlns:x="adobe:ns:meta/" x:xmptk="3.1-701">
    <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
      <rdf:Description rdf:about="" xmlns:pdf="http://ns.adobe.com/pdf/1.3/">
        <pdf:Producer>PDFsharp 6.0.0-preview-4 under Microsoft Windows 10.0.19045</pdf:Producer><pdf:Keywords></pdf:Keywords>
      </rdf:Description>
      <rdf:Description rdf:about="" xmlns:dc="http://purl.org/dc/elements/1.1/">
        <dc:title><rdf:Alt><rdf:li xml:lang="x-default"></rdf:li></rdf:Alt></dc:title>
        <dc:creator><rdf:Seq><rdf:li></rdf:li></rdf:Seq></dc:creator>
        <dc:description><rdf:Alt><rdf:li xml:lang="x-default"></rdf:li></rdf:Alt></dc:description>
      </rdf:Description>
      <rdf:Description rdf:about="" xmlns:xmp="http://ns.adobe.com/xap/1.0/">
        <xmp:CreatorTool>PDFsharp 6.0.0-preview-4 (www.pdfsharp.net)</xmp:CreatorTool>
        <xmp:CreateDate>0001-01-01T00:00:00.0000000</xmp:CreateDate>
        <xmp:ModifyDate>0001-01-01T00:00:00.0000000</xmp:ModifyDate>
      </rdf:Description>
      <rdf:Description rdf:about="" xmlns:xmpMM="http://ns.adobe.com/xap/1.0/mm/">
        <xmpMM:DocumentID>uuid:cd4cf696-81b0-4f4a-8296-ad24a0482084</xmpMM:DocumentID>
        <xmpMM:InstanceID>uuid:97a97879-905c-4417-86f7-8ac6a426854a</xmpMM:InstanceID>
      </rdf:Description>
    </rdf:RDF>
  </x:xmpmeta>
<?xpacket end="w"?>                
endstream
endobj

In 1.50.5147 it was possible.

Here is the source code allowing to reproduce the issue:
https://github.com/podprad/misc_public/tree/main/playgrounds/csharp/PDFSharpMetaIssue

Pasted code:

<Project Sdk="Microsoft.NET.Sdk">

    <PropertyGroup>
        <OutputType>Exe</OutputType>
        
        <TargetFramework>net6.0</TargetFramework>
<!--        <TargetFramework>net48</TargetFramework>-->
    </PropertyGroup>

    <ItemGroup>
      <None Remove="Pdf14Simplest.pdf" />
      <Content Include="Pdf14Simplest.pdf">
        <CopyToOutputDirectory>PreserveNewest</CopyToOutputDirectory>
      </Content>
    </ItemGroup>

    <ItemGroup>
      <PackageReference Include="PDFsharp" Version="6.0.0-preview-4" Condition="'$(TargetFramework)' == 'net6.0'"/>
      <PackageReference Include="PDFsharp" Version="1.50.5147" Condition="'$(TargetFramework)' == 'net48'"/>
    </ItemGroup>

</Project>

namespace PDFSharpMetaIssue
{
    using System.Text;
    using PdfSharp.Pdf;
    using PdfSharp.Pdf.Advanced;

    public static class Program
    {
        public static void Main()
        {
            const string XmpToSet = @"<?xpacket begin=""ï»¿"" id=""W5M0MpCehiHzreSzNTczkc9d""?>
<x:xmpmeta xmlns:x=""adobe:ns:meta/"" x:xmptk=""Adobe XMP Core 6.1.10"">
  <rdf:RDF xmlns:rdf=""http://www.w3.org/1999/02/22-rdf-syntax-ns#"">
    <rdf:Description rdf:about=""""
        xmlns:xmp=""http://ns.adobe.com/xap/1.0/""
      xmp:CreatorTool=""Test""/>
  </rdf:RDF>
</x:xmpmeta>";

            var xmpBytes = new UTF8Encoding(false).GetBytes(XmpToSet);

            var filePath = "Pdf14Simplest.pdf";

            using (var document = PdfSharp.Pdf.IO.PdfReader.Open(filePath))
            {
                var catalog = document.Internals.Catalog;

                if (catalog.Elements.TryGetValue("/Metadata", out var oldMetadata))
                {
                    if (oldMetadata is PdfReference oldMetadataReference)
                    {
                        catalog.Elements.Remove("/Metadata");
                        document.Internals.RemoveObject(oldMetadataReference.Value);
                    }
                }

                var newMetadata = new PdfDictionary();
                newMetadata.CreateStream(xmpBytes);
                newMetadata.Elements.Add("/Type", new PdfName("/Metadata"));
                newMetadata.Elements.Add("/Subtype", new PdfName("/XML"));

                document.Internals.AddObject(newMetadata);

                catalog.Elements.Add("/Metadata", newMetadata.Reference);

                document.Save("Output.pdf");
            }
        }
    }
}

Suggestion: PdfDocumentRenderer.RenderDocument() should support cancel/stop

Hi,

Because when creating a large PDF, the PdfDocumentRenderer.RenderDocument() may take several minutes to execute, it's better to add a API to allow cancelling/stopping it.

SpaceBefore property value ignored in first paragraph

spacebeforeignored.pdf

Expected Behavior

Text should appear 20 cm below in page

Actual Behavior

Text appears immediately at start of page (see attached pdf file)

Steps to Reproduce the Behavior

Run attached solution

Code:

        static Document CreateDocument()
        {
            var document = new Document { };
            var section = document.AddSection();
            section.PageSetup.TopMargin = 0;
            var par = section.AddParagraph("test");
            par.Format.SpaceBefore = "20cm";
            return document;
        }

SpaceBeforePropertyValueIgnoredForFirstParagraph.zip

spacebeforeignored.pdf

Efficiently Merging multiple PDFs

I trying to implement a method to merge multiple PDFs together. Is there any best approach for handling this? I currently have something like this:

using PdfDocument outputPdf = new PdfDocument();
foreach (var docId in documentIds)
{
    Stream incomingPdfFileStream = await _docService.DownloadDocument(docId);

    var sourcePdf = PdfReader.Open(incomingPdfFileStream, PdfDocumentOpenMode.Import);
    for (int i = 0; i < sourcePdf.PageCount; i++) {
        outputPdf.AddPage(sourcePdf.Pages[i]);
    }
}

Stream completePdfFileStream = new MemoryStream();
outputPdf.Save(completePdfFileStream);
return completePdfFileStream;

Any thoughts or recommendations is greatly appreciated!

Struggling with installing and use the package

Hello there, i'm using .NET 6 and wpf 6.0.20 and i'm struggling with installing the package through nuget manager :

And when i try to use it i've got errors like this one :

System.NotSupportedException : 'No data is available for encoding 1252. For information on defining a custom encoding, see the documentation for the Encoding.RegisterProvider method.'

The lines that are throwing are these one :

XGraphics gfx = XGraphics.FromPdfPage(pdfPage);
gfx.DrawString("Test", new XFont("Arial", 20, XFontStyle.Bold), XBrushes.Black, new XPoint(100, 50));

Is the version on the nuget manager outdated ? So should i build localy the package with the "6.0.0-preview-3" provided in the release tab here ? I never installed manualy a package and i'm a bit lost...

Also the doc says that this version is for testing purposes and that i should use "core build" but where can i find it ?

Regards, Bill.

Document stays in memory when using PdfReader.Open

using (PdfDocument pdfDocument = PdfReader.Open("file.pdf", PdfDocumentOpenMode.Import))
{
pdfDocument.Dispose();
}

When I run this code, the pdfDocument is never cleard from memory. The file.pdf is about 800MB.

PdfSharp.Pdf.IO.PdfReaderException:“Unexpected token 'e' in PDF stream. The file may be corrupted.

file

pdf

code

var pdf1 = PdfReader.Open("files/file3.pdf");

Dot in AcroFormField Name (Textfield) freaks field

Hi,

if a have a Acro Form named for example "function_A_DoSth(mytext.txt)", PdfSharp splitx textfield on "dot" so in AcroForm Names i only can see "function_A_DoSth(mytext". I tried to debug but I dont find why this happens.
Do you have an Idea?

PdfSharp and flexbox(css level3)

Greetings! When using this library, I get the following error. I understand that PDFsharp does not support css higher than level2 and html higher than 4.01 . Do I understand correctly that versions higher are not supported by this program or do I need to upgrade to a later version? I want to use a flexbox-based build, but I get an error, Please help me

Opening an existing PDF file for import ignores two of its pages

I want to combine two existing PDFs into a new document. When I try this, two of the pages are missing. I can see that the document returned by PdfReader.Open() has only two pages instead of the expected four.

I have followed all the steps relating to producing Issue.zip which contains a working example of the problem. The only step that I couldn't fully follow was "4. Send us the zip file" as I see no email addresses or form upload options anywhere. Please let me know how I'm meant to send the file and I will do so.

Reporting an Issue Here

Expected Behavior

Opening an existing PDF file with four pages for import should result in a document with all four pages.

Actual Behavior

The document only contains the first two pages. In the old version of PDFSharp,. this failed with the error "Invalid predictor in array". The new version doesn't crash, but it doesn't include all pages either.

Steps to Reproduce the Behavior

I have Issue.zip all ready to go. Tell me where you want it.

Project cannot be compiled

If you think there is a bug in PDFsharp then please use the IssueSubmissionTemplate to make the issue replicable.
https://docs.pdfsharp.net/General/IssueReporting.html

Thanks.

Resources

The official project web site:
https://docs.pdfsharp.net/

The official peer-to-peer support forum:
http://forum.pdfsharp.net/

Reporting an Issue Here

Hello，First of all, thanks for providing such a good class library.
But,【PdfSharp 6.0.0】Source Code Suspected lack of resource files caused the project to fail compilation.
Can you help me take a look?

Missing Added text

I am not sure if this is an issue or I am just missing something.

I have people that add text to Pdfs using Adobe with the fill and sign feature and emailed directly from Adobe to where I am pulling Pdfs from without saving pdf to disk. When I open the pdf using 6.0-preview3 with PdfReader openMode Import the added text is not in the pdf.

If I open the Pdf with say edge pdf viewer the text is shown.
If the user saves the file to disk before emailing it, it will work as it should (but it slows production too much is excuse I am told).

So I was just wondering am I missing something here? Or would it be a bug?

Logging Feature Deployment Issues

In addition to the Console LoggerFactory problem in Issue #26, adding Microsoft.Extensions.Logging.Console and Microsoft.Extensions.Logging as public dependencies causes a significant headache for deployment using an installer application created by WiX. These dependencies form the root of a large tree of dependencies, over a dozen assemblies need to be added to the installer. Our application does not need PDFSharp to do any logging at all, especially when deployed on a customer machine.

Is it possible to make the logging feature private using the PrivateAssets feature? Then you could use logging for internal development and testing purposes, but not require the end consumer to think about logging. This would remove all dependencies for the PDFSharp NuGet, which would be cleaner.

Thanks for considering this!

Borders serializer crashes

Namespace: MigraDoc.DocumentObjectModel
Class: Borders
Method: internal void Serialize(Serializer serializer, Borders? refBorders)

DiagonalDown is checked for null, but DiagonalUp is used then, which leads to a null ref exception:

internal void Serialize(Serializer serializer, Borders? refBorders)
{
...
if (!Values.DiagonalDown.IsValueNullOrEmpty())
{
Values.DiagonalUp!.Serialize(serializer, "DiagonalDown", null);
}
if (!Values.DiagonalUp.IsValueNullOrEmpty())
{
Values.DiagonalUp!.Serialize(serializer, "DiagonalUp", null);
}

XGraphics.DrawMatrixCode ignores XBrush parameter

Expected Behavior

1D codes (derived from BarCode) use provided XBrush when rendering (e.g. via XGraphics.DrawBarCode).
2D codes (derived from MatrixCode) use provided XBrush when rendering (e.g. via XGraphics.DrawMatrixCode).

Actual Behavior

1D codes correctly use provided XBrush when rendering.
2D codes use XBrushes.Black when rendering.

Steps to Reproduce the Behavior

MCVE Issue.zip

CodeOmr not usable caused by a specific hack

It looks like a specific hack went into the generic production code.
File src\foundation\src\PDFsharp\src\PdfSharp\Drawing.BarCodes\CodeOmr.cs

#if true
            // HACK: Project Wallenwein: set LK
            value |= 1;
            _synchronizeCode = true;
#endif

Best Regards,
Steffen