Giter Club home page Giter Club logo

Comments (3)

TheGS avatar TheGS commented on July 2, 2024

To put another way, there are two lines in PDFParser:: ParseXrefFromXrefTable() which construct ULong values for assignment. They both look something like this...

inXrefTable[currentObject].mRivision = ULong((const char*)(entry+11));

ULong is a typedef of BoxingBaseWithRW<unsigned long>, and it can only be constructed from a const char* by implicitly constructing a temporary std::string variable. The typedef looks like this...

typedef BoxingBaseWithRW<unsigned long> ULong;

And the declaration of BoxingWithBaseRW<> looks like this...

template <typename U, class Reader=STDStreamsReader<U>, class Writer=STDStreamsWriter<U> >
class BoxingBaseWithRW : public BoxingBase<U>
{
public:
    BoxingBaseWithRW();
    BoxingBaseWithRW(const U& inValue);
    BoxingBaseWithRW(const BoxingBase<U>& inOther);
    BoxingBaseWithRW(const BoxingBaseWithRW<U,Reader,Writer>& inOther);
    BoxingBaseWithRW(const std::wstring& inReadFrom);
    BoxingBaseWithRW(const std::string& inReadFrom);

    BoxingBaseWithRW<U,Reader,Writer>&  operator =(const U& inValue);

    std::string ToString() const;
    std::wstring ToWString() const;
};

Unfortunately, the single parameter const char* version of the std::string constructor std::string(const char*) assumes that the input is a pointer to a NULL-terminated byte-string. This is a problem here, because...

Byte entry[20];

The entry buffer allocated on the stack to hold the xref line is 20 bytes long... exactly big enough to hold just the entire contents of an xref line with no padding on the stack afterwards.

So, the line which constructs an ULong constructs a temporary std::string from offset 11 of the entry until the next NUL byte encountered... which will definitely not be in the entry buffer as the 5 character revision (offsets 11 to 15) is followed by a whitespace character (offset 16), followed by the in-use flag character (offset 17), followed by who knows what (CRLF, probably), and not guaranteed to be followed by any sort of NULL byte at all. The construction of the std::string will read past the 5 characters of the revision number, and can potentially read past the end of the entry buffer and into unmapped memory.

In this case, we know from the PDF specification that the revision number is 5 digits wide padded in front with zeros. So, a simple fix to read only those 5 bytes as the revision number is to explicitly construct a std::string with the 2-parameter constructor that takes buffer pointer and buffer length.

inXrefTable[currentObject].mRivision = ULong( std::string((const char*)(entry+11),5) );

I believe this also more accurately captures the intent of parsing the xref line.

from pdf-writer.

galkahana avatar galkahana commented on July 2, 2024

Sounds perfect.

How about we take care of this. wanna run a pull request or should i update the code?

Gal.

from pdf-writer.

galkahana avatar galkahana commented on July 2, 2024

checked in commit per your suggested corrections - commit.

from pdf-writer.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.