Comments (3)
To put another way, there are two lines in PDFParser:: ParseXrefFromXrefTable()
which construct ULong
values for assignment. They both look something like this...
inXrefTable[currentObject].mRivision = ULong((const char*)(entry+11));
ULong
is a typedef of BoxingBaseWithRW<unsigned long>
, and it can only be constructed from a const char*
by implicitly constructing a temporary std::string
variable. The typedef looks like this...
typedef BoxingBaseWithRW<unsigned long> ULong;
And the declaration of BoxingWithBaseRW<>
looks like this...
template <typename U, class Reader=STDStreamsReader<U>, class Writer=STDStreamsWriter<U> >
class BoxingBaseWithRW : public BoxingBase<U>
{
public:
BoxingBaseWithRW();
BoxingBaseWithRW(const U& inValue);
BoxingBaseWithRW(const BoxingBase<U>& inOther);
BoxingBaseWithRW(const BoxingBaseWithRW<U,Reader,Writer>& inOther);
BoxingBaseWithRW(const std::wstring& inReadFrom);
BoxingBaseWithRW(const std::string& inReadFrom);
BoxingBaseWithRW<U,Reader,Writer>& operator =(const U& inValue);
std::string ToString() const;
std::wstring ToWString() const;
};
Unfortunately, the single parameter const char*
version of the std::string
constructor std::string(const char*)
assumes that the input is a pointer to a NULL-terminated byte-string. This is a problem here, because...
Byte entry[20];
The entry
buffer allocated on the stack to hold the xref line is 20 bytes long... exactly big enough to hold just the entire contents of an xref line with no padding on the stack afterwards.
So, the line which constructs an ULong
constructs a temporary std::string
from offset 11 of the entry
until the next NUL byte encountered... which will definitely not be in the entry
buffer as the 5 character revision (offsets 11 to 15) is followed by a whitespace character (offset 16), followed by the in-use flag character (offset 17), followed by who knows what (CRLF, probably), and not guaranteed to be followed by any sort of NULL byte at all. The construction of the std::string
will read past the 5 characters of the revision number, and can potentially read past the end of the entry
buffer and into unmapped memory.
In this case, we know from the PDF specification that the revision number is 5 digits wide padded in front with zeros. So, a simple fix to read only those 5 bytes as the revision number is to explicitly construct a std::string
with the 2-parameter constructor that takes buffer pointer and buffer length.
inXrefTable[currentObject].mRivision = ULong( std::string((const char*)(entry+11),5) );
I believe this also more accurately captures the intent of parsing the xref line.
from pdf-writer.
Sounds perfect.
How about we take care of this. wanna run a pull request or should i update the code?
Gal.
from pdf-writer.
checked in commit per your suggested corrections - commit.
from pdf-writer.
Related Issues (20)
- Can not modify a document by creating a new form XObject and using it in one of the pages HOT 3
- [Question] - pdf to image HOT 1
- Question about attachments HOT 2
- some example projects in wiki are missing HOT 2
- Streams objects writing problem HOT 2
- Add watermark to PDF HOT 7
- Missing lib.obj file HOT 3
- Android Build Workflow HOT 3
- CIDSet encoding does not conform with ISO 19005-2:2011, ISO 19005-3:2012 (PDF/A-2b or PDF/A-3b) HOT 21
- annotations are lost with PDFDocumentCopyingContext::AppendPDFPageFromPDF HOT 3
- How to draw Bezier curves using PDF-Witer library? HOT 2
- Parse a screenplay into scene objects? HOT 2
- color emojis HOT 16
- Links are removed when documents are merged HOT 8
- Color inversion problem occurs when exporting images HOT 1
- infinite loop HOT 2
- Crash when WriteUsedFontsDefinitions HOT 17
- Publish to github releases without PDFWriterTesting HOT 4
- U3D support, 10 years later HOT 10
- `Segmentation fault (core dumped)` just for adding `PDFWriter pdfWriter` in the `h` file HOT 11
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pdf-writer.