pomax / a-binary-parser-generator Goto Github PK
View Code? Open in Web Editor NEWThis project aims to create a tool that can turn a spec file into a parser skeleton for binary data files such as OpenType fonts, PNG images, etc.
This project aims to create a tool that can turn a spec file into a parser skeleton for binary data files such as OpenType fonts, PNG images, etc.
The test/Opentype/validator.html page blew up on my favorite wonky TTF font file, but I finally tracked it down to missing tables in the .spec file. Saw console.log's of:
VM462:101 reading undefined structure at GLOBAL offset 1043482.
VM462:125 readStructure(): f is 'undefined'
VM462:128 Yes, we saw f get returned as 'undefined'
VM462:131 Uncaught TypeError: undefined is not a function
where I had added the "Yes, we saw" logging. It might be nice if readStructure() could better handle bad returns from the "new Function()" call.
After laboriously tracking down what data was being worked on, found that Opentype.spec referenced an undefined collection, viz.
890: RELATIVE USHORT OFFSET ClassDef TO _ClassDefTable FROM START
In lieu of (or until) a PR, here's the fix that got me past that point:
--- OpenType.spec.original 2014-11-27 13:33:39.357960200 -0600
+++ OpenType.spec 2014-11-28 00:08:11.807579700 -0600
@@ -767,6 +767,25 @@
}
}
+ // used ????
+ Collection _ClassDefTable {
+ USHORT ClassFormat
+ if(ClassFormat==1) {
+ USHORT StartGlyph
+ USHORT GlyphCount
+ USHORT[GlyphCount] ClassValueArray
+ }
+ if(ClassFormat==2) {
+ USHORT ClassRangeCount
+ Collection _ClassRangeRecord {
+ USHORT Start
+ USHORT End
+ USHORT Class
+ }
+ _ClassRangeRecord[RangeCount] ClassRangeRecord
+ }
+ }
+
// ==========================================
// LookupType 1: Single Substitution Subtable
// ==========================================
I can well believe the collection definition ought to go somewhere else...
The RegExp for this is currently
search = "([ \\t]*)([\\w_]+)\\[([^\\.\\s]+(\\s*[\\+\\-\\*\\/]\\s*[^)]+)*)\\]\\s*(\\w+)($|\s*[^O])";
but this should be
search = "([ \\t]*)([\\w_]+)\\[([^\\.\\s]+(\\s*[\\+\\-\\*\\/]\\s*[^\\]]+)*)\\]\\s*(\\w+)($|\s*[^O])";
difference is in matching until ^] rather than ^)
not all data types are created equal, and some binary files have a strict endian policy, whereas others do not. This needs some kind of spec indicator, probably the easiest being:
[BIG|LITTLE] ENDIAN
with 'machine ordering' if left off.
several opentype tables are empty collections at the moment, and should be filled in.
Does it make sense to add an include for sub specs, such as the CFF spec inside the OpenType spec?
loading Ubuntu Mono works fine for GSUB, Adobe Garamond Pro causes parse errors in GSUB on LangSysTable
CFF and PNG required additional data unpacking, using more than just byte sequence reading. Find out what a good way is to get this to work without introducing a specific programming language
Hey,
this tool is awesome, thanks for making this.
One suggestion would be to make the file illustrations self-describing. So allow each field in the file spec to have a comment explaining what it is. That could then optionally show up in the parser output, so a programmer unfamiliar with that file spec could even intuitively understand the file just by looking at the output of the parser.
I guess this is more of a feature request than a bug... sorry if I filed this in the wrong spot.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.