Giter Club home page Giter Club logo

oxmsg's Introduction

oxmsg

write Microsoft .msg Outlook Item files.

MSG Outlook Items

These are compound files in cfb format, which is a simplified filesystem-in-a-file with directories (called storages) and files (called streams).

msg can contain outlook appointments and emails, they consist of a property stream in the top-level storage and several sub-storages (ie for attachments).

Streams contain either raw data or a list of Properties of 16 bytes each. Properties consist of a 2 byte type (binary, boolean, long etc), 2 byte id (usage), 4 bytes of flags and 8 bytes of value. Property tags (the first 4 bytes of a property) are defined in lib/property_tags.js.

MsgKit

MsgKit is the C# library oxmsg was based on.

Emails

When creating a new Email instance, oxmsg will prepare the data structure to store all attributes, like a list of attachments, a list of recipients, and so on. These can be filled by adding the items to the Email object.

The final .msg will be created by calling .msg() on the Email, at which point the Email object will serialize itself into a cfb compound file provided by cfb, which is then written to an Uint8Array. Example usage in examples/node.js.

Appointments

MsgKit also supports exporting Outlook Appointments to msg, but those parts are not translated yet.

TODO

  • The library currently works in node, but could be made to work in the browser if it's shrunk and the dependencies that rely on node Buffers are replaced (mainly the parsers in lib/mime/header/ that use iconv-lite). BigInt would need to be polyfilled and bigint literals removed
  • it would probably be good to generate/provide an index.d.ts type definition file, which should not be too hard since the actual API surface is tiny.
  • The library has the capability to parse email headers if that should be necessary (lib/mime/header/), but that's not done yet.
  • There's also code to read .msg files into an Email object that's not useful yet

The next steps would include using the library to export mails and figure out if the api makes sense and if the timestamps match expectations.

Docs & Tools

CFB viewer

online cfb viewer

This online tool (made by the developers of the cfb library) shows the contents of a .msg file, consisting of all storages with their names, and streams with their contents (in hex and ASCII).

Comparer

examples/comparer.js is a small utility that takes two cfb files and shows the differences between them. It will show streams/storages that are contained in only one of them, and if the element is contained in both but has different contents, it will show the difference.

Example (run with -a after the files to compare to show identical elements as well):

$ node examples/comparer.js test/testOut_attach.msg test/msgKitOut_attach.msg -a

MISSING: 4 Root Entry/Sh33tJ5 not in /home/nils/Projects/repositories/oxmsg/test/msgKitOut_attach.msg
tag: undefined
0000: 37 32 36 32
7262

SAME: size: 38 Root Entry/__substg1.0_0E1D001F
tag: PR_NORMALIZED_SUBJECT_W

DIFFERENCE: 72:72 Root Entry/__substg1.0_0FFF0102 type 2
tag: PR_ENTRYID
test/testOut_attach.msg                               || test/msgKitOut_attach.msg
no tag                                                || no tag                                               
0000: 65 00 65 00 36 00 62 00 30 00 34 00 35 00 62 00 || 64 00 35 00 66 00 62 00 36 00 38 00 61 00 35 00
no tag                                                || no tag                                               
0010: 2d 00 61 00 31 00 64 00 63 00 2d 00 34 00 64 00 || 2d 00 30 00 37 00 37 00 66 00 2d 00 34 00 63 00
no tag                                                || no tag                                               
0020: 38 00 62 00 2d 00 38 00 34 00 31 00 65 00 2d 00 || 36 00 66 00 2d 00 38 00 66 00 66 00 31 00 2d 00
no tag                                                || no tag                                               
0030: 63 00 34 00 35 00 36 00 38 00 39 00 63 00 37 00 || 38 00 61 00 34 00 38 00 35 00 32 00 39 00 37 00
no tag                                                || no tag                                               
0040: 63 00 36 00 36 00 33 00                         || 30 00 63 00 31 00 61 00
ee6b045b-a1dc-4d8b-841e-c45689c7c663
d5fb68a5-077f-4c6f-8ff1-8a4852970c1a

DIFFERENCE: 644:644 Root Entry/__properties_version1.0 type 2
tag: undefined
test/testOut_attach.msg                               || test/msgKitOut_attach.msg
PR_CLIENT_SUBMIT_TIME                                 || PR_CLIENT_SUBMIT_TIME                                
00c0: 40 00 39 00 06 00 00 00 00 00 00 00 00 00 00 00 || 40 00 39 00 06 00 00 00 00 3f cf 4e 90 4d d2 01
PR_CREATION_TIME                                      || PR_CREATION_TIME                                     
0140: 40 00 07 30 06 00 00 00 00 00 00 00 00 00 00 00 || 40 00 07 30 06 00 00 00 9c 93 95 53 cc ce d6 01
PR_LAST_MODIFICATION_TIME                             || PR_LAST_MODIFICATION_TIME                            
0150: 40 00 08 30 06 00 00 00 00 00 00 00 00 00 00 00 || 40 00 08 30 06 00 00 00 9c 93 95 53 cc ce d6 01
no tag                                                || no tag                                               
0270: 03 00 08 0e 06 00 00 00 00 00 00 00 00 00 00 00 || 03 00 08 0e 06 00 00 00 64 fa 03 00 00 00 00 00
4848

    ????
        @9???7((=p(@00? 

This is part of the result of comparing MsgKit Output and oxmsg output with the same contents.

  • It's reporting that the stream Root Entry/Sh33tJ is in the oxmsg output, but not added by MsgKit (that's a "maker's mark" added by the devs of cfb that doesn't impact functionality). The 4 means that the stream contains 4 bytes, tag: undefined means that there's no meaningful part of a .msg file associated with the stream name and 0000: 37 32 36 32 | 7262 are the contents of the stream in hex and ascii. type 2 just means it's a stream and not a storage.
  • The stream Root Entry/__substg1.0_0E1D001F is contained in both files and identical. Its name is associated with the property tag PR_NORMALIZED_SUBJECT_W (which can be searched in the code to find out about it) and 38 bytes long.
  • Root Entry/__substg1.0_0FFF0102 is contained in both, but different. 72:72 means that both streams are 72 bytes long (if they don't have the same length, it's likely that the content listing below won't line up). The name is associated with PR_ENTRYID which, if looked up in lib/property_tags.js, contains binary data (a GUID in ascii, to be precise, which should be different).
  • Root Entry/__properties_version1.0 is the top-level properties stream, which doesn't have an associated property tag. It contains several different properties that are different. The PR_CLIENT_SUBMIT_TIME property at offset 00c0 is different between the two files (oxmsg set it to 0), as is the creation time and the last modification time. The property at 0270 is a checksum at the end of the stream.

oxmsg's People

Contributors

johnbotris avatar ganthern avatar armhub avatar charlag avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.