Giter Club home page Giter Club logo

mailreader's Introduction

mailreader

This module parses RFC 2822 strings. Works on a simplified version of a MIME tree as commonly used in emails. mailreader uses email.js components.

Here's how mailreader is intended:

  • Receive a mail with the imap-client and get the body parts you're interested in
  • Give them to mailreader for parsing
  • Done.

Build Status

The MIME tree abstraction

To not have to deal with the whole complexity of handling a full-blown mime tree, mailreader uses a simplified version of the most commonly used MIME nodes used in emails.

html

A MIME node with Content-Type: text/html, without Content-Disposition header. Example:

{
    type: 'text',
    raw: 'Content-Transfer-Encoding: 7bit\r\nContent-Type: text/html;\r\n    charset=us-ascii\r\n\r\n<html><body>asd<img src="cid:20154202-BB6F-49D7-A1BB-17E9937B42B5"></body></html>\r\n',
    content: '<html><body>asd<img src="cid:20154202-BB6F-49D7-A1BB-17E9937B42B5"></body></html>'
}

text

A MIME node with Content-Type: text/plain, without Content-Disposition header. Example:

{
    type: 'text',
    raw: 'Content-Type: text/plain; charset=ISO-8859-1\r\n\r\nasdasd\r\n',
    content: 'asdasd'
}

attachment

A MIME node with Content-Disposition header. Example:

{
    type: 'attachment',
    raw: 'Content-Type: image/jpeg; name="myfile.jpg"\r\nContent-Disposition: attachment; filename="myfile.jpg"\r\nContent-Id: <my-content-identifier>\r\nContent-Transfer-Encoding: base64\r\n\r\n ... (a lot of base64) ... \r\n'
    content: [...], // Uint8Array with binary content
    filename: 'myfile.jpg', // the attachment filename, 'attachment' as fallback if not parseable
    mimeType: 'image/jpeg', // the attachment MIME type, application/octet-stream as fallback if not parseable
    id: 'my-content-identifier' // if `Content-Id` header is present, the id can be used to cross-reference it with html body parts
}

signed

A MIME subtree with Content-Type: multipart/signed. Example

{
    type: 'signed',
    raw: 'Content-Type: multipart/signed; boundary="Apple-Mail=_C94D8F86-2AA4-4D9A-A975-F51C8A2937B6"; protocol="application/pgp-signature"; micalg=pgp-sha512\r\n\r\n--Apple-Mail=_C94D8F86-2AA4-4D9A-A975-F51C8A2937B6\r\nContent-Transfer-Encoding: 7bit\r\nContent-Type: text/plain;\r\n    charset=us-ascii\r\n\r\nthis is some signed stuff!\r\n\r\n--Apple-Mail=_C94D8F86-2AA4-4D9A-A975-F51C8A2937B6\r\nContent-Transfer-Encoding: 7bit\r\nContent-Disposition: attachment;\r\n    filename=signature.asc\r\nContent-Type: application/pgp-signature;\r\n    name=signature.asc\r\nContent-Description: Message signed with OpenPGP using GPGMail\r\n\r\n-----BEGIN PGP SIGNATURE-----\r\nComment: GPGTools - https://gpgtools.org\r\n\r\niQEcBAEBCgAGBQJTaJgoAAoJEOHUm+Va/GWKreEIAI9qgTBR1SWciKQXduY2ZyY1\r\n3ymKequbFKyoG6gytrIfeAeMJrTZiySXNvOHMlm852fE0vQFWNXtVf2XW0wp8gHL\r\n9X8rpaKtArQHNXWgWN/23+Ea1A0GsyMaxRQxJgj62BEsQsnGUJDgWhq6T5SDZA+h\r\n1ihy12Xvh4F4P//Nt8az2EmWLCv4KbzGp6LVS5jqVxPncuO5mKYZB3yupXnV2nKA\r\nrijmxCTaTJM2tTcTucxNR7hiYTjY6kCpmaTGg9Aq1iy8+hahZ/ZJndzrIMcg+VEA\r\nclbOS6qREijrtuUDLiK58j4w41vRsOmbMOyGQEYNJ7cXQ793/qDPetY4W2ZtRLk=\r\n=iMlU\r\n-----END PGP SIGNATURE-----\r\n\r\n--Apple-Mail=_C94D8F86-2AA4-4D9A-A975-F51C8A2937B6--\r\n',
    signedMessage: 'Content-Transfer-Encoding: 7bit\r\nContent-Type: text/plain;\r\n    charset=us-ascii\r\n\r\nthis is some signed stuff!\r\n\r\n', //
    signature: '-----BEGIN PGP SIGNATURE-----\r\nComment: GPGTools - https://gpgtools.org\r\n\r\niQEcBAEBCgAGBQJTaJgoAAoJEOHUm+Va/GWKreEIAI9qgTBR1SWciKQXduY2ZyY1\r\n3ymKequbFKyoG6gytrIfeAeMJrTZiySXNvOHMlm852fE0vQFWNXtVf2XW0wp8gHL\r\n9X8rpaKtArQHNXWgWN/23+Ea1A0GsyMaxRQxJgj62BEsQsnGUJDgWhq6T5SDZA+h\r\n1ihy12Xvh4F4P//Nt8az2EmWLCv4KbzGp6LVS5jqVxPncuO5mKYZB3yupXnV2nKA\r\nrijmxCTaTJM2tTcTucxNR7hiYTjY6kCpmaTGg9Aq1iy8+hahZ/ZJndzrIMcg+VEA\r\nclbOS6qREijrtuUDLiK58j4w41vRsOmbMOyGQEYNJ7cXQ793/qDPetY4W2ZtRLk=\r\n=iMlU\r\n-----END PGP SIGNATURE-----\r\n\r\n',
    content: [{
        type: 'text',
        content: 'this is some signed stuff!'
    }]
}

encrypted

A MIME subtree with Content-Type: multipart/signed

{
    type: 'encrypted',
    raw: 'Content-Type: multipart/encrypted;\r\n protocol=\"application/pgp-encrypted\";\r\n boundary=\"MrDkNHd70n0CBWqJqodk50MfrlELiXLgn\"\r\n\r\nThis is an OpenPGP/MIME encrypted message (RFC 4880 and 3156)\r\n--MrDkNHd70n0CBWqJqodk50MfrlELiXLgn\r\nContent-Type: application/pgp-encrypted\r\nContent-Description: PGP/MIME version identification\r\n\r\nVersion: 1\r\n\r\n--MrDkNHd70n0CBWqJqodk50MfrlELiXLgn\r\nContent-Type: application/octet-stream; name=\"encrypted.asc\"\r\nContent-Description: OpenPGP encrypted message\r\nContent-Disposition: inline; filename=\"encrypted.asc\"\r\n\r\n-----BEGIN PGP MESSAGE-----\r\nVersion: GnuPG v1.4.13 (Darwin)\r\nComment: GPGTools - https://gpgtools.org\r\nComment: Using GnuPG with Thunderbird - http://www.enigmail.net/\r\n\r\n ... ciphertext goes here ... \r\n=3OkT\r\n-----END PGP MESSAGE-----\r\n\r\n--MrDkNHd70n0CBWqJqodk50MfrlELiXLgn--',
    content: '-----BEGIN PGP MESSAGE-----\r\nVersion: GnuPG v1.4.13 (Darwin)\r\nComment: GPGTools - https://gpgtools.org\r\nComment: Using GnuPG with Thunderbird - http://www.enigmail.net/\r\n\r\n ... ciphertext goes here ... \r\n=3OkT\r\n-----END PGP MESSAGE-----' // PGP ciphertext
}

Let's parse stuff

var mailreader = require('mailreader');
var email = {
    bodyParts: [{
        type: 'text',
        raw: 'Content-Type: text/plain; charset=ISO-8859-1\r\n\r\nasdasd\r\n'
    }]
};

mailreader.parse(email, function(err, bodyParts) {
    console.log(bodyParts[0].content); // -> 'asdasd'
});

Multithreading

To offload the mail parsing to a web worker, call mailreader.startWorker(path). mailreader has to load dependencies, so there are two options:

  • Use startWorker('[PATH]/mailreader-parser-worker.js') as the entry point for the web worker to load the email.js dependencies via AMD or
  • Build mailreader-parser-worker-browserify.js with browserify and supply the browserified file startWorker('[PATH]/[BROWSERIFIED-FILE].js')

Get your hands dirty

Run the following commands to get started:

npm install && grunt

mailreader's People

Contributors

andris9 avatar c-f-h avatar felixhammerl avatar tanx avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

mailreader's Issues

Content-Transfer-Encoding:8bit

Does mailreader support Content-Transfer-Encoding:8bit?

I'm getting reports (mailvelope/mailvelope#6 (comment)) about wrong decoding of umlauts if transfer encoding 8bit was used.

Not sure if this is the right test setup, but I created a rawText:

Content-Type:text/plain; charset="UTF-8"
Content-Transfer-Encoding:8bit

äöü

and after mailreader.parse([{raw: rawText}], function(parsed) {

the result of textParts[0].content with mailreader v0.4.2 is:

���

error parsing attachment name

[ERROR][2014-10-19T20:55:27.571Z] Error handling web worker: Line 192 in https://mail.whiteout.io/lib/mailreader-parser.js: TypeError: undefined is not an object (evaluating 'node.headers['content-disposition'][0]')
onerror@https://mail.whiteout.io/lib/mailreader.js:63:34

Empty text nodes

A bit of an edge case, but parsing of a message with an empty body:

Content-Type: multipart/signed;
    boundary="Apple-Mail=_2FA6320C-6453-4AF8-B142-BB0B5A86BF5D";
    protocol="application/pgp-signature";
    micalg=pgp-sha512


--Apple-Mail=_2FA6320C-6453-4AF8-B142-BB0B5A86BF5D
Content-Transfer-Encoding: 7bit
Content-Type: text/plain


--Apple-Mail=_2FA6320C-6453-4AF8-B142-BB0B5A86BF5D
Content-Transfer-Encoding: 7bit
Content-Disposition: attachment;
    filename=signature.asc
Content-Type: application/pgp-signature;
    name=signature.asc
Content-Description: Message signed with OpenPGP using GPGMail

-----BEGIN PGP SIGNATURE-----
Comment: GPGTools - https://gpgtools.org

iQIcBAEBCgAGBQJVwfvFAAoJEAf+2SAqAy1Ei5cQAKY2d9irduVuZOIN1SSJvxJG
f12BB5J1AMvwrN9UCsJxTydO3DlmLr2e9jMDULNOmfR7H+MHKxAGnKez9lXQ0uyA
n83086+bYNyBHD1DWq/BNdIsrp8kphEbqRjRPLzWmKeKL9KQnls5dC0+UKn4hwwe
5g7V4ko9Ps8P6qZGp2KSEhfJ2hJnJWhj1EPR5q/rAumFv/QQiki3DpKh/kTZx+ZH
ADqWhs/JxRJaKBQ0StVgZW/kpZ0Tp72fob7lw/mNfRttjZhiyjZn7xetaZNAViWa
ZEm96rDnawVO2TWY1M2GH4o+ZkloJeoc14AyXWaOR7eiTJuHYzHu0BsNQRJLS7kM
0UasJDYfnVt4TZyA42Qj2RlYFex4jPFUX8DZIzbEaYr2XpmrezbdQuGJTy9Cpdph
f0U6wUVy2bYlsOoFmJJKq/2G0RAQN1tQzs8rBCcgBZEc/xTBh43iA7g9We5mxC95
3bC9GBSBjfe1eInBJ+95Kv3t7TIgbyrKArhWAasrGrN6a8oKVDk3SY0KUTN4YT3+
r/LqLf4HOekJ3VmPW3W+2aLGInMAKJjXWng7WQ5iEnwe9Bjyf1IOg8pvbwlX9PVu
QClkjbD7rD3I4UC4VMtkPK3Xl18UZ91fisrbTqRU80wyhDWxaMWTL0urSAYWZp1h
kCqpmgzbyeeLeDfT7bAs
=nHSm
-----END PGP SIGNATURE-----

--Apple-Mail=_2FA6320C-6453-4AF8-B142-BB0B5A86BF5D--

leads to the exception: Failed to execute 'decode' on 'TextDecoder': The provided value is not of type '(ArrayBuffer or ArrayBufferView)'

Mailreader issues

Mailreader fails to parse emails via imap-client when only a text body-part is provided and no html resulting in an error which crashes imap-client.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.