Giter Club home page Giter Club logo

Comments (12)

Godlance avatar Godlance commented on May 31, 2024

I solved this, or at least, found a workaround.

Surround your call with a try and except like this:

message_data = b'\r\n'.join(lines)

try:

    mail = mailparser.parse_from_bytes(message_data[b"RFC822"])

except Exception as e:

    print('This mail has cirillic characters. Trying to parse from string...')

    try:

        mail = mailparser.parse_from_string(message_data[b"RFC822"].decode('ISO-8859-1'))
    
     except Exception as e:
        
        print('This mail is corrupted and cannot be parsed: %s' % str(e))
        
        pass

This way, if the bytes parser fails it will fall back to the string parser and you can change the encoding.

I've been able to parse every single mail thrown at my server this way.

from mail-parser.

yatakoi avatar yatakoi commented on May 31, 2024

Hi. Thank you!

Where should I paste this code?

from mail-parser.

fedelemantuano avatar fedelemantuano commented on May 31, 2024

Maybe this snippet works only for Python 3. Can you do a PR here?

from mail-parser.

yatakoi avatar yatakoi commented on May 31, 2024

Sorry, but what is PR?

My script works for Python 3.

from mail-parser.

fedelemantuano avatar fedelemantuano commented on May 31, 2024

It's a Pull Request: https://docs.github.com/en/github/collaborating-with-issues-and-pull-requests/about-pull-requests

from mail-parser.

Godlance avatar Godlance commented on May 31, 2024

Sorry, it seems I wasn't receiving notifications for this issue correctly.

@yatakoi

In your main.py, line 54, you have the following code:

mail = mailparser.parse_from_bytes(message_data[b"RFC822"])

Replace that line with the snippet I wrote, omitting the message_data = b'\r\n'.join(lines) line.

@fedelemantuano

I did not modify mail-parser, just coded a workaround that goes in my app code. I've never done a PR before so I'm not sure I can help, but if it may solve this issue for everyone I could try.

from mail-parser.

fedelemantuano avatar fedelemantuano commented on May 31, 2024

The develop branch doesn't have any issue.
I will release the new version soon.

$ python3.9 -m mailparser -f ~/Downloads/mail_raw -sa -ap ~/Downloads/test

image

from mail-parser.

fechnert avatar fechnert commented on May 31, 2024

This issue still seems to occur with mail-parser==3.15.0 and german umlauts like ä, ü, or ö or wrongly decoded strings like ü.

@fedelemantuano was this issue fixed with version 3.15.0?


How to reproduce

Raw email data:

Subject: foobar
To: foobar@example
From: [email protected]
Content-Type: multipart/mixed; boundary=somecontent

--somecontent
Content-Disposition: attachment; filename="Liste übersprungener 1.txt"
Content-Transfer-Encoding: base64
Content-Type: text/plain; charset=utf-8; name="Liste übersprungener 1.txt"

c3R1ZmY=
--somecontent--

Ready to use snippet:

import mailparser

_header = b'Subject: foobar\nTo: foobar@example\nFrom: [email protected]\nContent-Type: multipart/mixed; boundary=somecontent'
_body = b'--somecontent\nContent-Disposition: attachment; filename="Liste \xc3\xbcbersprungener 1.txt"\nContent-Transfer-Encoding: base64\nContent-Type: text/plain; charset=utf-8; name="Liste \xc3\xbcbersprungener 1.txt"\n\nc3R1ZmY=\n--somecontent--\n'


mailparser.parse_from_bytes(_header + b'\n\n' + _body)

Output:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File ".../mailparser/mailparser.py", line 118, in parse_from_bytes
    return MailParser.from_bytes(bt)
  File ".../mailparser/mailparser.py", line 241, in from_bytes
    return cls(message)
  File ".../mailparser/mailparser.py", line 138, in __init__
    self.parse()
  File ".../mailparser/mailparser.py", line 357, in parse
    content_disposition = ported_string(
  File ".../mailparser/utils.py", line 80, in wrapper
    return normalize('NFC', func(*args, **kwargs))
  File ".../mailparser/utils.py", line 114, in ported_string
    return six.text_type(raw_data, encoding)
TypeError: decoding to str: need a bytes-like object, Header found

from mail-parser.

fedelemantuano avatar fedelemantuano commented on May 31, 2024

Please send me the raw mail, I can't test it from your snippet.

from mail-parser.

fechnert avatar fechnert commented on May 31, 2024

GitHub won't let me upload *.eml files, so i simply renamed it to txt: mail.txt

import mailparser
with open('mail.txt', 'rb') as infile:
    text = infile.read()
mailparser.parse_from_bytes(text)

Returns the same issue as mentioned above.

from mail-parser.

fechnert avatar fechnert commented on May 31, 2024

Any progress on this?

from mail-parser.

fedelemantuano avatar fedelemantuano commented on May 31, 2024

I'm working on it. I will answer soon.

from mail-parser.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.