Giter Club home page Giter Club logo

Comments (5)

dersam avatar dersam commented on July 23, 2024 1

You're right, it's a character encoding issue - probably not noticeable for text attachments, but would blow up dramatically for any binary files.

Even though we could add a parameter for converting the encoding, the problem I see is that RT doesn't report the original encoding. Since RT instances can have attachments uploaded by an arbitrary user on an arbitrary system, there's no guarantee that the user of RTPHPLib would know it. It seems like any implementation based on that would be very prone to breaking.

I'm open to discussion on this, but I think the way forward is continuing to leave complex attachments to have to be requested separately, as that sidesteps any encoding issues. I do think there's a place for a convenience wrapper that could maybe lazy load the attachment content as needed, so it's not all pulled down at once.

from rtphplib.

dersam avatar dersam commented on July 23, 2024

This is actually by design, due to the way RT handles attachment content in the api. getAttachment is getting the attachment metadata. It may be including a content key, but in the past that has been unreliable for binary content (as you've discovered) - I think it truncates the data in the response above a certain length.

RT provides an additional endpoint to fetch solely the attachment content, which is implemented in getAttachmentContent. This one will reliably acquire the attachment content.

I actively decided not to wrap these in a single call, as attachments could be quite large. This would mean that getting info about the attachments could be very slow if the content was also being fetched. However, there's no reason that a method to do that couldn't be added alongside the existing items.

Thoughts, @petski @ParisLiakos ?

from rtphplib.

petski avatar petski commented on July 23, 2024

I think I found a useful hint here: https://rt-wiki.bestpractical.com/wiki/REST#Ticket_Attachment

NOTE: RT returns the content indented with 9 spaces on each line, so that it lines up with the "Content:" header. Even if you strip this out with a regexp, the content is still UTF-8, which is probably not what you want. To get the original binary data back, strip out the 9 spaces with a regexp, strip off the 3 carriage returns at the end, and then convert the whole thing from UTF-8 to the native character encoding of the attachment, whatever that is. RT doesn't tell you, so you have know. If the attachments were uploaded by a U.S. Windows system, odds are that Windows-1252 is what you want. If you can't get the binary back intact, see the next method below.

from rtphplib.

ParisLiakos avatar ParisLiakos commented on July 23, 2024

I am not sure if its an API's fault or not, but i also found getAttachmentContent() much more reliable and frankly i didnt bother more and just use that all the time.

from rtphplib.

petski avatar petski commented on July 23, 2024

In the current version (v1.3.2), the following piece of content would return Lorum Ipsumdolor sitamet instead of the desired Lorum Ipsum dolor sit amet.

Content: Lorum Ipsum
          dolor sit 
         amet

This is because the "greedy" trim() is used in parseResponseBody(). This behavior is crippling binary data (in case the line begins or ends with "extra" spacing)

A patch for this situation will be posted shortly after this message. With this patch, I managed to:

  • Have the strlen() of $getAttachment["Content"] the same as the strlen() of the original image
  • Have the strlen() of $getAttachmentContent the same as the strlen of the original image
  • Have the strlen() of $getAttachment["Content"] the same as $getAttachment["Headers"] => Content-Length

I'm pretty confident my patch fixes this issue, but I surely hope you guys can do some extra tests as well.

P.S. getAttachment() returns a "ContentEncoding" key.

from rtphplib.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.