Giter Club home page Giter Club logo

multipart-form-data's People

Contributors

darobin avatar electron-libre avatar plehegar avatar reschke avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

multipart-form-data's Issues

what do do about non-ASCII field names

See https://www.w3.org/Bugs/Public/show_bug.cgi?id=16909

And here is my proposed reply but some comments

RFC 2388 was clear:

Field names originally in non-ASCII character sets may be encoded
within the value of the "name" parameter using the standard method
described in RFC 2047.

For reasons I don't understand, browsers did different, incompatible
things.

I think the main advice is:

  • those creating HTML forms SHOULD use ASCII field names, since deployed HTML processors vary, and field names shouldn't be visible to the user anyway.
  • Those developing server infrastructure to read multipart/form-data uploads SHOULD be aware of the varying behavior of the browsers in translating non-ASCII field names, and look for any of the variants (if they're expecting non-ASCII field names).
  • Those developing browsers should migrate toward a standard encoding, but the server infrastructure will still have to do fuzzy match for a long while.

What should the browsers migrate to?

http://www.rfc-editor.org/rfc/rfc5987.txt
seems like a more recent proposal and possibly implemented in HTTP anyway.

Sites that use non-ASCII field names and want to work with multiple browsers already have to do fuzzy matching.

The problem is that the fuzzy matchers already deployed might not recognize any NEW encodings.

So I suppose having a name* value would be necessary.

example error needs empty line

  --AaB03x
  content-disposition: form-data; name="field1"
  content-type: text/plain;charset=windows-1250
  content-transfer-encoding: quoted-printable
  Joe owes =80100.
  --AaB03x

You need an empty line before the "Joe owes" (unfortunately, this was turned into a page break in RFC 2388).

Comments from Leigh Klotz

HTML is not the only producer of multipart/form-data. Cleaning up encodings is very important, and citing examples from HTML5 in the RFC is illuminating and can help adoption, but please don't give readers the impression that multipart/form-data is useful only for HTML. Box.net, for example, uses it for their REST-based API and people struggle with the Google HTTP Client code library because it improperly implements multipart/form-data,. A clearer RFC that lays out the responsibility to implement might help adoption.

In XForms we also used multipart/related with an XML body for the first part, containing all instance data (what could or was bound to input etc. controls) and separate parts, referenced by URI from within the first part, for the uploaded-file components. Using JSON for the first part is a trivial change.

html references wrong

There are two HTML4 references, why?

What should the HTML5 reference be? It's non-normative, just to point to where this is used.

What's the holdup?

I ran some simple tests using http://software.hixie.ch/utilities/js/live-dom-viewer/ and http://software.hixie.ch/utilities/cgi/test-tools/echo and I'm wondering why this is taking so long.

This format seems pretty straightforward. As far as I can tell @dthaler is correct. The way things like name="..."'s value are encoded is not, it's just the bytes coming out of the encoder (which depends on the encoding of the <form>).

Text entries have no Content-Type header and the others do. multipart/mixed is not to be used.

All we need here is an algorithm that takes a set form entries and serializes them and an algorithm that does the reverse. Ideally soon as we need both of these algorithms in browsers due to service workers (a sort of proxy server).

I'm somewhat tempted to just inline these algorithms and define this format together with its API, just as we already did for application/x-www-form-urlencoded.

Content-Transfer-Encoding

The text for section 4.7 currently reads:

Previously, it was recommended that senders use a "Content-Transfer-
Encoding" encoding (such ss quoted-printable) for each non-ASCII part
of a multipart/form-data body. This recommendation is "deprecated":
senders MUST NOT send any parts with a content-transfer-encoding
header. No deployed implementations that send such bodies have been
discovered.

But this precludes the use of multipart/form-data on 7-bit only transports, such as default SMTP, wherever non-ASCII form data has to be sent. This requirement, though, makes sense on transports that allow binary data, such as HTTP. Note also that the previous version of this draft was less strict in this respect. Moreover, as senders include forwarding proxies, this requirement unnecessarily applies to proxies. A better word would be "generate", rather than send.

I would prefer text like the following:

This recommendation is "deprecated": senders MUST NOT generate any parts
with a content-transfer-encoding header field unless the part is being sent via a
7-bit only transport, in which case it may be necessary to use a
transfer encoding such as "base64" or "quoted-printable". No deployed [...]
Note that HTTP is not a 7-bit only transport.

Note:

Change "such ss quoted-printable" to "such as quoted-printable".

clarify that restricting to ASCII fieldnames

L: * those creating HTML forms SHOULD use ASCII field names,
since deployed HTML processors vary,
and field names shouldn't be visible to the user anyway.

Clarify that "to maximize interoperability" i.e., it's not a conformance requirement. Maybe a "MAY" instead of a "SHOULD".

comments from Alexey

Hi Larry,
Some comments after reviewing your draft:

In Section 2:

As with all multipart MIME types, each part has an optional "Content-
Type", which defaults to "text/plain".  If the contents of a file are
returned via filling out a form, then the file input is identified as
the appropriate media type, if known, or "application/octet-stream".
The inclusion of multiple files returned for a single file input
result in multiple parts, one for each file, with the same name.

I would insert references to where various mentioned media types are
defined.

As with all multipart MIME types, each part has an optional "Content-
Type", which defaults to "text/plain".  If the contents of a file are
returned via filling out a form, then the file input is identified as
the appropriate media type, if known, or "application/octet-stream".
The inclusion of multiple files returned for a single file input
result in multiple parts, one for each file, with the same name.

It took me multiple passes to understand the last sentence. I am not
sure I got it. Can you insert an example or qualify various use of "file"?

3.5. Charset of text in form data

For example, a form with a text field in which a user typed 'Joe owes
<eu>100' where <eu> is the Euro symbol might have form data returned
as:

   --AaB03x
   content-disposition: form-data; name="field1"
   content-type: text/plain;charset=windows-1250
   content-transfer-encoding: quoted-printable
   Joe owes =80100.
   --AaB03x

Are you missing an empty line after "content-transfer-encoding:"? This
doesn't look like a proper MIME fragment.

  1. Media type registration for multipart/form-data
Media Type name:
  multipart
Media subtype name:
  form-data
    Required parameters:
   none
Optional parameters:
   none

This doesn't look correct. What about "boundary"?

Example multipart/form-data would be useful. I had to search the web to
find some

multipart/mixed usage recommend NOT

In particular, this means that multiple files submitted as part of a single element will result in each file having its own field; the "sets of files" feature ("multipart/mixed") of RFC 2388 is not used.

My view is that this use of multipart/mixed now qualifies for a NOT RECOMMENDED.

enumerate how current clients encode non-ascii fields and values

L: "Those developing server infrastructure to read multipart/form-data uploads
SHOULD be aware of the varying behavior of the browsers in translating
non-ASCII field names, and look for any of the variants (if they're
expecting non-ASCII field names).

H: If the servers have to look for variants, we should define those variants.

Agree, need to find current behavior of deployed browsers

original request

-----Original Message-----
From: Ian Hickson [mailto:[email protected]] 
Sent: Wednesday, February 13, 2013 7:18 AM
To: Larry Masinter
Subject: RFC 2388 (multipart/form-data)

Hey Larry,

Do you know if there is anyone working on fixing RFC2388? People keep 
asking me to update the HTML spec to just define it all inline rather than 
deferring to the RFC since the RFC leaves a lot of stuff underdefined, but 
I don't have the bandwidth to spec all that myself at this point.

e.g.:
   https://www.w3.org/Bugs/Public/show_bug.cgi?id=16909
   https://www.w3.org/Bugs/Public/show_bug.cgi?id=19879

Other feedback:
   http://lists.w3.org/Archives/Public/public-whatwg-archive/2012Oct/0204.html
   http://lists.w3.org/Archives/Public/public-whatwg-archive/2012Jul/0037.html
   http://lists.w3.org/Archives/Public/public-whatwg-archive/2012May/0003.html

Cheers,
-- 
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'

clarify "can" in 3.2

I think some clarification of the "can" in 3.2 is required. Maybe say something like "the handling of file input fields that allow multiple files to be specified varies between browsers. Some send these as sets of files (wrapping all the parts for the files in one multipart/mixed), some just send multiple form-data parts with the same "name" attribute. HTML5 specifies the latter behavior."

(See:) http://www.whatwg.org/specs/web-apps/current-work/multipage/association-of-controls-and-forms.html#multipart-form-data

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.