Multipurpose Internet Mail Extensions (MIME)

Multipurpose Internet Mail Extensions (MIME) is an Internet standard that extends the format of emails to support the following:

Text in character sets other than ASCII
Non-text attachments: audio, video, images, application programs, etc.
Message bodies with multiple parts
Header information in non-ASCII character sets

MIME is specified in the following six linked RFCs:

RFC 2045: Part One: Format of Internet Message Bodies
RFC 2046: Part Two: Media Types
RFC 2047: Part Three: Message Header Extensions for Non-ASCII Text
RFC 4288: obsoleted by RFC 6838: Media Type Specifications and Registration Procedures
RFC 4289: Part Four: Registration Procedures
RFC 2049; Part Five: Conformance Criteria and Examples

Although MIME was designed mainly for SMTP, the content types defined by MIME standards are also of importance in communication protocols outside of email, such as HTTP for the World Wide Web. Servers insert the MIME header at the beginning of any Web transmission. Clients use this content type or media type header to select an appropriate viewer application for the type of data the header indicates. Some of these viewers are built into the Web client or browser (for example, almost all browsers come with GIF and JPEG image viewers as well as the ability to handle HTML files).

The MIME headers consist of the following:

MIME-Version: The presence of this header indicates the message is MIME-formatted.
Content-Type: Indicates the media type of the message content, consisting of a type and subtype.
Content-Disposition: Contains the presentation style, which can be either inline (if the attachment should be displayed automatically) or attachment (if it is not displayed automatically, but needs user interaction) as well as fields for specifying the name and creation & modification dates of the file.
Content-Transfer-Encoding: Indicates whether a binary-to-text encoding scheme has been used on top of the original encoding, or not.

Encoded-Word:

The MIME encoded-word syntax is defined in RFC 2047 as mentioned above. It uses a string of ASCII characters indicating both the original encoding (the charset) and the content-transfer-encoding used to map the bytes of the charset into ASCII characters.

The form is: "=?charset?encoding?encoded text?=".

charset may be any character set registered with the IANA. Typically it would be the same charset as the message body.
encoding can be either "Q" (denoting Q-encoding) or "B" (denoting base64 encoding).
encoded text is the Q-encoded or base64-encoded text.
An encoded-word may not be more than 75 characters long, including charset, encoding, encoded text, and delimiters. If more is needed, multiple encoded-words (separated by CRLF SPACE) may be used.

Multipart Messages:

The MIME multipart message contains a boundary in the Content-Type header. This boundary, which must not occur in any of the parts, is placed between the parts, and at the beginning and end of the body of the message.

Each part consists of its own content header (zero or more Content- header fields) and a body. Multipart content can be nested. The content-transfer-encoding of a multipart type must always be "7bit", "8bit" or "binary" to avoid the complications that would be posed by multiple levels of decoding. The multipart block as a whole does not have a charset; non-ASCII characters in the part headers are handled by the Encoded-Word system, and the part bodies can have charsets specified if appropriate for their content-type.

Notes:

Before the first boundary is an area that is ignored by MIME-compliant clients. This area is generally used to put a message to users of old non-MIME clients.
It is up to the sending mail client to choose a boundary string that doesn't clash with the body text. Typically this is done by inserting a long random string.
The last boundary must have two hyphens at the end.