Files
Rodolfo García Peñas (kix) 0185b11255 Fix UnicodeEncodeError on emails with malformed bytes
When fetching emails with defects or malformed bytes (e.g., spam or broken
MIME boundaries), the Python 3 `email.parser` handles un-decodable bytes
by safely substituting them with the Unicode Replacement Character `\ufffd`
(using the `errors='replace'` handler by default).

However, a problem arises during the `_fetch_from_imap` sync process when
OfflineIMAP3 tests if the message can be serialized back to bytes via
`as_bytes()`. If the malformed text part was originally declared as
`us-ascii` (or left unspecified, defaulting to ASCII), Python attempts to
encode the `\ufffd` string back to bytes using the ASCII codec. Since `\ufffd`
is out of the ASCII range, this triggers a `UnicodeEncodeError`, causing
OfflineIMAP3 to raise an `OfflineImapError` and entirely skip syncing the
message.

This commit fixes the issue by catching the `UnicodeEncodeError` when
`as_bytes()` fails. It then walks through the message parts, identifies
the text payloads that cannot be encoded with their current charset, and
dynamically forces their charset to `utf-8`. This allows Python to safely
encode the `\ufffd` character (automatically applying base64/quoted-printable
transfer encoding if needed), successfully serializing the message so it
can be synced without crashing.

Closes #240
Closes #229
Closes #224
Closes #160
2026-04-18 23:10:29 +02:00
..
2020-11-01 13:12:03 +01:00