When utf_8_support is False (the default, standard RFC 3501 mode),
folder names received from the server are in Modified UTF-7 and must
be kept in that encoding internally. When utf_8_support is True,
names are decoded to UTF-8 for internal use.
Previous code decoded unconditionally in IMAPFolder.__init__, which
would corrupt non-ASCII names received as Modified UTF-7 when
utf_8_support is False. The inverse problem existed in
getfullIMAPname() and the three encode_mailbox_name() call sites in
IMAPRepository: they always converted UTF-8 → Modified UTF-7 before
sending to the server, which is wrong when utf_8_support is False
(names are already in Modified UTF-7 and must not be double-encoded).
Fix by applying the same conditional pattern consistently:
if account.utf_8_support:
name = imaputil.utf8_IMAP(name) # UTF-8 → Modified UTF-7
return imaputil.foldername_to_imapname(name)
This is applied in:
- IMAPFolder.__init__ (decode on receive)
- IMAPFolder.getfullIMAPname (encode before SELECT)
- IMAPRepository.getfolders (folderincludes SELECT)
- IMAPRepository.deletefolder
- IMAPRepository.makefolder_single
encode_mailbox_name() (which always assumed UTF-8 input) is removed
as it is no longer used anywhere.
Based on patch by Etienne Buira <etienne.buira@free.fr>
This commit refactors the IMAP folder handling in the offlineimap repository to use the `encode_mailbox_name` function for encoding folder names. This change ensures that folder names are properly encoded when interacting with the IMAP server, improving compatibility and reliability when dealing with folder names that may contain special characters or non-ASCII characters. The `encode_mailbox_name` function combines UTF-8 encoding and quoting as needed, providing a consistent way to handle folder names across the codebase.
When fetching emails with defects or malformed bytes (e.g., spam or broken
MIME boundaries), the Python 3 `email.parser` handles un-decodable bytes
by safely substituting them with the Unicode Replacement Character `\ufffd`
(using the `errors='replace'` handler by default).
However, a problem arises during the `_fetch_from_imap` sync process when
OfflineIMAP3 tests if the message can be serialized back to bytes via
`as_bytes()`. If the malformed text part was originally declared as
`us-ascii` (or left unspecified, defaulting to ASCII), Python attempts to
encode the `\ufffd` string back to bytes using the ASCII codec. Since `\ufffd`
is out of the ASCII range, this triggers a `UnicodeEncodeError`, causing
OfflineIMAP3 to raise an `OfflineImapError` and entirely skip syncing the
message.
This commit fixes the issue by catching the `UnicodeEncodeError` when
`as_bytes()` fails. It then walks through the message parts, identifies
the text payloads that cannot be encoded with their current charset, and
dynamically forces their charset to `utf-8`. This allows Python to safely
encode the `\ufffd` character (automatically applying base64/quoted-printable
transfer encoding if needed), successfully serializing the message so it
can be synced without crashing.
Closes#240Closes#229Closes#224Closes#160
In various cases we were passing string messages, which is incompatible
with how ui.error handles its argument, leading to bugs like #233Fix#233
Signed-off-by: serge-sans-paille <sergesanspaille@free.fr>
When file_use_mail_timestamp or utime_from_header are enabled,
OfflineIMAP tries to parse the Date header in the email. If the header
is present but invalid -- it doesn't contain a valid date -- this will
cause email.message to raise an exception. This is all fine. However
when handling that exception, OfflineIMAP can't try to extract the date
again: it's clearly invalid, and raising the same exception a second
time while handling the first exception just causes the entire sync to
fail.
To avoid that happening, don't try to provide the invalid date string in
the error message. Instead, just give the user the UID of the email
that triggered the exception, and the exception text.
Ideally we'd instead fix the code to actually extract the header value
and provide it in the error message, but Python's email.message module
doesn't provide an easy way to get the raw text of the Date header from
an EmailMessage object; it's possible using private variables like
EmailMessage._headers, or by parsing the email using a custom
email.policy.EmailPolicy object that disables the module's attempts to
coerce the header value to a DateTime. However, a user should be able
to get the problematic Date header from the message directly anyway, so
it's not worth adding all that complexity for something that should be
rare and provides little value.
Fixes#134
Signed-off-by: Adam Dinwoodie <adam@dinwoodie.org>
This patch checks the exception raises by os.rename()
on Windows and provide the same behavior than Linux.
This patch is related to issue #37, issue 5.
This patch sets closes the issue 37.
closes#37
Moving the quoted boundary fix to the Base class so that it can be used
by any subclass that needs to read an email. Adding another utility to
extract message-id from a raw email.
unreachable due to an optimization in PR#56. Since message-id is more
useful to better pin point the correct message, removing dbg_output.
Also fixing https://github.com/OfflineIMAP/offlineimap3/issues/62 by
correcting broken multipart boundaries or raising an error if as_bytes()
fails. Related python bug submitted: https://bugs.python.org/issue43818
although this workaround should be sufficent in the interim.
Signed-off-by: Joseph Ishac <jishac@nasa.gov>
If synclabels is enabled then offlineimap is sending '1:*' to imaplib2,
and imaplib2 while creating the FETCH command is quoting the sequence
and the command becomes:
b"JFFJ10 FETCH '1:*' (FLAGS X-GM-LABELS UID)\r\n"
Remove the single-quotes to prevent that and also consider the response
as bytes.
Closes: #52
Signed-off-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com>
a message from a string to an email object that is part of the built-in
email library. The allows for emails to be processed as bytes and
re-encoded properly if they are not UTF-8 or ascii encoded. Currently
these changes cover the Base, IMAP, and Maildir classes but not the
specialized GMAIL class yet.
This patch converts the search results from bytes to strings
I add a bit comment about it here:
In Py2, with IMAP, imaplib2 returned a list of one element string.
['1, 2, 3, ...'] -> in Py3 is [b'1 2 3,...']
In Py2, with Davmail, imaplib2 returned a list of strings.
['1', '2', '3', ...] -> in Py3 should be [b'1', b'2', b'3',...]
In my tests with Py3, I get a list with one element: [b'1 2 3 ...']
Then I convert the values to string and I get ['1 2 3 ...']
With Davmail, it should be [b'1', b'2', b'3',...]
When I convert the values to string, I get ['1', '2', '3',...]
imaplib2 is doing this code for strings:
if isinstance(message, str):
message = bytes(message, 'ASCII')
But our message is already encoded using 'utf-8'.
Then, we can set the message as bytes, encoded using 'utf-8'
in offlineimap and imaplib2 won't change our message.
This patch solves this problem:
WARNING:OfflineImap:
Traceback:
File "/home/kix/src/offlineimap3/offlineimap/folder/Base.py", line 1127, in syncmessagesto
action(dstfolder, statusfolder)
File "/home/kix/src/offlineimap3/offlineimap/folder/Base.py", line 955, in __syncmessagesto_copy
self.copymessageto(uid, dstfolder, statusfolder, register=0)
File "/home/kix/src/offlineimap3/offlineimap/folder/Base.py", line 855, in copymessageto
new_uid = dstfolder.savemessage(uid, message, flags, rtime)
File "/home/kix/src/offlineimap3/offlineimap/folder/IMAP.py", line 668, in savemessage
(typ, dat) = imapobj.append(self.getfullIMAPname(),
File "/usr/lib/python3/dist-packages/imaplib2.py", line 660, in append
message = bytes(message, 'ASCII')
Emails received may not be UTF-8. Following error was observed on a specific
mail:
Traceback (most recent call last):
File "/home/tdescham/repo/offlineimap3/offlineimap/threadutil.py", line 146, in run
Thread.run(self)
File "/usr/lib/python3.7/threading.py", line 870, in run
self._target(*self._args, **self._kwargs)
File "/home/tdescham/repo/offlineimap3/offlineimap/folder/Base.py", line 850, in copymessageto
message = self.getmessage(uid)
File "/home/tdescham/repo/offlineimap3/offlineimap/folder/IMAP.py", line 327, in getmessage
data = self._fetch_from_imap(str(uid), self.retrycount)
File "/home/tdescham/repo/offlineimap3/offlineimap/folder/IMAP.py", line 844, in _fetch_from_imap
ndata1 = data[0][1].decode('utf-8')
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xa0 in position 10177: invalid start byte
This completely aborted offlineimap3, blocking further mail reception.
Instead, use the 'replace' error strategy in Python:
Replace with a suitable replacement character; Python will use the
official U+FFFD REPLACEMENT CHARACTER for the built-in Unicode codecs on
decoding and ‘?’ on encoding.
https://docs.python.org/2/library/codecs.html#codec-base-classes
ERROR: ERROR in syncfolder for gmail folder INBOX: Traceback (most recent call last):
File ".../offlineimap3/offlineimap/accounts.py", line 634, in syncfolder
cachemessagelists_upto_date(maxage)
File ".../offlineimap3/offlineimap/accounts.py", line 526, in cachemessagelists_upto_date
min_date=time.gmtime(time.mktime(date) + 24 * 60 * 60))
File ".../offlineimap3/offlineimap/folder/IMAP.py", line 277, in cachemessagelist
imapobj, min_date=min_date, min_uid=min_uid)
File ".../offlineimap3/offlineimap/folder/IMAP.py", line 259, in _msgs_to_fetch
search_result = search(search_cond)
File ".../offlineimap3/offlineimap/folder/IMAP.py", line 222, in search
if ' ' in res_data[0] or res_data[0] == '':
TypeError: a bytes-like object is required, not 'str'