Quantcast
Channel: Active questions tagged email - Stack Overflow
Viewing all articles
Browse latest Browse all 29748

Python - email header decoding UTF-8

$
0
0

is there any Python module which helps to decode the various forms of encoded mail headers, mainly Subject, to simple - say - UTF-8 strings?

Here are example Subject headers from mail files that I have:

Subject: [ 201105311136 ]=?UTF-8?B?IMKnIDE2NSBBYnM=?=. 1 AO;
Subject: [ 201105161048 ] GewSt:=?UTF-8?B?IFdlZ2ZhbGwgZGVyIFZvcmzDpHVmaWdrZWl0?=
Subject: [ 201105191633 ]
  =?UTF-8?B?IERyZWltb25hdHNmcmlzdCBmw7xyIFZlcnBmbGVndW5nc21laHJhdWZ3ZW5kdW4=?=
  =?UTF-8?B?Z2VuIGVpbmVzIFNlZW1hbm5z?=

text - encoded sting - text

text - encoded string

text - encoded string - encoded string

Encodig could also be something else like ISO 8859-15.

Update 1: I forgot to mention, I tried email.header.decode_header

    for item in message.items():
    if item[0] == 'Subject':
            sub = email.header.decode_header(item[1])
            logging.debug( 'Subject is %s' %  sub )

This outputs

DEBUG:root:Subject is [('[ 201101251025 ] ELStAM;=?UTF-8?B?IFZlcmbDvGd1bmcgdm9tIDIx?=. Januar 2011', None)]

which does not really help.

Update 2: Thanks to Ingmar Hupp in the comments.

the first example decodes to a list of two tupels:

print decode_header("""[ 201105161048 ] GewSt:=?UTF-8?B?IFdlZ2ZhbGwgZGVyIFZvcmzDpHVmaWdrZWl0?=""")
[('[ 201105161048 ] GewSt:', None), (' Wegfall der Vorl\xc3\xa4ufigkeit', 'utf-8')]

is this always [(string, encoding),(string, encoding), ...] so I need a loop to concat all the [0] items to one string or how to get it all in one string?

Subject: [ 201101251025 ] ELStAM;=?UTF-8?B?IFZlcmbDvGd1bmcgdm9tIDIx?=. Januar 2011

does not decode well:

print decode_header("""[ 201101251025 ] ELStAM;=?UTF-8?B?IFZlcmbDvGd1bmcgdm9tIDIx?=. Januar 2011""")

[('[ 201101251025 ] ELStAM;=?UTF-8?B?IFZlcmbDvGd1bmcgdm9tIDIx?=. Januar 2011', None)]


Viewing all articles
Browse latest Browse all 29748

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>