I am trying to parse and analyze email headers and body but I cannot parse more than one email. Only the first email is being parsed and ready to be analyzed where as rest of the emails are displayed as it is. How do I modify the code below in order to parse and analyze more than one email stored in a file?
import email
file_path = "/var/mail/admin"
with open(file_path, "r") as file:
content = file.read()
email_to_string = email.message_from_string(content)
headers = email_to_string._headers
header_contents = {}
for header in headers:
if "From" in header:
header_contents['From'] = header[-1]
elif "To" in header:
header_contents['To'] = header[-1]
elif "Date" in header:
header_contents['Date'] = header [-1]
elif "Subject" in header:
header_contents['Subject'] = header[-1]
if email_to_string.is_multipart():
body = []
for lines in body.get_payload():
body.append(lines)
body = "".join(body)
else:
body = email_to_string.get_payload()
print("HEADER CONTENTS\n", header_contents)
print("BODY\n", body)
Below is the out of the code where, first email is parsed and ready to be analyzed whereas, the rest of the email are just being displayed as it is.
Output
HEADER CONTENTS
{'Subject': 'Do grab the attached file as you requested!', 'To': '<admin@mail.zigzagdlp.com>', 'Date': 'Sat, 21 Mar 2020 02:49:50 -0700 (PDT)', 'From': 'server2 <server@mail.zigzagdlp.com>'}
BODY
aWQJbG5hbWUJZm5hbWUJY2NfdHlwZQljY19udW1iZXIJCjE3Mi0zMi0xMTc2CVdoaXRlCUpvaG5z
b24JbQk1MjcwIDQyNjcgNjQ1MCA1NTE2CQo1MTQtMTQtODkwNQlCb3JkZW4JQXNobGV5CW0JNTM3
MCA0NjM4IDg4ODEgMzAyMAkKMjEzLTQ2LTg5MTUJR3JlZW4JTWFyam9yaWUJdgk0OTE2IDk3NjYg
NTI0MCA2MTQ3CQo1MjQtMDItNzY1NwlNdW5zY2gJSmVyb21lCW0JNTE4MCAzODA3IDM2NzkgODIy
MQkKNDg5LTM2LTgzNTAJQXJhZ29uCVJvYmVydAl2CTQ5MjkgMzgxMyAzMjY2IDQyOTUJCjY5MC0w
NS01MzE1CUNvbmxleQlUaG9tYXMJdgk0OTE2IDQ4MTEgNTgxNCA4MTExCQo2NDYtNDQtOTA2MQlK
YWNrc29uCUNoYXJsZXMJbQk1MjE4IDAxNDQgMjcwMyA5MjY2CQo0MjEtMzctMTM5NglEYXZpcwlT
dXNhbgl2CTQ5MTYgNDAzNCA5MjY5IDg3ODMJCjQ2MS05Ny01NjYwCVdhdHNvbglHYWlsCXYJNDUz
MiAxNzUzIDYwNzEgMTExMgkKNjYwLTAzLTgzNjAJR2Fycmlzb24JTGlzYQl2CTQ1MzkgNTM4NSA3
NDI1IDU4MjUJCjc1MS0wMS0yMzI3CVJlbmZybwlKdWxpZQltCTUzMjUgMzI1NiA5NTE5IDY2MjQJ
CjU1OS04MS0xMzAxCUhlYXJkCUphbWVzCXYJNDUzMiA0MjIwIDY5MjIgOTkwOQkKCg==
From server@mail.zigzagdlp.com Mon Mar 23 23:29:36 2020
Return-Path: <server@mail.zigzagdlp.com>
X-Original-To: admin@mail.zigzagdlp.com
Delivered-To: admin@mail.zigzagdlp.com
Received: by mail.zigzagdlp.com (Postfix, from userid 1000)
id 41C57100667; Mon, 23 Mar 2020 23:29:36 -0700 (PDT)
To: <admin@mail.zigzagdlp.com>
Subject: This is subject part
X-Mailer: mail (GNU Mailutils 3.4)
Message-Id: <20200324062936.41C57100667@mail.zigzagdlp.com>
Date: Mon, 23 Mar 2020 23:29:36 -0700 (PDT)
From: server2 <server@mail.zigzagdlp.com>
This is where body starts from.
This is the part of the body.
Body ends here.
From dlpmonitor@mail.zigzagdlp.com Mon Mar 23 05:56:31 2020
Return-Path: <dlpmonitor@mail.zigzagdlp.com>
X-Original-To: admin@mail.zigzagdlp.com
Delivered-To: admin@mail.zigzagdlp.com
Received: from mail.zigzagdlp.com (localhost [127.0.0.1])
by mail.zigzagdlp.com (Postfix) with ESMTP id 68A681006D7
for <admin@mail.zigzagdlp.com>; Mon, 23 Mar 2020 05:56:31 -0700 (PDT)
Message-Id: <20200323125631.68A681006D7@mail.zigzagdlp.com>
Date: Mon, 23 Mar 2020 05:56:31 -0700 (PDT)
From: dlpmonitor@mail.zigzagdlp.com
Subject: Alert!!
Sensitive data have been found.