Page 1 of 1

Need help modifying a Gmail message retrieval script

Posted: Thu Feb 14, 2019 5:55 pm
by sapnho
I am trying to figure out a Python script to download attachments from a Gmail account if a certain subject line matches. I found the attached from 2015 which works perfectly but there is no checking of the subject line included. Can somebody tell me how to add the subject line check (e.g. if subject line does not include "grandmother" then ignore)

Code: Select all

#!/usr/bin/env python3
# read emails and detach attachment in attachments directory

import email
import getpass, imaplib
import os
import sys
import time

detach_dir = '.'
if 'attachments' not in os.listdir(detach_dir):
    os.mkdir('attachments')

userName = 'testxxxx'
passwd = 'password'

try:
    imapSession = imaplib.IMAP4_SSL('imap.gmail.com',993)
    typ, accountDetails = imapSession.login(userName, passwd)
    if typ != 'OK':
        print ('Not able to sign in!')
        raise

    imapSession.select('Inbox')
    typ, data = imapSession.search(None, 'ALL')
    if typ != 'OK':
        print ('Error searching Inbox.')
        raise

    # Iterating over all emails
    for msgId in data[0].split():
        typ, messageParts = imapSession.fetch(msgId, '(RFC822)')

        if typ != 'OK':
            print ('Error fetching mail.')
            raise 
        
        emailBody = messageParts[0][1] #print(type(emailBody))
        mail = email.message_from_bytes(emailBody) #mail = email.message_from_string(emailBody)

        for part in mail.walk():
            #print (part)
            if part.get_content_maintype() == 'multipart':
                # print part.as_string()
                continue
            if part.get('Content-Disposition') is None:
                # print part.as_string()
                continue

            fileName = part.get_filename()

            if bool(fileName): #add second condition of subjet line which must match
                filePath = os.path.join(detach_dir, 'attachments', fileName)
                if not os.path.isfile(filePath) : #check duplication
                    print (fileName)
                    print (filePath)
                    fp = open(filePath, 'wb')
                    fp.write(part.get_payload(decode=True))
                    fp.close()

    imapSession.close()
    imapSession.logout()

except :
    print ('Something went wrong')
    time.sleep(3)


Re: Need help modifying a Gmail message retrieval script

Posted: Thu Feb 14, 2019 8:48 pm
by MrYsLab
Take a look at this link in the python docs. Perhaps it is what you are looking for. https://docs.python.org/3.7/library/ima ... AP4.search

Re: Need help modifying a Gmail message retrieval script

Posted: Fri Feb 15, 2019 7:51 am
by sapnho
Thanks, yes I know this one, but I am not proficient enough in Python to figure this out.

Re: Need help modifying a Gmail message retrieval script

Posted: Fri Feb 15, 2019 8:57 am
by PhatFil
sapnho wrote:
Fri Feb 15, 2019 7:51 am
Thanks, yes I know this one, but I am not proficient enough in Python to figure this out.

Code: Select all

IMAP4.search(charset, criterion[, ...])
Search mailbox for matching messages. charset may be None, in which case no CHARSET will be specified in the request to the server. The IMAP protocol requires that at least one criterion be specified; an exception will be raised when the server returns an error. charset must be None if the UTF8=ACCEPT capability was enabled using the enable() command.

Example:

# M is a connected IMAP4 instance...
typ, msgnums = M.search(None, 'FROM', '"LDJ"')

# or:
typ, msgnums = M.search(None, '(FROM "LDJ")')
so perhaps start trying replacing your script line

Code: Select all

typ, data = imapSession.search(None, 'ALL')
with something like

Code: Select all

typ, data= imapSession.search(None, 'SUBJECT', '"THIS EMAIL HAS AN ATTACHMENT"')


maybe?

Re: Need help modifying a Gmail message retrieval script

Posted: Fri Feb 15, 2019 9:16 am
by MrYsLab
Another link to look at: https://pymotw.com/3/imaplib/#search-criteria. There is a sample of code searching for "from".

Code: Select all

#  imaplib_search_from.py

import imaplib
import imaplib_connect
from imaplib_list_parse import parse_list_response

with imaplib_connect.open_connection() as c:
    typ, mbox_data = c.list()
    for line in mbox_data:
        flags, delimiter, mbox_name = parse_list_response(line)
        c.select('"{}"'.format(mbox_name), readonly=True)
        typ, msg_ids = c.search(
            None,
            '(FROM "Doug" SUBJECT "Example message 2")',
        )
        print(mbox_name, typ, msg_ids)


Re: Need help modifying a Gmail message retrieval script

Posted: Fri Feb 15, 2019 12:22 pm
by Idahowalker
gmail accounts are free. I made an account just for my RPi. All the mail received by that account is just for the RPi. After making the account, send an email from that account to your main account and reply. Seems gmail accounts like a 2 way communication.

Re: Need help modifying a Gmail message retrieval script

Posted: Fri Feb 15, 2019 1:17 pm
by sapnho
PhatFil wrote:
Fri Feb 15, 2019 8:57 am
so perhaps start trying replacing your script line

Code: Select all

typ, data = imapSession.search(None, 'ALL')
with something like

Code: Select all

typ, data= imapSession.search(None, 'SUBJECT', '"THIS EMAIL HAS AN ATTACHMENT"')
You are amazing, thank you, it works great!

Re: Need help modifying a Gmail message retrieval script

Posted: Fri Feb 15, 2019 1:18 pm
by sapnho
Idahowalker wrote:
Fri Feb 15, 2019 12:22 pm
gmail accounts are free. I made an account just for my RPi. All the mail received by that account is just for the RPi. After making the account, send an email from that account to your main account and reply. Seems gmail accounts like a 2 way communication.
Yeah, that's what I did as well. Did you experience any issues or why do you recommend the two-way communication?

Re: Need help modifying a Gmail message retrieval script

Posted: Fri Feb 15, 2019 1:37 pm
by Idahowalker
sapnho wrote:
Fri Feb 15, 2019 1:18 pm
Idahowalker wrote:
Fri Feb 15, 2019 12:22 pm
gmail accounts are free. I made an account just for my RPi. All the mail received by that account is just for the RPi. After making the account, send an email from that account to your main account and reply. Seems gmail accounts like a 2 way communication.
Yeah, that's what I did as well. Did you experience any issues or why do you recommend the two-way communication?
During testing, after about receiving 1000ish emails, and nothing sent, emails just stopped being received. I logged into the account sent an email and have had not issue since ( over a year).

Re: Need help modifying a Gmail message retrieval script

Posted: Fri Feb 15, 2019 1:45 pm
by sapnho
After solving the first big issue, I have two more remaining if I may be so bold. :oops:

- How can I limit the download of any attachments to only jpg files?
- How can I delete the emails after processing? Both those with or without a valid attachment.

I read on https://www.timpoulsen.com/2018/reading ... ython.html that deleting with Gmail is a bit more tricky. This is his code but again I would need help where and how to place it in my above script...

Code: Select all

def delete_message(self, msg_id):
        if not msg_id:
            return
        if self.move_to_trash:
            # move to Trash folder
            self.imap.store(msg_id, '+X-GM-LABELS', '\\Trash')
            self.imap.expunge()
        else:
            self.imap.store(msg_id, '+FLAGS', '\\Deleted')
            self.imap.expunge()

Re: Need help modifying a Gmail message retrieval script

Posted: Fri Feb 15, 2019 1:45 pm
by sapnho
Idahowalker wrote:
Fri Feb 15, 2019 1:37 pm
During testing, after about receiving 1000ish emails, and nothing sent, emails just stopped being received. I logged into the account sent an email and have had not issue since ( over a year).
Makes sense, good tip, thanks!

Re: Need help modifying a Gmail message retrieval script

Posted: Sat Feb 16, 2019 4:24 pm
by PhatFil
may i suggest you dont delete the emails, instead once processed move them to a 'Processed' folder so they can be archived and then saved or deleted as appropriate.

a system with an automatic delete feature can create an unrecoverable error condition, while if you have an archive of all source data any errors can be corrected and backtracked.

Re: Need help modifying a Gmail message retrieval script

Posted: Sat Feb 16, 2019 4:30 pm
by sapnho
PhatFil wrote:
Sat Feb 16, 2019 4:24 pm
may i suggest you dont delete the emails, instead once processed move them to a 'Processed' folder so they can be archived and then saved or deleted as appropriate.

a system with an automatic delete feature can create an unrecoverable error condition, while if you have an archive of all source data any errors can be corrected and backtracked.
Yes, that's a good point. But how do I move them? :?

Re: Need help modifying a Gmail message retrieval script

Posted: Sat Feb 16, 2019 11:58 pm
by PhatFil
Your example code snippet is moving email to the trash folder, so just replave the trash reference with a reference to the folder you want to move to, and create the folder in advance ;)...

what have you tried? how did it go? it looks like your on the right track finding examples but are not trying to use them. you wont break anything putting in a badly formatted command, but any errors you raise will probably tell you what you have done wrong, and while error messages may appear oblique to start with the only way to get used to the language used and what is meant is by getting stuck in..

One hint
the code snippet your trying to import is pulled from an imap class definition so unless your intending to mirror that and define your own imap class look at replacing the self references with the imap object you are using (imapsession iirc) and i think you can probably just call the function pretty much as is?? I dont know as i haven't tested it though..

Re: Need help modifying a Gmail message retrieval script

Posted: Sun Feb 17, 2019 9:03 am
by sapnho
My problem is that I have no idea where to put that code snippet. This is how I understand it.

Code: Select all

try:
	imapSession = imaplib.IMAP4_SSL('imap.gmail.com',993)
	typ, accountDetails = imapSession.login(userName, passwd)
	if typ != 'OK':
		print ('Not able to sign in!')
		raise
This just established the connection to Gmail.

Code: Select all

	imapSession.select('Inbox')
	typ, data = imapSession.search(None, 'SUBJECT', '"#mypictures"')
	if typ != 'OK':
		print ('Error searching Inbox.')
		raise
This checks if there are any emails with "#mypictures" in the subject line.
Let's assume, there are none but other emails (that I want to move to "Processed"), then I guess I would have to add

Code: Select all

else:
	imap.store(msgId, '+X-GM-LABELS', '\\Processed')
So the section would look like:

Code: Select all

	# Iterating over all emails
	for msgId in data[0].split():
		typ, messageParts = imapSession.fetch(msgId, '(RFC822)')

		if typ != 'OK':
			print ('Error fetching mail.')
			raise 
		
		emailBody = messageParts[0][1] #print(type(emailBody))
		mail = email.message_from_bytes(emailBody) #mail = email.message_from_string(emailBody)

		for part in mail.walk():
			#print (part)
			if part.get_content_maintype() == 'multipart':
				# print part.as_string()
				continue
			if part.get('Content-Disposition') is None:
				# print part.as_string()
				continue

			fileName = part.get_filename()

			if bool(fileName): #add second condition of subjet line which must match
				filePath = os.path.join(detach_dir, 'attachments', fileName)
				if not os.path.isfile(filePath) : #check duplication
					print (fileName)
					print (filePath)
					fp = open(filePath, 'wb')
					fp.write(part.get_payload(decode=True))
					fp.close()

	else:
		imap.store(msgId, '+X-GM-LABELS', '\\Processed')

	imapSession.close()
	imapSession.logout()
But this does nothing to emails that do not have "#mypictures" in the subject line. No error message, just nothing happens.. :?

Re: Need help modifying a Gmail message retrieval script

Posted: Sun Feb 17, 2019 9:10 pm
by PhatFil
As a beginner with python myself there may be more python-esq methods employing try/catch.. But I would start with peppering the script with print or write statements to validate that the collected data conforms to your expectations.
Also confirm the contents of the inbox you are querying....

have you selected any messages that should be moved and have you got them when your trying to move them? if so
look at the move command more closely.
edit

when it comes to the move command insert print statements identifying the message uid you want to move and print out its subject line too then if it isnt moved.. well your program logic correctly?identified a message to move and you have confirmed its indexed correctly and accessible to your code at that point.....

Re: Need help modifying a Gmail message retrieval script

Posted: Mon Feb 18, 2019 6:09 pm
by sapnho
With a little help from a gifted Python programmer and everyone here in this thread, my solution is now working.

I have summarized it on my blog: https://www.thedigitalpictureframe.com/ ... via-email/

Thanks @PhatFil and @Idahowalker!