Recap from last week

In last week's article I developed a simple framework for exporting emails from Mail to emlx. The code has the following parts:

  1. Get some emails. I'll start with the selected emails.
  2. Do a loop with the emails I have.
  3. Get some data out of an email.
  4. Prepare the data for writing to a file.
  5. Create a file for writing.
  6. Write the data to a file.

Today I'm doing to expand on some of the features:

  1. When exporting lots of emails I usually want to have the date in front of the subject.
  2. From the article on getting mailboxes for Mail I know how the name of a mailbox looks like. The last task is to export one large mailbox.
  3. Emails with a super long subject have the same problem as when dragging them to the Finder: writing fails. I need to find a way to make the filename shorter.

Adding the date to the filename

Usually, when having lots of email files I want the date in front of the subject in the filename for better sorting. In addition to the subject and the source I need the received date of the email. I need to add a line to the part where I get the basic data:


       
--get basic data
       
set theSubject to subject of theMessage as Unicode text
       
set theSource to source of theMessage as Unicode text
       
set theDate to date received of theMessage

The date needs to be formatted. Which goes into the part where I do the data munging:

       
--prepare data for writing
       
set theSubject to my replace_chars(theSubject, ":", "_")
       
set theSubject to my format_date(theDate) & " " & theSubject

For the date I need the year, the month and the day. There is one trick with the month. The month of a date gives the month name. Coercing the data to an integer gives the number of a month. Finally, the 3 date parts need to be concatenated to the result:

--format the date of the email: makes an sql date type out of the date 2022-8-10
on format_date(theDate)
   
set theYear to year of theDate
   
set theMonth to (month of theDateas integer
   
set theDay to day of theDate
   
return theYear & "-" & theMonth & "-" & theDay
end format_date

That's all I need to do for the date.

Selecting a specific mailbox for export

From the article on getting mailboxes from Mail I know to the mailboxes for all accounts:

tell application id "com.apple.mail"
   
set theMailboxes to get mailboxes
   
set theAccounts to get accounts
   
repeat with theAccount in theAccounts
       
set AccountMailboxes to mailboxes of theAccount
       
set end of theMailboxes to AccountMailboxes
   
end repeat
   
return theMailboxes
end tell

This is the short version of the script because I just want to make sure I get the correct mailbox:

mailbox "INBOX/Ablage/Ablage 2019" of account id "8DAF3D5A-E77F-4806-BD51-929BAE1D1943" of application "Mail", mailbox "INBOX/Ablage/Ablage 2020" of account id "8DAF3D5A-E77F-4806-BD51-929BAE1D1943" of application "Mail", mailbox "INBOX/Ablage/Acronis" of account id "8DAF3D5A-E77F-4806-BD51-929BAE1D1943" of application "Mail", 

8DAF3D5A etc is my Beatrixwillius account. So the correct syntax is:

tell application "Mail"
   
set theMailbox to mailbox "INBOX/Ablage" of account "Beatrixwillius"
   
set SelectedMessages to messages of theMailbox
end tell

The INBOX part is needed because the account has a Imap path prefix. Then I need to get all emails of the mailbox. 

Of course, I could select multiple mailboxes. But then I would have to add another outer repeat for the mailboxes.

Now to the hardest part.

Solving the file length problem part 1

When using dragging to export emails from Mail I noticed that emails with a super long subject of 255 characters weren't written to the hard disk. AppleScript has the same problem. But with AppleScript it's possible to solve the problem. 

The subject of an email comes with special characters like é as one character. The Finder uses a form that uses 2 characters like ´e. If a subject has x characters I don't know if the Finder representation of the characters has the same length or is longer. So I make the file name shorter until writing the file works.

I'll present the solution only as pseudo code because after I had the idea for this I already had a better idea:

set FileWritten to false
repeat while not FileWritten
     try
          write file with subject as filename
          set FileWritten to true
on error
          cut off one character on the right side from the subject
end try
end repeat 

The code is a brute force method. Passwords, for instance, can be hacked in this manner. A hacker can use a word list and simply tries all words on the list. Here I just cut off a character from the problematic file name until the Finder can write the file. The brute force method is simple but not really elegant. But there is a better way! For the second method I need to know waaayyy more about macOS, though.

Solving the file length problem part 2

AppleScript consists of multiple parts:

  • The language itself like the "try" and "repeat".
  • Scripting additions.
  • The scripting dictionary of the apps.

However, there is another - less well known - part of AppleScript. It can use all features of macOS by talking to the Foundation framework. There is a function to convert the é form of a string to the ´e form. If the function exists in Foundation then there it's possible to use the function in AppleScript! Personally I find the syntax really confusing. I need to use a function with the lovely name of decomposedStringWithCanonicalMapping. The code is a direct translation of the Apple developer docs (with the help of MacScripter.net):

use framework "Foundation"
use scripting additions

--convert the string to precomposed and get the length of the string
on get_precomposed_length(theText)
   
set theApp to a reference to current application
   
set theNSString to theApp's NSString's stringWithString:theText
   
set theMutableNSString to theApp's NSMutableString's stringWithString:(theNSString's decomposedStringWithCanonicalMapping())
   
return theMutableNSString's |length|() as text
end get_precomposed_length

With this function I can make my subject shorter if necessary.

The subject string needs to be cut an array with the individual characters. This is done with a split function that I got off the internet.

For each character I'm going to add an entry in a second array with the length of the string with the function decomposedStringWithCanonicalMapping.

Let's say my string is "bäume" (German plural of baum for tree).

The result is something like: 

b: 1

ä: 2

u: 1

m: 1

e: 1

Now I just need to concatenate the array of the characters until I have the desired length of the characters:

property maxLength : 5

use framework "Foundation"
use scripting additions

set theText to "bäume"
set theText to my get_string_with_maxlength(theText)
return theText

on get_string_with_maxlength(theText)

   
--get the full length of the string first
   
set theLength to my get_precomposed_length(theTextas number
   
if theLength ≤ maxLength then return theText

   
--split the string into characters
   
set TextAsArray to my split_string(theText, "")

   
--for each character get the length
   
set CharacterLengths to {}
   
repeat with theChar in TextAsArray
       
set CharLength to my get_precomposed_length(theChar)
       
set end of CharacterLengths to CharLength
   
end repeat

   
--concatenate characters and count the length of the characters
   
set currentLength to 0
   
set Counter to 1
   
set theText to ""
   
repeat while currentLength < maxLength
       
set currentLength to (item Counter of CharacterLengths) + currentLength
       
set theText to theText & item Counter of TextAsArray
       
set Counter to Counter + 1
   
end repeat
   
return theText
end get_string_with_maxlength

--convert the string to precomposed and get the length of the string
on get_precomposed_length(theText)
   
set theApp to a reference to current application
   
set theNSString to theApp's NSString's stringWithString:theText
   
set theMutableNSString to theApp's NSMutableString's stringWithString:(theNSString's decomposedStringWithCanonicalMapping())
   
return theMutableNSString's |length|() as text
end get_precomposed_length

on split_string(theStringtheDelimiter)
   
-- save delimiters to restore old settings
   
set oldDelimiters to AppleScript's text item delimiters
   
-- set delimiters to delimiter to be used
   
set AppleScript's text item delimiters to theDelimiter
   
set theArray to every text item of theString
   
-- restore the old setting
   
set AppleScript's text item delimiters to oldDelimiters
   
return theArray
end split_string

The second repeat looks a bit odd so let me have a more detailed look at it. Below are the values of the variables before the counter value is increased:

CountercurrentLengththeText
00""
11b
23
34bäu
45bäum

The result for a maxLength of 5 is "bäum" because the ä counts as 2 characters in the third row.

Because of the structure of the script I just need to add the function for cutting off the subject to the main part to the script. Which gives me as script:

use framework "Foundation"
use scripting additions

property maxLength : 250
property FileEnding : ".eml"

tell application "Mail"

   
--check if there is an email selected
   
set SelectedMessages to (get selection)
   
if SelectedMessages = {} then
       
display dialog "Please select an email first!"
       
return
   
end if

   
repeat with theMessage in SelectedMessages

       
--get basic data
       
set theSubject to subject of theMessage as Unicode text
       
set theSource to source of theMessage as Unicode text
       
set theDate to date received of theMessage

       
--prepare data for writing
       
set theSubject to my replace_chars(theSubject, ":", "_")
       
set theSubject to my format_date(theDate) & " " & theSubject
       
set theSubject to my get_string_with_maxlength(theSubject)

       
--use file on desktop
       
set this_file to (((path to desktop folderas string) & theSubject & FileEnding)

       
--write data to file
       
my write_to_file(theSourcethis_filetrue)
   
end repeat

end tell

--format the date of the email: makes an sql date type out of the date 2022-08-10
on format_date(theDate)
   
set theYear to year of theDate
   
set theMonth to (month of theDateas integer
   
set theDay to day of theDate
   
return (theYear & "-" & theMonth & "-" & theDayas string
end format_date

--write data to file
on write_to_file(this_datatarget_fileappend_data)
   
tell application "Finder"
       
try
           
set the target_file to the target_file as string
           
set the open_target_file to open for access file target_file with write permission
           
if append_data is false then set eof of the open_target_file to 0
           
write this_data to the open_target_file as «class utf8» starting at eof
           
close access the open_target_file
           
return true
       
on error errMsg number errNr
           
try
               
display dialog errMsg & " Nr.: " & errNr
               
close access file target_file
           
end try
           
return false
       
end try
   
end tell
end write_to_file

--replace characters in text
on replace_chars(this_textsearch_stringreplacement_string)
   
set AppleScript's text item delimiters to the search_string
   
set the item_list to every text item of this_text
   
set AppleScript's text item delimiters to the replacement_string
   
set this_text to the item_list as string
   
set AppleScript's text item delimiters to ""
   
return this_text
end replace_chars

on get_string_with_maxlength(theText)

   
--get the full length of the string first
   
set theLength to my get_precomposed_length(theTextas number
   
if theLength ≤ maxLength then return theText

   
--split the string into characters
   
set TextAsArray to my split_string(theText, "")

   
--for each character get the length
   
set CharacterLengths to {}
   
repeat with theChar in TextAsArray
       
set CharLength to my get_precomposed_length(theChar)
       
set end of CharacterLengths to CharLength
   
end repeat

   
--concatenate characters and count the length of the characters
   
set currentLength to 0
   
set Counter to 1
   
set theText to ""
   
repeat while currentLength < maxLength
       
set currentLength to (item Counter of CharacterLengths) + currentLength
       
set theText to theText & item Counter of TextAsArray
       
set Counter to Counter + 1
   
end repeat
   
return theText
end get_string_with_maxlength

--convert the string to precomposed and get the length of the string
on get_precomposed_length(theText)
   
set theApp to a reference to current application
   
set theNSString to theApp's NSString's stringWithString:theText
   
set theMutableNSString to theApp's NSMutableString's stringWithString:(theNSString's decomposedStringWithCanonicalMapping())
   
return theMutableNSString's |length|() as text
end get_precomposed_length

on split_string(theStringtheDelimiter)
   
-- save delimiters to restore old settings
   
set oldDelimiters to AppleScript's text item delimiters
   
-- set delimiters to delimiter to be used
   
set AppleScript's text item delimiters to theDelimiter
   
set theArray to every text item of theString
   
-- restore the old setting
   
set AppleScript's text item delimiters to oldDelimiters
   
return theArray
end split_string

Final Thoughts

I think that this is the most complicated script that I ever did. However, the script has a clear structure and can be extended easily.

What the script does not do is handling emails with duplicate subjects. But it would only take a minute to add the message id or the time to the file name. When dragging emails out of Mail a number is added to the file name. Which could also be implemented in AppleScript.