Package org.apache.poi.hsmf
Class MAPIMessage
java.lang.Object
org.apache.poi.POIDocument
org.apache.poi.POIReadOnlyDocument
org.apache.poi.hsmf.MAPIMessage
- All Implemented Interfaces:
Closeable
,AutoCloseable
Reads an Outlook MSG File in and provides hooks into its data structure.
If you want to develop with HSMF, you might find it worth getting
some of the Microsoft public documentation, such as:
[MS-OXCMSG]: Message and Attachment Object Protocol Specification
-
Nested Class Summary
Nested ClassesModifier and TypeClassDescriptionstatic enum
A MAPI file can be an email (NOTE) or a number of other types -
Constructor Summary
ConstructorsConstructorDescriptionConstructor for creating new files.MAPIMessage
(File file) Constructor for reading MSG Files from the file system.Constructor for reading MSG Files from an input stream.MAPIMessage
(String filename) Constructor for reading MSG Files from the file system.MAPIMessage
(DirectoryNode poifsDir) Constructor for reading MSG Files from a certain point within a POIFS filesystemConstructor for reading MSG Files from a POIFS filesystem -
Method Summary
Modifier and TypeMethodDescriptionGets the message attachments.Gets the conversation topic of the parsed Outlook Message.Gets the display value of the "BCC" line of the outlook message.Gets the display value of the "CC" line of the outlook message.Gets the display value of the "FROM" line of the outlook message This is not the actual address that was sent from but the formated display of the user name.Gets the display value of the "TO" line of the outlook message.String[]
Returns all the headers, one entry per lineGets the html body of this Outlook Message, if this email contains a html version.Gets the main, core details chunksGets the message class of the parsed Outlook Message.Gets the date that the message was accepted by the server on.Gets the Name ID chunks, or null if there aren't anyGets all the recipient details chunks.Returns all the recipients' email address, separated by semicolons.String[]
Returns an array of all the recipient's email address, normally in TO then CC then BCC order.Returns all the recipients' names, separated by semicolons.String[]
Returns an array of all the recipient's names, normally in TO then CC then BCC order.Gets the RTF Rich Message body of this Outlook Message, if this email contains a RTF (rich) version.getStringFromChunk
(StringChunk chunk) Gets a string value based on the passed chunk.Gets the subject line of the Outlook MessageGets the plain text body of this Outlook Messagevoid
Tries to identify the correct encoding for 7-bit (non-unicode) strings in the file.boolean
Does this file contain any strings that are stored as 7 bit rather than unicode?boolean
Will you get a null on a missing chunk, or aChunkNotFoundException
(default is the exception).void
set7BitEncoding
(String charset) Many messages store their strings as unicode, which is nice and easy.void
setReturnNullOnMissingChunk
(boolean returnNullOnMissingChunk) Sets whether on asking for a missing chunk, you get back null or aChunkNotFoundException
(default is the exception).Methods inherited from class org.apache.poi.POIReadOnlyDocument
write, write, write
Methods inherited from class org.apache.poi.POIDocument
clearDirectory, close, createInformationProperties, getDirectory, getDocumentSummaryInformation, getEncryptedPropertyStreamName, getEncryptionInfo, getPropertySet, getPropertySet, getSummaryInformation, initDirectory, readProperties, replaceDirectory, validateInPlaceWritePossible, writeProperties, writeProperties, writeProperties
-
Constructor Details
-
MAPIMessage
public MAPIMessage()Constructor for creating new files. -
MAPIMessage
Constructor for reading MSG Files from the file system.- Parameters:
filename
- Name of the file to read- Throws:
IOException
- on errors reading, or invalid data
-
MAPIMessage
Constructor for reading MSG Files from the file system.- Parameters:
file
- The file to read from- Throws:
IOException
- on errors reading, or invalid data
-
MAPIMessage
Constructor for reading MSG Files from an input stream.Note - this will buffer the whole message into memory in order to process. For lower memory use, use
MAPIMessage(File)
- Parameters:
in
- The InputStream to buffer then read from- Throws:
IOException
- on errors reading, or invalid data
-
MAPIMessage
Constructor for reading MSG Files from a POIFS filesystem- Parameters:
fs
- Open POIFS FileSystem containing the message- Throws:
IOException
- on errors reading, or invalid data
-
MAPIMessage
Constructor for reading MSG Files from a certain point within a POIFS filesystem- Parameters:
poifsDir
- Directory containing the message- Throws:
IOException
- on errors reading, or invalid data
-
-
Method Details
-
getStringFromChunk
Gets a string value based on the passed chunk.- Throws:
ChunkNotFoundException
- if the chunk isn't there
-
getTextBody
Gets the plain text body of this Outlook Message- Returns:
- The string representation of the 'text' version of the body, if available.
- Throws:
ChunkNotFoundException
- If the text-body chunk does not exist and returnNullOnMissingChunk is set
-
getHtmlBody
Gets the html body of this Outlook Message, if this email contains a html version.- Returns:
- The string representation of the 'html' version of the body, if available.
- Throws:
ChunkNotFoundException
- If the html-body chunk does not exist and returnNullOnMissingChunk is set
-
getRtfBody
Gets the RTF Rich Message body of this Outlook Message, if this email contains a RTF (rich) version.- Returns:
- The string representation of the 'RTF' version of the body, if available.
- Throws:
ChunkNotFoundException
- If the rtf-body chunk does not exist and returnNullOnMissingChunk is set
-
getSubject
Gets the subject line of the Outlook Message- Throws:
ChunkNotFoundException
- If the subject-chunk does not exist and returnNullOnMissingChunk is set
-
getDisplayFrom
Gets the display value of the "FROM" line of the outlook message This is not the actual address that was sent from but the formated display of the user name.- Throws:
ChunkNotFoundException
- If the from-chunk does not exist and returnNullOnMissingChunk is set
-
getDisplayTo
Gets the display value of the "TO" line of the outlook message. If there are multiple recipients, they will be separated by semicolons. This is not the actual list of addresses/values that will be sent to if you click Reply in the email - those are stored inRecipientChunks
.- Throws:
ChunkNotFoundException
- If the to-chunk does not exist and returnNullOnMissingChunk is set
-
getDisplayCC
Gets the display value of the "CC" line of the outlook message. If there are multiple recipients, they will be separated by semicolons. This is not the actual list of addresses/values that will be sent to if you click Reply in the email - those are stored inRecipientChunks
.- Throws:
ChunkNotFoundException
- If the cc-chunk does not exist and returnNullOnMissingChunk is set
-
getDisplayBCC
Gets the display value of the "BCC" line of the outlook message. If there are multiple recipients, they will be separated by semicolons. This is not the actual list of addresses/values that will be sent to if you click Reply in the email - those are stored inRecipientChunks
. This will only be present in sent emails, not received ones!- Throws:
ChunkNotFoundException
- If the bcc-chunk does not exist and returnNullOnMissingChunk is set
-
getRecipientEmailAddress
Returns all the recipients' email address, separated by semicolons. Checks all the likely chunks in search of the addresses.- Throws:
ChunkNotFoundException
-
getRecipientEmailAddressList
Returns an array of all the recipient's email address, normally in TO then CC then BCC order. Checks all the likely chunks in search of the addresses.- Throws:
ChunkNotFoundException
-
getRecipientNames
Returns all the recipients' names, separated by semicolons. Checks all the likely chunks in search of the names. See alsogetDisplayTo()
,getDisplayCC()
andgetDisplayBCC()
.- Throws:
ChunkNotFoundException
-
getRecipientNamesList
Returns an array of all the recipient's names, normally in TO then CC then BCC order. Checks all the likely chunks in search of the names. See alsogetDisplayTo()
,getDisplayCC()
andgetDisplayBCC()
.- Throws:
ChunkNotFoundException
-
guess7BitEncoding
public void guess7BitEncoding()Tries to identify the correct encoding for 7-bit (non-unicode) strings in the file.Many messages store their strings as unicode, which is nice and easy. Some use one-byte encodings for their strings, but don't always store the encoding anywhere helpful in the file.
This method checks for codepage properties, and failing that looks at the headers for the message, and uses these to guess the correct encoding for your file.
Bug #49441 has more on why this is needed
-
set7BitEncoding
Many messages store their strings as unicode, which is nice and easy. Some use one-byte encodings for their strings, but don't easily store the encoding anywhere in the file! If you know what the encoding is of your file, you can use this method to set the 7 bit encoding for all the non unicode strings in the file.- See Also:
-
has7BitEncodingStrings
public boolean has7BitEncodingStrings()Does this file contain any strings that are stored as 7 bit rather than unicode? -
getHeaders
Returns all the headers, one entry per line- Throws:
ChunkNotFoundException
-
getConversationTopic
Gets the conversation topic of the parsed Outlook Message. This is the part of the subject line that is after the RE: and FWD:- Throws:
ChunkNotFoundException
- If the conversation-topic chunk does not exist and returnNullOnMissingChunk is set
-
getMessageClassEnum
Gets the message class of the parsed Outlook Message. (Yes, you can use this to determine if a message is a calendar item, note, or actual outlook Message) For emails the class will be IPM.Note- Throws:
ChunkNotFoundException
- If the message-class chunk does not exist and returnNullOnMissingChunk is set
-
getMessageDate
Gets the date that the message was accepted by the server on.- Throws:
ChunkNotFoundException
-
getMainChunks
Gets the main, core details chunks -
getRecipientDetailsChunks
Gets all the recipient details chunks. These will normally be in the order of: * TO recipients, in the order returned bygetDisplayTo()
* CC recipients, in the order returned bygetDisplayCC()
* BCC recipients, in the order returned bygetDisplayBCC()
-
getNameIdChunks
Gets the Name ID chunks, or null if there aren't any -
getAttachmentFiles
Gets the message attachments. -
isReturnNullOnMissingChunk
public boolean isReturnNullOnMissingChunk()Will you get a null on a missing chunk, or aChunkNotFoundException
(default is the exception). -
setReturnNullOnMissingChunk
public void setReturnNullOnMissingChunk(boolean returnNullOnMissingChunk) Sets whether on asking for a missing chunk, you get back null or aChunkNotFoundException
(default is the exception).
-