Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add save_attachments action and provide enhancements to search_items action. #17

Merged
merged 136 commits into from
Dec 2, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
136 commits
Select commit Hold shift + click to select a range
85302b6
Add "attachment_directory" configuration parameter for server directo…
Jun 24, 2021
6728259
Add handler for pack attachment configuration parameters.
Jun 25, 2021
d605c33
Disable attachment configuration for troubleshooting.
Jun 25, 2021
071121d
Remove st2common.content import for troubleshooting.
Jun 25, 2021
4bfaf37
Move st2common.content import into _attachment_configuration to avoid…
Jun 25, 2021
d8540d3
Add email_sender and email_recipient_list attributes to item_to_dict …
Jun 25, 2021
9b251e6
Cast "item.item_class" to str() for comparison.
Jun 25, 2021
8a9e3d1
Add debugging output to show fields on search results.
Jun 25, 2021
a9e843a
Change debugging to show class of items returned by search.
Jun 25, 2021
5a1aeb9
Remove check for item class.
Jun 25, 2021
0614905
Update item_to_dict() method to extract email address from exchangeli…
Jun 25, 2021
0d6a4da
Remove Item class from list of exchangelib imports.
Jun 25, 2021
e340f17
Debugging email addresses.
Jun 25, 2021
1d99738
Fix acquisition of sender email address.
Jun 25, 2021
a476dce
Change name of attribute for list of email recipients to "email_recip…
Jun 25, 2021
c93d182
Add support for searching for items by date using free-format date vi…
Jun 25, 2021
5792a38
Update _get_date_from_string to return EWSDateTime object instance.
Jun 25, 2021
7a816b1
Fix missing key in logging format statement.
Jun 25, 2021
f7d79e9
Add pytz library to make search start date timezone-aware using pack …
Jun 25, 2021
e2f05fa
Remove temporary debug logging.
Jun 25, 2021
7e69d58
Add support for searching for items by date using free-format date vi…
Jun 25, 2021
5725a6a
Merge branch 'save_attachments' of github.com:TimothyDJones/stackstor…
Jun 25, 2021
94c6455
Remove temporary debug logging.
Jun 25, 2021
8ca8f4d
Add check to ensure that attachment save directory is writeable.
Jun 25, 2021
bffc9b5
Cast attachment directory maximum size and days to keep attachments c…
Jun 25, 2021
25be039
Add check to ensure that attachment save directory is writeable.
Jun 25, 2021
03eadfe
Make class attributes associated with attachment configuration parame…
Jun 25, 2021
703db98
Re-enable attachment attribute configuration.
Jun 25, 2021
b0c93ed
Comment out custom attachment directory configuration code.
Jun 25, 2021
4211e25
Comment out custom attachment directory configuration code.
Jun 25, 2021
2cceab6
Change data structure access to retrieve attachment directory configu…
Jun 25, 2021
7bcf8a7
Add logging output for troubleshooting.
Jun 25, 2021
f4255b0
Add missing format() key.
Jun 25, 2021
f49e83f
Change data structure access to retrieve attachment directory configu…
Jun 25, 2021
dfbaf17
Update attachment configuration method to use "self.config" object.
Jun 25, 2021
fc8aa74
Re-enable all functionality for _attachment_configuration.
Jun 25, 2021
8b8b25a
Add implementation for saving email file attachments.
Jun 25, 2021
fb06a97
Correct data structure for ATTACHMENT_FORMAT lookup dictionary.
Jun 25, 2021
ef9ca24
Add folder_name attribute to results of item search to allow use of f…
Jun 25, 2021
49b9069
Update attachment save to include parent folder containing message an…
Jun 25, 2021
b316de7
Use "id" key to get message by ID.
Jun 25, 2021
15fd6dc
Use "item_id" key to get message by ID.
Jun 25, 2021
98515f6
Change criteria to retrieve email to save attachments to use combinat…
Jun 25, 2021
1eb71f2
Add missing changekey_id parameter to run() method.
Jun 25, 2021
a892630
Add logging output for troubleshooting.
Jun 25, 2021
b4a0ee2
Update logic to handle iterator returned by "account.fetch()" method.
Jun 25, 2021
46f5813
Add logging output for troubleshooting.
Jun 25, 2021
a61d317
Add logging output for troubleshooting.
Jun 25, 2021
cf32b33
Add logging output for troubleshooting.
Jun 25, 2021
bbf9e11
Add pytz to pack requirements.
Jun 25, 2021
8a8697e
Move _get_date_from_string() utility method to base/action.py.
Jun 25, 2021
1059b61
Re-factor attachment save functionality to obtain results via search.
Jun 25, 2021
a888e94
Update to use _get_date_from_string() utility method in base/action.py.
Jun 25, 2021
ad13017
Correct attribute name.
Jun 25, 2021
47d0679
Update output data structure of results from saving attachments.
Jun 25, 2021
668650a
Correct attribute name.
Jun 25, 2021
3c6401e
Correct fp name.
Jun 25, 2021
83b40ce
Correct fp name.
Jun 25, 2021
cf3b207
Add logging output for troubleshooting.
Jun 25, 2021
fa8adab
Changed to simple file writer for saving attachment.
Jun 25, 2021
f1adf36
Add attachment_folder_maximum_size and attachment_days_to_keep config…
Jun 28, 2021
69614bb
Updates for version 1.1.0.
Jun 28, 2021
ab7e53d
Change pack version to 1.1.0.
Jun 28, 2021
c196d22
Move common search code to utility method in base/actions.py to share…
Jun 28, 2021
417f55f
Implement attachment directory maintenance functionality.
Jun 28, 2021
25b75c6
Remove directory checks from attachment directory maintenance sensor,…
Jun 28, 2021
2fd4e22
Rename attachment directory maintenance sensor to better reflect purp…
Jun 28, 2021
8b0862b
Correct argument to join() to list.
Jun 28, 2021
3dd7320
Remove quotes from attachment_folder_maximum_size description.
Jun 28, 2021
5cfd97d
Fix attachment_directory_maximum_size parameter name.
Jun 28, 2021
30737fb
Add optional attachment_directory_maximum_size and attachment_days_to…
Jun 28, 2021
5792795
Additional updates for changes to pack.
Jun 28, 2021
3b777af
Update README for save_attachments and related actions, sensors, etc.
Jun 28, 2021
cde564b
Change contributor email address to personal email address.
Jun 28, 2021
9775971
Fix multiple assignment error.
Jun 28, 2021
bd285e0
Correct date calculations in _remove_old_files() method.
Jun 28, 2021
a315362
Add check for attachment directory and ensure it is writeable.
Jun 28, 2021
32d7d30
Cast date value converted to Unix epoch string to integer for compari…
Jun 28, 2021
26d0989
Use "path" attribute from scandir() result to get full pathname to fi…
Jun 28, 2021
03b09b4
Add missing key to format() statement.
Jun 28, 2021
f7679a2
Fix multiple assignment error.
Jun 28, 2021
c0e36b2
Add logging for troubleshooting.
Jun 28, 2021
56d2ccf
Change to using Python sorted() function to get sorted_file_list.
Jun 28, 2021
e5c2024
Additional notes on updates.
Jun 28, 2021
6f7504f
Correct "subject" object reference.
Jun 28, 2021
28f114a
Update to append to result list only if one or more attachments are s…
Jun 28, 2021
e0c6f74
Change log level for message when attachment is *not* file attachment…
Jun 28, 2021
b5e17e1
Remove temporary logging for debugging.
Jun 28, 2021
e1b2aef
Fix parameter for folder_name.
Jun 28, 2021
e6bc9dd
Flake8 fixes.
Jun 28, 2021
4475700
Flake8 fixes.
Jun 28, 2021
fd6c0db
Flake8 fixes.
Jun 28, 2021
718f4ee
Flake8 fixes.
Jun 28, 2021
13e7d46
Flake8 fixes.
Jun 29, 2021
92fc15b
Flake8 fixes.
Jun 29, 2021
76e4936
Flake8 fixes.
Jun 29, 2021
84e4cc4
Flake8 fixes.
Jun 29, 2021
428cc82
Flake8 fixes.
Jun 29, 2021
bdc5cf8
Update to call _search_items() method in base/actions.py instead of l…
Jun 29, 2021
cf701c0
Update to call _search_items() method in base/actions.py instead of l…
Jun 29, 2021
4971692
Fix comprehension for messages_as_dict to include outer list.
Jun 29, 2021
ca4c508
Flake8 fixes.
Jun 29, 2021
b788a46
Comment out code for troubleshooting.
Jun 29, 2021
b0c30cc
Refactor creation of messages_as_dict data structure for debug logging.
Jun 29, 2021
ada1491
Fix argument to dict() to make it list() of tuples.
Jun 29, 2021
51695f5
Fix argument to dict() to make it list() of tuples.
Jun 29, 2021
d6866d9
Add 'attachments' directory.
Jun 29, 2021
7a2d6ed
Correct initialization of att_filename_list to *BEFORE* looping throu…
Jul 2, 2021
125d7d8
Add method for creating unique filename by using *CURRENT* date/time …
Jul 2, 2021
eaf1361
Create utility method _construct_filename() and refactor _get_unique_…
Jul 2, 2021
7b763ae
Use positional arguments to _construct_filename() calls.
Jul 2, 2021
f412792
Create utility method _construct_filename() and refactor _get_unique_…
Jul 2, 2021
9f3f93b
Refactor _construct_filename() to use implicit default of self.attach…
Jul 2, 2021
b9af93a
Add enumerated value replace_spaces_in_filename parameter to allow re…
Jul 5, 2021
4e044af
Add dictionary and corresponding lookup for replace_spaces_in_filenam…
Jul 5, 2021
7fd0752
Change default for replace_spaces_in_filename parameter on save_attac…
Jul 5, 2021
a48b121
Add debug logging for troubleshooting.
Jul 5, 2021
9963c8b
Add debug logging for troubleshooting.
Jul 5, 2021
fa983d5
Add debug logging for troubleshooting.
Jul 5, 2021
c1a70a2
Fix target of dictionary get() function.
Jul 5, 2021
b2dbaa9
Remove temporary log debug output.
Jul 5, 2021
8832f66
Fix class hierarchy for datetime "now()" method.
Jul 6, 2021
2c9d9e1
Add "change_key" attribute to trigger payload to allow finding items …
Jul 14, 2021
3d1d46f
Add _get_item_by_id() utility method to get MS Exchange item (email m…
Jul 14, 2021
fa7e1dd
Update "save attachments" action to accept alternate input of combina…
Jul 14, 2021
53a73af
Update maximum revision of exchangelib to 1.12.5.
Jul 14, 2021
e8f201e
Update maximum revision of exchangelib to 2.2.0.
Jul 14, 2021
0e194b7
Update maximum revision of exchangelib to 1.12.5.
Jul 14, 2021
f477156
Revert to maximum version of 1.10.0 for exchangelib.
Jul 14, 2021
44a7688
Update call to exchangelib fetch() method to use a *list* of tuple fo…
Jul 14, 2021
0ac2f6d
Change to use dictionary get() method for "folder_name" attribute, be…
Jul 14, 2021
d569e04
Add "change_key" attribute to ItemSensor payload schema YAML definition.
Sep 2, 2021
90a4270
Change to require use on only version 2020.1 of pytz library.
Sep 8, 2021
a498ee4
Add dependency on version 2.1 of tzlocal library, because of API vers…
Sep 8, 2021
89694ff
Add ".vscode" settings directory to list of directories to ignore.
Oct 5, 2021
df00745
Remove VS Code settings from repository.
Oct 5, 2021
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -87,3 +87,6 @@ ENV/

# Rope project settings
.ropeproject

# VS Code project settings
.vscode
42 changes: 41 additions & 1 deletion CHANGES.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,45 @@
# Change Log

## 1.1.0

* **Add** `save_attachments` action.
* Includes several _pack_ configuration options:
- `attachment_directory`: Fully-qualified server path name used to store attachments. Must be readable and writeable by Stackstorm. Defaults to "/opt/stackstorm/packs/msexchange/attachments".
- `attachment_folder_maximum_size`: Maximum storage space in MB (default is 50MB) alloted to `attachment_directory`. Pack maintenance process (see below) manages this.
- `attachment_days_to_keep`: Maximum number of days to keep saved attachments (default is 7 days). Also, managed by pack maintenance process.
* Uses same model as `search_items` for finding email messages for which to save attachments. Only _email_ messages and _file_ attachments are supported.
* Action returns a _list/array_ of dictionaries with following attributes:
- `email_subject` - Full subject of the email.
- `email_sent` - Date that email was sent.
- `sender_email_address` - Email address of the sender.
- `attachment_files` - _List/array_ of fully-qualified filenames from server of attachments saved. Example:
```JSON
[
{
"email_subject": "ACCOUNT LIST - resend for testing",
"email_sent": "2021-06-25 13:54:34+00:00",
"sender_email_address": "someone@example.com",
"attachment_files": [
"/opt/stackstorm/packs/msexchange/attachments/Accounts_06_23_2021.xlsx"
]
}
]
```
* If attachment filename is **not** unique in target folder, attempts to generate a unique filename for each attachment via several methods, including date sent, date _and_ time sent, and "random" 8-character string.
* Attachments can be saved either as BINARY (default) or TEXT format.
* An `attachment_directory_maintenance` sensor/trigger/rule combination has been implemented and, by default, runs once daily (polling interval of 86400 seconds) to enforce these rules through the `do_attachment_directory_maintenance` action. This action can be run manually, as well, if you need to override the pack configuration values; running manually it with no input values uses the pack configuration.
* Maintenance process removes files by age first and then, if necessary, deletes remaining files starting with _largest_ files until threshold is reached.
* Save attachment action has `replace_spaces_in_filename` enumerated value parameter (NONE [default], UNDERSCORE, OCTOTHORPE/HASH, and PIPE) to allow user to replace spaces in attachment file names, if desired. Default (NONE) is to preserve spaces.

* **Enhancements** to `search_items` action.
* Moved search logic from `run` method in `search_items` action into `_search_items` utility method in `base/actions.py` to allow functionality to be shared by `search_items` and `save_attachments`.
* Added `search_start_date` parameter specifying the start date for items to search. (End date is always "today".) Date can be entered as free-form text. Most any date format is supported, as [`dateutil`](https://dateutil.readthedocs.io/) library is used to parse input to valid `datetime` value.
* Update `search_items` action to return additional item attributes, specific to _email_ messages, from `item_to_dict` helper method. Such attributes can be useful in filtering e-mails based on sender and/or originating domain.
- `sender_email_address` - Email address of sender.
- `email_recipient_addresses` - List/array of email recipients (from [`exchangelib`](https://ecederstrand.github.io/exchangelib/) `to_recipients` list **only**).
* Added optional `folder_name` parameter to `item_to_dict` helper method to include the name of the folder used in the search as attribute of returned dictionary.
* Update `requirements.txt` to include `python-dateutil` (see above) and `pytz`, which is needed for creating timezone-aware `exchangelib` [`EWSDateTime`](https://ecederstrand.github.io/exchangelib/exchangelib/ewsdatetime.html#exchangelib.ewsdatetime.EWSDateTime) objects for date searches.

## 1.0.0

* Drop Python 2.7 support
Expand All @@ -15,7 +55,7 @@
## 0.1.3

* Set default folder to `Inbox` for `search_items`
* Fixed sensor bug with config object handling for non-autodiscovery systems
* Fixed sensor bug with config object handling for non-autodiscovery systems

## 0.1.2

Expand Down
14 changes: 12 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,14 +6,19 @@ This pack provides Microsoft Exchange integration to perform simple searches on
Exchange Server 2010, 2013 and 2016 as well as Office 365 hosted Exchange accounts.

## Actions
* `do_attachment_directory_maintenance` - Performance maintenance on server directory in which email file attachments are saved.
* `get_calendar_items` - Get a list of calendar items within a date range
* `get_folder` - Get information about a folder (mail, contact, meta)
* `list_folders` - List all folders or subfolders within a folder
* `search_items` - Search for items by subject within a folder (default folder Inbox)
* `save_attachments` - Save _file_ attachments on _email_ messages to server directory.
* `search_items` - Search for items by subject and/or date within a folder (default folder Inbox)
* `send_email` - Send an email

## Rules
* `attachment_directory_maintenance` - Runs maintenance (storage usage) via `do_attachment_directory_maintenance` action on server directory in which file attachments are saved when triggered by associated sensor.

## Sensors
* `attachment_directory_maintenance_sensor` - Runs maintenance periodically (default daily).
* `item_sensor` - Monitors the configured folder (Inbox by default) for new items and sends a `exchange_new_item` trigger when one is received

## Configuration
Expand Down Expand Up @@ -48,4 +53,9 @@ username: "bob@company.com"
password: "B0bsPassword!"
timezone: "Europe/London"
sensor_folder: "My folder to monitor"
```
```

### Email Attachment Configuration
- `attachment_directory`: Fully-qualified server path name used to store attachments. Must be readable and writeable by Stackstorm. Defaults to "/opt/stackstorm/packs/msexchange/attachments".
- `attachment_folder_maximum_size`: Maximum storage space in MB (default is 50MB) alloted to `attachment_directory`. Pack maintenance process (see below) manages this.
- `attachment_days_to_keep`: Maximum number of days to keep saved attachments (default is 7 days). Also, managed by pack maintenance process.
17 changes: 16 additions & 1 deletion actions/base/__init__.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,6 @@
from exchangelib import Mailbox, Message


def folder_to_dict(folder):
return {
'id': folder.folder_id,
Expand All @@ -9,7 +12,7 @@ def folder_to_dict(folder):
}


def item_to_dict(item, include_body=False):
def item_to_dict(item, include_body=False, folder_name=None):
result = {
'id': item.item_id,
'changekeyid': item.changekey,
Expand All @@ -32,4 +35,16 @@ def item_to_dict(item, include_body=False):
if not include_body:
del result['body']
del result['text_body']
if folder_name:
result["folder_name"] = folder_name
# If this is an email message, add sender and recipients.
if isinstance(item, Message):
result["sender_email_address"] = None
if isinstance(item.sender, Mailbox):
result["sender_email_address"] = str(item.sender.email_address)
result["email_recipient_addresses"] = list()
for recpt in item.to_recipients:
if isinstance(recpt, Mailbox):
result["email_recipient_addresses"].append(
str(recpt.email_address))
return result
129 changes: 128 additions & 1 deletion actions/base/action.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,8 @@
from st2common.runners.base_action import Action
from st2client.client import Client
from st2client.models import KeyValuePair
from exchangelib import Account, ServiceAccount, Configuration, DELEGATE, EWSTimeZone
from exchangelib import (Account, ServiceAccount, Configuration, DELEGATE,
EWSTimeZone, EWSDateTime)

CacheEntry = namedtuple('CacheEntry', 'ews_url ews_auth_type primary_smtp_address')

Expand Down Expand Up @@ -54,6 +55,9 @@ def __init__(self, config):
access_type=DELEGATE)
self._store_cache_configuration()

# Configure attachment parameters
self._attachment_configuration()

def _store_cache_configuration(self):
ews_url = self.account.protocol.service_endpoint
ews_auth_type = self.account.protocol.auth_type
Expand All @@ -79,3 +83,126 @@ def _get_cache(self):
primary_smtp_address=primary_smtp_address.value)
else:
return None

def _attachment_configuration(self):
attach_dir = self.config.get("attachment_directory", None)
if not attach_dir:
try:
from st2common.content import utils as content_utils
pack_name = getattr(self.action_service._action_wrapper,
"_pack", "unknown")
pack_path = content_utils.get_pack_base_path(pack_name)
attach_dir = os.path.join(pack_path, "attachments")
except ImportError:
err_msg = (
"Unable load import 'st2common.content.utils' "
"library. Using pack default attachment directory of "
"'/opt/stackstorm/packs/msexchange/attachments'.")
self.logger.error(err_msg)
attach_dir = "/opt/stackstorm/packs/msexchange/attachments"
else:
attach_dir = os.path.abspath(attach_dir)

# Create the folder/directory, if it doesn't exist,
# and make it writeable.
if not os.path.exists(attach_dir):
os.makedirs(attach_dir, exist_ok=True)
os.chmod(attach_dir, 0o755)
self.logger.info("Created directory '{dir}' and made writeable."
.format(dir=attach_dir))

if not os.access(attach_dir, os.W_OK):
raise OSError("Unable to write to attachment directory '{dir}'."
.format(dir=attach_dir))

self.attachment_directory = attach_dir
self.attachment_directory_maximum_size = int(self.config.get(
"attachment_directory_maximum_size", 50))
self.attachment_days_to_keep = int(self.config.get(
"attachment_days_to_keep", 7))

def _get_date_from_string(self, date_str=None):
"""
Use dateutil library (https://dateutil.readthedocs.io/) to parse
unstructured date string to standard format.
:param date_str str: Date as string in unknown/unstructured format
:returns EWSDateTime object or None
"""
# If date_str is not provided, we assume that this is for the *end*
# of the filter range, which we set to "now", using timezone from
# pack configuration.
if not date_str:
return EWSDateTime.now(tz=self.timezone)

try:
from dateutil import parser
import pytz
parsed_date = parser.parse(date_str)
utc_date = pytz.utc.localize(parsed_date)
local_date = utc_date
try:
local_date = utc_date.astimezone(self.timezone)
except Exception:
self.logger.error("Unable to convert search date to pack "
"timezone. Using UTC...")
start_date = EWSDateTime.from_datetime(local_date)
self.logger.debug("Search start date: {dt}".format(dt=start_date))
except ImportError:
self.logger.error("Unable to find/load 'dateutil' library.")
start_date = None
except ValueError:
self.logger.error("Invalid format for date input: {dt}"
.format(dt=date_str))
start_date = None

return start_date

def _search_items(self, folder, subject=None, search_start_date=None):
"""
Common method for searching for MS Exchange items (email messages,
calendar items, etc.). Used by _search_items_ and _save_attachments_.
"""
folder = self.account.root.get_folder_by_name(folder)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe to prep for the upgrade of exchangelib in #16, remove the get_folder_by_name outlined in #20 as it is now deprecated.


start_date = None
if search_start_date:
start_date = self._get_date_from_string(search_start_date)
# For email messages, MS Exchange does not support using only a
# start date for searches. Instead, we must use a *range* of dates
# for search, so we set the *end* of range to "now".
# See https://stackoverflow.com/a/48742644 for details.
end_date = self._get_date_from_string()

if subject:
if start_date:
# First, try searching for messages...
try:
items = folder.filter(
subject__contains=subject,
datetime_received__range=(start_date, end_date))
# Search on other items, which have regular "start" attribute.
except Exception:
items = folder.filter(
subject__contains=subject, start__gte=start_date)
else:
items = folder.filter(subject__contains=subject)
else:
if start_date:
try:
items = folder.filter(
datetime_received__range=(start_date, end_date))
except Exception:
items = folder.filter(start__gte=start_date)
else:
items = folder.all()

return (items)

def _get_item_by_id(self, item_id, change_key):
"""
Utility method to get MS Exchange item (email message, calendar item,
etc.) by combination of item ID and change key directly.
"""

item_iter = self.account.fetch(ids=[(item_id, change_key)])
return [item for item in item_iter]
Loading