Skip to content

Commit

Permalink
Merge pull request #93 from AutoIDM/archived_refactor
Browse files Browse the repository at this point in the history
Draft: Archived refactor
  • Loading branch information
visch authored Nov 17, 2021
2 parents bc69f68 + 177c05d commit b8097da
Show file tree
Hide file tree
Showing 9 changed files with 231 additions and 244 deletions.
24 changes: 5 additions & 19 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,4 @@
# `tap-clickup`
![Build and Tests](https://github.com/AutoIDM/tap-clickup/actions/workflows/ci.yml/badge.svg?branch=main)
[![PyPI download month](https://img.shields.io/pypi/dm/tap-clickup.svg)](https://pypi.python.org/pypi/tap-clickup/)

# `tap-clickup` ![Build and Tests](https://github.com/AutoIDM/tap-clickup/actions/workflows/ci.yml/badge.svg?branch=main) [![PyPI download month](https://img.shields.io/pypi/dm/tap-clickup.svg)](https://pypi.python.org/pypi/tap-clickup/)
`tap-clickup` is a Singer tap for ClickUp.

## Capabilities
Expand All @@ -16,7 +13,6 @@
| Setting | Required | Default | Description |
|:----------|:--------:|:-------:|:------------|
| api_token | True | None | Example: 'pk_12345' |
| start_date| False | None | Example: '2010-01-01T00:00:00Z' |

A full list of supported settings and capabilities is available by running: `tap-clickup --about`

Expand All @@ -30,8 +26,6 @@ A full list of supported settings and capabilities is available by running: `tap
* This is a personal token, it's fine to use a personal token as this tap is only for the business that's using the data.

## Clickup Replication
Incremental Replication keys are available for Tasks. The Task uses the updated at field as [documented in the tasks section](https://clickup.com/apiv1) of the api.

Start Date is used for the initial updated at value for the updated at field with tasks.

Let's say that you only want tasks that have been updated in the last year. To accomplish this you would pass in a start date of the first of this year!
Expand Down Expand Up @@ -128,21 +122,13 @@ Note that the most up to date information is located in tap_clickup/streams.py.
- Bookmark column(s): N/A
- Link to API endpoint documentation: [Custom Field](https://jsapi.apiary.io/apis/clickup20/reference/0/custom-fields/get-accessible-custom-fields.html)

### Folderless Tasks
- Table name: folderless_task
### Tasks
- Table name: tasks
- Description: Some tasks do not sit under folders. This comes from the folderless_list endpoint
- Primary key column(s): id
- Replicated fully or incrementally: Yes
- Bookmark column(s): date_updated. Note that the api endpoint date_updated_gt is great than or equal to, not just greater than.
- Link to API endpoint documentation: [Get Tasks](https://jsapi.apiary.io/apis/clickup20/reference/0/tasks/get-tasks.html)

### Folder Tasks
- Table name: folder_task
- Description: Some tasks do not sit under folders. This comes from the folderless_list endpoint
- Primary key column(s): id
- Replicated fully or incrementally: Yes
- Replicated fully or incrementally: No
- Bookmark column(s): date_updated. Note that the api endpoint date_updated_gt is great than or equal to, not just greater than.
- Link to API endpoint documentation: [Get Tasks](https://jsapi.apiary.io/apis/clickup20/reference/0/tasks/get-tasks.html)
- Link to API endpoint documentation: [Get Tasks](https://jsapi.apiary.io/apis/clickup20/reference/0/tasks/get-filtered-team-tasks.html)

## Other Info

Expand Down
3 changes: 0 additions & 3 deletions meltano.yml
Original file line number Diff line number Diff line change
Expand Up @@ -11,13 +11,10 @@ plugins:
- catalog
- discover
settings:
- name: username
- name: api_token
kind: password
select:
- '*.*'
config:
start_date: '2010-01-01T00:00:00Z'
loaders:
- name: target-jsonl
variant: andyh1203
Expand Down
255 changes: 140 additions & 115 deletions poetry.lock

Large diffs are not rendered by default.

5 changes: 4 additions & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ keywords = [
[tool.poetry.dependencies]
python = "<3.10,>=3.6.2"
requests = "^2.25.1"
singer-sdk = "0.3.11"
singer-sdk = "0.3.13"

[tool.poetry.dev-dependencies]
pytest = "^6.1.2"
Expand All @@ -38,6 +38,9 @@ tox = "^3.23.1"
codecov = "^2.1.11"
pylint = "2.10.2"

[tool.pytest.ini_options]
log_cli = 1

[build-system]
requires = ["poetry-core>=1.0.0"]
build-backend = "poetry.core.masonry.api"
Expand Down
46 changes: 30 additions & 16 deletions tap_clickup/client.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
"""REST client handling, including ClickUpStream base class."""

from typing import Any, Optional, Iterable, cast
from typing import Any, Optional, Iterable, cast, Dict
from pathlib import Path
from datetime import datetime
import time
Expand All @@ -19,6 +19,17 @@ class ClickUpStream(RESTStream):
url_base = "https://api.clickup.com/api/v2"
records_jsonpath = "$[*]" # Or override `parse_response`.
next_page_token_jsonpath = "$.next_page" # Or override `get_next_page_token`.

def get_url_params(
self, context: Optional[dict], next_page_token: Optional[Any]
) -> Dict[str, Any]:
"""Return a dictionary of values to be used in URL parameterization."""
params: dict = {}
if next_page_token:
params["page"] = next_page_token
if context:
params["archived"] = context.get("archived")
return params

@property
def http_headers(self) -> dict:
Expand All @@ -30,21 +41,6 @@ def http_headers(self) -> dict:
headers["Authorization"] = self.config.get("api_token")
return headers

def get_next_page_token(
self, response: requests.Response, previous_token: Optional[Any]
) -> Optional[Any]:
"""Return a token for identifying next page or None if no more pages."""
if self.next_page_token_jsonpath:
all_matches = extract_jsonpath(
self.next_page_token_jsonpath, response.json()
)
first_match = next(iter(all_matches), None)
next_page_token = first_match
else:
next_page_token = response.headers.get("X-Next-Page", None)

return next_page_token

@backoff.on_exception(
backoff.expo,
(requests.exceptions.RequestException),
Expand Down Expand Up @@ -114,3 +110,21 @@ def _request_with_backoff(
def parse_response(self, response: requests.Response) -> Iterable[dict]:
"""Parse the response and return an iterator of result rows."""
yield from extract_jsonpath(self.records_jsonpath, input=response.json())

def from_parent_context(self, context: dict):
"""Default is to return the dict passed in"""
if(self.partitions is None): return context
else:
#Was going to copy the partitions, but the _sync call, forces us
#To use partitions, instead of being able to provide a list of contexts
#Ideally we wouldn't mutate partitions here, and we'd just provide
#A copy of partitions with context merged so we don't have side effects
for partition in self.partitions:
partition.update(context.copy()) #Add copy of context to partition
return None #Context now handled at the partition level

def _sync_children(self, child_context: dict) -> None:
for child_stream in self.child_streams:
if child_stream.selected or child_stream.has_selected_descendents:
child_stream.sync(child_stream.from_parent_context(context=child_context))

18 changes: 16 additions & 2 deletions tap_clickup/schemas/list.json
Original file line number Diff line number Diff line change
Expand Up @@ -56,8 +56,22 @@
"$ref": "#definitions/status"
},
"priority": {
"type": ["null", "string"]
},
"type": ["string", "object", "null"],
"properties":{
"color":{
"type": ["string"]
},
"id":{
"type": ["string"]
},
"orderindex":{
"type": ["string"]
},
"priority":{
"type": ["string"]
}
}
},
"assignee": {
"type": ["object", "null"],
"properties": {
Expand Down
112 changes: 34 additions & 78 deletions tap_clickup/streams.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
"""Stream type classes for tap-clickup."""
from pathlib import Path
from typing import Optional, Any, Dict, cast
from typing import Optional, Any, Dict, cast, Iterable
import datetime
import pendulum
import requests
Expand Down Expand Up @@ -39,14 +39,15 @@ class SpacesStream(ClickUpStream):
schema_filepath = SCHEMAS_DIR / "space.json"
records_jsonpath = "$.spaces[*]"
parent_stream_type = TeamsStream
partitions = [{"archived":"true"},{"archived":"false"}]


def get_child_context(self, record: dict, context: Optional[dict]) -> dict:
"""Return a context dictionary for child streams."""
return {
"space_id": record["id"],
}


class FoldersStream(ClickUpStream):
"""Folders"""

Expand All @@ -57,6 +58,7 @@ class FoldersStream(ClickUpStream):
schema_filepath = SCHEMAS_DIR / "folder.json"
records_jsonpath = "$.folders[*]"
parent_stream_type = SpacesStream
partitions = [{"archived":"true"},{"archived":"false"}]

def get_child_context(self, record: dict, context: Optional[dict]) -> dict:
"""Return a context dictionary for child streams."""
Expand All @@ -75,6 +77,7 @@ class FolderListsStream(ClickUpStream):
schema_filepath = SCHEMAS_DIR / "list.json"
records_jsonpath = "$.lists[*]"
parent_stream_type = FoldersStream
partitions = [{"archived":"true"},{"archived":"false"}]

def get_child_context(self, record: dict, context: Optional[dict]) -> dict:
"""Return a context dictionary for child streams."""
Expand All @@ -93,6 +96,7 @@ class FolderlessListsStream(ClickUpStream):
schema_filepath = SCHEMAS_DIR / "list.json"
records_jsonpath = "$.lists[*]"
parent_stream_type = SpacesStream
partitions = [{"archived":"true"},{"archived":"false"}]

def get_child_context(self, record: dict, context: Optional[dict]) -> dict:
"""Return a context dictionary for child streams."""
Expand Down Expand Up @@ -172,20 +176,44 @@ class FolderCustomFieldsStream(ClickUpStream):
records_jsonpath = "$.fields[*]"
parent_stream_type = FolderListsStream

class TasksStream(ClickUpStream):
"""Tasks Stream"""

class ClickUpTasksStream(ClickUpStream):
"""Parent Class for Task Streams"""

name = "task"
#Date_updated_gt is greater than or equal to not just greater than
path = "/team/{team_id}/task?include_closed=true&subtasks=true"
primary_keys = ["id"]
#replication_key = "date_updated"
#is_sorted = True
#ignore_parent_replication_key = True
schema_filepath = SCHEMAS_DIR / "task.json"
records_jsonpath = "$.tasks[*]"
parent_stream_type = TeamsStream
partitions = [{"archived":"true"},{"archived":"false"}]

initial_replication_key_dict = {}

def initial_replication_key(self, context) -> int:
path = self.get_url(context)
path = self.get_url(context) + context.get("archived")
key_cache: Optional[int] = self.initial_replication_key_dict.get(path, None)
if key_cache is None:
key_cache = self.get_starting_replication_key_value(context)
self.initial_replication_key_dict[path] = key_cache
assert key_cache is not None
return key_cache

def get_url_params(
self, context: Optional[dict], next_page_token: Optional[Any]
) -> Dict[str, Any]:
"""Return a dictionary of values to be used in URL parameterization."""
params: dict = {}
if next_page_token:
params["page"] = next_page_token
params["archived"] = context.get("archived")
params["order_by"] = "updated"
params["reverse"] = "true"
params["date_updated_gt"] = 0
return params

def get_starting_replication_key_value(
self, context: Optional[dict]
Expand Down Expand Up @@ -237,75 +265,3 @@ def get_next_page_token(
newtoken = None

return newtoken

def get_url_params(
self, context: Optional[dict], next_page_token: Optional[Any]
) -> Dict[str, Any]:
"""Return a dictionary of values to be used in URL parameterization."""
params: dict = {}
if next_page_token:
params["page"] = next_page_token

# Replication key specefic to tasks
if self.replication_key:
params["order_by"] = "updated"
params["reverse"] = "true"
params["date_updated_gt"] = self.initial_replication_key(
context
) # Actually greater than or equal to
return params


class FolderlessTasksStream(ClickUpTasksStream):
"""Tasks can come from lists not under folders"""

name = "folderless_task"
path = "/list/{list_id}/task?include_closed=true&subtasks=true"
primary_keys = ["id"]
replication_key = "date_updated"
is_sorted = True
ignore_parent_replication_key = True
schema_filepath = SCHEMAS_DIR / "task.json"
records_jsonpath = "$.tasks[*]"
parent_stream_type = FolderlessListsStream


class FolderlessTasksArchivedStream(ClickUpTasksStream):
"""
Tasks can come from lists not under folders,
archived only pulls archived tasks
"""

name = "folderless_task_archived"
path = "/list/{list_id}/task?include_closed=true&subtasks=true&archived=true"
primary_keys = ["id"]
replication_key = "date_updated"
is_sorted = True
ignore_parent_replication_key = True
schema_filepath = SCHEMAS_DIR / "task.json"
records_jsonpath = "$.tasks[*]"
parent_stream_type = FolderlessListsStream


class FolderTasksStream(ClickUpTasksStream):
"""Tasks can come from under Folders"""

name = "folder_task"
path = "/list/{list_id}/task?include_closed=true&subtasks=true"
primary_keys = ["id"]
replication_key = "date_updated"
schema_filepath = SCHEMAS_DIR / "task.json"
records_jsonpath = "$.tasks[*]"
parent_stream_type = FolderListsStream


class FolderTasksArchivedStream(ClickUpTasksStream):
"""Tasks can come from under Folders, archived only pulls archived tasks"""

name = "folder_task_archived"
path = "/list/{list_id}/task?include_closed=true&subtasks=true&archived=true"
primary_keys = ["id"]
replication_key = "date_updated"
schema_filepath = SCHEMAS_DIR / "task.json"
records_jsonpath = "$.tasks[*]"
parent_stream_type = FolderListsStream
11 changes: 2 additions & 9 deletions tap_clickup/tap.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,12 +15,9 @@
GoalsStream,
TagsStream,
SharedHierarchyStream,
FolderTasksStream,
FolderlessTasksStream,
TasksStream,
FolderCustomFieldsStream,
FolderlessCustomFieldsStream,
FolderlessTasksArchivedStream,
FolderTasksArchivedStream,
)

STREAM_TYPES = [
Expand All @@ -33,12 +30,9 @@
GoalsStream,
TagsStream,
SharedHierarchyStream,
FolderTasksStream,
FolderlessTasksStream,
TasksStream,
FolderCustomFieldsStream,
FolderlessCustomFieldsStream,
FolderlessTasksArchivedStream,
FolderTasksArchivedStream,
]


Expand All @@ -49,7 +43,6 @@ class TapClickUp(Tap):

config_jsonschema = th.PropertiesList(
th.Property("api_token", th.StringType, required=True),
th.Property("start_date", th.DateTimeType),
).to_dict()

def discover_streams(self) -> List[Stream]:
Expand Down
1 change: 0 additions & 1 deletion tap_clickup/tests/test_core.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,6 @@
SAMPLE_CONFIG = {
"start_date": datetime.datetime.now(datetime.timezone.utc).strftime("%Y-%m-%d"),
"api_token": os.environ["TAP_CLICKUP_API_TOKEN"],
"team_ids": os.environ["TAP_CLICKUP_TEAM_IDS"],
}


Expand Down

0 comments on commit b8097da

Please sign in to comment.