Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upload book workflow #40

Merged
merged 4 commits into from
May 23, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
42 changes: 42 additions & 0 deletions .github/workflows/build_pdf_book.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
name: Build latest version of PDF Book

on:
workflow_dispatch:
repository_dispatch:
types: [rebuild-book]

permissions:
contents: write

jobs:
build_pdf_book:
runs-on: ubuntu-latest

steps:
- uses: actions/checkout@v3

- name: install requirements
run: |
cd new-website
cd utils
pip install -r requirements.txt
sudo apt-get install -y poppler-utils
sudo apt-get install -y wkhtmltopdf

- name: fetch latest version of tutorials
run: |
sudo apt-get install jq
cd new-website
cd utils/tutorials
python3 fetch_tutorials.py

- name: build pdf book
env:
AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_ID }}
AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
run: |
cd new-website
cd utils/tutorials
python3 build_pdf_book.py


8 changes: 8 additions & 0 deletions .github/workflows/test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,8 @@ jobs:
cd new-website
cd utils
pip install -r requirements.txt
sudo apt-get install -y poppler-utils
sudo apt-get install -y wkhtmltopdf

- name: Test tutorial fetching and export
run: |
Expand All @@ -37,4 +39,10 @@ jobs:
cd utils/tutorials
python3 test_utils.py

- name: Test tutorials build pdf book functions
run: |
cd new-website
cd utils/tutorials
python3 test_build_pdf_book.py


23 changes: 17 additions & 6 deletions new-website/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -93,6 +93,7 @@ A detailed description of the working of the scripts is given below.
- The script then merges these PDFs and creates the file `merged.pdf`.
- The `merged.pdf` file is then uploaded to the S3 bucket.
- Please note, pdfunite package is required to be installed for merging. `apt install poppler-utils`



## Deployment
Expand All @@ -107,12 +108,22 @@ A detailed description of the working of the scripts is given below.

## Workflow script

- The `deploy_gh_pages.yml` workflow script in `.github/workflows` is triggered on updates to the main branch.
- The workflow runs a single job comprising of 3 steps
- Fetch version data: This step fetches the latest deepchem release version from the github [api endpoint](https://api.github.com/repos/deepchem/deepchem/releases) and updates the terminal commands in `deepchem/data/home/terminal-commands.json`
- Install and build: This step checks out the repository, installs the required dependencies using npm i, runs the linting process with npm run lint, and generates the static website with npm run export.
- Deploy: This step deploys the website to the gh-pages branch using the [JamesIves/github-pages-deploy-action](https://github.com/JamesIves/github-pages-deploy-action). The website files are copied from the deepchem/out directory, and any files listed in the clean-exclude parameter are excluded from the cleaning process.
- ### `deploy_gh_pages.yml`

- The `deploy_gh_pages.yml` workflow script in `.github/workflows` is triggered on updates to the main branch.
- The workflow runs a single job comprising of 3 steps
- Fetch version data: This step fetches the latest deepchem release version from the github [api endpoint](https://api.github.com/repos/deepchem/deepchem/releases) and updates the terminal commands in `deepchem/data/home/terminal-commands.json`
- Install and build: This step checks out the repository, installs the required dependencies using npm i, runs the linting process with npm run lint, and generates the static website with npm run export.
- Deploy: This step deploys the website to the gh-pages branch using the [JamesIves/github-pages-deploy-action](https://github.com/JamesIves/github-pages-deploy-action). The website files are copied from the deepchem/out directory, and any files listed in the clean-exclude parameter are excluded from the cleaning process.

- ### `build_pdf_book.yml`

- The `build_pdf_book.yml` workflow script in `.github/workflows` is triggered on updates to the `deepchem/examples/tutorials` directory in `deepchem` repository.
- The workflow runs a single job comprising of 3 steps
- Install requirements: This step installs the dependencies specified in the `requirements.txt` file in `new-website/utils`. It also installs poppler-utils and wkhtmltopdf packages.
- Fetch latest version of tutorials: It installs the jq package and then runs the `fetch_tutorials.py` script.
- Build pdf book: This step runs the `build_pdf_book.py` script.

## Workflow overview

![](./public/assets/workflow.png)
![](./public/assets/workflow.png)
32 changes: 25 additions & 7 deletions new-website/utils/tutorials/build_pdf_book.py
Original file line number Diff line number Diff line change
Expand Up @@ -106,20 +106,38 @@ def upload_file(file_name, bucket, object_name=None):
True if file was uploaded, else False

"""
s3_client = boto3.client('s3')
try:
response = s3_client.head_object(Bucket=bucket, Key=object_name)
except ClientError as e:
logging.error(e)
else:
last_modified_datetime = response['LastModified']
format = '%Y-%m-%d_%H:%M:%S'
formatted_time = last_modified_datetime.strftime(format)
old_key = object_name
new_key = f'{object_name[:-4]}_{formatted_time}.pdf'

s3_client.copy_object(
Bucket=bucket,
CopySource={'Bucket': bucket, 'Key': old_key},
Key=new_key
)

s3_client.delete_object(
rbharath marked this conversation as resolved.
Show resolved Hide resolved
Bucket=bucket,
Key=old_key
)

# If S3 object_name was not specified, use file_name
if object_name is None:
object_name = os.path.basename(file_name)

# Upload the file
s3_client = boto3.client('s3')
try:
response = s3_client.upload_file(file_name, bucket, object_name)
except ClientError as e:
logging.error(e)
return False
return True


def merge_pdf(info_path=INFO_PATH, pdf_path=PDF_PATH):
"""
Expand Down Expand Up @@ -182,6 +200,6 @@ def compile_information_pages():
html_to_pdf()
merge_pdf()
compile_information_pages()
merge_pdf_pages(['storage/title.pdf', 'storage/acknowledgement.pdf', 'storage/contents.pdf', 'storage/merged.pdf'])
upload_file('storage/full_pdf.pdf', 'deepchemtutorials', 'TutorialsBook.pdf')
merge_pdf_pages(['cover.pdf', 'storage/title.pdf', 'storage/acknowledgement.pdf', 'storage/contents.pdf', 'storage/merged.pdf'])
rbharath marked this conversation as resolved.
Show resolved Hide resolved
upload_file('storage/full_pdf.pdf', 'deepchemdata', 'book/TutorialsBook.pdf')

Binary file added new-website/utils/tutorials/cover.pdf
Binary file not shown.
Loading