Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to Fix Your Notebook's Images #21

Open
calnfynn opened this issue Jul 11, 2024 · 0 comments
Open

How to Fix Your Notebook's Images #21

calnfynn opened this issue Jul 11, 2024 · 0 comments
Assignees
Labels

Comments

@calnfynn
Copy link
Collaborator

calnfynn commented Jul 11, 2024

How to Fix Your Notebook’s Images

(Combined fix for issues #20 and #13.)

All changes need to be made in the file section.ipynb or however you renamed your Notebook

Update Quarto

Before starting it is important to update your Quarto version to the latest (July 24) version Quarto 1.5.54. For Codespace this is done by making a new Codespace. Before doing that make sure you have Pushed any changes back to GitHub.

You can make a new Codespace here - you only need to select the repository you want to use all the other settings can remain the same. https://github.com/codespaces

To update locally use this https://quarto.org/docs/download/release.html


1. Duplicated Images

The duplicated images are caused by SPARQL: It outputs every result twice in different date formats. To fix that, simply change the query.

Replace the content of your query_img with this:

query_img ="""PREFIX cps: <https://computational-publishing-service.wikibase.cloud/entity/>
PREFIX cpss: <https://computational-publishing-service.wikibase.cloud/entity/statement/>
PREFIX cpsv: <https://computational-publishing-service.wikibase.cloud/value/>
PREFIX cpspt: <https://computational-publishing-service.wikibase.cloud/prop/direct/>
PREFIX cpsp: <https://computational-publishing-service.wikibase.cloud/prop/>
PREFIX cpsps: <https://computational-publishing-service.wikibase.cloud/prop/statement/>
PREFIX cpspq: <https://computational-publishing-service.wikibase.cloud/prop/qualifier/>

SELECT DISTINCT ?itemLabel ?itemDescr ?imgItem ?imgUrl ?publishDate
WHERE
{
  ?imgItem cpsp:P107 ?urlStatement. 
  ?urlStatement cpsps:P107 ?imgUrl. 
  ?imgItem cpsp:P60 ?dateStatement. 
  ?dateStatement cpsps:P60 ?publishDate. 
  ?imgItem cpsp:P6 ?partOfStatement.
  ?partOfStatement cpsps:P6 ?partOfItem.
  <placeholder> 

  FILTER (datatype(?publishDate) = xsd:edtf)
  
  SERVICE wikibase:label {
      bd:serviceParam wikibase:language "en,de".
      ?imgItem rdfs:label ?itemLabel.
      ?imgItem schema:description ?itemDescr.
    }
}"""

The new query uses a filter that only lets dates in one of the formats through.


2. Images Rendering at the End of Your Book

The reason for the incorrect placement of the images is that Quarto doesn't recognise them as proper image files. The python library used for the image rendering (PIL) allows them to show up in the Jupyter Notebook but leads to errors during conversion to HTML, PDF and EPUB.

2.1. Replace the old get_img-function with this snippet to change the way they are displayed to Markdown:

def get_img(partOfItem_id):

    q = ""
    if partOfItem_id:
        q = query_img.replace("<placeholder>", "?partOfStatement cpsps:P6 cps:"+partOfItem_id+".")
    else:
        q = query_img.replace("<placeholder>","")

    results_img = run_query(endpoint_url, q)

    for item in results_img["results"]["bindings"]:    

        title = item['itemLabel']['value']
        description = html.unescape(item['itemDescr']['value'])

        print('\nWikibase link: ' + '[' + item['imgItem']['value'] + ']' + '(' + item['imgItem']['value'] + ')' + '\n')
        print('Title: ' + title + '\n')
        print('Year: ' + item['publishDate']['value'] + '\n')
        print('Description: ' + description + '\n')

        # get image from image URL
        image_url=item['imgUrl']['value']

        # display image with title + alt text (in markdown)
        print('!['+ title +']('+image_url+'){fig-alt="'+description+'"}\n\n')

Instead of an image thumbnail you will now see a Markdown expression, similar to this: ![Die barocken Schloss- und Gartenveduten bild](https://previous.bildindex.de/bilder/fmd10005861a.jpg){fig-alt="Bild für Die barocken Schloss- und Gartenveduten"}, in your notebook's output. This is fine, it will still show up as an image in the other formats.

2.2. Remove the now unused imports and functions:

Simply delete these lines at the very start of the file:

from datetime import datetime
import time
from PIL import Image

and also delete these two functions which we don't need anymore if we aren't using PIL:

def get_delay(date):
    try:
        date = datetime.datetime.strptime(date, '%a, %d %b %Y %H:%M:%S GMT')
        timeout = int((date - datetime.datetime.now()).total_seconds())
    except ValueError:
        timeout = int(date)
    return timeout

def fetch_image_by_url(url, headers):
    r = requests.get(url, headers=headers, stream=True)
    if r.status_code == 200:
        im = Image.open(r.raw)
        return im
    if r.status_code == 500:
        return None
    if r.status_code == 403:
        return None
    if r.status_code == 429:
        timeout = get_delay(r.headers['retry-after'])
        print('Timeout {} m {} s'.format(timeout // 60, timeout % 60))
        time.sleep(timeout)
        fetch_image_by_url(url, headers)

3. Update Your Repository

  • Run all your code cells in section.ipynb.
  • In the terminal, use quarto render and then quarto preview.
  • Push the changes to your repository.
  • Your images should now be fixed in all outputs!
@calnfynn calnfynn self-assigned this Jul 11, 2024
@calnfynn calnfynn added the demo label Jul 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant