Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding more examples to the gallery #80

Open
wants to merge 11 commits into
base: main
Choose a base branch
from
Open

Adding more examples to the gallery #80

wants to merge 11 commits into from

Conversation

kolibril13
Copy link
Collaborator

@kolibril13 kolibril13 commented Aug 7, 2024

Hi there,

In this pull request, I’ve added some new examples and made a few improvements.
Fistly, I added a new notebook, 01_download_all_data.ipynb, which handles downloading all the assets.
Previously, each notebook contained commands like:

!mkdir -p ../images/
!wget https://github.com/niivue/niivue/raw/main/demos/images/narps-4735_50GV-hypo1_unthresh.nii.gz -P ../images/
!wget https://github.com/niivue/niivue/raw/main/demos/images/narps-4965_9U7M-hypo1_unthresh.nii.gz -P ../images/
!wget https://github.com/niivue/niivue/raw/main/demos/images/mni152.nii.gz -P ../images/

Having these commands at the top of every notebook felt cluttered, and running “Run All” would repeatedly re-download the assets, which is not efficient.

For better development convenience, I suggest we remove these headers from all notebooks and centralize the asset download in 01_download_all_data.ipynb.

Additionally, I’ve reorganized the example image folder, so the paths no longer need the ../ prefix. For instance, ../images/mni152.nii.gz is now simply images/mni152.nii.gz.

The notebook meshes_(GIfTI, FreeSurfer, MZ3, OBJ, STL, legacy VTK).ipynb does not work yet because nv.setMeshShader is not yet implmented

And the notebook trajectory.ipynb does not work yet because the fiber feature is not yet implemented yet (probably).

Therefore, I'm making this a draft pull request.

closes #103

@bcalford
Copy link
Collaborator

bcalford commented Nov 2, 2024

I went through and moved the example files that are not currently working due to some missing features into a prototypes directory within the examples directory. Additionally, I fixed the current lint issues in the project.

@christian-oreilly I think this new file to download all needed images is a good solution to the issue of downloading every image you need each time. That aspect of the PR is ready to merge, but the prototype examples are not currently functional. How should this be handled? Can it be merged with the notes (which are there) indicating that those notebook examples are not functional, also opening up an issue for these notebooks, or should we keep this PR as a draft until the necessary functionality is implemented for these examples?

@christian-oreilly
Copy link
Collaborator

I think this approach could work. Looking at the commit, I would suggest moving

# Import necessary libraries
import os

import requests

# GitHub API URL for the base folder
BASE_API_URL = "https://api.github.com/repos/niivue/niivue/contents"
FOLDER_PATH = "demos/images"
REF = "main"

# Folder where files will be saved
DEST_FOLDER = "images"

def fetch_and_download(api_url, dest_folder):
    """Fetch and download files recursively."""
    print(f"Fetching contents from {api_url}...")
    os.makedirs(dest_folder, exist_ok=True)
    response = requests.get(api_url)
    if response.status_code != 200:
        print(f"Failed to fetch {api_url}: {response.status_code}")
        return

    file_list = response.json()
    for item in file_list:
        item_type = item['type']
        download_url = item.get('download_url', '') if item_type == 'file' else ''
        name = item['name']
        path = item['path']

        if item_type == 'file':
            print(f"Downloading {name}...")
            file_response = requests.get(download_url)
            if file_response.status_code == 200:
                with open(os.path.join(dest_folder, name), 'wb') as f:
                    f.write(file_response.content)
            else:
                print(f"Failed to download {name}: {file_response.status_code}")
        elif item_type == 'dir':
            print(f"Entering directory {name}...")
            subfolder = os.path.join(dest_folder, name)
            sub_api_url = f"{BASE_API_URL}/{path}?ref={REF}"
            fetch_and_download(sub_api_url, subfolder)

to a io.py file and then import it in the 01 example notebook. This function will be required to run any other notebook in Colab, in sanboxes, or any similar environment. The examples notebook need to be self-contained (i.e., you should not need to run notebook 01 to be able to run 02), so having this function in a .py will allow easily reusing across notebooks.

@christian-oreilly
Copy link
Collaborator

Can we also discuss a bit data storage? Right now it defaults to a relative directory "demos/images". That will result in multiple versions of these "demo/images" subfolders depending on where the user is in their file system when they call this function. I would rather add it to a fixed place relative to the root of the ipyniivue repo. This can done by using something like this:

import ipyniivue
from pathlib import Path
FOLDER_PATH = Path(np.__file__).parent / "images"

@bcalford
Copy link
Collaborator

bcalford commented Nov 8, 2024

@christian-oreilly I've created a new py file in the src that handles the image downloads. Additionally, I've made some changes to the data storage and folder pathing. The "FOLDER_PATH" was actually used by the function as a way to access the niivue images from the NiiVue github. "DEST_FOLDER" is used to indicate where to save the files.

This code snippet you attached (reworked for ipniivue):

import ipyniivue
from pathlib import Path
FOLDER_PATH = Path(ipyniivue.__file__).parent / "images"

saves the images to the package source code in Python site-packages. This makes it hard to access them from the examples. Please let me know if I was using it wrong or if there is a different way to do this.

My other concern is with your statement that the notebooks should be self-contained. I believe the original idea that @kolibril13 had was to have one notebook that had to be run first that would download every image needed and thus, you would not need to download any images in any of the other notebooks. This, however, would mean they would not be self contained and this new notebook would be a dependency for every other notebook. Using the function in each file would only worsen the initial problem (of having to download the images each time you run the notebook) even more as you would download every single image for every example each time you ran any notebook. If the notebooks must be self-contained, this current approach will not work.

@christian-oreilly
Copy link
Collaborator

christian-oreilly commented Nov 9, 2024

This makes it hard to access them from the examples. Please let me know if I was using it wrong or if there is a different way to do this.

What about creating a function like this?

def get_dataset_path():
    return Path(ipyniivue.__file__).parent / "images"

Such a function could be added to the download_images.py module, but I would probably rename this module something less specific to downloading images... something like "dataset"?

Using the function in each file would only worsen the initial problem (of having to download the images each time you run the notebook) even more as you would download every single image for every example each time you ran any notebook. If the notebooks must be self-contained, this current approach will not work.

Normally, such functions check if the files needed already exist and if they exist it just returns without doing anything. Optionally, such a function can have an argument overwrite=False that allow users to explicitly ask for a redownload...

Note that DownloadImages is not PEP8-compliant for a python function name. It would need to be renamed download_images. DownloadImages would be a class.

the original idea that @kolibril13 had was to have one notebook that had to be run first that would download every image needed and thus, you would not need to download any images in any of the other notebooks.

I don't like this approach. "tests" data should not be downloadable through a notebook but through a function(s) because there are multiple other use cases that may require people to download such data (e.g., for their testing and playing around, for tests ran through pytests... which we will have to add at some point, etc.). Further, it means that these notebooks cannot run on platforms like Colab out-of-the-box. Finally, an example that crashes by default, unless you have done some other preparatory work is not very good... When you are looking into adopting a new software, if the first example you try crashes, it is not enticing to continue exploring the package. It may be self-evident for the developer that coded the examples that this other notebook needs to be run first... it is generally not for the naive user.

@christian-oreilly
Copy link
Collaborator

@bcalford Is there something blocking this PR? This one has been pending for a long time. It would be good if we can get it moving. Please let me know if there are still some aspects of it that require discussion.

@bcalford bcalford marked this pull request as ready for review December 2, 2024 18:25
@bcalford
Copy link
Collaborator

bcalford commented Dec 2, 2024

I've made a few small changes to the download_images function, now called download_dataset. These allow it to be implemented into each example instead of having a seperate file in the examples directory that downloaded the images.

@christian-oreilly With review, this one is ready to merge.

@bcalford bcalford added the enhancement New feature or request label Dec 3, 2024
Provides default values for download_dataset

Co-authored-by: Christian O'Reilly <christian.oreilly@gmail.com>
@bcalford
Copy link
Collaborator

bcalford commented Dec 5, 2024

This looks good. I'll go through and edit the corresponding code in each example to match this change.

@christian-oreilly
Copy link
Collaborator

@bcalford Before merging this PR, can you make sure it fixes #103. I'll add the command so that the merge automatically closes that issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Demo enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Example path for images needs updating
3 participants