Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug: does not write crates with arcp IDs correctly #175

Closed
rosanna-smith opened this issue Jan 31, 2024 · 8 comments
Closed

Bug: does not write crates with arcp IDs correctly #175

rosanna-smith opened this issue Jan 31, 2024 · 8 comments

Comments

@rosanna-smith
Copy link

rosanna-smith commented Jan 31, 2024

If you have a crate with an arcp identifier on the root dataset, the write method creates an unwanted arcp directory in the export and changes the ID to ./

Here's the code to reproduce this error:

import os
import json
from rocrate.model.person import Person

#input data
input_data = {
    "@context": "https://w3id.org/ro/crate/1.1/context",
    "@graph": [
        {
            "@id": "arcp://name,farms-to-freeways-example-dataset",
            "@type": "Dataset",
            "datePublished": "2024-01-31T04:46:07+00:00"
            
        },
        {
            "@id": "ro-crate-metadata.json",
            "@type": "CreativeWork",
            "about": {
                "@id": "arcp://name,farms-to-freeways-example-dataset"
            },
            "conformsTo": {
                "@id": "https://w3id.org/ro/crate/1.1"
            }
        },
        
        {
            "@id": "https://orcid.org/0000-0000-0000-0000",
            "@type": "Person",
            "affiliation": "University of Flatland",
            "name": "Alice Doe"
        },
        {
            "@id": "https://orcid.org/0000-0000-0000-0001",
            "@type": "Person",
            "affiliation": "University of Flatland",
            "name": "Bob Doe"
        }
    ]
}

os.mkdir("input_crate")

with open('input_crate/ro-crate-metadata.json', 'w') as f:
    json.dump(input_data, f)

crate = ROCrate("input_crate")

alice_id = "https://orcid.org/0000-0000-0000-0000"
bob_id = "https://orcid.org/0000-0000-0000-0001"
alice = crate.add(Person(crate, alice_id, properties={
    "name": "Alice Doe",
    "affiliation": "University of Flatland"
}))
bob = crate.add(Person(crate, bob_id, properties={
    "name": "Bob Doe",
    "affiliation": "University of Flatland"
}))

crate.write("exp_crate")

It produces this error:

Traceback (most recent call last):
  File "/Users/rosannasmith/Documents/LDaCA/Repos/oni-downloader/test.py", line 60, in <module>
    crate.write("exp_crate")
  File "/Users/rosannasmith/Documents/LDaCA/Repos/oni-downloader/venv/lib/python3.12/site-packages/rocrate/rocrate.py", line 452, in write
    writable_entity.write(base_path)
  File "/Users/rosannasmith/Documents/LDaCA/Repos/oni-downloader/venv/lib/python3.12/site-packages/rocrate/model/dataset.py", line 57, in write
    raise FileNotFoundError(
FileNotFoundError: [Errno 2] No such file or directory: 'arcp://name,farms-to-freeways-example-dataset'
@simleo
Copy link
Collaborator

simleo commented Mar 5, 2024

I could not reproduce the error. I ran the above code, after adding the missing from rocrate.rocrate import ROCrate, and it ran with no errors. No directory was created in exp_crate and no @id was changed to ./.

This is with ro-crate-py from the current master branch. I was able to reproduce the error with ro-crate-py 0.9.0, so this problem must have been fixed as a side effect of something that got merged after 0.9.0.

@jmfernandez
Copy link

I guess the issue might be related to some kind of default of urllib.parse library which depends on the Linux distribution or the Python installer, because I have been able to reproduce the issued found by @rosanna-smith with Python versions from 3.7 to 3.11 (btw, I'm using Gentoo Linux). I could not reproduce the issue with Python 3.12 because a different issue related to pkg_resources arose.

I experienced something similar in an unrelated development when I was testing several interactions between JSON-LD processing libraries, relative URI resolution and the scheme used for the permanent identifiers.

@jmfernandez
Copy link

All the tests were done in freshly created Python venvs, first updating pip and wheel, then installing rocrate package, and last testing the script (with the fix about adding from rocrate.rocrate import ROCrate near its beginning).

@stain
Copy link
Contributor

stain commented Mar 11, 2024

We should be supporting ARCP URIs as in https://www.researchobject.org/ro-crate/1.1/appendix/relative-uris.html#establishing-a-base-uri-inside-a-zip-file and currently claim Python 3.7 is supported.

@elichad will investigate

@elichad
Copy link
Contributor

elichad commented Mar 12, 2024

This seems to be the same issue as #167, just with a slightly different manifestation. That issue was fixed in PR #168 but the fix hasn't been released yet - @stain @simleo is there anything blocking us from making a release?

@simleo
Copy link
Collaborator

simleo commented Mar 15, 2024

I think we can make a release after merging #173.

@simleo
Copy link
Collaborator

simleo commented Mar 18, 2024

@rosanna-smith can you check that the problem is solved in ro-crate 0.10.0?

@rosanna-smith
Copy link
Author

Thanks! Can confirm this solved the problem on my end as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants