Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix broken links to .md files #938

Open
MarkLodato opened this issue Aug 3, 2023 · 8 comments
Open

Fix broken links to .md files #938

MarkLodato opened this issue Aug 3, 2023 · 8 comments
Assignees
Labels
good first issue Good issues for newcomers shovel-ready Issues ready to be resolved website Issues with the slsa.dev website

Comments

@MarkLodato
Copy link
Member

There are quite a few broken links in our site, most of which to non-existent .md files. The problem is that we have been using default jekyll-relative-links plugin, which automatically converts a link to file.md to file if and only if file.md exists. If the author uses an absolute path rather than relative (e.g. /spec/v1.0/levels.md rather than spec/v1.0/levels.md) or if the file gets moved and the link does not get updated, then Jekyll will silently link to the non-existent .md file.

I suggest a two-part fix:

  1. Stop using jekyll-relative-links because it leads to too much confusion. Fix up all existing cases and disable the plugin.
  2. Set up netlify-plugin-checklinks or similar to check links for us, so that we don't have so many broken links going forward.
@MarkLodato MarkLodato added website Issues with the slsa.dev website shovel-ready Issues ready to be resolved labels Aug 3, 2023
@MarkLodato
Copy link
Member Author

A quick way to find broken .md links is to search through the built HTML files.

bundle exec jekyll build
git grep --no-index 'href="[^h][^"]*\.md' _site

joshuagl added a commit to joshuagl/slsa that referenced this issue Aug 17, 2023
Fixes: slsa-framework#938

Signed-off-by: Joshua Lock <joshuagloe@gmail.com>
joshuagl added a commit to joshuagl/slsa that referenced this issue Aug 17, 2023
Fixes: slsa-framework#938

Signed-off-by: Joshua Lock <joshuagloe@gmail.com>
@joshuagl
Copy link
Member

I've sent a PR to fix .md links in #946 which addresses the first part of the suggested fix.

Initial experimentation with netlify-plugin-checklinks has been error-prone, but I will dig into it more as time allows.

@MarkLodato
Copy link
Member Author

I was wrong about using an absolute link to a .md file - it actually works. Example: https://slsa.dev/how-to-orgs

Source:

For all [SLSA levels](/spec/v1.0/levels.md), you follow the same steps:

Result:

<p>For all <a href="/spec/v1.0/levels">SLSA levels</a>, you follow the same steps:</p>

That said, there are still many remaining broken links (posted on #946) and the source code to use the URL instead of the .md file seems like a good idea regardless.

@MarkLodato
Copy link
Member Author

MarkLodato commented Aug 17, 2023

I've sent a PR to fix .md links in #946 which addresses the first part of the suggested fix.

Note that that PR only changes a few links to .md files, whereas we have many:

$ git grep -e '^\[.*\]: [^:#]*\.md$' --or -e '\]([^:#)]*\.md[#)]' docs | wc -l
172

Also we should disable the plugin once we remove all links, so that we don't start to accidentally rely on it in the future.

Initial experimentation with netlify-plugin-checklinks has been error-prone, but I will dig into it more as time allows.

An alternative (non-netlify-specific) tool might also work, which we could run in GitHub Actions.

@MarkLodato
Copy link
Member Author

Looks like the thinks that are broken due to this specific issue are limited:

$ git grep --color --no-index 'href="[^h][^"]*\.md' _site
_site/blog.html:the valuable feedback we received on the <a href="2023-02-24-slsa-v1-rc.md">first release candidate</a>. This is
_site/blog.html:      <p>Interested in getting involved? Now’s the chance to <a href="2023-02-24-slsa-v1-rc.md">provide your feedback on the foundational v1 release of the SLSA framework.</a></p>
_site/example.html:<p>SLSA 4 <a href="requirements.md">requires</a> two-party source control and hermetic builds.
_site/how-to-orgs.html:<a href="/provenance/v1.md">https://slsa.dev/provenance/</a>.</p>
_site/spec/v1.0-rc2/onepage.html:<p>SLSA v1.0 does not address this threat, but it may be addressed in a <a href="#future-directions.md">future
_site/spec/v1.0-rc2/onepage.html:<p>SLSA v1.0 does not address this threat, but it may be addressed in a <a href="#future-directions.md">future
_site/spec/v1.0-rc2/threats.html:<p>SLSA v1.0 does not address this threat, but it may be addressed in a <a href="future-directions.md">future
_site/spec/v1.0-rc2/threats.html:<p>SLSA v1.0 does not address this threat, but it may be addressed in a <a href="future-directions.md">future
_site/spec/v1.0/onepage.html:<p>SLSA v1.0 does not address this threat, but it may be addressed in a <a href="#future-directions.md">future
_site/spec/v1.0/onepage.html:<p>SLSA v1.0 does not address this threat, but it may be addressed in a <a href="#future-directions.md">future
_site/spec/v1.0/threats.html:<p>SLSA v1.0 does not address this threat, but it may be addressed in a <a href="future-directions.md">future
_site/spec/v1.0/threats.html:<p>SLSA v1.0 does not address this threat, but it may be addressed in a <a href="future-directions.md">future

Specifically:

@joshuagl
Copy link
Member

* autogenerated pages (blog.html and onepage.html) don't work with the plugin, so links to .md files are not converted there

these are resolved by the changes in #946

% git grep --color --no-index 'href="[^h][^"]*\.md' _site
_site/how-to-orgs.html:<a href="/provenance/v1.md">https://slsa.dev/provenance/</a>.</p>

That how-to-orgs page doesn't appear to be resolved by #939, but it's unclear why the link to an .md file 7 lines earlier is converted?!

@MarkLodato
Copy link
Member Author

* autogenerated pages (blog.html and onepage.html) don't work with the plugin, so links to .md files are not converted there

these are resolved by the changes in #946

Ah, I'm sorry. I didn't understand that from the PR. Could you explain in the PR description?

That how-to-orgs page doesn't appear to be resolved by #939, but it's unclear why the link to an .md file 7 lines earlier is converted?!

Yeah, it's a mystery why sometimes the conversion works and sometimes it doesn't!

@joshuagl
Copy link
Member

I've updated #946 to only address the broken links in the generated site. For this issue I plan to

  • remove all uses of .md file extensions in relative links
  • remove jekyll-relative-links from our site's configuration
  • implement a broken link checker for our generated site, to be run in CI

renovate-bot pushed a commit to renovate-bot/slsa that referenced this issue Aug 22, 2023
There are several places in the generated site where links are broken
because:

* the generated HTML still includes the .md file extension. It's unclear
  why jekyll-relative-links is broken _only_ in these instances.
* the link is to a moved file in the in-toto/attestation git repository
* the link was to a file in our projcet which has moved
* the link was always bad (i.e. a link to nist.gov with no proto)

This surgical change resolves broken links on the site until the wider
issues (disable jekyll-relative-links and broken link detection) can be
addressed in issue slsa-framework#938.

On broken jekyll-relative-links, specifically, before this patch:
```
% git grep --color --no-index 'href="[^h][^"]*\.md' _site
_site/blog.html:the valuable feedback we received on the <a href="2023-02-24-sls
a-v1-rc.md">first release candidate</a>. This is
_site/blog.html:      <p>Interested in getting involved? Now’s the chance to <a href="2023-02-24-slsa-v1-rc.md">provide your feedback on the foundational v1 release of the SLSA framework.</a></p>
_site/example.html:<p>SLSA 4 <a href="requirements.md">requires</a> two-party source control and hermetic builds.
_site/how-to-orgs.html:<a href="/provenance/v1.md">[https://slsa.dev/provenance/</a>.</p](https://slsa.dev/provenance/%3C/a%3E.%3C/p)>
_site/spec/v1.0-rc2/onepage.html:<p>SLSA v1.0 does not address this threat, but it may be addressed in a <a href="#future-directions.md">future
_site/spec/v1.0-rc2/onepage.html:<p>SLSA v1.0 does not address this threat, but it may be addressed in a <a href="#future-directions.md">future
_site/spec/v1.0-rc2/threats.html:<p>SLSA v1.0 does not address this threat, but it may be addressed in a <a href="future-directions.md">future
_site/spec/v1.0-rc2/threats.html:<p>SLSA v1.0 does not address this threat, but it may be addressed in a <a href="future-directions.md">future
_site/spec/v1.0/onepage.html:<p>SLSA v1.0 does not address this threat, but it may be addressed in a <a href="#future-directions.md">future
_site/spec/v1.0/onepage.html:<p>SLSA v1.0 does not address this threat, but it may be addressed in a <a href="#future-directions.md">future
_site/spec/v1.0/threats.html:<p>SLSA v1.0 does not address this threat, but it may be addressed in a <a href="future-directions.md">future
_site/spec/v1.0/threats.html:<p>SLSA v1.0 does not address this threat, but it may be addressed in a <a href="future-directions.md">future
```

- the two blog.html instances are resolved by editing the individual
blog
    posts to remove the .md file extension
- the example.html instance will be resolved by
slsa-framework#945
-   how-to-orgs.html is fixed by removing the .md extension line 20 of
    how-to-orgs.md
- the v1.0-rc2 threats and onepage are fixed by removing the .md
extension in
    lines 40 and 49 of spec/v1.0-rc2/threats.md
- the v1.0 threats and onepage are fixed by removing the .md extension
in
    lines 40 and 49 spec/v1.0/threats.md

Following these changes:

```
% git grep --color --no-index 'href="[^h][^"]*\.md' _site
_site/example.html:<p>SLSA 4 <a href="requirements.md">requires</a> two-party source control and hermetic builds.
```

which is resolved by the removal of example.md in PR slsa-framework#945

Signed-off-by: Joshua Lock <joshuagloe@gmail.com>
@marcelamelara marcelamelara added the good first issue Good issues for newcomers label Oct 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Good issues for newcomers shovel-ready Issues ready to be resolved website Issues with the slsa.dev website
Projects
Status: 🏗 In progress
Development

No branches or pull requests

3 participants