Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PEP 639: Implement License-Expression and License-File #828

Merged
merged 36 commits into from
Oct 24, 2024
Merged
Show file tree
Hide file tree
Changes from 26 commits
Commits
Show all changes
36 commits
Select commit Hold shift + click to select a range
de8b239
PEP 639: Implement License-Expression and License-File
ewdurbin Sep 3, 2024
44beda7
wire in vendoring library to manage vendored dependencies
ewdurbin Sep 3, 2024
2b1a9f7
vendor license_expression and boolean.py
ewdurbin Sep 3, 2024
9bfa714
harmonize with documentation
ewdurbin Sep 3, 2024
1aea212
exclude vendored spdx data from sdist/whl. build/bring our own
ewdurbin Sep 4, 2024
cec33f1
migrate to parser based on hatchling's
ewdurbin Sep 4, 2024
cfb3af1
string -> re
ewdurbin Sep 4, 2024
d6b47d5
License-File: disallow unresolved globs and non-relative paths
ewdurbin Sep 5, 2024
a55f422
Merge branch 'main' into pep_639
brettcannon Sep 11, 2024
afa5d4c
Apply suggestions from code review
ewdurbin Sep 13, 2024
396e4ef
update typing for licenses.spdx
ewdurbin Sep 13, 2024
21a2821
Extend typing improvements in licenses.spdx to include Exception
ewdurbin Sep 13, 2024
4ac18f0
fixup names, Exception is not a good one.
ewdurbin Sep 13, 2024
46a7491
better enforcement of license-file paths
ewdurbin Sep 13, 2024
cd7105f
subclass ValueError for invalid license expressions
ewdurbin Sep 13, 2024
e469b7e
and empty license expression is invalid
ewdurbin Sep 13, 2024
e699391
create a "NormalizedLicenseExpression" type
ewdurbin Sep 13, 2024
22fa9cd
rename normalize -> canonicalize
ewdurbin Sep 13, 2024
f952ab9
add tests to ensure license and exception ids conform
ewdurbin Sep 13, 2024
30e34f1
update name of var
ewdurbin Sep 13, 2024
8906b16
match formatting standards
ewdurbin Sep 13, 2024
a361294
reorganize the licenses module a bit
ewdurbin Sep 15, 2024
9cee38e
apply code-review suggestions for update_licenses task
ewdurbin Sep 15, 2024
81efbda
fix tests after spdx module was made private
ewdurbin Sep 15, 2024
701217b
add docs
ewdurbin Sep 15, 2024
4539543
Merge branch 'main' into pep_639
brettcannon Sep 16, 2024
64d3647
add additional test cases and handle LicenseRef- identifiers not alre…
ewdurbin Sep 19, 2024
6e6b304
Apply suggestions from code review
ewdurbin Oct 3, 2024
a65ca89
nit
ewdurbin Oct 3, 2024
1cac177
fixups from code-review suggestions
ewdurbin Oct 3, 2024
42d6452
pass globals/locals to eval
ewdurbin Oct 3, 2024
461d183
Merge branch 'main' into pep_639
brettcannon Oct 7, 2024
9546938
fix: unncessary string concatenation from reformatting
ewdurbin Oct 8, 2024
da3f04b
licenses: add some testcases for whitespace normalization
ewdurbin Oct 8, 2024
23df2ac
lint
ewdurbin Oct 8, 2024
042a335
Merge branch 'main' into pep_639
ewdurbin Oct 24, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@ The ``packaging`` library uses calendar-based versioning (``YY.N``).
version
specifiers
markers
licenses
requirements
metadata
tags
Expand Down
53 changes: 53 additions & 0 deletions docs/licenses.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
Licenses
=========

.. currentmodule:: packaging.licenses


Helper for canonicalizing SPDX
`License-Expression metadata <https://peps.python.org/pep-0639/#term-license-expression>`__
as `defined in PEP 639 <https://peps.python.org/pep-0639/#spdx>`__.


Reference
---------

.. class:: NormalizedLicenseExpression

A :class:`typing.NewType` of :class:`str`, representing a normalized
License-Expression.


.. exception:: InvalidLicenseExpression

Raised when a License-Expression is invalid.


.. function:: canonicalize_license_expression(raw_license_expression)

This function takes a valid Python package or extra name, and returns the
normalized form of it.
ewdurbin marked this conversation as resolved.
Show resolved Hide resolved

The return type is typed as :class:`NormalizedLicenseExpression`. This allows type
checkers to help require that a string has passed through this function
before use.

:param str raw_license_expression: The License-Expression to canonicalize.
:raises InvalidLicenseExpression: If the License-Expression is invalid due to and
ewdurbin marked this conversation as resolved.
Show resolved Hide resolved
invalid/unknown license identifier or invalid syntax.

.. doctest::

>>> from packaging.licenses import canonicalize_license_expression
>>> canonicalize_license_expression("mit")
'MIT'
>>> canonicalize_license_expression("mit and (apache-2.0 or bsd-2-clause)")
'MIT AND (Apache-2.0 OR BSD-2-Clause)'
>>> canonicalize_license_expression("(mit")
Traceback (most recent call last):
...
InvalidLicenseExpression: Invalid license expression: '(mit'
>>> canonicalize_license_expression("Use-it-after-midnight")
Traceback (most recent call last):
...
InvalidLicenseExpression: Unknown license: 'Use-it-after-midnight'
6 changes: 6 additions & 0 deletions noxfile.py
Original file line number Diff line number Diff line change
Expand Up @@ -181,6 +181,12 @@ def release(session):
webbrowser.open("https://github.com/pypa/packaging/releases")


@nox.session
def update_licenses(session: nox.Session) -> None:
session.install("httpx")
session.run("python", "tasks/licenses.py")


# -----------------------------------------------------------------------------
# Helpers
# -----------------------------------------------------------------------------
Expand Down
4 changes: 3 additions & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -57,9 +57,11 @@ warn_unused_ignores = true
module = ["_manylinux"]
ignore_missing_imports = true


[tool.ruff]
src = ["src"]
extend-exclude = [
"src/packaging/licenses/_spdx.py"
]

[tool.ruff.lint]
extend-select = [
Expand Down
142 changes: 142 additions & 0 deletions src/packaging/licenses/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,142 @@
#######################################################################################
#
# Adapted from:
# https://github.com/pypa/hatch/blob/5352e44/backend/src/hatchling/licenses/parse.py
#
# MIT License
#
# Copyright (c) 2017-present Ofek Lev <oss@ofek.dev>
#
# Permission is hereby granted, free of charge, to any person obtaining a copy of this
# software and associated documentation files (the "Software"), to deal in the Software
# without restriction, including without limitation the rights to use, copy, modify,
# merge, publish, distribute, sublicense, and/or sell copies of the Software, and to
# permit persons to whom the Software is furnished to do so, subject to the following
# conditions:
#
# The above copyright notice and this permission notice shall be included in all copies
# or substantial portions of the Software.
#
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED,
# INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A
# PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT
# HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF
# CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE
# OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
#
#
# With additional allowance of arbitrary `LicenseRef-` identifiers, not just
# `LicenseRef-Public-Domain` and `LicenseRef-Proprietary`.
#
#######################################################################################
from __future__ import annotations

import re
from typing import NewType

from packaging.licenses._spdx import EXCEPTIONS, LICENSES

__all__ = [
"NormalizedLicenseExpression",
"InvalidLicenseExpression",
"canonicalize_license_expression",
]

license_ref_allowed = re.compile("^[A-Za-z0-9.-]*$")

NormalizedLicenseExpression = NewType("NormalizedLicenseExpression", str)
ewdurbin marked this conversation as resolved.
Show resolved Hide resolved


class InvalidLicenseExpression(ValueError):
"""Raised when a license-expression string
ewdurbin marked this conversation as resolved.
Show resolved Hide resolved

>>> canonicalize_license_expression("invalid")
Traceback (most recent call last):
...
packaging.licenses.InvalidLicenseExpression: Invalid license expression: 'invalid'
"""


def canonicalize_license_expression(
raw_license_expression: str,
) -> str | NormalizedLicenseExpression:
ewdurbin marked this conversation as resolved.
Show resolved Hide resolved
if raw_license_expression == "":
ewdurbin marked this conversation as resolved.
Show resolved Hide resolved
message = f"Invalid license expression: {raw_license_expression!r}"
raise InvalidLicenseExpression(message)

license_refs = {
ref.lower(): "LicenseRef-" + ref[11:]
for ref in raw_license_expression.split()
if ref.lower().startswith("licenseref-")
}
ewdurbin marked this conversation as resolved.
Show resolved Hide resolved

# First, normalize to lower case so we can look up licenses/exceptions
# and so boolean operators are Python-compatible.
license_expression = raw_license_expression.lower()

# Pad any parentheses so tokenization can be achieved by merely splitting on
# white space.
ewdurbin marked this conversation as resolved.
Show resolved Hide resolved
license_expression = license_expression.replace("(", " ( ").replace(")", " ) ")

tokens = license_expression.split()

# Rather than implementing boolean logic, we create an expression that Python can
# parse. Everything that is not involved with the grammar itself is treated as
# `False` and the expression should evaluate as such.
python_tokens = []
for token in tokens:
if token not in {"or", "and", "with", "(", ")"}:
python_tokens.append("False")
elif token == "with":
python_tokens.append("or")
elif token == "(" and python_tokens and python_tokens[-1] not in {"or", "and"}:
message = f"Invalid license expression: {raw_license_expression!r}"
raise InvalidLicenseExpression(message)
else:
python_tokens.append(token)

python_expression = " ".join(python_tokens)
try:
result = eval(python_expression)
ewdurbin marked this conversation as resolved.
Show resolved Hide resolved
except Exception:
result = True
ewdurbin marked this conversation as resolved.
Show resolved Hide resolved

if result is not False:
ewdurbin marked this conversation as resolved.
Show resolved Hide resolved
message = f"Invalid license expression: {raw_license_expression!r}"
raise InvalidLicenseExpression(message) from None

# Take a final pass to check for unknown licenses/exceptions.
normalized_tokens = []
for token in tokens:
if token in {"or", "and", "with", "(", ")"}:
normalized_tokens.append(token.upper())
continue

if normalized_tokens and normalized_tokens[-1] == "WITH":
brettcannon marked this conversation as resolved.
Show resolved Hide resolved
if token not in EXCEPTIONS:
message = f"Unknown license exception: {token!r}"
raise InvalidLicenseExpression(message)

normalized_tokens.append(EXCEPTIONS[token]["id"])
else:
if token.endswith("+"):
final_token = token[:-1]
suffix = "+"
else:
final_token = token
suffix = ""

if final_token.startswith("licenseref-"):
if not license_ref_allowed.match(final_token):
message = f"Invalid licenseref: {final_token!r}"
raise InvalidLicenseExpression(message)
normalized_tokens.append(license_refs[final_token] + suffix)
else:
if final_token not in LICENSES:
message = f"Unknown license: {final_token!r}"
raise InvalidLicenseExpression(message)
normalized_tokens.append(LICENSES[final_token]["id"] + suffix)

normalized_expression = " ".join(normalized_tokens)

return normalized_expression.replace("( ", "(").replace(" )", ")")
Loading
Loading