Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace pillow with filetype #1867

Closed
wants to merge 4 commits into from

Conversation

christianarty
Copy link

When this PR was merged, #1680, we replaced imghdr with Pillow.

Albeit Pillow is a great library, it has many conflicts with different platform architectures, and too much overhead for the use we have for it (We just use it to determine a MIME type for an image)

Looking at PEP 594, filetype is a reccommended replacement for imghdr and it is cross platform compatible.

This PR is replacing Pillow with filetype

@christianarty christianarty marked this pull request as ready for review June 11, 2024 13:52
Copy link
Member

@ssbarnea ssbarnea left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be honest, I am not sure if filetype is much better and here is why:

Apparently pillow had releases https://pypi.org/project/pillow/#files

Still, a pure python library would have lots of advantages. I would support migration to a pure python alternative.

@christianarty
Copy link
Author

To be honest, I am not sure if filetype is much better and here is why:

Apparently pillow had releases pypi.org/project/pillow/#files

Still, a pure python library would have lots of advantages. I would support migration to a pure python alternative.

Hi @ssbarnea,

Thank you for reviewing my PR and providing feedback. I appreciate your feedback and remarks, but I believe they should not deter us from considering filetype over Pillow.

As noted in #1801, this issue is not only for me, but for other consumers as well. For our pycontrib/jira's use case, Pillow is unnecessarily large for simply detecting the mimetype based on buffer bytes. Its complex dependencies, especially on MacOS, often require additional tinkering, which can be cumbersome for the consumers. Generally they have to go and deal with linking binaries across their system so that the C library underneath the hood in Pillow can properly execute.

Regarding the points raised:

hosted under personal github account instead of using an umbrella organization, h2non/filetype.py

The hosting under a personal GitHub account shouldn't be a deterrent. Popular packages like fastapi and six are similarly hosted and widely used.

not much recent activity h2non/filetype.py/graphs/contributors

filetype is a focused utility, unlike Pillow, which has broader image processing functionalities. Its lack of frequent updates doesn't imply instability, but rather can indicate it is stable for its specific purpose.

CI currently broken and had only one PR merged in 2024: h2non/filetype.py/pulls (sort:updated-desc is:closed)

The CI breakage seems recent due to an update in their test infrastructure. This appears to be a temporary issue and doesn't reflect the library's overall stability.

not using pep-517 for packaging

While PEP 517 provides a way toward simpler packaging, the absence of it in filetype shouldn't be a major concern. Ideally, it should migrate to use PEP-517 soon, but this isn't a blocker.

Additionally, This situation is similar to an existing dependency pycontribs/jira uses: defusedxml. It is hosted under a personal account, had its last release in 2021, has low activity, and also doesn't use PEP 517 for packaging. Despite this, defusedxml serves its one purpose well within our project.

@ssbarnea
Copy link
Member

I added my comments, a slight -1 but I will not block them this change. Maybe that is a good opportunity to review library dependencies and plan to remove or replace some of them (other ticket).

I want to make jira library slim and pure python too!

We can also raise our concerns upstream and see how current maintainers are willing to address them (could help us make a more informed decision).

@christianarty
Copy link
Author

I added my comments, a slight -1 but I will not block them this change. Maybe that is a good opportunity to review library dependencies and plan to remove or replace some of them (other ticket).

I want to make jira library slim and pure python too!

We can also raise our concerns upstream and see how current maintainers are willing to address them (could help us make a more informed decision).

Yes, that would be ideal! This package should be slim and ideally pure python. I don't think this needs anything computationally expensive (at least right now).

So, @ssbarnea , what would be the necessary action items to get this PR through the door and merged? We can raise some issues upstream and coordinate with their maintainers, but ideally want this PR merged sooner rather than later, since it affects my workflow and a couple others.

@ob
Copy link

ob commented Jun 21, 2024

+1 to removing dependency on Pillow... I've had to wrap this library and do the following:

import sys
class Image:
    """Dummy PIL.Image module."""

    @staticmethod
    def open(file, mode="r"):
        """
        Dummy implementation of PIL.Image.open() that always raises a
        NotImplementedError due to a PIL import problem on macOS
        """
        raise NotImplementedError("This is a no-op implementation of PIL.Image.open()")


# On macOS, patch PIL.Image with the dummy module
if sys.platform == "darwin":
    sys.modules["PIL.Image"] = sys.modules[__name__]

which is gross and I'l like to avoid.

@dimitarOnGithub
Copy link

Is removing the auto-detection functionality entirely an option here?

I'm aware that probably a lot of projects out there rely on it to automatically populate the content-type header, so maybe a version or two deprecation warning in place would be the first move, but considering that both create_temp_project_avatar and create_temp_project_avatar offer the user to provide a str value for contentType in their signature, it would be a matter of shifting the responsibility on the end user and their preferred library (that is if they need it in the first place).

The benefit would be reducing the dependencies required by the package and allowing the users to pick their own flavour of a library, the downside is that everyone who wants to upgrade will have to handle this on their own.

@ob
Copy link

ob commented Jun 27, 2024

Removing the functionality would work and simplify the dependencies. As for a transition how requiring content-type (or MIME type or whatever) to be provided and give people a snippet of code that is equivalent to the only usage in the code. That way you can use whatever you want for guessing image types when you don't know...

@cilki
Copy link

cilki commented Jul 10, 2024

Previous attempt, btw.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants