Error in unicode entry scripts #252

julienmalard · 2024-12-08T09:21:05Z

In utils.py, the _ENTRYPOINT_REGEX regex does not recognise valid unicode Python identifiers (https://peps.python.org/pep-3131/), preventing the use of non-ascii file paths as entry points. Using the regex module as a drop-in replacement for re would solve this issue; please let me know if you would like me to submit a pull request.

The text was updated successfully, but these errors were encountered:

eli-schwartz · 2024-12-08T14:26:26Z

Using the regex module as a drop-in replacement for re would solve this issue

This would be quite problematic since a major feature of this package was that it has no dependencies and can run purely based on the stdlib, making it suitable as the very bottom package in an ecosystem bootstrap.

It would also prevent being vendored into pip, though that is less about having dependencies and more about having dependencies that have a binary component.

uranusjr · 2024-12-09T09:44:54Z

Do we even need a regex engine though? The spec basically just requires splitting on the first equal sign, and checking some leading and trailing special characters. It can be hand-rolled.

julienmalard · 2024-12-09T16:13:21Z

Interesting...would something using the following functions to check for valid identifiers be useful?
If so, I could make a pull request.

def isalnum(x):
    if len(x) != 1:
        return all(isalnum(y) for y in x)
    return unicodedata.category(x) in ['Lu', 'Ll', 'Lt', 'Lm', 'Lo', 'Nl', 'Mn', 'Mc', 'Nd', 'Pc']


def isalpha(x):
    if len(x) != 1:
        return all(isalpha(y) for y in x)
    return unicodedata.category(x) in ['Lu', 'Ll', 'Lt', 'Lm', 'Lo', 'Mn', 'Mc', 'Pc']

eli-schwartz · 2024-12-09T16:17:07Z

Valid identifiers can be determined via x.isidentifier(), presumably.

julienmalard · 2024-12-14T04:08:38Z

Reproduced here (pull request with failing tests): https://github.com/pypa/installer/actions/runs/12326732550?pr=254

julienmalard · 2024-12-14T04:28:37Z

...and now fixed with PR: https://github.com/pypa/installer/actions/runs/12326868467/job/34408216136?pr=254

julienmalard linked a pull request Dec 14, 2024 that will close this issue

Fix unicode entrypoint issue #254

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error in unicode entry scripts #252

Error in unicode entry scripts #252

julienmalard commented Dec 8, 2024

eli-schwartz commented Dec 8, 2024

uranusjr commented Dec 9, 2024

julienmalard commented Dec 9, 2024

eli-schwartz commented Dec 9, 2024

julienmalard commented Dec 14, 2024

julienmalard commented Dec 14, 2024

Error in unicode entry scripts #252

Error in unicode entry scripts #252

Comments

julienmalard commented Dec 8, 2024

eli-schwartz commented Dec 8, 2024

uranusjr commented Dec 9, 2024

julienmalard commented Dec 9, 2024

eli-schwartz commented Dec 9, 2024

julienmalard commented Dec 14, 2024

julienmalard commented Dec 14, 2024