Details
Description
I'm finding that "grouped" PyPI repositories are ignoring case (and possibly special characters "_" and "-").
This is a follow on to https://issues.sonatype.org/browse/NEXUS-12075.
To give a simple example, if repository "ex1" has package "prj_a" and repository "ex2" has a "PrJ_A", and a group repository is created called ``ex1_ex2`` with members in this order, and another called ``ex2_ex1`` with the inverse order then:
- pip install prj_a from ex1 resolves *prj_a*
- pip install prj_a from ex2 resolves *PrJ_A* (notice that the resolution is case insensitive as it should be)
- pip install prj_a from ex2_ex1 resolves *PrJ_A*
- pip install prj_a from ex1_ex2 resolves *PrJ_A* - this is unexpected
To reproduce the problem:
1. docker run -d -p 8081:8081 --name nexus sonatype/nexus3:3.15.1
2. The python projects:
$ cat prj_a/setup.py
from setuptools import setup, find_packages
setup(
name='prj-a',
version="1.2",
author='test',
author_email='test@example.com',
)
$ cat PrJ_A/setup.py
from setuptools import setup, find_packages
setup(
name='PrJ-A',
version="1.2",
author='test',
author_email='test@example.com',
)
3. Create the hosted repositories (I'm using nexus3-cli), but you can do this by hand instead:
$ pip install nexus3-cli twine
nexus3 repository create hosted pypi ex1
nexus3 repository create hosted pypi ex2
4. Build and upload prj_a:
cd prj_a
[ -d dist ] && rm -rf dist
python setup.py sdist
twine upload --repository-url http://localhost:8081/repository/ex1/ dist/* -u admin -p admin123
cd ../
5. Build and upload PrJ_A:
cd PrJ_A
[ -d dist ] && rm -rf dist
python setup.py sdist
twine upload --repository-url http://localhost:8081/repository/ex2/ dist/* -u admin -p admin123
cd ../
6. Create the group repositories. Call one ex1_ex2 and the other ex2_ex1 (with the members ordered as the name suggests)
7. Run the 4 commands to check everything:
pip install --force --no-cache --index-url http://localhost:8081/repository/ex1/simple prj_a
pip install --force --no-cache --index-url http://localhost:8081/repository/ex2/simple prj_a
pip install --force --no-cache --index-url http://localhost:8081/repository/ex1_ex2/simple prj_a
pip install --force --no-cache --index-url http://localhost:8081/repository/ex2_ex1/simple prj_a
8. Notice that the 3rd command identifies the package PrJ_A from ex2, not prj_a from ex1.
----------
I checked this same method against sonatype/nexus3:3.19.1 and can confirm that the problem remains.
Fundamentally I believe the problem is with the merge algorithm of group repositories - the index of both ex1_ex2 and ex2_ex1 show that the package "prj_a" has both "prj_a" and "PrJ_A" tarballs, where really it should only have one of them.
(image attached to show this)
From a security perspective, there are some interesting implications for those who have an internal repository and a full proxy to PyPI - even if all of the internal repository packages are namespaced to avoid collisions, it is possible for somebody to upload something to public PyPI to match the package (with appropriate capitalisation) and affect the package resolution (i.e. are able to override the package that was once internal-only).
Attachments
Issue Links
- is related to
-
NEXUS-12075 PyPi packages are case sensitive and not correctly redirected
-
- Closed
-
- relates
-
NEXUS-23406 PyPI: Reconcile component database from blob store breaks packages with normalised names
-
- Open
-