Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

simpler directory structure for python/resources #402

Open
ms-lolo opened this issue Dec 12, 2024 · 5 comments
Open

simpler directory structure for python/resources #402

ms-lolo opened this issue Dec 12, 2024 · 5 comments

Comments

@ms-lolo
Copy link
Collaborator

ms-lolo commented Dec 12, 2024

trying to find a structure with fewer directories and the ability to package resources into released wheels. this is what i'm currently finding pleasant along with some naming conventions for certain types of python packages.

  • src/rats/… for standard rats python import packages belonging to the rats.* namespace package. this is what we have now but without the additional subfolder src/python.
  • src/rats_resources/ as a new namespace package that is meant to only hold non-python files. i follow the same naming conventions as src/rats where there is a sub-package per domain; but files inside each domain follow the naming conventions of standard resources which uses dashes instead of underscores.
  • src/rats_e2e also follows the structure of src/rats and contains the end-to-end or integration tests of the component. these are shipped with the wheel unlike the things in test/ specifically because they are meant to be runnable without testing framework. these should run on final artifacts before publishing instead of being run in a development environment.
  • test/* matches the new directory structure above without the _e2e package. unit tests and lightweight integration tests meant to be run on a development environment install that includes tools like pytest.

this doesn't stop us from using any other rats_* prefix to packages but it's starting to show a convention with rats_e2e (new), rats_resources (new), and rats_test (existing) where we group things at the top level before giving people short and descriptive import names below the top level.

Example:

  • src/rats/apps has libraries for creating applications
  • src/rats_e2e/apps has some example apps and other runnable code to ensure rats.apps is well integrated
  • src/rats_resources/apps/some-asset.txt has some information used by the rats.apps libraries
  • test/rats_test/apps has the unit tests for the public facing api of rats.apps
@jzazo
Copy link
Collaborator

jzazo commented Dec 12, 2024

How about moving resources and e2e folders to the same level as src and test?

src/rats/apps
resources/rats
e2e/rats
test/rats

Just thinking e2e and resources are not exactly source code...

@ms-lolo
Copy link
Collaborator Author

ms-lolo commented Dec 12, 2024

a few notes after our chat this morning:

  • rats_e2e is part of src/ because we want to ship it and make it visible to users of the libraries.
  • resources/rats should be possible without custom build-system configs, but both test/ and src/ have their own resources, so i'm not sure thinking about resources at the component level is possible. i need to be able to define resource files used by tests without shipping them with my wheel. i was thinking of trying resources/rats/… next to resources/rats_test/… at the root, but it would take custom build system configs as well (to ship resources not under a rats_test resource folder) and leave us without being able to support all of them.

i think today the main difficulty here is the non-standards around configuring the layout of your project. PEP 621 helped make the [project] section consistent, but it does not specify any standards around the build system. so my intuition is to use completely standard structures which would only allow for src/ and test/ to exist as the two folders where one is for the wheel (src/) and one is for local development (test/).

@jzazo
Copy link
Collaborator

jzazo commented Dec 12, 2024

ok, not strongly opposed, but I would still separate resources folder in my own projects at the root level.

It is a bit inconsistent, but if I had test resources, I would put them in the test folder. I don't need such test resources, so not a compromise in practice for me.

I don't like src/rats_resources because I don't know if that is a python package or actual resources and you have to look inside to know.

@ms-lolo
Copy link
Collaborator Author

ms-lolo commented Dec 17, 2024

I don't like src/rats_resources because I don't know if that is a python package or actual resources and you have to look inside to know.

i think this has been my struggle the entire time. i think of resources as "not a python package" but the python ecosystem very much does not. so after this restructure, the answer to "is this a package?" is always yes; and some packages contain resources. this issue sets the convention that all packages only contain either python files or resources, and all packages that are for resources will end in _resources. so we always know the answer to your question without looking inside.

  • all files that are part of the business logic of the wheel are included in one or more import packages
  • no import package can have both resource files and python modules
  • all import packages that contain resources have a _resources suffix

since rats is an empty namespace package, and the first meaningful names are the sub-packages–like apps, pipelines, processors, etc.–, we can add a couple more assumptions specifically for rats:

  • there is one empty namespace package called rats_resources
  • all sub-packages of rats_resources belong to exact one sub-package of rats

the python community puts .py files next to any related resource. the proposed structure is still fighting against that convention, but accepting the other one; that all resources are part of python import packages. what i'm doing here is saying that any package ending in _resources is the sibling to the non-suffixed one that contains the relevant resources. so we'll have a single __init__.py file in rats_resources/ in order to make it a valid python package, and everything else can be assumed to be the resource for the matching package in rats/. this is the most separation i've been able to achieve so far without relying on undocumented behavior in any given packaging tool. everything else i've tried either works in the development environment, or in the wheel installation, but never both.

@jzazo
Copy link
Collaborator

jzazo commented Dec 17, 2024

Ok, I don't usually require resources, but if they are part of packages and the community puts them alongside python modules, I am fine with that. We can follow the convention you propose in rats, but I am not too keen on extending this convention to other projects, if there is no consensus in the community.

But thanks for explaining all of this, and documenting the inconsistencies between setting up the dev environment and packaging...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants