You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The biggest chunk of this 2GB content is the docs (around 1.6 GB), which is mostly because we keep several versions, and per version it is around 100MB (and this also continuously increases with every version, because we expand our docs).
We already don't keep the Java reference for older versions because this is too big. We could do something similar for the Python and cpp reference docs.
In general we could also trim down the number of versions we keep for older docs.
It's probably also a good idea to clean up the /docs/dev/ docs once completely (eg I see that in the R docs it accumulated multiple versions of bootstrap, probably because we always overwrite what exists and don't replace), although that's probably not that much space.
When generating the branch asf-site, keep only the latest snapshot, remove previous commits during the CI site generation job
That's certainly responsible for quite some of the size as well. We however do sometimes manually commit to the asf-site branch as well, so ideally we could keep those commits.
I am wondering if it would be possible to only do that for a subdirectory, like the docs/dev/ that get updated from nightly CI. If we could remove the history for just that subdirectory (I don't know if git easily allows this), I think that would already give a large chunk of the benefit.
Can look at this once #449 is done. Moving the older versions to GitLFS seems reasonable, perhaps keeping only the last 4 versions. It is nice to have the older documentation, especially since version updates are relatively frequent, but they are unlikely to change much.
The repository is quite large, 2Gb for the contents. Possible things to reduce the size:
The text was updated successfully, but these errors were encountered: