Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[enhance] Auto rename to have other rename fucntionality like pdfgrep #330

Open
Frooodle opened this issue Aug 29, 2023 · 11 comments
Open
Labels
enhancement New feature or request good first issue Good for newcomers

Comments

@Frooodle
Copy link
Member

Use name of line that contains x
Use name of text that is between x and y
Use name Of x line number
Use name of text before/after x (for say name: Anthony S could be text after 'name:'

@Frooodle Frooodle added enhancement New feature or request good first issue Good for newcomers labels Jan 4, 2024
@souviksenapati
Copy link

Can you elaborate what you are trying to achieve

@Frooodle
Copy link
Member Author

Frooodle commented Jan 6, 2024

Existing auto rename functionality renamed based on top x lines by looking at what has the largest font

This could be enhanced by saying check x lines

Or based on y regex
Etc

So if I wanted a doc to be renamed based on company name in pdf

I could do regex Company: ([a-z]+) or something

@Frooodle
Copy link
Member Author

Frooodle commented Jan 6, 2024

Or the other options I listed above as well as regex can be complex for some people

@TomTinking
Copy link

I have a use case here to illustrate how to get a first stab at this.
Take a PDF of a payslip from work
If you are lucky when you download it, it will have a sensible filename.
However recently I downloaded a bunch of slips and they just had numbers in the file name that didn't seem to relate to any sort of time or date i.e. "Payslip_55683545.pdf"
Today is I use rename PDF, as the pdf file has very little meta data.. the file name ends up very long and unusable.
Ideally I would like to pdfgrep for the phrase "Date" and then match the date format that follows (in my case its Date: 31/03/2022 ) and use that value in my renaming script..
So desired outcome might be "Payslip_31032022.pdf" for example..
Now when looking through my payslips for a specific month slip its easier.. by filename.

Sure this applies to many a document you can download.. (Some Bank Statements often end up with weird filenames)

@tanseer123
Copy link

Hi @Frooodle,

I'd like to work on this enhancement. The proposed features sound great, and I'd be excited to contribute to improving the auto rename functionality. I'll start by exploring the existing codebase and planning out the implementation for the new renaming options.

Please let me know if there are any specific guidelines or additional information I should consider before getting started.

Thanks!

@Frooodle
Copy link
Member Author

Please have a go! If you need help discord would be best to reach out on
We do not have any exact guides for developers sadly other than our general contributing.md

@tanseer123
Copy link

Hi @Frooodle,

Thank you for the approval and the guidance. I'll start working on the implementation as planned.

However, before I dive in, could you provide some pointers on where in the codebase the current auto rename functionality is implemented? Any specific files or functions I should focus on initially would be very helpful.

I'll join the Discord channel for further questions and discussions as well.

Thanks again for the opportunity to contribute and for your assistance!

@Frooodle
Copy link
Member Author

Honestly it should only touch those files, anything else you should get by via referencing that java class, such as extra params you need to edit ExtractHeaderRequest.java etc

@tanseer123
Copy link

Hi @Frooodle,

Thank you for providing the links and the guidance. I’ll start by reviewing auto-rename.html and AutoRenameController.java to understand the current implementation and identify where to make the changes.

I’ll also check out ExtractHeaderRequest.java and other relevant classes to ensure all necessary modifications are covered.

If I have any specific questions or need further clarification as I work through this, I'll reach out on Discord.

Thanks again for your support!

@tanseer123
Copy link

Hi @Frooodle,

I have completed the changes and raised a pull request for the enhancement.

The updated version does the following:

  1. It first attempts to find a filename using the keyword-based method.
  2. If keywords are specified, it looks for entire lines containing the keyword or text after the keyword.
  3. If no suitable filename is found using the keyword, it falls back to the largest font method.

You can review the pull request #1604.

Please let me know if there are any further changes or improvements needed. I am happy to make adjustments as required.

Thanks again for your guidance and support!

Best regards,
@tanseer123

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

4 participants