A Python package to extract individual applications from a combined PDF file, such as for Oxford HR application packs.
If you have uv installed, you can run the tool directly without installing it:
uvx corehr-pdf-split --input-pdf applicationspack.pdf --output-dir outputInstall the package globally or in a virtual environment:
pip install corehr-pdf-splitThen run:
corehr-pdf-split --input-pdf applicationspack.pdf --output-dir outputuv tool install corehr-pdf-splitThen run:
corehr-pdf-split --input-pdf applicationspack.pdf --output-dir outputThe tool processes the input PDF file and saves individual applications in the specified output directory. The output folder will be created if it does not exist yet. Each applicant's PDF is saved with a filename format: LastName,FirstName [ApplicantID].pdf.
uvx corehr-pdf-split --input-pdf applicationspack.pdf --output-dir outputThis will process the applicationspack.pdf file and save individual applications in the output directory.
If you want to contribute to or modify this project:
- uv for dependency management
-
Clone this repository:
git clone https://github.com/synthetic-society/corehr-pdf-split.git cd corehr-pdf-split -
Install dependencies:
uv sync
-
Set up pre-commit hooks:
uvx pre-commit install
-
Run the tool in development mode:
uv run corehr-pdf-split --input-pdf <path_to_input_pdf> --output-dir <path_to_output_directory>
We use pre-commit hooks to ensure code quality. Run checks manually with:
uvx pre-commit run --all-filesTo build the package:
uv buildTo publish to PyPI (maintainers only):
uv publishThis project is available under the MIT License.
Contributions, issues, and feature requests are welcome.