Academic PDF Submission: Compress, Page Numbers, PDF/A | way2pdf Skip to main content
Workflow Guide

Academic PDF Submission: Compress, Add Page Numbers, and Convert to PDF/A

A complete guide to preparing your thesis, dissertation, or journal manuscript for academic submission — covering the exact requirements that differ between institutions and journals, and the steps to meet them reliably.

11 min read  ·  By  ·  Updated May 17, 2026


Why academic PDF submission has unique requirements

Submitting a research paper or thesis is not the same as emailing any other PDF. Universities, journals, and institutional repositories have specific technical requirements that exist for substantive reasons: the submission portal may have a hard file size limit enforced by software; the repository needs the PDF to be readable in 30 years when today's fonts may no longer be installed on any computer; the graduate school needs page numbers in a specific position and format for their internal review process; the journal needs the PDF to pass an automated compliance check before it reaches a reviewer.

Failing one of these requirements typically results in the submission being rejected outright and returned for resubmission — which adds days to the process and, in the case of a thesis submission with a defence date approaching, can cause significant problems. This guide covers the four most common technical requirements: file size, page numbering, font embedding, and PDF/A conversion.

Step 0: Read the submission requirements first

This step is not optional and it comes before opening any PDF tool. Every institution and journal has its own requirements page, and they differ in material ways. What to look for before starting:

Requirement Common values Where to find it
File size limit 10 MB, 25 MB, 100 MB Submission portal help page
PDF version / standard PDF/A-1b, PDF/A-2b, PDF 1.4+ Graduate school format guide
Page numbering Arabic from body page 1; Roman for front matter Thesis formatting guidelines
Fonts All fonts must be embedded Repository technical requirements
Colour space sRGB or CMYK (for print) Journal author guidelines
Supplementary files Merged or separate; size limits apply per file Journal submission system

Many institutions publish a specific thesis format guide or author preparation checklist. Find it, save it, and check each item off as you work through this workflow.

Step 1: Compress to meet file size limits

A thesis with many figures, photographs, or scanned documents can easily reach 50–200 MB. Most submission portals impose limits of 10–100 MB. Compression is usually the first step because the file size affects whether anything else is possible within the portal.

What compression does to your PDF

PDF compression primarily affects embedded images — figures, photographs, diagrams — rather than text or vector graphics. The text in your document (paragraphs, headings, references) is stored as text characters and takes negligible space. Images are stored as bitmaps, and it is these that compression algorithms reduce.

Upload to way2pdf Compress PDF and choose a compression level based on your content:

  • Low compression: Minimal quality reduction. Best for manuscripts with high-resolution scientific figures, microscopy images, or maps where detail matters. Typically reduces file size by 10–30%.
  • Medium compression: Appropriate for most academic manuscripts. Reduces image quality slightly — not visible at normal reading zoom (100%), but may be visible when zooming in to pixel level. Typically reduces file size by 35–60%.
  • High compression: Use only when medium compression is insufficient to meet the file size limit. Visible quality reduction on photographs at 100% zoom. Avoid for manuscripts where figure quality will be evaluated by reviewers.

Checking output quality

After compression, open the output at 100% zoom in your PDF reader and compare a figure-heavy page against your source. In particular:

  • Check that graphs with fine lines and small labels are still legible.
  • Check photographs for visible artefacts — blocky regions or colour banding are signs of excessive compression.
  • Check that any embedded code listings or tables with small text are still sharp.
  • Verify that the text itself is unaffected — it should look identical to the source at any zoom level.

If the quality is unacceptable at medium compression, consider removing large figures from the main manuscript and submitting them as separate supplementary files if the journal allows this. Many journals explicitly permit high-resolution figures as separate uploads while requiring the manuscript text PDF to meet the file size limit.

Step 2: Add or verify page numbers

Academic documents have one of the most specific page numbering requirements of any document type. The typical convention is:

  • Front matter (title page, abstract, acknowledgements, table of contents, list of figures): Roman numerals (i, ii, iii …). The title page usually receives the number "i" but it is typically not printed visibly on the page itself — the abstract becomes page "ii" with a visible number.
  • Body text (introduction through conclusion, references, appendices): Arabic numerals (1, 2, 3 …) starting from 1 on the first page of Chapter 1.

When to use way2pdf for page numbering

If your manuscript was written in Word or LaTeX and exported with correct page numbers already embedded, check that they are positioned correctly and that no numbers are cut off at the page margin. If they look correct in the PDF, you do not need to add numbers again — proceed to Step 3.

Use way2pdf's Add Page Numbers tool when:

  • Your PDF has no page numbers at all (common when printing to PDF from a presentation or from a web browser).
  • The existing numbers are in the wrong position (e.g., the institution requires footer-centre but the document has them top-right).
  • You have assembled the document from multiple sources and the numbering is inconsistent or restarting from 1 mid-document.
  • The institution requires that all pages — including front matter — receive a visible number, starting from 1 with no Roman numeral section.

Practical limitation

way2pdf's Add Page Numbers tool applies a single sequential numbering scheme starting from a number you choose, in a single format (Arabic or Roman). It does not currently support dual numbering schemes (Roman for front matter, Arabic for body) in a single pass. If your institution requires both, one approach: add Roman numbers to the front-matter pages as a separate operation (upload only the front-matter pages, add Roman numerals starting from i), then add Arabic numbers to the body pages as a second operation (upload only the body pages, add Arabic numerals starting from 1), then merge the two numbered sections together.

Step 3: Merge appendices and supplementary material (if required)

Some institutions require all supplementary content — appendices, survey instruments, code listings, data tables — to be included within the single thesis PDF. Others allow or require them as separate files. If your institution requires a single merged PDF:

  1. Ensure all supplementary PDFs are formatted consistently (same page size, same margin standard if specified).
  2. Merge them after the main body using way2pdf Merge PDF.
  3. Verify the page count increases correctly after merge.
  4. Check that the table of contents in the front matter correctly references the appendix page numbers (these will be the merged-document page numbers, not the appendix's internal pagination).

Step 4: Convert to PDF/A for archival repositories

PDF/A is a subset of the PDF format standardised for long-term archival. It is required by many institutional repositories (EThOS in the UK, ProQuest in the US, DART-Europe) and by some journals for final accepted manuscripts. The "A" stands for Archive.

What PDF/A actually requires

A PDF/A-compliant file differs from a standard PDF in specific ways:

  • All fonts must be embedded. If a font is referenced but not embedded, any computer that doesn't have that font installed will render the text differently or not at all. PDF/A requires full font embedding so the document renders identically in 2046 as it does in 2026.
  • No JavaScript. JavaScript can alter document content or behaviour — this is incompatible with the requirement that the archived document be static and unalterable.
  • No encryption or password protection. An encrypted or password-protected PDF cannot be accessed without the password. PDF/A requires unrestricted access for archival access. Remove any protection before converting.
  • Colour spaces must be defined. Every colour in the document must reference a defined colour profile (typically sRGB for screen, CMYK for print) so colours render consistently across devices.
  • No external content references. Links to external files or streams that are not self-contained within the PDF are not permitted — the archived PDF must be fully self-contained.

PDF/A-1b vs PDF/A-2b

There are several PDF/A versions. The two you will encounter in academic submission:

  • PDF/A-1b: Based on PDF 1.4. The strictest and most widely supported by older repositories. Does not support transparency in images (vector transparency is flattened). If your figures use transparency effects, they will be rasterised. Does not support embedded files. Required by some legacy institutional systems.
  • PDF/A-2b: Based on PDF 1.7. Supports transparency, JPEG2000 compression (smaller file sizes), and embedded files. Appropriate for modern submissions. Use this unless the repository explicitly requires PDF/A-1b.

Upload to way2pdf PDF/A Converter, select the appropriate standard, and download the result. The tool embeds all fonts, sets the colour profile to sRGB, removes JavaScript, and strips external references. Processing time is typically under one minute for a 200-page thesis.

Verifying PDF/A compliance

After conversion, verify that the output is genuinely compliant before submitting. Not all PDF/A converters produce fully compliant output for every input. The recommended free tool for verification is VeraPDF (verapdf.org) — an open-source PDF/A validator maintained by the PDF Association and the Open Preservation Foundation. Upload your converted PDF and it will report any remaining compliance issues with specific references to the PDF/A standard clause being violated.

Common issues that VeraPDF catches:

  • Fonts that the converter couldn't embed because they are not embedded in the source PDF (rare, but occurs with some older Word-to-PDF exports)
  • Colour spaces on images that were specified as "DeviceRGB" without an output intent — the converter should set one, but some edge cases slip through
  • Metadata inconsistencies where XMP metadata contradicts the document info dictionary

Common problems and fixes

The compressed PDF fails the repository's file size check

If medium compression is insufficient: (1) Check whether supplementary figures can be submitted as separate files — this is often allowed and may not be subject to the same size limit. (2) Try high compression and compare figure quality at 100% zoom — if figures are still acceptable, use it. (3) Check if the repository offers a higher file size tier for theses with extensive visual content — many do.

Page numbers appear in the wrong position relative to the existing content

This happens when the source PDF has page content that extends to the full margin. Solution: add numbers with a slight inset from the page edge (8–10 mm from the bottom for footer placement) to ensure they don't overlap content. Alternatively, if the source document has a consistent blank footer space, use a position that lands within it.

PDF/A conversion changes the appearance of figures

Transparency flattening in PDF/A-1b conversion can change how layered vector figures look — gradients may become stepped, and transparent shadows may disappear. Switch to PDF/A-2b if your institution accepts it. If only PDF/A-1b is accepted, regenerate affected figures from their source (Python matplotlib, R ggplot, Adobe Illustrator) without transparency effects.

The portal rejects the PDF despite PDF/A compliance

Some portals check additional constraints beyond the PDF/A standard — maximum page dimensions, specific metadata fields, or a maximum embedded file count. Check the portal's own help documentation for technical requirements beyond "PDF/A." Contact the institution's graduate school or repository manager directly if the portal acceptance criteria are unclear.

Tools used in this workflow

Related workflows

Invoice Digitization

Scan paper invoices, run OCR, and export structured data to Excel spreadsheets for accounts payable processing.

Legal Bundle Preparation

Merge exhibits, add watermarks, and password-protect a complete legal submission package.