Skip to Main Content
DocHorizon Ideas Portal
Status Planned
Created by Steffen Wilhelm
Created on Jul 29, 2025

Prompt builder - Render PDF forms

Description:

Currently, the PromptBuilder component checks for the presence of an existing OCR layer and skips OCR generation if one is found. However, this leads to issues in cases where the existing OCR layer is incomplete or does not reflect recent changes.


Problem:

A user filled out a form and saved it as a PDF. The form fields were visually filled, but the resulting PDF’s text layer did not correctly store the input values. Despite this, PromptBuilder detected the general OCR and did not initiate a new OCR process. As a result, the filled-in values were not extractable via PromptBuilder.


Expected Behavior:

If an existing OCR layer is detected but fails to accurately represent all visible text (e.g., filled form fields), PromptBuilder should either:

  • Perform a validation pass to ensure OCR completeness and text layer fidelity, or

  • Provide an option to force re-OCR the document when anomalies are detected.


  • Attach files