In the financial model, a hash is created for each document based on merchant name, document date and total amount.
Extend the duplicate detection with a file-based hash, meaning even the slightest change in the document results in a different hash.
Instead of only returning the hash, return a field indicating if the hash was seen before.