Alongside the document/image, send in additional information with the request to validate against. For example, send a list of allowed phone brands with which the image has to be taken, or send in countries to be validated with geodata from the EXIF.
Explainable AI: Providing natural language descriptions of why a document was flagged (e.g., "The font on the date field does not match the rest of the document").
Template Matching: We could maintain a "blocklist" database of known internet receipt/invoice maker templates.
Screenshot Detection: We could analyze EXIF data, aspect ratios (screen dimensions), and resolution (72 DPI vs print DPI) to flag screenshots.