Hi all, I am finetuning a OCR model for document analysis. so i have a bunch of images and i need to remove some of thema and only consider few of them using criteria’s like
-> Having tables
→ Having more than 60% text on page
→ can have any special images like signs, stamps..etc
So any suggestion how to do?
Thanks in advance!!!