Training art in frame in gallery

I’m only done training image into category classes. But this time,
I like to recognize images in the gallery space.

  • I first like to detect square in the gallery space.
  • from the bundle of cropped images, I like to group them to each art.
    Is this path correct? And what base model and how I should train them?

Thank you!