Hi everyone,
I’m building a construction plan takeoff pipeline for architectural drawings. I’m using Python and OpenCV to analyze architectural elevation PDFs that may be rasterized or converted to images.
The specific problem:
I can detect many lines from an exterior elevation, but the detected geometry includes different types of lines mixed together:
- exterior wall outline / facade boundary lines
- hatch or siding pattern lines
- dimension extension lines
- opening lines around windows and doors
- view crop / title / annotation lines
- roof/gable/context lines
I need a generic way to distinguish the main exterior wall/facade boundary from these other line types. I do not want to hardcode this for one specific PDF, sheet name, page number, address, or drawing style. The goal is to work across many architectural/construction drawing sets.
My current idea is to score line candidates using features such as:
- line length
- stroke width or apparent line weight
- proximity to text/dimensions
- whether the line is part of a connected contour
- whether it forms a closed or nearly closed polygon
- relationship to openings/windows/doors
- orientation: horizontal/vertical/roof slope
- repeated short parallel lines that may indicate hatch or siding
- distance from view titles or dimension strings
Question:
For architectural elevation images, what OpenCV features or pipeline would you recommend to separate true exterior facade boundary lines from hatch lines, dimension lines, annotation lines, and opening lines?
Would you approach this with:
- morphological operations,
- connected components,
- contour hierarchy,
- Hough line detection,
- skeletonization,
- graph-based scoring,
- stroke width / line weight estimation,
- or a combination of these?
I’m especially interested in a robust, general approach rather than a one-off solution for a single drawing.
I can share a redacted cropped image and an overlay showing the detected lines if that helps.
Thank you!