This dissertation investigates the potential of pre‑trained deep neural networks to interpret architectural drawings — specifically, to distinguish between plans and sections and to recognise spatial patterns. While neural networks are commonly trained on large, well‑labelled image datasets (e.g., identifying dog breeds or flower types), applying these methods to domains with sparse, irregular, or conceptually complex data — such as architectural spatial representation — presents unique challenges.
Drawing on the “projection problem” identified by cognitive scientist Jeffery Elman (UCSD), this research examines whether limited examples can adequately train networks to generalise and grasp the underlying logic of architectural drawings. It evaluates the extent to which pre‑trained networks can reduce the volume of task‑specific data needed, and explores domain adaptation — identifying where, and to what degree, a network adjusts from source to target domains.
A novel fine‑tuning strategy is proposed, using feature spaces from four auxiliary datasets — flowers, art drawings, animal sketches, and pet images. This cross‑domain approach revealed unexpected insights into how networks learn and adapt to architectural spatial logic, including which auxiliary datasets are most effective for structural and organisational understanding, and which network layers undergo the most significant adaptation.
The dissertation also introduces a new application of generative adversarial networks (GANs) to produce architectural drawings informed by pre‑trained models, demonstrating how generative processes can extend architectural dataset augmentation and exploration.
This ongoing research contributes both technical methods and theoretical insights to the intersection of deep learning and architecture, advancing how AI can be adapted to read, generate, and ultimately comprehend spatial knowledge.