We have been experimenting recently a bit with machine learning. No practical usage, just figuring out the basics – how it really works. I have to say, it is a lot of fun, but also teaches patience – like visualization rendering used to.
By no means we are experts on ML now, however I would like to show one of the results that I think might be somewhat interesting. Because architecture is so much of a visual discipline we have very quickly turned into researching Generative Adversarial Networks (GAN). This type of neural network can be used to generate or process images and gained lot of attention due to deep fake human portraits.
There are many ‘subtypes’ of GAN out there. The most interesting for use were ones that could take an image as an input and based on that produce a different image result. Think of colourizing photos – you can input a black&white picture and ML, based on previously gained knowledge, will make a colour version.
One of the biggest challenges for successful neural net training is gathering high quality and high quantity data. Ideally you should have at least a few thousand images (or image pairs) to learn from them.
Fortunately we have services like Open Street Map – a great resource for urban related spatial data with open API access. That influenced our research direction and we decided to work with some city plans. In the example shown here we have tried to predict building footprints based on a specified road layout.
As a first step, we had to gather all the data needed and preprocess it so the neural net would be able to read it. To keep the input a bit more consistent we have decided to limit ourselves to European city centres, which we thought would share some common layout characteristics.
Using OverpassAPI (through overpy library in Python) we have downloaded snapshots of XML data from OSM. Then we have extracted two kinds of features from the map – roads and building outlines. This allowed us to create training data in the form of image pairs as below:
So, on the right we have a street network of some part of Antwerp. On the left filled building outlines. We have emphasized divisions between them with white lines – it seemed easier for the neural net to learn this way.
With the above data we were able to train our GAN. We have used a slightly modified algorithm of CGAN (conditional GAN) called Pix2Pix, implemented in TensorFlow/Keras. The learning process is actually quite fun. There are two networks involved. Often called ‘the artist’ and ‘the critic’ in the GAN nomenclature.
The role of the artist is to generate ‘fake’ images and slowly get more and more similar results to the real world data. The critic, on the other hand, learns to distinguish those creations from the real data – name what is real and what is fake. They work against each other (adversarial) but together they get both gradually better. I am not going to explain this process here in detail because I do not think I am competent enough and surely there are better places online to read about it.
In our example the artist does not generate a random image from scratch. It receives a road part of the training image (the lines). Based on that it tries to guess a possible building layout. In the learning process we use street data from OSM, but the goal here is to train the network to be able to work with any user provided data.
After lots of tweaking of the algorithm and the input data, we have managed to generate some results that would resemble proper schwarzplans. They are, of course, far from perfect and also pretty low res – learning process takes a lot of computational power. Also it is really not easy to build GANs that can work with large images.
One clear issue is the fact that resulting buildings are not as orthogonal as they should be. It also happens that some very strange shapes appear from time to time. Maybe longer training or better neural net architecture could solve it? Who knows…
Will this kind of workflow be useful in Revit one day? I really hope so, because it would be extremely exciting.
Here are some sample results. User drawn streets on the left, ML-generated building outlines on the right.