3/25/24. Running a GAN Model

- March 25, 2024 9:11 pm

It’s been a while since my last post, and since then I’ve spent a lot of time studying how to code machine learning programs and learning about the workings of some well-known AI models. For example, Stable Diffusion (an incredibly powerful AI art generator model) and large language models like ChatGPT use the same process of tokenizing when processing text input. When given phrases or sentences, the AI models split them into separate words and then analyzes each word at a time. It’s all really interesting stuff.

As for my research assistant work, I’ve starting helping Hasnaa run her new Generative Adversarial Network (GAN) models. This GAN model scans through hundreds of real resistance spot weld nugget images and then trains itself to produce similar images. I’m not really doing any of the coding myself though. I just press a few buttons to run the code and change a couple of numbers to adjust the parameters in the GAN model. Hasnaa is the one actually tuning the neural network architecture and coding things. I’d love to help out with coding, but right now, it’s all way to complex for me to figure out. The only reason I understand what’s going on inside of the GAN model is because Hasnaa does such a great job of adding in comments that explain what every block of code does.

Running the code also does take up a lot of time and processing power. Sometimes, it’ll take up to an hour or more, even when running on a high-end GPU. The GAN model that Hasnaa built goes through 150 rounds of training. In each round of training, the generator part of the model produces a new batch of images for the discriminator part to analyze. It’s pretty cool watching the model create images that get more and more accurate over time. I’ve attached some images below for reference.

This is what the images are being trained to look like: