Model Attention in CNNs for Image Classification of AI-Generated and Human-Made Artwork

Adam El Kholy
University of Bath
Last Updated: 30/04/2024

Free to use under the Apache 2.0 license
For use in Google Colab using the AI-Artwork dataset available on Kaggle

The following notebook allows you to train and evaluate a binary classification model using the AI-Artwork dataset as well as providing a basic toolkit for inspecting model attention. A sample of images with which to examine model attention can be downloaded from GitHub and the final trained model (a CNN with 6 convolutional and 2 pooling layers, 96% accuracy and a 0.97 F1 score) is available to download here

The final trained model can also be used online, with image upload functionality, via adamelkholy.co.uk/artworkai

import time
import cv2
import numpy as np
import tensorflow as tf
import matplotlib.pyplot as plt

Loading Dataset from Kaggle

In order to download the dataset from Kaggle a free API key is necessary. The following tutorial explains how to acquire said key. Once acquired, uploading the key via the kaggle.json file will allow for the downloading of Kaggle datasets

# upload kaggle.json
from google.colab import files
files.upload()

{'kaggle.json': b'{"username":"john_appleseed","key":"API_key"}'}

We now install the Kaggle library on Colab

! pip install -q kaggle

!rm -r ~/.kaggle
!mkdir ~/.kaggle
!mv ./kaggle.json ~/.kaggle/
!chmod 600 ~/.kaggle/kaggle.json

For this notebook we will using the AI-Artwork dataset. Alternatively if you would like to run the experiments in this notebook using the AI-ArtBench dataset by Ravidu Silva please see Appendix A. We first download and unzip the Combined Dataset

!kaggle datasets download -d adamelkholy/human-ai-artwork/

Downloading human-ai-artwork.zip to /content
100% 62.4G/62.4G [49:31<00:00, 29.6MB/s]
100% 62.4G/62.4G [49:31<00:00, 22.6MB/s]

! mkdir data
! unzip human-ai-artwork.zip -d data;

Streaming output truncated to the last 10 lines.
inflating: data/data/Human_Ukiyo_e/utagawa-toyokuni_nakamura-nosio-the-second-performs-the-dance-dodzedzi-1796.jpg  
inflating: data/data/Human_Ukiyo_e/utagawa-toyokuni_nakamura-utaemon-1.jpg  
inflating: data/data/Human_Ukiyo_e/utagawa-toyokuni_nakamura-utaemon.jpg  
inflating: data/data/Human_Ukiyo_e/utagawa-toyokuni_segawa-kikunojo-iii-and-bando-mitsugoro-ii-1798.jpg  
inflating: data/data/Human_Ukiyo_e/utagawa-toyokuni_seki-sanjuro.jpg  
inflating: data/data/Human_Ukiyo_e/utagawa-toyokuni_the-actor-otani-monzo-in-the-role-of-igarashi-tenzen.jpg  
inflating: data/data/Human_Ukiyo_e/utagawa-toyokuni_the-heian-courtier.jpg  
inflating: data/data/Human_Ukiyo_e/utagawa-toyokuni_the-promenade.jpg  
inflating: data/data/Human_Ukiyo_e/utagawa-toyokuni_three-beauties-playing-battledore-and-shuttlecock.jpg  
inflating: data/data/Human_Ukiyo_e/utagawa-toyokuni_three-beauties-snow.jpg

! rm /content/human-ai-artwork.zip

The dataset is now fully downloaded and unpacked in the data/ directory

Dataset Pre-Processing

In order to train our model we must load the data from its directory into a dataset object using tf.keras.utils.image_dataset_from_directory. We then use a stratified shuffled split of 60% training, 20% validation and 20% test data along with a batch size of 32 across 3 epochs

class_weights = {0: 1.0, 1: 1.0}
num_classes = 1
batch_size = 32
img_height = 256
img_width = 256
data_dir = "/content/data/data/"

train_ds = tf.keras.utils.image_dataset_from_directory(
  data_dir,
  validation_split=0.4,
  subset="training",
  seed=123,
  image_size=(img_height, img_width),
  batch_size=batch_size
)

Found 271993 files belonging to 52 classes.
Using 163196 files for training.

val_test_ds = tf.keras.utils.image_dataset_from_directory(
  data_dir,
  validation_split=0.4,
  subset="validation",
  seed=123,
  image_size=(img_height, img_width),
  batch_size=batch_size)

Found 271993 files belonging to 52 classes.
Using 108797 files for validation.

We split the original validation set in half in order to generate our test set

val_batches = int(0.5 * len(val_test_ds))
val_ds = val_test_ds.take(val_batches)
test_ds = val_test_ds.skip(val_batches)

We initialise the model's bias in order to adjust for the class imbalance

pos = 190549   # number of AI-Generated artworks 
neg = 81457    # number of Human-Made artworks
initial_bias = np.log([pos/neg])
output_bias = tf.keras.initializers.Constant(initial_bias)

The original Combined Dataset is organised into 52 classes. As such we must map the images to binary labels (1 for AI-generated artworks and 0 for human-made). The first 25 original classes are AI-generated

""" maps images to binary classes """
def classes_to_binary(image, label):
  # in the Combined Dataset our first 25 images are AI
  new_label = tf.where(label < 25, 1, 0)
  return image, new_label

train_ds = train_ds.map(classes_to_binary)
val_ds   = val_ds.map(classes_to_binary)
test_ds  = test_ds.map(classes_to_binary)

# sanity check
for image_batch, labels_batch in train_ds:
  print(image_batch.shape)
  print(labels_batch.shape)
  break
print(labels_batch)

(32, 256, 256, 3)
(32,)
tf.Tensor([1 1 0 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1 0 1 1 0 0 1 1 1 1 1 0], shape=(32,), dtype=int32)

plt.imshow(tf.cast(image_batch[0], tf.int32))
plt.axis("off")
plt.show()

The first image in our batch is clearly AI-Generated, with the label 1 indicating our binary mapping was succesful

Data Augmentation

In order to control for image size during model inference we augment our dataset. We randomly resize every image in order to introduce a level of visual noise across all samples, such that the model does not overfit so as to erroneously classify images based on size

""" augments the given image by randomly resizing the image and then rescaling it back down to its original size (256x256)"""
def augment(image, label):
    # generate a random rescale factor
    min_scale = 0.5
    max_scale = 2.0
    scale_factor = tf.random.uniform(shape=[], minval=min_scale, maxval=max_scale)

    # resize the image using bilinear interpolation then rescale back to original size
    resized_image = tf.image.resize(image, tf.cast(tf.cast(tf.shape(image)[1:3], tf.float32) * scale_factor, tf.int32))
    rescaled_image = tf.image.resize(resized_image, tf.shape(image)[1:3])

    return rescaled_image, label

(Optional) The cell below applies the augmentation operation to our dataset

train_ds = train_ds.map(augment)
val_ds   = val_ds.map(augment)
test_ds  = test_ds.map(augment)

# sanity check
for image_batch, labels_batch in train_ds:
  print(image_batch.shape)
  print(labels_batch.shape)
  break

(32, 256, 256, 3)
(32,)

plt.imshow(tf.cast(image_batch[0], tf.int32))
plt.axis("off")
plt.show()

We can clearly see evidence of compression in the image from the noise surrounding the figure, indicating our augmentation was successful

Model Training

Having preprocessed our data we now set about training our model. Below we have defined a suitable model (a 6 convolutional with 2 pooling layer CNN) for the task

example_model = tf.keras.Sequential([
  tf.keras.layers.Rescaling(1./255),

  tf.keras.layers.Conv2D(32, 3, activation='relu'),
  tf.keras.layers.Conv2D(32, 3, activation='relu'),
  tf.keras.layers.Conv2D(32, 3, activation='relu'),
  tf.keras.layers.MaxPooling2D(),

  tf.keras.layers.Conv2D(32, 3, activation='relu'),
  tf.keras.layers.Conv2D(32, 3, activation='relu'),
  tf.keras.layers.Conv2D(32, 3, activation='relu'),
  tf.keras.layers.MaxPooling2D(),

  tf.keras.layers.Flatten(),
  tf.keras.layers.Dense(128, activation='relu'),
  tf.keras.layers.Dense(num_classes, bias_initializer=output_bias, activation="sigmoid")
])
example_model._name = "example_model"

We now define helper functions for model training in order to setup our model pipeline. We use ADAM optimisation and a cross-entropy loss function

# specify path to save the model and its evaluation
path = "/content/"

# we define the performance metrics we want to track for our model
METRICS = [
      tf.keras.metrics.BinaryAccuracy(name='accuracy'),
      tf.keras.metrics.BinaryCrossentropy(name='cross entropy'), # equiv. to model's loss
      tf.keras.metrics.MeanSquaredError(name='MSE'),
      tf.keras.metrics.TruePositives(name='tp'),
      tf.keras.metrics.FalsePositives(name='fp'),
      tf.keras.metrics.TrueNegatives(name='tn'),
      tf.keras.metrics.FalseNegatives(name='fn'),
      tf.keras.metrics.Precision(name='precision'),
      tf.keras.metrics.Recall(name='recall'),
      tf.keras.metrics.AUC(name='roc', curve='ROC'),             # receiver operating characteristic curve
      tf.keras.metrics.AUC(name='prc', curve='PR'),              # precision-recall curve
]

We setup our own custom tensorflow callback in order to track our performance metrics across batches during training and testing

""" custom tensorflow history object used to record performance metrics during training
    and plot rolling average graphs """
class CustomHistory(tf.keras.callbacks.Callback):
    def __init__(self):
      super(CustomHistory, self).__init__()
      self.losses = []
      self.prcs = []
      self.recalls = []
      self.precisions = []
      self.accuracies = []
      self.mses = []

    """ called upon completion of each batch during training, records all performance metrics """
    def on_train_batch_end(self, batch, logs=None):
      self.losses.append(logs['loss'])
      self.mses.append(logs['MSE'])
      self.accuracies.append(logs['accuracy'])
      self.prcs.append(logs['prc'])
      self.recalls.append(logs['recall'])
      self.precisions.append(logs['precision'])

    """ called upon completion of each batch during testing, records all performance metrics """
    def on_test_batch_end(self, batch, logs=None):
      self.losses.append(logs['loss'])
      self.mses.append(logs['MSE'])
      self.accuracies.append(logs['accuracy'])
      self.prcs.append(logs['prc'])
      self.recalls.append(logs['recall'])
      self.precisions.append(logs['precision'])

    """ return all performance metrics """
    def get_metrics(self):
      return self.losses, self.mses, self.accuracies, self.prcs, self.recalls, self.precisions

""" compile the model with ADAM optimisation and cross-entropy loss """
def compile_model(model):
  model.compile(
    optimizer='adam',
    loss=tf.keras.losses.BinaryCrossentropy(from_logits=False),
    metrics=METRICS
  )
  return model

"""
    fits the model to training data.
    returns: the model and our custom history callback object
"""
def fit_model(model):
  history_callback = CustomHistory()
  model.fit(
    train_ds,
    validation_data=val_ds,
    epochs=3,
    class_weight=class_weights,
    callbacks=[history_callback]
  )
  return model, history_callback

""" evaluate model on test set and return performance metrics """
def evaluate_model_on_test(model):
  eval_metrics = model.evaluate(test_ds)
  return eval_metrics

""" save model to path in the .keras format """
def save_model(model):
  model_name = model._name
  print("\nSaving " + model_name + ".keras")
  try:
    model.save(path + model_name + ".keras")
  except:
    print("Error saving " + model_name + ".keras...")
    return
  print(model_name + ".keras saved successfully.\n")

""" save data (evaluation metrics or performance history) to .txt file """
def save_data(data, filename):
  print("Saving data for " + filename)
  try:
    with open(path + filename+".txt", 'w') as writefile:
      writefile.write(str(data))
  except:
    print("Error saving data for " + filename)
    return
  print("Data saved.\n")

We implement a function to execute the entire training and evaluation pipeline for a given model as follows

"""
    execute full training and evaluation pipeline, returning model, evaluation on test set
    and performance metrics during training (history)
    takes ADAM learning rate (lr) as an optional argument (default 0.001)
    returns model.keras, performance metrics evaluation on test set and model history (CustomHistory object)
"""
def train_model_pipeline(model, lr=0.001):
  start = time.time()
  model_name = model._name
  print("Now training " + model_name)

  # compile model and fit to training data
  compiled_model = compile_model(model)
  compiled_model.optimizer.learning_rate = lr
  trained_model, history = fit_model(compiled_model)

  # save model.keras and history data
  save_data(history.get_metrics(), model_name+"_history")
  save_model(trained_model)

  # evaluate on test set and save evaluation
  evals = evaluate_model_on_test(trained_model)
  save_data(evals, model_name + "_evals")

  time_taken = time.time() - start
  print("Training complete in", round((time_taken)/60, 2), "minutes")
  return trained_model, evals, history

Using Google Colab's T4 GPU the example model should take approximately 2 hours to train (alternatively see the next section to download the pre-trained model)

model, evals, history = train_model_pipeline(example_model)

Now training example_model
Epoch 1/3
5100/5100 [==============================] - 2041s 397ms/step
loss: 0.4293 - accuracy: 0.8015 - cross entropy: 0.4293 - MSE: 0.1383 - tp: 102780.0000 - fp: 20651.0000 - tn: 28014.0000 - fn: 11751.0000 - precision: 0.8327 - recall: 0.8974 - roc: 0.8561 - prc: 0.9319 - val_loss: 0.3469 - val_accuracy: 0.8475 - val_cross entropy: 0.3469 - val_MSE: 0.1087 - val_tp: 34149.0000 - val_fp: 4367.0000 - val_tn: 11954.0000 - val_fn: 3930.0000 - val_precision: 0.8866 - val_recall: 0.8968 - val_roc: 0.9125 - val_prc: 0.9597

Epoch 2/3
5100/5100 [==============================] - 1993s 391ms/step
loss: 0.2729 - accuracy: 0.8864 - cross entropy: 0.2729 - MSE: 0.0825 - tp: 107605.0000 - fp: 11613.0000 - tn: 37052.0000 - fn: 6926.0000 - precision: 0.9026 - recall: 0.9395 - roc: 0.9447 - prc: 0.9739 - val_loss: 0.2132 - val_accuracy: 0.9190 - val_cross entropy: 0.2132 - val_MSE: 0.0613 - val_tp: 36259.0000 - val_fp: 2592.0000 - val_tn: 13736.0000 - val_fn: 1813.0000 - val_precision: 0.9333 - val_recall: 0.9524 - val_roc: 0.9654 - val_prc: 0.9826

Epoch 3/3
5100/5100 [==============================] - 2034s 399ms/step
loss: 0.1880 - accuracy: 0.9273 - cross entropy: 0.1880 - MSE: 0.0544 - tp: 109407.0000 - fp: 6740.0000 - tn: 41925.0000 - fn: 5124.0000 - precision: 0.9420 - recall: 0.9553 - roc: 0.9735 - prc: 0.9877 - val_loss: 0.1731 - val_accuracy: 0.9331 - val_cross entropy: 0.1731 - val_MSE: 0.0496 - val_tp: 37105.0000 - val_fp: 2668.0000 - val_tn: 13658.0000 - val_fn: 969.0000 - val_precision: 0.9329 - val_recall: 0.9745 - val_roc: 0.9800 - val_prc: 0.9900

Saving data for example_model_history
Data saved.

Saving example_model.keras
example_model.keras saved successfully.

Evaluation 1/1
1700/1700 [==============================] - 850s 273ms/step
loss: 0.1693 - accuracy: 0.9351 - cross entropy: 0.1693 - MSE: 0.0485 - tp: 37022.0000 - fp: 2613.0000 - tn: 13844.0000 - fn: 918.0000 - precision: 0.9341 - recall: 0.9758 - roc: 0.9811 - prc: 0.9907

Saving data for example_model_evals
Data saved.

Training complete in 115.34 minutes

Loading a Pre-Trained Model

Alternatively you can load a pre-trained model (in the .h5 file format). We provide access to our final trained model (a CNN with 6 convolutional and 2 pooling layers, trained on the Augmented dataset) available to download here

# specify path to model
model_path = "/content/example_model.h5"
model = tf.keras.models.load_model(model_path)

Plotting Performance Metrics

If you are loading a previously saved history callback from a txt file (such as the example_history.txt available on GitHub) the following cell will read the history data. If you trained the model within this notebook then this step is unnecessary as the history callback is already loaded in the history variable

""" loads the history data at the specified path and returns an array of metric histories
    returns arr: a 2D array containing: [[losses], [mses], [accuracies], [prcs], [recalls], [precisions]]
    where [metric] is a 1D array of metric values recorded at the end of each batch during training and testing
"""
def get_history(path):
  file = open(path, "r")
  for line in file:
    arr = eval(line)
  file.close()
  return arr

# specify path from which to load custom history data
history_path = "/content/example_history.txt"
history = get_history(history_path)

# unpack performance metrics
losses, mses, accuracies, prcs, recalls, precisions = history

We now plot the rolling average graphs of loss, accuracy, recall and precision across batches during training and testing. We take the rolling average in order to avoid the skewing affects of anomalous data on our visualisation

""" plot the rolling average graph for the given metric during training and testing for the given metric data"""
def plot_metric_rolling_average_graph(x, y, title, xlabel, ylabel, cutoff=0):
  plt.plot(x[cutoff:], (np.cumsum(y) / x)[cutoff:])
  plt.xlabel(xlabel)
  plt.ylabel(ylabel)
  plt.title(title)
  plt.ylim(0,1)
  plt.show()

x = [x for x in range(len(losses))]
plot_metric_rolling_average_graph(x, losses, "Rolling average loss over time", "Batch No.#", "Loss")
plot_metric_rolling_average_graph(x, accuracies, "Rolling average accuracy over time", "Batch No.#", "Accuracy")
plot_metric_rolling_average_graph(x, recalls, "Rolling average recall over time", "Batch No.#", "Recall")
plot_metric_rolling_average_graph(x, precisions, "Rolling average precision over time", "Batch No.#", "Precision")

Attention Setup and Image Filtering

We now provide a simple toolkit for model attention analysis. Example attention images can be found in the attention_dataset folder on Github

# specify path for loading the attention dataset
path = "/content/attention_dataset"

We create a new model to simply extract the activation maps of the model which for which we would like to inspect attention. Credit: Rodrigo Silva 2023

# https://towardsdatascience.com/exploring-feature-extraction-with-cnns-345125cefc9a

from tensorflow.keras.models import Model
from tensorflow.keras.layers import Input, Conv2D, MaxPool2D, Dense, Dropout, Flatten
from tensorflow.keras.callbacks import ModelCheckpoint, EarlyStopping

# get model layers and inputs
benchmark_layers = model.layers
benchmark_input = model.input

# create a new model to output the feature maps of the original model
layer_outputs_benchmark = [layer.output for layer in benchmark_layers]
features_benchmark = Model(inputs=benchmark_input, outputs=layer_outputs_benchmark)

We now create our attention dataset object

batch_size = 32
img_height = 256
img_width = 256

attention_ds = tf.keras.utils.image_dataset_from_directory(
  path,
  seed=123,
  image_size=(img_height, img_width),
  batch_size=batch_size)

Found 4 files belonging to 1 classes.

# sanity check
for image_batch, labels_batch in attention_ds:
  print(image_batch.shape)
  print(labels_batch.shape)
  print(labels_batch[0])
  break

(4, 256, 256, 3)
(4,)
tf.Tensor(0, shape=(), dtype=int32)

input_image_batch = np.expand_dims(image_batch[0], axis=0)
plt.axis("off")
plt.imshow(tf.cast(image_batch[0], tf.int32))

In our analysis we found our model to be replicating the existing Sobel filtering image preprocessing operations. We now inspect the vertical and horizontal gradient Sobel filters applied to our image as well as the Laplacian filter operation

from google.colab.patches import cv2_imshow

image = image_batch[0].numpy()
grey = cv2.cvtColor(image, cv2.COLOR_RGB2GRAY)

# apply filters: Sobel x, y and Laplacian
gX = cv2.Sobel(grey, ddepth=cv2.CV_32F, dx=1, dy=0, ksize=3) #x gradient Sobel filter with 3x3 kernel
gY = cv2.Sobel(grey, ddepth=cv2.CV_32F, dx=0, dy=1, ksize=3) #y gradient Sobel filter with 3x3 kernel
laplacian = cv2.Laplacian(grey,cv2.CV_32F)

# resize image for display purposes
resize_img = lambda x: cv2.resize(x, (256, 256))
fig, ax = plt.subplots(1, 3, figsize=(15, 5))

ax[0].imshow(resize_img(gX), cmap='viridis')
ax[0].set_title("Sobel $G_{x}$")
ax[0].axis('off')

ax[1].imshow(resize_img(gY), cmap='viridis')
ax[1].set_title("Sobel $G_{y}$")
ax[1].axis('off')

ax[2].imshow(laplacian, cmap='viridis')
ax[2].set_title('Laplacian')
ax[2].axis('off')

plt.tight_layout()
plt.show()

Examining Model Attention

We now provide out simple toolkit for inspecting model attention, featuring numerous functions to filter the model's activation maps for the given image

Across all of our attention inspection functions the following two parameters are frequently referenced

activations: an array of all the activation maps within the model across all channels and layers  
layer_names: an array containing the name of each layer in the model

To begin we show the activation maps across all 32 channels in the first layer

"""
    displays all 32 activation maps across all layers
"""
def show_all_channels(activations):
  for i, activation in enumerate(activations):

      # skip the rescaling layer
      if i==0:
        continue

      # we cannot visualise the final dense layers
      if len(activation.shape) <= 2:
        print(activation)
        continue

      num_channels = activation.shape[-1]
      num_cols = 5
      num_rows = 7
      plt.figure(figsize=(25, 25))

      for j in range(num_channels):
          plt.subplot(num_rows, num_cols, j + 1)
          plt.imshow(activation[0, :, :, j], cmap='viridis')
          plt.axis('off')

      plt.subplots_adjust(wspace=-0.67, hspace=0.01)
      plt.show()

# get the activations of all layers for the input image
activations = features_benchmark.predict(input_image_batch)
layer_names = [layer._name for layer in model.layers]

show_all_channels(activations[:2])

We now display the highest overall magnitude activation map in each layer

"""
    returns the highest activation map and its index for the given array of activations
"""
def get_highest_magnitude_map(activation):
    channel_sums = np.sum(activation, axis=(0, 1, 2))
    highest_index = np.argmax(channel_sums)
    highest_activation = activation[0, :, :, highest_index]
    return highest_activation, highest_index

"""
    displays the highest overall magnitude activation map and its channel for each layer of the model
"""
def show_highest_weighted_maps(activations, layer_names):
  plt.figure(figsize=(20, 20))
  num_cols =3
  num_rows = 4

  for i, activation in enumerate(activations):
      # skip the rescaling layer
      if i==0:
        continue

      # we cannot visualise the final dense layers
      if len(activation.shape) <= 2:
        break

      highest_activation, highest_index = get_highest_magnitude_map(32, activation)
      plt.subplot(num_rows, num_cols, i)

      # assuming the activation has shape (1, height, width, channels)
      plt.imshow(highest_activation, cmap='viridis')
      plt.title(f'{layer_names[i].capitalize()} (layer {i}): channel {highest_index}')
      plt.axis('off')

  plt.suptitle(f"Highest weighted activation map for each layer", fontsize=16, y=0.91)
  plt.subplots_adjust(wspace=-0.5, hspace= 0.10)
  plt.show()

show_highest_weighted_maps(activations, layer_names)

We now visualise the maps of a single channel across all layers of the model

"""
    displays the maps from a single specified channel across all layers of the model
"""
def visualise_single_channel(channel, activations, layer_names):

  plt.figure(figsize=(20, 20))
  num_cols = 3
  num_rows = 4

  for i, activation in enumerate(activations):

      # skip the rescaling layer
      if i==0:
        continue

      # we cannot visualise the final dense layers
      if len(activation.shape) <= 2:
        break

      plt.subplot(num_rows, num_cols, i)
      plt.imshow(activation[0, :, :, channel], cmap='viridis')
      plt.title(f'Layer {i}: {layer_names[i].capitalize()}')
      plt.axis('off')

  plt.suptitle(f"Channel {channel} across all layers", fontsize=16, y=0.91)
  plt.subplots_adjust(wspace=-0.5, hspace= 0.10)
  plt.show()

visualise_single_channel(20, activations, layer_names)

Finally we visualise a single activation map for the specified layer and channel

def visualise_single_map(layer_num, channel, activations, layer_names):
  activation = activations[layer_num]
  plt.imshow(activation[0, :, :, channel], cmap='viridis')
  plt.title(f'{layer_names[layer_num].capitalize()} (layer {layer_num}): channel {channel}')
  plt.axis('off')
  plt.subplots_adjust(wspace=-0.5, hspace= 0.10)
  plt.show()

visualise_single_map(1, 31, activations, layer_names)

Appendix A: Loading the AI-ArtBench Dataset

All code in this notebook can be run using data from the AI-ArtBench dataset, with a couple of minor adjustments necessary due to the difference in structure between AI-ArtBench Dataset and the Combined Dataset. First we simply download the dataset as before

!kaggle datasets download -d ravidussilva/real-ai-art

! mkdir data
! unzip real-ai-art.zip -d data;

Next the tensorflow dataset objects for AI-ArtBench must be loaded using seperate train and test directories as follows

class_weights = {0: 1.0, 1: 1.0}
num_classes = 1
batch_size = 32
img_height = 256
img_width = 256
train_dir = "/content/data/Real_AI_SD_LD_Dataset/train/"
test_dir = "/content/data/Real_AI_SD_LD_Dataset/test/"

train_ds = tf.keras.utils.image_dataset_from_directory(
  train_dir,
  validation_split=0.2,
  subset="training",
  seed=123,
  image_size=(img_height, img_width),
  batch_size=batch_size
  )

val_ds = tf.keras.utils.image_dataset_from_directory(
  train_dir,
  validation_split=0.2,
  subset="validation",
  seed=123,
  image_size=(img_height, img_width),
  batch_size=batch_size)

test_ds = tf.keras.utils.image_dataset_from_directory(
  test_dir,
  seed=123,
  image_size=(img_height, img_width),
  batch_size=batch_size)

The different datasets have different class imbalances respectively so we adjust accordingly

pos = 125015
neg = 60000
initial_bias = np.log([pos/neg])
output_bias = tf.keras.initializers.Constant(initial_bias)

Finally since there are a lower number of original classes in the AI-ArtBench dataset we must adjust our mapping of the classes to binary labels

def classes_to_binary(image, label):
  new_label = tf.where(label < 20, 1, 0)  # first 20 classes are AI images
  return image, new_label                 # we assign AI images the label 1

train_ds = train_ds.map(classes_to_binary)
val_ds   = val_ds.map(classes_to_binary)
test_ds  = test_ds.map(classes_to_binary)

Following these minor adjustments all the code in the notebook should run as normal using the AI-ArtBench dataset as opposed to the default Combined Dataset

# sanity check
for image_batch, labels_batch in train_ds:
  print(image_batch.shape)
  print(labels_batch.shape)
  break
print(labels_batch)

(32, 256, 256, 3)
(32,)
tf.Tensor([1 1 0 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1 0 1 1 0 0 1 1 1 1 1 0], shape=(32,), dtype=int32)

Appendix B: Model Definitions

We tested a number of different CNN architectures with a varied number of convolutional, dropout and pooling layers. Below we show a subsample of different architectures tested

arch_4_conv_2_pooling_combined = tf.keras.Sequential([
  tf.keras.layers.Rescaling(1./255),

  tf.keras.layers.Conv2D(32, 3, activation='relu'),
  tf.keras.layers.Conv2D(32, 3, activation='relu'),
  tf.keras.layers.MaxPooling2D(),

  tf.keras.layers.Conv2D(32, 3, activation='relu'),
  tf.keras.layers.Conv2D(32, 3, activation='relu'),
  tf.keras.layers.MaxPooling2D(),

  tf.keras.layers.Flatten(),
  tf.keras.layers.Dense(128, activation='relu'),
  tf.keras.layers.Dense(num_classes, bias_initializer=output_bias, activation="sigmoid")
])
arch_4_conv_2_pooling_combined._name = "arch_4_conv_2_pooling_combined"

pool_6_conv_6_pooling_combined = tf.keras.Sequential([
  tf.keras.layers.Rescaling(1./255),

  tf.keras.layers.Conv2D(32, 3, activation='relu'),
  tf.keras.layers.MaxPooling2D(),

  tf.keras.layers.Conv2D(32, 3, activation='relu'),
  tf.keras.layers.MaxPooling2D(),

  tf.keras.layers.Conv2D(32, 3, activation='relu'),
  tf.keras.layers.MaxPooling2D(),

  tf.keras.layers.Conv2D(32, 3, activation='relu'),
  tf.keras.layers.MaxPooling2D(),

  tf.keras.layers.Conv2D(32, 3, activation='relu'),
  tf.keras.layers.MaxPooling2D(),

  tf.keras.layers.Conv2D(32, 3, activation='relu'),
  tf.keras.layers.MaxPooling2D(),

  tf.keras.layers.Flatten(),
  tf.keras.layers.Dense(128, activation='relu'),
  tf.keras.layers.Dense(num_classes, bias_initializer=output_bias, activation="sigmoid")
])
pool_6_conv_6_pooling_combined._name = "pool_6_conv_6_pooling_combined"

aug_dropout_50_6_conv_2_pooling_combined = tf.keras.Sequential([
  tf.keras.layers.Rescaling(1./255),

  tf.keras.layers.Conv2D(32, 3, activation='relu'),
  tf.keras.layers.Conv2D(32, 3, activation='relu'),
  tf.keras.layers.Conv2D(32, 3, activation='relu'),
  tf.keras.layers.MaxPooling2D(),
  tf.keras.layers.Dropout(.5),

  tf.keras.layers.Conv2D(32, 3, activation='relu'),
  tf.keras.layers.Conv2D(32, 3, activation='relu'),
  tf.keras.layers.Conv2D(32, 3, activation='relu'),
  tf.keras.layers.MaxPooling2D(),
  tf.keras.layers.Dropout(.5),

  tf.keras.layers.Flatten(),
  tf.keras.layers.Dense(128, activation='relu'),
  tf.keras.layers.Dense(num_classes, bias_initializer=output_bias, activation="sigmoid")
])
aug_dropout_50_6_conv_2_pooling_combined._name = "aug_dropout_50_6_conv_2_pooling_combined"

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
data		data
images		images
Dissertation.pdf		Dissertation.pdf
LICENSE		LICENSE
README.md		README.md
main.ipynb		main.ipynb
model_wrapper.py		model_wrapper.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Model Attention in CNNs for Image Classification of AI-Generated and Human-Made Artwork

Loading Dataset from Kaggle

Dataset Pre-Processing

Data Augmentation

Model Training

Loading a Pre-Trained Model

Plotting Performance Metrics

Attention Setup and Image Filtering

Examining Model Attention

Appendix A: Loading the AI-ArtBench Dataset

Appendix B: Model Definitions

About

Releases

Packages

Languages

License

adamelkholyy/ai-artwork-attention

Folders and files

Latest commit

History

Repository files navigation

Model Attention in CNNs for Image Classification of AI-Generated and Human-Made Artwork

Loading Dataset from Kaggle

Dataset Pre-Processing

Data Augmentation

Model Training

Loading a Pre-Trained Model

Plotting Performance Metrics

Attention Setup and Image Filtering

Examining Model Attention

Appendix A: Loading the AI-ArtBench Dataset

Appendix B: Model Definitions

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages