11/19/2024 Let me explain some of the code from yesterday: 0. a. I forgot to add import numpy as np b. I thought I needed to do the following to solve an error, but with the code I provided no error happens, so it seems unnecessary #tf.config.run_functions_eagerly(True) c. how did I know I needed to do this? x_train = x_train.reshape((x_train.shape[0], 28, 28, 1)) I got an error on the fit call because x_train was expected to be 4 dimensions but was only 3 did the same to the x_test set 1. purpose of this: y_train = keras.utils.to_categorical(y_train, num_classes) before that line, y_train were value like 1, 8, 7, 3 2. categorical_crossentropy (use for classification, one-hot label representation) sparse_categorical_crossentropy (use for classification, integer value label) mean squared error (use for regression) see here: https://keras.io/api/losses/probabilistic_losses/ 3. Besides model.evaluate, we can do model() or model.predict() 4. Try this simpler model for mnist to show underfitting: model = keras.models.Sequential() model.add(keras.layers.Conv2D(2, kernel_size=(3, 3), activation="relu", padding='same', input_shape=[28,28,1])) model.add(keras.layers.MaxPooling2D(pool_size=(8, 8))) model.add(keras.layers.Flatten()) model.add(keras.layers.Dense(num_classes, activation='softmax')) model.summary() 5. For Xavier weight initialization: kernel_initializer=keras.initializers.GlorotNormal() or kernel_initializer=keras.initializers.GlorotUniform() add this to the layer when it is added see here: https://keras.io/api/layers/initializers/ e.g. model = keras.models.Sequential() model.add(keras.layers.Conv2D(32, kernel_size=(3, 3), activation="relu", padding='same', kernel_initializer=keras.initializers.GlorotNormal(), input_shape=[28,28,1])) model.add(keras.layers.MaxPooling2D(pool_size=(2, 2))) model.add(keras.layers.Conv2D(64, kernel_size=(3, 3), activation="relu", kernel_initializer=keras.initializers.GlorotNormal())) model.add(keras.layers.MaxPooling2D(pool_size=(2, 2))) model.add(keras.layers.Flatten()) model.add(keras.layers.Dense(num_classes, activation='softmax', kernel_initializer=keras.initializers.GlorotNormal())) model.summary() 6. In the fit call, there is a parameter called shuffle that defaults to True See here: https://keras.io/api/models/model_training_apis/ 7. Let me show the code I have for fashion mnist (which is a more difficult dataset than mnist). I'll show it with a too simple model and that will cause underfitting then I show a more complex model. 8. Inputs can be specified in at least 2 ways: model.add(keras.layers.InputLayer(input_shape=[28,28,1])) model.add(keras.layers.Conv2D(8, 3, activation='relu', padding='same')) OR model.add(keras.layers.Conv2D(8, 3, activation='relu', padding='same', input_shape=[28,28,1]))