Fastai Transforms for DonkeyCar
Fastai transforms don’t seem to help DonkeyCar training
The way rewards were originally being logged doesn’t make much sense. It looks like Gym considers each point to an episode, by returning done from the step() method. Breakout itself considers 5 points to be a game, so in my later testing I sum scores for each 5 ‘episodes’ and report that.
I partially reproduced OpenAI’s model in Keras so I could output a summary even though it didn’t include filter size and stride.
Layer (type) | Output Shape | Param # | Size | Stride |
---|---|---|---|---|
conv2d_1 (Conv2D) | (None, 20, 20, 32) | 8224 | 8x8 | 4 |
conv2d_2 (Conv2D) | (None, 9, 9, 64) | 32832 | 4x4 | 2 |
conv2d_3 (Conv2D) | (None, 7, 7, 32) | 18464 | 3x3 | 1 |
flatten_1 (Flatten) | (None, 1568) | 0 | ||
dense_1 (Dense) | (None, 512) | 803328 | ||
dense_2 (Dense) | (None, 4) | 2052 |
Total params: 864,900
Trainable params: 864,900
Layer (type) | Output Shape | Param # | Size | Stride |
---|---|---|---|---|
conv2d_1 (Conv2D) | (None, 21, 21, 32) | 1184 | 3x3 | 4 |
conv2d_2 (Conv2D) | (None, 10, 10, 64) | 18496 | 3x3 | 3 |
conv2d_3 (Conv2D) | (None, 8, 8, 32) | 18464 | 3x3 | 1 |
flatten_1 (Flatten) | (None, 2048) | 0 | ||
dense_1 (Dense) | (None, 512) | 1049088 | ||
dense_2 (Dense) | (None, 4) | 2052 |
Total params: 1,089,284
Trainable params: 1,089,284
I ran checkpointed versions of each model on 10 games and plotted the average reward along with error bars at 1 standard deviation. These plots use the per-game method of figuring reward.