| ▲ | porridgeraisin 3 hours ago | |
Memory is the main constraint. You have what, 8mb of psram. Compute wise you can manage. You can do quantisation and run a small 10-15 layer CNN perhaps. Image classification is possible. Keep in mind the channel count and input resolution cannot be high since memory will be a problem. You can maybe do face _detection_, "is my cat on my keyboard" classification as well maybe. Audio, you can do a lot more. Wake word detection happens on _much_ smaller accelerators inside iphones. In this one you can do slightly heavier classifications. Maybe speaker identification "which member of family" or maybe "which dog is barking" | ||