Ah, functional modularity addresses the macroscopic architecture of the brain — each region responds to a particular type of stimulus. However, I was writing about microscopic structure; how the connections between particular neurons are formed. Our brains wire neurons locally based on affinity, (“what fires together, wires together…”) which is what I supposed was superior to back-propagation, so I am advocating that we follow the neurophysiological model.
Also, functional modularity does not address the issue of catastrophic forgetting in artificial neural networks; our brains can learn various tasks which are all located in the same functional module, without overwriting other tasks in that module. Artificial neural networks which seek to approximate a particular functional module (e.g. a language task, or image recognition) still suffer from catastrophic forgetting when they are asked to learn something new within that same functional module. So, parsing tasks between different functional modules does not alleviate catastrophic forgetting.
Fundamentally, catastrophic forgetting is a byproduct of the method used for training artificial neural networks (i.e. back-propagation by gradient descent). That’s why I advocate that back-propagation be abandoned, in favor of wiring by affinity of salient features, after suppression of input weights.
In the link I provided, researchers showed that many areas of the brain are initially activated by stimuli, yet they ‘come to an agreement’ by iterative suppression of inputs, until only the salient features remain. This is how features’ affinity can be selected, despite noisy initial activity. I hope that helps to answer your question!