top of page

What are your thoughts on training specialized vs more general neural net architectures?


Google, Cambridge and the Alan Turing Institute just published PolyViT, "a single transformer model capable of processing multiple modalities and datasets, whilst sharing almost all of its learnable parameters." Authors call their training method "co-training" and it involves training a single model over different modalities and for multiple classification tasks. They experimented with as many as 9 classification tasks over image, video, and audio. Results are strong and in some cases state-of-the-art. An interesting find is that according to authors "co-training has a regularizing effect, that improves performance on smaller datasets that large transformer models would otherwise overfit on." #deeplearning #neuralnetworks #ai #transformers #machinelearning

2 views0 comments

Recent Posts

See All

Comments


bottom of page