Google, Cambridge and the Alan Turing Institute just published PolyViT, "a single transformer model capable of processing multiple modalities and datasets, whilst sharing almost all of its learnable parameters." Authors call their training method "co-training" and it involves training a single model over different modalities and for multiple classification tasks. They experimented with as many as 9 classification tasks over image, video, and audio. Results are strong and in some cases state-of-the-art. An interesting find is that according to authors "co-training has a regularizing effect, that improves performance on smaller datasets that large transformer models would otherwise overfit on." #deeplearning #neuralnetworks #ai #transformers #machinelearning
top of page
Recent Posts
See AllAn innovative strategy that analyzes specific regions of the genome offers the possibility of early diagnosis of schizophrenia, reports a...
20
When business leaders talk about digital transformation, they often pay more attention to the word digital than the word transformation....
200
bottom of page
Comments