Обложка канала

Spark in me - Internet, data science, math, deep learning, philosophy

2440 @snakers4

Канал про интересные мне темы - интернет - статистика - наука о данных Без рекламы и буллшита.

Spark in me - Internet, data science, math, deep learning, philosophy

4 года назад
Открыть в
CoCa: Contrastive Captioners are Image-Text Foundation Models Looks like Google is dead set on developing a production grade dual Image-Text encoder / captioning model:
we unify single-encoder, dual-encoder and encoder-decoder paradigms, and train one image-text foundation model that subsumes the capabilities of all three approaches
The idea of using all of the available noisy data and approaches and creatively sharing the compute is a good pattern, unless you read this line:
Pretraining CoCa takes about 5 days on 2,048 CloudTPUv4 chips
Research and compute siloing, of course, but the pattern itself is nice. #deep_learing
Image-Text Pre-training with Contrastive Captioners

Posted by Zirui Wang and Jiahui Yu, Research Scientists, Google Research, Brain Team Oftentimes, machine learning (ML) model developers b...

Google AI Blog