Attention is all you need

“Attention is all you need” is a research paper that proposes a new way of making an AI.

Attention is mapping a query and a set of key-value pairs to an output, all of which are vectors: the query, the keys, and the values. The results are calculated as a weighted sum of the values, ith each value’s weight determined by the query’s compatibility function with its corresponding key.

There are to type of attentions:

Self-attention

Cross-attention

A word about cross-attention is gonna be some other time.

Self-attention is mechanism that links various positions in a single sequence to create a representation of the sequence. Reading comprehension, abstractive summarization, textual entailment, and the acquisition of task-independent phrase representation are just a few of the activities in which self-attention has been successfully applied.

More about this you can read in “Attention is all you need”.

PDF