“Attention is all you need” is a research paper that proposes a new way of making an AI.
Attention is mapping a query and a set of key-value pairs to an output, all of which are vectors: the query, the keys, and the values. The results are calculated as a weighted sum of the values, ith each value’s weight determined by the query’s compatibility function with its corresponding key.
There are to type of attentions:
Self-attention
Cross-attention
A word about cross-attention is gonna be some other time.
Self-attention is mechanism that links various positions in a single sequence to create a representation of the sequence. Reading comprehension, abstractive summarization, textual entailment, and the acquisition of task-independent phrase representation are just a few of the activities in which self-attention has been successfully applied.
More about this you can read in “Attention is all you need”.