A Visual Intro to NumPy and Data Representation

A Visual Intro to NumPy and Data Representation

NumPy handles those as position-wise operations:

We can get away with doing these arithmetic operations on matrices of different size only if the different dimension is one (e.g. the matrix has only one column or one row), in which case NumPy uses its broadcast rules for that operation:

A key distinction to make with arithmetic is the case of matrix multiplication using the dot product. For example, consider the mean square error formula that is central to supervised machine learning models tackling regression problems:

Implementing this is a breeze in NumPy:

The beauty of this is that numpy does not care if and contain one or a thousand values (as long as they’re both the same size). So before feeding a sequence of words to a model, the tokens/words need to be replaced with their embeddings (50 dimension word2vec embedding in this case):

You can see that this NumPy array has the dimensions [embedding_dimension x sequence_length].

Source: jalammar.github.io