Multiple vectors could be arranged in rows or columns to create a matrix.

$$\vec{a} = \begin{bmatrix} 5 \\ 6 \\ 7 \end{bmatrix}\ \ \vec{b} = \begin{bmatrix} 7 \\ 9 \\ 10 \end{bmatrix}$$

$$\textbf{A} = \begin{bmatrix} \vec{a} & \vec{b} \end{bmatrix} = \begin{bmatrix} 5 & 7 \\ 6 & 9 \\ 7 & 10 \end{bmatrix}$$

$$\textbf{B} = \begin{bmatrix} \vec{a} \\ \vec{b} \end{bmatrix} = \begin{bmatrix} 5 & 6 & 7 \\ 7 & 9 & 10 \end{bmatrix}$$

The term *matrix* encompasses all vectors so $\vec{a}$ and $\vec{b}$ could be called matrices or vectors interchangeably.

The notation of the number of $rows \times columns$ of a matrix is called its **order**. For example, the following matrix has order $3 \times 4$.

$$\begin{bmatrix} 1 & 8 & 99 & 0 \\ 56 & 43 & 91 & 2 \\ 9 & 5 & 33 & 1 \end{bmatrix}$$

Vectors are matrices with an order of either $n \times 1$ or $1 \times n$.

The **diagonal** of the matrix refers to the elements from the top left corner to the bottom right corner of the matrix (highlighted below).

$$\begin{bmatrix} {\color {red} 4} & 9 & 5 \\ 9 & {\color{red} 3} & 1 \\ 90 & 2 & {\color{red} 8}\end{bmatrix} \ \ \ \begin{bmatrix} {\color {red} 5} & 9 & 5 & 98 \\ 9 & {\color{red} 43} & 1 & 1 \\ 90 & 2 & {\color{red} 9} & 52\end{bmatrix}$$

A matrix with an equal number of rows and columns is called a **square matrix**. For example:

$$\begin{bmatrix} 1 & 8 & 9 \\ 2 & 0 & 11 \\ 9 & 7 & 1\end{bmatrix}$$

A **triangular matrix** is a type of square matrix where either the elements above or below the diagonal are all $0$s.

Here is an example of an **upper triangular matrix**

$$\begin{bmatrix} \color{red} 6 & \color{red} 21 & \color{red} 6 & \color{red} 88 \\ 0 & \color{red} 71 & \color{red} 90 & \color{red} 8 \\ 0 & 0 & \color{red} 1 & \color{red} 71 \\ 0 & 0 & 0 & \color{red} 4 \end{bmatrix}$$

Following is an example of a **lower triangular matrix**

$$\begin{bmatrix} \color{red} 6 & 0 & 0 & 0 \\ \color{red} 7 & \color{red} 71 & 0 & 0 \\ \color{red} 12 & \color{red} 61 & \color{red} 1 & 0 \\ \color{red} 4 & \color{red} 6 & \color{red} 5 & \color{red} 4 \end{bmatrix}$$

Two matrices are equal if

- They have the same order.
- The values of the corresponding elements are the same.

Given a matrix $\textbf{A} = \begin{bmatrix} 1 & 2 \\ 3 & 4 \end{bmatrix}$ and a matrix $\textbf{B} = \begin{bmatrix} 1 & 2 \\ 3 & 4 \end{bmatrix}$ we can say that $\textbf{A} = \textbf{B}$ because the order of both matrices is $2 \times 2$ and the elements corresponding to same position are also equal.

To add two matrices we just have to sum the elements at corresponding positions. The order of both matrices should be the same for addition (and subtraction).

$$\begin{bmatrix} 4 & 5 & 9 \\ 89 & 5 & 91 \end{bmatrix} + \begin{bmatrix} 9 & 4 & 8 \\ 9 & 1 & 85 \end{bmatrix} = \begin{bmatrix} 4+9 & 5+4 & 9+8 \\ 89+9 & 5+1 & 91+85 \end{bmatrix} = \begin{bmatrix} 13 & 9 & 17 \\ 98 & 6 & 176 \end{bmatrix}$$

The subtraction operation is similar to the addition.

$$\begin{bmatrix} 9 & 4 & 8 \\ 9 & 1 & 85 \end{bmatrix} - \begin{bmatrix} 4 & 5 & 9 \\ 89 & 5 & 91 \end{bmatrix} = \begin{bmatrix} 9-4 & 4-5 & 8-9 \\ 9-89 & 1-5 & 85-91 \end{bmatrix} = \begin{bmatrix} 5 & -1 & -1 \\ -80 & -4 & -6 \end{bmatrix}$$

If we want to multiply a matrix with a scalar value, we just have to multiply the scalar with each element individually.

$$5 \times \begin{bmatrix} 7 & 6 \\ 8 & 4 \end{bmatrix} = \begin{bmatrix} 7 \times 5 & 6 \times 5 \\ 8 \times 5 & 4 \times 5 \end{bmatrix} = \begin{bmatrix} 35 & 30 \\ 40 & 20 \end{bmatrix}$$

Division operation with a scalar value is also applied similarly on the matrix.

$$\begin{bmatrix} 9 & 6 \\ 12 & 36 \end{bmatrix} \div 3 = \begin{bmatrix} 9 & 6 \\ 12 & 36 \end{bmatrix} \times {1 \over 3} = \begin{bmatrix} {9 \over 3} & {6 \over 3} \\ {12 \over 3} & {36 \over 3} \end{bmatrix} = \begin{bmatrix} 3 & 2 \\ 4 & 12 \end{bmatrix}$$

The **transpose** operation on a matrix swaps the elements in its rows and columns. For example, if we apply the transpose operation on the matrix $\textbf{C} = \begin{bmatrix} 4 & 5 & 8 \\ 6 & 7 & 9 \end{bmatrix}$ the result would be $\textbf{C}^T = \begin{bmatrix} 4 & 6 \\ 5 & 7 \\ 8 & 9 \end{bmatrix}$.

Notice that the order of the matrix is also changed from $2 \times 3$ to $3 \times 2$.

A matrix is **symmetric** if it remains unchanged after the transpose operation is applied i.e. $\textbf{C}^T = \textbf{C}$. The following matrix is an example of a symmetric matrix:

$$\left(\begin{bmatrix} 9 & 5 & 89 \\ 5 & 8 & 67 \\ 89 & 67 & 34\end{bmatrix}\right)^T = \begin{bmatrix} 9 & 5 & 89 \\ 5 & 8 & 67 \\ 89 & 67 & 34\end{bmatrix}$$

If we perform the transpose operation on a skew-symmetric matrix the result will be the original matrix multiplied by the scalar value $(-1)$ i.e. $\textbf{C}^T = (-1) \times \textbf{C}$.

$$\left(\begin{bmatrix} 9 & 5 & 89 \\ -5 & 8 & 67 \\ -89 & -67 & 34\end{bmatrix}\right)^T = \begin{bmatrix} 9 & -5 & -89 \\ 5 & 8 & -67 \\ 89 & 67 & 34\end{bmatrix}$$

To calculate the product of two matrices we have to take the dot product of each row from the first matrix with every column of the second matrix.

$$\textbf{A} \times \textbf{B} = \begin{bmatrix} a_{11} & a_{12} \\ a_{21} & a_{22} \\ a_{31} & a_{32} \end{bmatrix} \times \begin{bmatrix} b_{11} & b_{12} \\ b_{21} & b_{22} \end{bmatrix} = \begin{bmatrix} a_{11}b_{11} +a_{12}b_{21} & a_{11}b_{12} +a_{12}b_{22} \\ a_{21}b_{11} +a_{22}b_{21} & a_{21}b_{12} +a_{22}b_{22} \\ a_{31}b_{11} +a_{32}b_{21} & a_{31}b_{12} +a_{32}b_{22}\end{bmatrix}$$

Because the dot product could only be calculated between the vectors with identical number of elements, to multiply two matrices the number of columns in the first matrix should be equal to the number of rows in the second matrix.

If we switch the matrix $\textbf{A}$ and $\textbf{B}$ while performing multiplication the resulting product matrix will have different dimensions (and elements). Thus, *matrix multiplication is not commutative*.
$$\textbf{A} \times \textbf{B} \neq \textbf{B} \times \textbf{A}$$

A square matrix with $1$s on its diagonal (and $0$s as non-diagonal elements) is called an **identity matrix**. Multiplying any matrix with an identity matrix (of valid order) is analogous to multiplying a number with $1$.

In equations, the identity matrix is represented with $\textbf{I}$

$$\textbf{A} \times \textbf{I} = \textbf{A}$$ $$\begin{bmatrix} 7 & 1 & 8 \\ 4 & 5 & 3 \\ 1 & 2 & 6 \end{bmatrix} \times \begin{bmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{bmatrix} = \begin{bmatrix} 7 & 1 & 8 \\ 4 & 5 & 3 \\ 1 & 2 & 6 \end{bmatrix}$$

If multiplication with identity matrix is analogous to multiplication with $1$ then multiplication with null matrix will be analogous to multiplication of a number with $0$. A null matrix has only $0$s as elements and doesn’t have to be a square matrix. It is represented as $\textbf{0}$ in equations.

$$\textbf{A} \times \textbf{0} = \textbf{0}$$ $$\begin{bmatrix} 7 & 1 & 8 \\ 4 & 5 & 3 \\ 1 & 2 & 6 \end{bmatrix} \times \begin{bmatrix} 0 & 0 & 0 \\ 0 & 0 & 0 \\ 0 & 0 & 0 \end{bmatrix} = \begin{bmatrix} 0 & 0 & 0 \\ 0 & 0 & 0 \\ 0 & 0 & 0 \end{bmatrix}$$

A matrix when multiplied by itself remains unchanged, is called an **idempotent matrix**.
$$\textbf{A} \times \textbf{A} = \textbf{A}^{2} = \textbf{A}$$
Since the order of the product matrix has to be the same as the original matrix, an idempotent matrix must always be a square matrix.
All identity matrices are idempotent.

If we have to divide a matrix with another matrix

$$\textbf{A} \div \textbf{B} = {\textbf{A} \over \textbf{B}}$$

we can rephrase it as

$$\textbf{A} \times {\textbf{I} \over \textbf{B}} = \textbf{A} \times \textbf{B}^{-1}$$

The matrix $\textbf{B}^{-1}$ will be called the **inverse** of the matrix $\textbf{B}$.
$$\textbf{B} \times \textbf{B}^{-1} = \textbf{I}$$
The inverse matrix has the same order as the original matrix.
We can find the inverse of any square matrix using the process of Gauss-Jordan Elimination.

Matrix multiplication dimensions

Part 2 : Operations on Matrices

Part 3 : Types of Matrices, Diagonal, and Transpose

On a 2D plane $8$ and $9$ could be called the horizontal and vertical components of the vector $\vec{v}$ respectively. But the words *horizontal* and *vertical* on a 2D or 3D plane are relative to the viewer (try rotating the figure above anti-clockwise).

By using **unit vectors** $\hat{i}$, $\hat{j}$, and $\hat{k}$ (vectors with magnitude 1 along the x, y, and z axis respectively) we can define the orientation of each component of the vector. Thus, vector $\vec{v}$ could be redefined as $8\hat{i} + 9\hat{j} + 0\hat{k}$ in a 3D plane or $8\hat{i} + 9\hat{j}$ on a 2D plane.

*Unit vectors along the x, y, and z axis*

We can break the vector into its components if we know the angle between the vector and any of the axes.

Assuming we have a vector $\vec{v}$ from the origin, we can draw a circle on the plane with radius $|\vec{v}|$ (magnitude of the vector). To break it into two components along the x and y axis we can draw a *projection* line starting from the tip of the vector to both axes.

From the figure above we can assess that

$$ \vec{v}_{x} = |\vec{v}| \times {\vec{v}_x \over |\vec{v}|} = |\vec{v}|\cos(\theta) $$

$$ \vec{v}_{y} = |\vec{v}| \times {\vec{v}_y \over |\vec{v}|} = |\vec{v}|\sin(\theta) $$

Thus, the vector components of $\vec{v}$ along the x and y axis are $|\vec{v}|\cos(\theta)$ and $|\vec{v}|\sin(\theta)$ respectively, where $\theta$ is the angle between the vector and the x-axis.

Similarly, we can find the projection of a vector on another vector. For example, we have two vectors $\vec{v}$ and $\vec{u}$ subtended by the angle $\phi$.

*Projection of $\vec{v}$ on $\vec{u}$*

The component of $\vec{v}$ on vector $\vec{u}$ will be $|\vec{v}| \cos(\phi)$, but this is just a scalar value. To add direction we have to include $\hat{u}$, the unit vector along the direction of $\vec{u}$. Thus, the **projection vector** of $\vec{v}$ on $\vec{u}$ is $|\vec{v}| \cos(\phi) \hat{u}$.

The **dot product** of two vectors is calculated by multiplying the projection of the first vector with the magnitude of the second vector. It quantifies the similarity in the direction of both vectors.

$$ \vec{v} \cdot \vec{u} = |\vec{v}||\vec{u}|\cos(\phi)$$
The dot product operation is *commutative*, so it doesn’t matter if we multiply the projection of $\vec{v}$ with the magnitude of $\vec{u}$ or vice-versa.
$$\vec{v} \cdot \vec{u} = |\vec{v}||\vec{u}|\cos(\phi) = |\vec{u}||\vec{v}|\cos(\phi) = \vec{u} \cdot \vec{v}$$

*Projection of $\vec{v}$ on $\vec{u}$ and projection of $\vec{u}$ on $\vec{v}$*

We can also calculate the dot product by multiplying the corresponding elements of both vectors and adding them up.

$$\vec{v} \cdot \vec{u} = \begin{bmatrix} 1 \\ 1 \end{bmatrix} \cdot \begin{bmatrix} 3 \\ -2 \end{bmatrix} = 1 \times 3 + 1 \times (-2) = 1$$

The **cross product** of two vectors returns a vector that is perpendicular to the direction of both vectors.

The magnitude of the resulting vector of the cross product is equal to the area of the parallelogram created by the two vectors.

The length of the sides of the parallelogram will be $|\vec{v}|$ and $|\vec{u}|$.

Hence, from the area of the parallelogram we can calculate the magnitude of the cross product of vectors $\vec{u}$ and $\vec{v}$ $$| \vec{u} \times \vec{v}| = |\vec{u}||\vec{v}|\sin(\phi)$$

Unlike the dot product, the cross product is not commutative. $$\vec{u} \times \vec{v} \neq \vec{v} \times \vec{u}$$

If we have to find the cross product of two vectors using their elements, we have to create a matrix of both vectors and calculate its determinant.

$$\vec{u} \times \vec{v} = \begin{bmatrix} u_1 \\ u_2 \end{bmatrix} \times \begin{bmatrix} v_1 \\ v_2 \end{bmatrix} = \begin{vmatrix} u_1 & v_1 \\ u_2 & v_2 \end{vmatrix} = u_1 v_2 - v_1 \ u_2$$

Named after Jacques Hadamard, the **Hadamard product** is the resultant matrix after the multiplication of corresponding elements of two vectors or matrices.
$$\vec{v} \odot \vec{u} = \begin{bmatrix} v_1 \\ v_2 \end{bmatrix} \odot \begin{bmatrix} u_1 \\ u_2 \end{bmatrix} = \begin{bmatrix} v_1 \times u_1 \\ v_2 \times u_2 \end{bmatrix}$$
The result of the Hadamard product will be the same irrespective of the order of multiplication. Thus, it is commutative.
$$\vec{v} \odot \vec{u} = \vec{u} \odot \vec{v}$$
Some use cases of the Hadamard product operation are JPEG image compression and LSTM (Long Short-Term Memory) cells of RNNs (Recurrent Neural Networks). It is also known as Schur Product (named after Issai Schur).

Part 12 : Vectors

Vector components from magnitude & direction

Part 13 : Vector Components

Dot products and duality | Chapter 9, Essence of linear algebra

Part 14 : Dot and Hadamard Product

Cross products | Chapter 10, Essence of linear algebra

Area of parallelograms

Jacques Hadamard

Issai Schur

The number of independent dimensions of a tensor is called its **rank**.

Vectors and matrices could be generalized with the term tensor. The following Venn diagram visualizes the connection between them.

A rank 0 tensor does not expand in any dimension, it is used to represent quantities that could be expressed by just one component i.e. its magnitude or scale. For example, the distance of $8\ cm$ between two points could be represented by the tensor $[ 8 ]$.

Since the value of a rank 0 tensor signifies only its *scale* or *magnitude* it could be called a **scalar** value. A scalar value could represent the mass of an object, the temperature of a room, the speed of a car, etc.

A rank 1 tensor (or a **vector**) expands in one dimension i.e. it represents values with more than one component.

Vectors are used to represent the magnitude and direction of different components of quantities such as displacement of an object, velocity of a car, electric field generated by a particle, etc. For example, displacement of $5\ m$ in the east direction and $10\ m$ in the north direction for an object could be represented with the vector $[5\ 10]$ where each component represents displacement in the east and north direction respectively.

In equations, vectors are denoted as bold letters ($\textbf{E}$) or letters with an arrow on top ($\vec{E}$).

To obtain the magnitude/scalar value of a vector we have to square all of its components and take the square root of their sum. For example, the magnitude of the displacement vector in the example above i.e. the distance between the start and the end will be $$|\ [ 0\ 5 \ 0 \ 10]\ | = \sqrt{0^2 + 5^2 + 0^2 + 10^2} = 25+100 = 125\ m$$ The magnitude of a vector is denoted by enclosing it within $|\ \ |$, for example, $|\vec{E}|$.

A rank 2 tensor (or a **matrix**) expands in two independent dimensions.

A *system of linear equations*
$$x - 2y = 6$$
$$ x - y = 4 $$
$$ x + y = 0 $$
could be represented using a matrix as

$$ \begin{bmatrix} 1 & -2 & 6 \\ 1 & -1 & 4 \\ 1 & 1 & 0 \end{bmatrix}$$

This matrix has three rows and three columns. Thus, its **order** will be $3 \times 3$.
A matrix with one row or one column could be called a matrix or vector interchangeably.

The following diagram visualizes a rank 3 tensor in a 3-dimensional plane.

Tensors with the rank greater than 3 are difficult to visualize but they could be represented as nested arrays in any programming language.

```
# The following array has four levels of nesting
rank_4_tensor = [
[
[
[
6, 7, 8
],
[
8, 58, 26
]
],
[
[
5, 28, 19
],
[
10, 11, 12
]
]
],
[
[
[
13, 14, 15
],
[
16, 17, 18
]
],
[
[
19, 20, 21
],
[
22, 23, 24
]
]
]
]
```

What’s a Tensor?

Tensor Rank

Part 1 : Linear equation of two variables and Matrices

Part 4B : Tensors, Scalars, Vectors, and Matrices