Most of you are probably familiar with the concept of the norm of a vector, but not all of you are very comfortable with the norm of a matrix. What does the norm of a matrix measure? How to define a meaningful norm for a matrix? When would we need it? This post will help you answer these questions.
Recall we can define a norm on any vector space $V$ over a field $K$ (usually $R$ or $C$), which satisfy the following properties:
Since the set of all M-by-N matrices form a vector space of dimension $MN$, it is not surprising we can also define norm for matrices, as long as the norm meet the three properties.
NOTE 1. From homogeneity, we have $\|0x\|=\|0\|=0\|x\|=0$. From triangular inequality, $\|0\|=\|x-x\| \le \|x\| + \|-x\| = 2\|x\|$, we have $\|x\| \ge 0$. Therefore, regarding positivity, we only require $\|x\| = 0 \implies x = 0, x \in V$, and the rest can be derived from homogeneity and triangular inequality.
NOTE 2. some norms (not all) have an additional property:
As matrices represent linear mapping, we can define the norm of matrices based on the norm of vectors.
Def. The norm of a matrix can be "induced" from the norm of matrix in the following way: $$\|A\| = \max_{\|x\| \ne 0} \frac{\|Ax\|}{\|x\|} = \max_{\|x\|=1} \|Ax\|$$ NOTE: Here $\|x\|$, $\|A\|$, and $\|Ax\|$ are three different norms (defined on different vector spaces), although we use the same notation without distinguishing.
We can prove that induced norm is a norm (satisfying the three probabilities in section 1.1). In addition, induced norm is submultiplicative.
There are matrix norms, satisfying the three defining conditions in sec 1.1, that are not induced-norms. One important example is Frobenius norm: $$\|A\|_F=(\sum_{j=1}^n \sum_{i=1}^m |a_{ij}|^2 )^{1/2}$$ Frobenius norm still has the submultiplicativity although it is not an induced norm.
We all know that the norm of a vector represents the magnitude of that vector, but what a norm of matrices implies?
A M-by-N matrix can be viewed as a linear map from a N-dim space to a M-dim space, and therefore we can understand the matrix norm as a property of a mapping or an "operator" (regardless what vector the operation is applied on). Recall the definition for an induced norm. $\frac{\|Ax\|}{\|x\|}$ is the magnification of a vector $x$, and $\max_{x}\frac{\|Ax\|}{\|x\|}$ is the maximum magnitude gain of the operation defined by A.
Matrix norm is not just a concept. It is an important metric of a linear map. One example of its applications is using matrix norm to study the sensitivity of the solution of a linear system, which we will touch upon in future posts.
[REFERENCES]
[1] MIT Open Course "Dynamic Systems and Control", Chpt 4 "Singular Value". Notes available at: https://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-241j-dynamic-systems-and-control-spring-2011/lecture-notes/.