If Pv + m_44 = 0, then either
Av + T \neq 0 and this corresponds to the
point at infinity in the direction of the vector Av + T, or
Av + T = 0 and this does not define any
point in the projective space.
Note that if M is multiplied by any nonzero scalar
\lambda then so are
A, T, P, m_44 and \lambda M
defines exactly the same transform as M
per the description above.
Normalization of m_44
Except for matrix3d all
the basic CSS transforms correspond to a matrix with
m_44 = 1 per the
table above.
The inverse of such a matrix is also of this form and so is the product
of two such matrices:
M = \begin{pmatrix}
A & T \\
P & m_44
\end{pmatrix}
If m_44 = 0 then M
cannot be decomposed as a product of
a \lambda \neq 0 and other simpler
matrices defined by the CSS syntax, since the coefficent at position
4,4 of such a product should be \lambda \neq 0 = m_44.
If additionally, P = 0, this corresponds to
the transform sending
v = {(x, y, z, 1)}^{\mathrm T}
to the
point at infinity in the direction of the vector
Av + T
(or is undefined if Av + T = 0).
Right-multiplying M by
\Sigma circularly shifts (from left to right)
the columns of
M.
Similarly, \Sigma^{n} is a
matrix with only 0's and 1's that circularly shifts
|n| times the columns of M (in
a direction depending on the sign of the integer).
If P \neq 0 and m_44 \neq 0, then
let's pick the smallest n \in {\{0, 1, 2, 3\}}
such that the bottom right coefficient
{\left(M \Sigma^n\right)}_44 is nonzero.
Then M can be decomposed as follows:
M = {{\left(M \Sigma^n\right)}_44 M'
{\Sigma^{-n}}}
where M' = \frac{1}{{\left(M \Sigma^n\right)}_44} M \Sigma^n is of the form
{M'} = {\begin{pmatrix}
A' & T' \\
P' & 1
\end{pmatrix}}
and the n first coefficients of P'
are zeros.
Factoring T, P out
Let's now consider matrices of the form
M = \begin{pmatrix}
A & T \\
P & 1
\end{pmatrix}
We note that scales, rotations, perspectives and skews
all belongs to the subset of matrices for which
T = 0 (lower triangular by block)
and this subset is stable by composition,
inversion or multiplication by a nonzero scalar.
However, it is easy to see that:
M ={
\begin{pmatrix}
1_3 & T \\
0 & 1
\end{pmatrix}
\begin{pmatrix}
A & 0 \\
P & 1
\end{pmatrix}}
Then scales, rotations and skews all belong to the set of matrices
for which T = P = 0 (diagonal by block)
and this subset is stable by composition,
inversion or multiplication by a nonzero scalar.
Again, one can easily check that
M ={
\underset{\text{translation}}{\underbrace{
\begin{pmatrix}
I_3 & T \\
0 & 1
\end{pmatrix}}}
\begin{pmatrix}
A & 0 \\
0 & 1
\end{pmatrix}
\begin{pmatrix}
I_3 & 0 \\
P & 1
\end{pmatrix}}
which gives a straightforward extraction of the translation part.
If P = {(p_x, p_y, p_z)} \neq 0 then one of the following
matrix is well-defined:
These matrices are of determinant 1 since they can be obtained by
permutting the rows (or columns) of a triangular matrix with only 1's
on the diagonal. Moreover, one can check that
Using the QR factorization, one can find an
orthogonal matrixQ and
an upper triangular matrix R such that A = QR.
There are several ways to calculate this decomposition e.g.
using the Gram–Schmidt process or
Givens rotations. Our algorithm will use the
Householder procedure which is simple, numerically stable
and works for a non-invertible matrix A.
Our matrix is small, the decomposition only requires at
most two reflections and one scale.
As a quick reminder, one can
define for any unitary vector v
(i.e. v^\mathrm T v = 1) the matrix
P_v = I_3 - 2v v^{\mathrm T}. One can verify that
The two first equalities show that P_v is the reflection
about the plan orthogonal to v and that its eigenvalues
are 1 (with mutplicity 2) and -1 (with multiplicity 1)
and so \det P_v = -1. The two last equalities show
that P_v is an orthogonal matrix.
Let's start with the first step
of Householder's algorithm.
Let x_1 =
\begin{pmatrix}
a_11 \\
a_12 \\
a_13 \\
\end{pmatrix}. Suppose one of
a_12 or a_13 is nonzero. Then
let
\alpha_1 = \pm \sqrt{x_1^{\mathrm T} x_1} \neq 0 with
the sign chosen to maximize (and in particular make it nonzero) the absolute value of
\alpha_1 - a_11. Then
u_1 = x_1 - \begin{pmatrix}
\alpha_1 \\
0 \\
0 \\
\end{pmatrix} satisfies
u_1^{\mathrm T} u_1 = 2 \alpha_1 {(\alpha_1 - a_11)} \neq 0
and the unitary vector
v_1 = \frac{u_1}{\sqrt{u_1^{\mathrm T} u_1}} satisfies:
with Q_1 = P_{v_1}.
If a_12 = a_13 = 0 then, one can instead choose
Q_1 = I_3 and \alpha_1 = a_{11} to
obtain the same form.
Similarly, if b_32 \neq 0 then we consider
x_2 =
\begin{pmatrix}
0 \\
b_22 \\
b_23 \\
\end{pmatrix},
\alpha_2 = \pm \sqrt{x_2^{\mathrm T} x_2} \neq 0 with
the sign chosen to maximize \left|\alpha_2 - b_22\right|,
u_2 = x_2 - \begin{pmatrix}
0 \\
\alpha_2 \\
0 \\
\end{pmatrix} and finally
v_2 = \frac{u_2}{\sqrt{u_2^{\mathrm T} u_2}} and
Q_2 = P_{v_2}. If
b_32 \neq 0 then we consider instead
Q_2 = I_3 and \alpha_2 = b_{22}.
In any case, one can write:
Finally, one can define the orthogonal matrix
Q_3 = \begin{pmatrix}
\det{(Q_2 Q_1)} \epsilon_2 \epsilon_3 & 0 & 0 \\
0 & \epsilon_2 & 0 \\
0 & 0 & \epsilon_3 \\
\end{pmatrix}
with \epsilon_2, \epsilon_3 \in {\{-1, 1\}} such that
\epsilon_3 = 1 if and only if
c_33 \geq 0 and
\epsilon_2 = 1 if and only if
\alpha_2 \geq 0. Then we obtain
where r_22, r_33 are nonnegative.
By construction,
Q = Q_1 Q_2 Q_3 is an orthogonal matrix of
determinant 1 and so is a
3D rotation,
which gives the following factorization:
with \detQ = 1.
Such a matrix corresponds to a 3D rotation and can thus can be described by a non-zero
vector (x, y, z)
and an angle \alpha. As described in
the table above, it will then
be written
Q = R(x, y, z, \alpha) from the normalized rotation axis
{(X, Y, Z)} = {\frac{1}{\sqrt{x^2+y^2+z^2}} {(x, y, z)}}
as well as
C = \cos \left(\frac{α}{2}\right) and
S = \sin \left(\frac{α}{2}\right).
The trace of the rotation matrix allows to easily obtain the angle from
the formula
\alpha = \arccos\left(\frac{{{\operatorname{tr}} {Q}} - 1}{2}\right)
\in {[0, \pi]}
.
Indeed, we have:
If additionally \alpha < \pi, then
C \neq 0 and equalities (4-6) provides the sign of
X, Y, Z. Otherwise can arbitrarily choose any sign for
one of them and the sign of the others is implied by equalities (7-8).
In computers, 3D rotations are often just encoded as
quaternion
with 4 coordinates (C, XS, YS, ZS)
and it is easy to rebuild the matrix
Q = R(x, y, z, \alpha) from these values
using the formula of the table above.
Wrapping up: Factorization of M
Summarizing what we have done so far, we are able to describe the
transformation matrix in two ways. If the last row of the matrix is
zero then it can be written:
v \mapsto \begin{pmatrix}
QRv + T \\
0
\end{pmatrix}
The image is well defined if
Rv \neq -Q^{\mathrm T} T
and is the point at infinity in the direction
QRv + T. If R (or equivalently
A) is invertible then clearly there is exactly one
vector with undefined image. The general case is more complicate to
express concisely, but is equivalent to solving the following triangular
system:
with \lambda \in {\mathbb R}^*
and \Sigma^{-n} is the matrix shifting
n \in {\{0,1,2,3\}} times the columns
of a left operand.
In particular
\det M = {{(-1)}^n \lambda s_11 s_22 s_33 u_11 u_22 u_33}
so the
matrix is invertible if and only if all the diagonal coefficients
of the scale and unscaled factors are nonzero.
Decomposition of non-general 3D matrices
Except for matrix3d all
the basic CSS transforms correspond to a matrix
M = \begin{pmatrix}
A & T \\
P & 1
\end{pmatrix}
whose last row
is (0, 0, p_z, 1).
So our decompositon algorithm will always
choose the second way and yield trivial factors
\lambda = 1 and n = 0.
The translation and "perspective" factors are just given by the original
T, P.
The rest of the factors are obtained by factorization of
A.
Decomposing A is less obvious, for example
rotateZ(180°) scale(1, 1, 2) rotates
the x-axis and y-axis by 180° (either clockwise or anticlockwise).
and scales the z-axis by 2 and so is then equivalent to
scale(-1, -1, 2). The same example shows
that the QR decomposition is not unique so in theory
we cannot just say that our algorithm will choose a factorization once
we find one that works...
For perspective() and
translate3d() (and their aliases),
A = I_3 is upper triangular so our algorithm will
yield Q_1 = Q_2 = I_3,
{\det {(Q_1 Q_2)}} = 1 and
Q_2 Q_1 I_3 = I_3.
The Q_3 = I_3 and so
Q = Q_1 Q_2 Q_3 = I_3 and
R = Q^{\mathrm T} A = I_3. The scale is then
S = I_3 and the unscaled upper triangular factor is also
U = I_3. So these two are indeed just decomposed into
a single perspective or translation factor.
For rotate3D() (and its aliases)
the algorithm will yield an upper triangular matrix
Q_2 Q_1 A,
which is additionnally orthogonal in that case
(because A is a rotation). This means that its
eigenvalues are both its diagonal elements (real numbers) and complex
numbers of norm 1,
that is they belong to \{-1, 1\}. Additionally,
the inverse is both the transpose and an upper triangular matrix
so the matrix is actually diagonal.
Finally its
determinant is 1, so
\alpha_2, c_33 \in {\{-1, 1\}},
c_21 = c_31 = c_32 = 0
and \alpha_1 = \alpha_2 c_33.
The algorithm will choose
\epsilon_3, \epsilon_2 such that
Q_3 is this diagonal matrix.
From Q_3 = Q_2 Q_1 A we get
Q = A, R = I_3 and so
S=U=I_3.
Again, this is decomposed into a single rotation factor.
It is diagonal and in particular upper triangular so our algorithm will
yield
Q_1 = Q_2 = I_3 and
\det{(Q_1 Q_2)} = 1. Then Q_3 is defined
according to the signs of s_y, s_z
(choosing the diagonal element to 1 if the corresponding scale
element is zero) to provide the
following QR decomposition:
It is decomposed into a single scale factor if and only if
s_x \neq 0, s_y, s_z > 0.
Decomposition of 2D matrices
For a 2D transform matrix(a, b, c, d, e, f)
(including skews, 2D translations, 2D scales, 2D rotations)
as explained in the previous section,
e, f are extracted in the translation component and this
is actually a 2D translation. The remaining part is then
A=
\begin{pmatrix}
a & c & 0 \\
b & d & 0 \\
0 & 0 & 1 \\
\end{pmatrix}
If b = 0 (e.g. translations, scales and
skewX) then the matrix
is upper triangular and the algorithm calculates
Q_1 = Q_2 = I_3.
\epsilon_3 = 1 and
\epsilon_2 is chosen to the sign of
d. This gives the following QR
decomposition:
If b \neq 0 (e.g. rotations of angle other than a
multiple of 180°) then
\alpha_1 = \pm \sqrt{a^2 + b^2} \neq 0
with the sign chosen to maximize the absolute value of
\alpha_1 - a. In any case, one can
verify that:
This is an upper diagonal matrix and so Q_2 = I_3.
Then \epsilon_3 = 1 is chosen
and \epsilon_2 is chosen with same sign as
\mp {\det(A)}
(1 if the determinant is zero). The decomposition becomes:
where \epsilon is chosen with same sign as
\det{(A)} (1 if the determinant is zero).
Note that this is slightly different from
the decomposition provided in this
blog post which uses Gram–Schmidt to perform the QR
factorization.
To summarize,
matrix(a, b, c, d, e, f)
is decomposed as a product of a 2D-translation,
a 2D-rotation, a 2D-scale and an unscaled factor with
u_33 = 1, u_31 = u_32 = 0 (i.e. a 2D transform).
If the matrix is invertible,
then the unscaled factor can be written as a skewX.
Per previous section, a 2D-translation or 2D-rotation is decomposed
into a single 2D-translation or 2D-rotation respectively.
A 2D-scale is decomposed
as a product of a 2D-rotation and a 2D-scale (with the rotation omitted
if the y-scale is nonnegative).