In mathematics, an eigenvalue perturbation problem is that of finding the eigenvectors and eigenvalues of a system that is perturbed from one with known eigenvectors and eigenvalues. This is useful for studying how sensitive the original system's eigenvectors and eigenvalues are to changes in the system. This type of analysis was popularized by Lord Rayleigh, in his investigation of harmonic vibrations of a string perturbed by small inhomogeneities.[1]
The derivations in this article are essentially self-contained and can be found in many texts on numerical linear algebra[2] or numerical functional analysis.
We assume that the matrices are symmetric and positive definite, and assume we have scaled the eigenvectors such that
![\mathbf{x}_{0j}^\top \mathbf{M}_0\mathbf{x}_{0i} = \delta_{ij} \qquad(2)](data:image/gif;base64,R0lGODlhAQABAIAAAAAAAP///yH5BAEAAAAALAAAAAABAAEAAAIBRAA7)
where δij is the Kronecker delta. Now we want to solve the equation
![\mathbf{K}\mathbf{x}_i = \lambda_i \mathbf{M} \mathbf{x}_i.](data:image/gif;base64,R0lGODlhAQABAIAAAAAAAP///yH5BAEAAAAALAAAAAABAAEAAAIBRAA7)
Substituting, we get
![(\mathbf{K}_0+\delta \mathbf{K})(\mathbf{x}_{0i} + \delta \mathbf{x}_{i}) = \left (\lambda_{0i}+\delta\lambda_{i} \right ) \left (\mathbf{M}_0+ \delta \mathbf{M} \right ) \left (\mathbf{x}_{0i}+\delta\mathbf{x}_{i} \right ),](data:image/gif;base64,R0lGODlhAQABAIAAAAAAAP///yH5BAEAAAAALAAAAAABAAEAAAIBRAA7)
which expands to
![\begin{align}
\mathbf{K}_0\mathbf{x}_{0i} &+ \delta \mathbf{K}\mathbf{x}_{0i} + \mathbf{K}_0\delta \mathbf{x}_i + \delta \mathbf{K}\delta \mathbf{x}_i = \\[6pt]
&=\lambda_{0i}\mathbf{M}_0\mathbf{x}_{0i}+\lambda_{0i}\mathbf{M}_0\delta\mathbf{x}_i + \lambda_{0i} \delta \mathbf{M} \mathbf{x}_{0i} +\delta\lambda_i\mathbf{M}_0\mathbf{x}_{0i} + \lambda_{0i} \delta \mathbf{M} \delta\mathbf{x}_i + \delta\lambda_i \delta \mathbf{M}\mathbf{x}_{0i} + \delta\lambda_i\mathbf{M}_0\delta\mathbf{x}_i + \delta\lambda_i \delta \mathbf{M} \delta\mathbf{x}_i.
\end{align}](data:image/gif;base64,R0lGODlhAQABAIAAAAAAAP///yH5BAEAAAAALAAAAAABAAEAAAIBRAA7)
Canceling from (0) (
) leaves
![\begin{align}
\delta \mathbf{K} \mathbf{x}_{0i} + \mathbf{K}_0\delta \mathbf{x}_i + \delta \mathbf{K}\delta \mathbf{x}_i = \lambda_{0i}\mathbf{M}_0\delta\mathbf{x}_i + \lambda_{0i} \delta \mathbf{M} \mathbf{x}_{0i} + \delta\lambda_i\mathbf{M}_0\mathbf{x}_{0i} + \lambda_{0i} \delta \mathbf{M} \delta\mathbf{x}_i + \delta\lambda_i \delta \mathbf{M} \mathbf{x}_{0i} + \delta\lambda_i\mathbf{M}_0\delta\mathbf{x}_i + \delta\lambda_i \delta \mathbf{M} \delta\mathbf{x}_i.
\end{align}](data:image/gif;base64,R0lGODlhAQABAIAAAAAAAP///yH5BAEAAAAALAAAAAABAAEAAAIBRAA7)
Removing the higher-order terms, this simplifies to
![\mathbf{K}_0 \delta\mathbf{x}_i+ \delta \mathbf{K} \mathbf{x}_{0i} = \lambda_{0i}\mathbf{M}_0 \delta \mathbf{x}_i + \lambda_{0i}\delta \mathbf{M} \mathrm{x}_{0i} + \delta \lambda_i \mathbf{M}_0\mathbf{x}_{0i}. \qquad(3)](data:image/gif;base64,R0lGODlhAQABAIAAAAAAAP///yH5BAEAAAAALAAAAAABAAEAAAIBRAA7)
When the matrix is symmetric, the unperturbed eigenvectors are orthogonal and so we use them as a basis for the perturbed eigenvectors. That is, we want to construct
![\delta \mathbf{x}_i = \sum_{j=1}^N \varepsilon_{ij} \mathbf{x}_{0j} \qquad (4)](data:image/gif;base64,R0lGODlhAQABAIAAAAAAAP///yH5BAEAAAAALAAAAAABAAEAAAIBRAA7)
where the εij are small constants that are to be determined. Substituting (4) into (3) and rearranging gives
![\begin{align}
\mathbf{K}_0 \sum_{j=1}^N \varepsilon_{ij} \mathbf{x}_{0j} + \delta \mathbf{K} \mathbf{x}_{0i} &= \lambda_{0i} \mathbf{M}_0 \sum_{j=1}^N \varepsilon_{ij} \mathbf{x}_{0j} + \lambda_{0i} \delta \mathbf{M} \mathbf{x}_{0i} + \delta\lambda_i \mathbf{M}_0\mathbf{x}_{0i} && (5) \\
\sum_{j=1}^N \varepsilon_{ij} \mathbf{K}_0 \mathbf{x}_{0j} + \delta \mathbf{K} \mathbf{x}_{0i} &= \lambda_{0i} \mathbf{M}_0 \sum_{j=1}^N \varepsilon_{ij} \mathbf{x}_{0j} + \lambda_{0i} \delta \mathbf{M} \mathbf{x}_{0i} + \delta\lambda_i \mathbf{M}_0 \mathbf{x}_{0i} && \text{Applying } \mathbf{K}_0 \text{ to the sum} \\
\sum_{j=1}^N \varepsilon_{ij} \lambda_{0j} \mathbf{M}_0 \mathbf{x}_{0j} + \delta \mathbf{K} \mathbf{x}_{0i} &= \lambda_{0i} \mathbf{M}_0 \sum_{j=1}^N \varepsilon_{ij} \mathbf{x}_{0j} + \lambda_{0i} \delta \mathbf{M} \mathbf{x}_{0i} + \delta\lambda_i \mathbf{M}_0 \mathbf{x}_{0i} && \text{Using Eq. } (1)
\end{align}](data:image/gif;base64,R0lGODlhAQABAIAAAAAAAP///yH5BAEAAAAALAAAAAABAAEAAAIBRAA7)
Because the eigenvectors are M0-orthogonal when M0 is positive definite, we can remove the summations by left-multiplying by
:
![\mathbf{x}_{0i}^\top \varepsilon_{ii} \lambda_{0i} \mathbf{M}_0 \mathbf{x}_{0i} + \mathbf{x}_{0i}^\top \delta \mathbf{K} \mathbf{x}_{0i} = \lambda_{0i} \mathbf{x}_{0i}^\top \mathbf{M}_0 \varepsilon_{ii} \mathbf{x}_{0i} + \lambda_{0i}\mathbf{x}_{0i}^\top \delta \mathbf{M} \mathbf{x}_{0i} + \delta\lambda_i\mathbf{x}_{0i}^\top \mathbf{M}_0 \mathbf{x}_{0i}.](data:image/gif;base64,R0lGODlhAQABAIAAAAAAAP///yH5BAEAAAAALAAAAAABAAEAAAIBRAA7)
By use of equation (1) again:
![\mathbf{x}_{0i}^\top \mathbf{K}_0 \varepsilon_{ii} \mathbf{x}_{0i} + \mathbf{x}_{0i}^\top \delta \mathbf{K} \mathbf{x}_{0i} = \lambda_{0i} \mathbf{x}_{0i}^\top \mathbf{M}_0\varepsilon_{ii} \mathbf{x}_{0i} + \lambda_{0i}\mathbf{x}_{0i}^\top \delta \mathbf{M}\mathbf{x}_{0i} + \delta\lambda_i\mathbf{x}_{0i}^\top \mathbf{M}_0 \mathbf{x}_{0i}. \qquad (6)](data:image/gif;base64,R0lGODlhAQABAIAAAAAAAP///yH5BAEAAAAALAAAAAABAAEAAAIBRAA7)
The two terms containing εii are equal because left-multiplying (1) by
gives
![\mathbf{x}_{0i}^\top\mathbf{K}_0\mathbf{x}_{0i} = \lambda_{0i}\mathbf{x}_{0i}^\top \mathbf{M}_0 \mathbf{x}_{0i}.](data:image/gif;base64,R0lGODlhAQABAIAAAAAAAP///yH5BAEAAAAALAAAAAABAAEAAAIBRAA7)
Canceling those terms in (6) leaves
![\mathbf{x}_{0i}^\top \delta \mathbf{K} \mathbf{x}_{0i} = \lambda_{0i} \mathbf{x}_{0i}^\top \delta \mathbf{M} \mathbf{x}_{0i} + \delta\lambda_i \mathbf{x}_{0i}^\top \mathbf{M}_0\mathbf{x}_{0i}.](data:image/gif;base64,R0lGODlhAQABAIAAAAAAAP///yH5BAEAAAAALAAAAAABAAEAAAIBRAA7)
Rearranging gives
![\delta\lambda_i = \frac{\mathbf{x}^\top_{0i} \left (\delta \mathbf{K}- \lambda_{0i} \delta \mathbf{M} \right )\mathbf{x}_{0i}}{\mathbf{x}_{0i}^\top\mathbf{M}_0 \mathbf{x}_{0i}}](data:image/gif;base64,R0lGODlhAQABAIAAAAAAAP///yH5BAEAAAAALAAAAAABAAEAAAIBRAA7)
But by (2), this denominator is equal to 1. Thus
![\delta\lambda_i = \mathbf{x}^\top_{0i} \left (\delta \mathbf{K} - \lambda_{0i} \delta \mathbf{M} \right )\mathbf{x}_{0i}.](data:image/gif;base64,R0lGODlhAQABAIAAAAAAAP///yH5BAEAAAAALAAAAAABAAEAAAIBRAA7)
Then, by left-multiplying equation (5) by
:
![\varepsilon_{ik} = \frac{\mathbf{x}^\top_{0k} \left (\delta \mathbf{K} - \lambda_{0i}\delta \mathbf{M} \right )\mathbf{x}_{0i}}{\lambda_{0i}-\lambda_{0k}}, \qquad i\neq k.](data:image/gif;base64,R0lGODlhAQABAIAAAAAAAP///yH5BAEAAAAALAAAAAABAAEAAAIBRAA7)
Or by changing the name of the indices:
![\varepsilon_{ij} = \frac{\mathbf{x}^\top_{0j} \left (\delta \mathbf{K} - \lambda_{0i} \delta \mathbf{M} \right )\mathbf{x}_{0i}}{\lambda_{0i}-\lambda_{0j}}, \qquad i\neq j.](data:image/gif;base64,R0lGODlhAQABAIAAAAAAAP///yH5BAEAAAAALAAAAAABAAEAAAIBRAA7)
To find εii, use the fact that:
![\mathbf{x}^\top_i \mathbf{M} \mathbf{x}_i = 1](data:image/gif;base64,R0lGODlhAQABAIAAAAAAAP///yH5BAEAAAAALAAAAAABAAEAAAIBRAA7)
implies:
![\varepsilon_{ii}=-\tfrac{1}{2}\mathbf{x}^\top_{0i} \delta \mathbf{M} \mathbf{x}_{0i}.](data:image/gif;base64,R0lGODlhAQABAIAAAAAAAP///yH5BAEAAAAALAAAAAABAAEAAAIBRAA7)
This means it is possible to efficiently do a sensitivity analysis on λi as a function of changes in the entries of the matrices. (Recall that the matrices are symmetric and so changing Kkℓ will also change Kℓk, hence the (2 − δkℓ) term.)
![\begin{align}
\frac{\partial \lambda_i}{\partial \mathbf{K}_{(k\ell)}} &= \frac{\partial}{\partial \mathbf{K}_{(k\ell)}}\left(\lambda_{0i} + \mathbf{x}^\top_{0i} \left (\delta \mathbf{K} - \lambda_{0i} \delta \mathbf{M} \right ) \mathbf{x}_{0i} \right) = x_{0i(k)} x_{0i(\ell)} \left (2 - \delta_{k\ell} \right ) \\
\frac{\partial \lambda_i}{\partial \mathbf{M}_{(k\ell)}} &= \frac{\partial}{\partial \mathbf{M}_{(k\ell)}}\left(\lambda_{0i} + \mathbf{x}^\top_{0i} \left (\delta \mathbf{K} - \lambda_{0i} \delta \mathbf{M} \right ) \mathbf{x}_{0i}\right) = \lambda_i x_{0i(k)} x_{0i(\ell)} \left (2- \delta_{k\ell} \right ).
\end{align}](data:image/gif;base64,R0lGODlhAQABAIAAAAAAAP///yH5BAEAAAAALAAAAAABAAEAAAIBRAA7)
Similarly
![\begin{align}
\frac{\partial\mathbf{x}_i}{\partial \mathbf{K}_{(k\ell)}} &= \sum_{j=1\atop j\neq i}^N \frac{x_{0j(k)} x_{0i(\ell)} \left (2-\delta_{k\ell} \right )}{\lambda_{0i}-\lambda_{0j}}\mathbf{x}_{0j} \\
\frac{\partial \mathbf{x}_i}{\partial \mathbf{M}_{(k\ell)}} &= -\mathbf{x}_{0i}\frac{x_{0i(k)}x_{0i(\ell)}}{2}(2-\delta_{k\ell}) - \sum_{j=1\atop j\neq i}^N \frac{\lambda_{0i}x_{0j(k)} x_{0i(\ell)}}{\lambda_{0i}-\lambda_{0j}}\mathbf{x}_{0j} \left (2-\delta_{k\ell} \right ).
\end{align}](data:image/gif;base64,R0lGODlhAQABAIAAAAAAAP///yH5BAEAAAAALAAAAAABAAEAAAIBRAA7)