blank

Noether’s theorem in classical mechanics

2022-09-13T00:00:00+00:00

Introduction

Noether’s (first) theorem is one of the most important theorems in physics. It relates well known conserved quantities of energy, momentum, charge, and more to fundamental symmetries of our equations of motion. It plays a role in classical mechanics as it does elsewhere in physics, but many text books separate out time/energy from momentum conservations and gloss over the most general proof – which crucially misses cases where time and space must transform together to reveal a symmetry.

This post is meant to bridge that gap and provide a foundation to use Noether’s theorem in more areas of physics.

General statement

The statement is built on the idea of the action principle, and we will use the specific form

\[\begin{equation} S[q] = \int dt \, L(q_1, \ldots, q_s, \dot q_1, \ldots, \dot q_s; t) \end{equation}\]

We will, for simplicity neglect the 1 through $s$ labels $\begin{equation} S[q] = \int dt \, L(q, \dot q ; t). \end{equation}$ The general statement is that $S[q] = S[\sigma(q, \alpha)]$ for any $\alpha$ (and neglecting boundary terms – terms which depend exclusively on $q$ or $\dot q$ at the initial or final times). The function $\sigma(q,\alpha)$ represents our symmetry transformation (it could be rotations $\alpha = \theta$, time translations, spatial translations, or something even more complicated). Importantly, we define $\sigma(q, 0) = q$, so that we can expand $\sigma(q, \alpha) = q + \frac{\partial \sigma}{\partial \alpha}\big\rvert_{\alpha = 0} \alpha + \cdots$ allowing us to make infinitesimal symmetry transformations. Writing out the action explicitly,

\[\begin{equation} \int dt \, L(q, \dot q; t) = \int dt \, L\bigg(\sigma(q, \alpha), \frac{d}{dt} \sigma(q, \alpha); t\bigg). \end{equation}\]

We define $L_\alpha \equiv L(\sigma(q, \alpha), \frac{d}{dt} \sigma(q, \alpha); t)$ for ease. In order to be a symmetry, we can expand the right in terms of $\alpha$ $\begin{equation} \int dt \, L(q, \dot q; t) = \int dt \, \bigg[L(q, \dot q; t) + \alpha \frac{\partial L_\alpha}{\partial \alpha}\Big\rvert_{\alpha = 0} + \cdots \bigg]. \end{equation}$

In the above, they can only differ by a boundary term which implies there is a function $\Lambda(q, \dot q; t)$ such that $\begin{equation} \frac{\partial L_\alpha}{\partial \alpha}\Big\rvert_{\alpha = 0} = \bigg[ \frac{\partial L}{\partial q} \frac{\partial \sigma}{\partial \alpha} + \frac{\partial L}{\partial \dot q} \frac{d}{dt}\frac{\partial \sigma}{\partial \alpha} \bigg]_{\alpha=0}= \frac{d \Lambda(q, \dot q; t)}{dt}. \label{eq:dLambda} \end{equation}$ This is what it means, in general, for a Lagrangian to have a (continuous) symmetry. For the action to have a symmetry and produce the same equations of motion, the Lagrangian can differ by at most a full derivative. This is a powerful statement: it is true regardless of path, it is just a property of the action itself. We will use it for the extremal paths that minimize the action in order to derive Noether’s theorem.

To prove Noether’s theorem we consider a (small) time dependence of $\alpha = \alpha(t)$. The logic here is rather simple: We are trying to find equations of motion but instead of just blindly letting $q \mapsto q + \delta q$, we choose $\delta q =\alpha(t) \frac{\partial \sigma}{\partial \alpha}\Big\rvert_{\alpha = 0}$ and let $\alpha(t)$ freely vary (not just be constant). In this way, we will obtain a term proportional to $\alpha(t)$ and it must be zero on the paths that minimize the action.

The main difference from what we showed above comes from the derivative term $\begin{equation} \begin{aligned} \frac{d}{dt} \sigma(q, \alpha(t)) & = \frac{d}{dt} \bigg( q + \alpha(t) \frac{\partial \sigma}{\partial \alpha}\Big\rvert_{\alpha = 0} + \cdots \bigg) \\ & = \dot q + \alpha(t) \frac{d}{dt } \bigg(\frac{\partial \sigma}{\partial \alpha}\Big\rvert_{\alpha = 0}\bigg) + \dot \alpha(t)\frac{\partial \sigma}{\partial \alpha}\Big\rvert_{\alpha = 0}, \end{aligned} \end{equation}$ where notice that we have a term proportional to $\dot \alpha(t)$.

Explicitly, we can make the expansion \(\begin{equation} \begin{aligned} S[\sigma(q, \alpha(t))] & = \int dt \, L\bigg(\sigma(q, \alpha(t)), \frac{d}{dt} \sigma(q, \alpha(t)); t\bigg) \\ & = \int dt \, L\bigg(q + \alpha(t)\frac{\partial \sigma}{\partial \alpha}\Big\rvert_{\alpha = 0} + \cdots , \\ & \quad\quad\quad \quad \quad \dot q + \alpha(t) \frac{d}{dt } \bigg(\frac{\partial \sigma}{\partial \alpha}\Big\rvert_{\alpha = 0}\bigg) + \dot \alpha(t)\frac{\partial \sigma}{\partial \alpha}\Big\rvert_{\alpha = 0} ; t\bigg) \\ & = \int dt \, \bigg(L(q, \dot q) + \alpha(t)\bigg[ \frac{\partial L}{\partial q} \frac{\partial \sigma}{\partial \alpha} + \frac{\partial L}{\partial \dot q} \frac{d}{dt}\frac{\partial \sigma}{\partial \alpha} \bigg]_{\alpha=0} \\ & \hspace{140pt}+ \frac{\partial L}{\partial \dot q} \frac{\partial \sigma}{\partial \alpha} \bigg\rvert_{\alpha=0} \dot \alpha(t) + \cdots \bigg) \\ & = \int dt \, \bigg(L(q, \dot q) + \bigg[ \frac{d \Lambda}{dt} - \frac{d}{dt}\bigg(\frac{\partial L}{\partial\dot q} \frac{\partial \sigma}{\partial \alpha} \bigg\rvert_{\alpha=0}\bigg) \bigg] \alpha(t) + \cdots \bigg), \end{aligned} \end{equation}\) where in the last line we have used \eqref{eq:dLambda} and integration by parts on the last term (dropping boundary terms). To extremize the action, we cannot have a term proportional to $\alpha(t)$ which means the term multiplying it is zero. In other words, we have formally found the conserved quantity $\begin{equation} Q = \frac{\partial L }{\partial \dot q} \frac{\partial \sigma}{\partial \alpha} \bigg\rvert_{\alpha=0} - \Lambda, \quad \frac{d Q}{dt} = 0. \label{eq:Qconserve} \end{equation}$

Most of the work will be finding what $\Lambda$ is.

A symmetry of space and time

Now, we want to make a slightly more specific observation using just one dimension. The symmetry we care about is going to be one of both space and time and we can write it out with infinitesimals $\begin{equation} \begin{aligned} x & \mapsto x + \alpha \delta x(x, \dot x, t) + \cdots, \\ t & \mapsto t + \alpha \delta t(x, \dot x, t) + \cdots. \end{aligned} \end{equation}$ Caution: $\delta x$ and $\delta t$ are functions and never assumed to be small in this calculation, it is only $\alpha$ that we take to be small.

For generality, the objects $\delta x$ and $\delta t$ are functions which can depend on the trajectory in potentially complicated ways. For a trajectory, the time mapping is modified by both of the above and we obtain $\begin{equation} x(t) \mapsto x(t + \alpha \delta t(x, \dot x, t) + \cdots) + \alpha \delta x(x, \dot x, t) + \cdots, \end{equation}$ which we can expand the argument of $x(t)$ to obtain $\begin{equation} x(t) \mapsto x(t) + \alpha ( \delta x + \dot x(t) \delta t) + \cdots, \end{equation}$ where for simplicity we have dropped the arguments of $\delta x$ and $\delta t$. One might wonder why $t$ is not transformed within the arguments of these functions and the reason is simply because that will constitute a higher order correction which is buried in the “$+\cdots$” at the end of the expression.

We have thus found

$\begin{equation} \frac{\partial \sigma}{\partial \alpha}\Big\rvert_{\alpha = 0} = \delta x + \dot x(t) \delta t, \label{eq:xtsigma} \end{equation}$ our first ingredient in computing our conserved quantity $Q$.

Next, we need to consider what happens to the action itself. In this case, it is helpful to (1) consider the Lagrangian along with the infinitesimal $dt$ that it comes with as part of the action and (2) define the time transformation $\tau_\alpha =t + \alpha \delta t(x, \dot x, t) + \cdots$. With these, our statement of symmetry under this spacetime transformation implies that the Lagrangian transforms as follows

\[\begin{equation} dt\, L\bigg(\sigma(x, \alpha), \frac{d}{dt} \sigma(x,\alpha)\bigg) = d\tau_\alpha L( x(\tau_\alpha), \dot x(\tau_\alpha)) \end{equation}\]

Or in otherwords, we reorganize terms to obtain $\begin{equation} L\bigg(\sigma(x, \alpha), \frac{d}{dt} \sigma(x,\alpha)\bigg) = \frac{d\tau_\alpha}{dt} L( x(\tau_\alpha), \dot x(\tau_\alpha)). \end{equation}$ This makes sense: If you change variables on the right (integrating over $t$) changing $t$ to $\tau_\alpha$, you would get the normal action that already know and love.

If we expand the right though, (calling it $L_\alpha$), we obtain $\begin{equation} \begin{aligned} L_\alpha & = (1 + \alpha \dot{\delta t} + \cdots)\Big(L + \alpha \frac{dL}{dt} \delta t + \cdots \Big) \\ & = L + \alpha \frac{d}{dt} ( L \delta t) + \cdots. \end{aligned}\label{eq:xtLambda} \end{equation}$ We have thus found $\Lambda = L \delta t$. Combining \eqref{eq:xtLambda} and \eqref{eq:xtsigma} into \eqref{eq:Qconserve}, we have $\begin{equation} Q = \frac{\partial L}{\partial \dot x} ( \delta x + \dot x \delta t) - L \delta t, \end{equation}$ or in other words, $\begin{equation} Q = \bigg(\dot x \frac{\partial L}{\partial \dot x} - L\bigg) \delta t + \frac{\partial L}{\partial \dot x} \delta x. \label{eq:xtQ} \end{equation}$ This effectively has mixed something that looks energy-like (the term proportional to $\delta t$) to the term which is more momentum-like (the term proportional to $\delta x$). In fact, one can use this expression to derive both energy and momentum separately (in cases where those are in fact conserved). Furthermore, the result to higher dimensions and more particles is rather straightforward.

Rotating system

We end with a simple application of the above: A Lagrangian that experiences an external rotation (at frequency $\omega$). In this case the Lagrangian has a functional form in cylindrical coordinates

\[L(r, \theta - \omega t, \dot r, \dot \theta).\]

In this case there is no energy or angular momentum conservation, but we do have a transformation law that will leave our Lagrangian invariant $\begin{equation} \begin{aligned} t & \mapsto t + \alpha, \\ \theta & \mapsto \theta - \omega \alpha. \end{aligned} \end{equation}$ (The minus signs comes from the following algebra $\theta(t) - \omega t \mapsto \theta(t + \alpha) - \omega (t + \alpha)$, notice that $t$ does not explicitly change but $\theta(t)\mapsto \theta(t + \alpha) - \omega \alpha$.)

Using what we have shown in \eqref{eq:xtQ} (generalized to higher dimensions), we have a conserved quantity $\begin{equation} Q = E - \omega \frac{\partial L}{\partial \dot \theta} = E - \omega L_z, \end{equation}$ a quantity that is a combination of both energy and angular momentum about the $z$-axis (which are separately not conserved but this combination is).

A topological analysis of the Su-Schrieffer-Heeger model

2022-02-20T00:00:00+00:00

Introduction

While this represents nothing particularly new about this well-studied model, I have found most references on this subject gloss over or miss certain points that I think are important for getting the full picture of this model and what it can teach us regarding topology in condensed matter systems. The Su-Schrieffer-Heeger model, or SSH model for short, is in a sense one of the simplest models of topology and the resulting edge states (the bulk-boundary correspondence), so a full analysis of it helps to shed light on other models in higher dimension with topologies that might be a little harder to imagine. The model is a simple tight-binding model representing polyacetelyne C₂H₂ but it also has an artificial particle-hole symmetry which when interpretted as a model of Bogoliubov-de-Gennes quasiparticles (in a $p+ip$ superconductor) it becomes an exact symmetry.

Model definition

The molecular chain is a combination of weak bonds and strong bonds and thus has two (low energy) configurations that are degenerate:

Figure 1. Two configurations for polyacetylene leading to different hoppings between carbon atoms (t₁ vs. t₂). We can label the chain with two sublattices A and B, and we have a unit cell that includes two atoms (as indicated by the gray square).

The tight-binding model for the SSH model takes the form

\[\begin{equation} H = - \sum_{n} \left( t_1 \lvert n, A \rangle\langle n, B \rvert + t_2 \lvert n+1, A \rangle\langle n, B \rvert + \mathrm{h.c.} \right), \end{equation}\]

where $n$ labels the unit cell $A$ and $B$ are the different sublattices with hoppings $t_1$ and $t_2$ between them. We can introduce Pauli matrices via the sublattice space $ \lvert A \rangle \langle B \rvert = \sigma^+ = \sigma_x + i \sigma_y$ (and $\sigma^- = (\sigma^+)^\dagger$), giving the Hamiltonian the form

\[\begin{equation} H = - \sum_{n} \left( t_1 \lvert n \rangle\langle n \rvert \otimes \sigma_x + t_2 \lvert n+1 \rangle\langle n \rvert \otimes \sigma^+ + \mathrm{h.c.} \right), \end{equation}\]

to diagonalize the model we note that $\lvert{n+1}\rangle = e^{-i \hat k} \lvert{n} \rangle$ where $e^{-i\hat k}$ is the translation operator by a unit cell. Using completeness $\sum_n \lvert n \rangle \langle n \rvert = 1$ we can write down our Hamiltonian as

\[\begin{equation} H = - (t_1 \sigma_x + t_2 e^{-i \hat k} \sigma^+ + t_2 e^{+i \hat k} \sigma^-). \end{equation}\]

This gives us a Brillouin zone as well with $k\in[-\pi,\pi)$. Here we write the $k$-space Hamiltonian out in three ways for pedagogical reasons

\[\begin{equation} \begin{aligned} H_k & = - (t_1 + t_2 \cos k) \sigma_x - t_2 \sin k \sigma_y, \\ H_k & = \begin{pmatrix} 0 & -t_1 - t_2 e^{-i k} \\ -t_1 - t_2 e^{ik} & 0 \end{pmatrix}, \end{aligned} \label{eq:ssh-kspace} \end{equation}\]

and finally the general form

\[\begin{equation} H_k = \epsilon_0(k) - \mathbf d(k) \cdot \sigma, \label{eq:general-k-space-two-band} \end{equation}\]

where for our model $\epsilon_0(k) = 0$, $d_x(k) = t_1 + t_2 \cos k$, $d_y(k) = t_2 \sin k$, and $d_z(k) = 0$. If we write $\mathbf d(k) = d(k) ( \cos \phi_k \sin \theta_k, \sin \phi_k \sin \theta_k, \cos \theta_k)$, then the general solution to this two-level problem is

\[\begin{align} \epsilon_\pm(k) & = \epsilon_0(k) \pm d(k), \end{align}\]

with solution for the lowest band ($\epsilon_-(k)$)

\[\begin{equation} \lvert{u_k}\rangle = \begin{pmatrix} \cos(\theta_k/2) \\ \sin(\theta_k/2) e^{i\phi_k} \end{pmatrix}. \label{eq:uk_north} \end{equation}\]

The nice thing about these $\lvert u_k \rangle$ is they are easily represented by points on the Bloch sphere. For our model \eqref{eq:ssh-kspace}, we have $\theta_k = \pi/2$, and $e^{i\phi_k} = (t_1 + t_2 e^{-i k})/|t_1 + t_2 e^{-ik}|$. Note that only the unit vector

\[\hat{\mathbf d}(k) = \frac{\mathbf d(k)}{d(k)}\]

controls the wave function (and as we’ll see, the topology). Lastly, the energy bands look like

Figure 2. The energy bands of the SSH model with t₂ = 0.75t₁.

Berry connection, Berry phase, and polarization

For the necessary mathematical machinery, we refer to Ch. 3 of . The Berry connection is then given by

\[\begin{equation} A(k) = \langle u_k | i \partial_k u_k \rangle = \frac{d\phi_k}{dk} \langle u_k | i \partial_{\phi_k} u_k\rangle = - \frac{1}{2} \frac{d\phi_k}{dk}. \end{equation}\]

We can compute the Berry phase for this band which is exactly related to the polarization (see Ch. 4 of )

\[\begin{equation} P = \frac{e}{2\pi} \oint_{BZ} dk \, A(k) = -\frac{e}{2\pi} \frac12 \oint_{BZ} dk \, \frac{d\phi_k}{dk}. \end{equation}\]

It is tempting to change variables in the integral to $\oint d \phi_k$, but we need to understand what the closed curve $\oint$ represents. Since $d_z=0$ we can think purely in terms of the equator of the Bloch sphere and either $\phi_k$ makes it around the sphere or it does not as indicated in Fig. 3 below.

Figure 3. The two situations for this model: either $\phi_k$ wraps around the origin or not. (red) $t_1t_2$ has no winding number and is labeled "trivial".

From this information, we can read off what the results from the Berry phase calculation for polarization becomes

\[\begin{equation} P = \begin{cases} -e/2 & t_1 < t_2, \\ 0 & t_1 > t_2. \end{cases} \quad \mathrm{mod}\, e \end{equation}\]

The first case $t_1 < t_2$ is represented in red in Fig. 3 and we call it “topological” while the case $t_1 > t_2$ we call “trivial”.

One might wonder at this point why these two phases are distinguishable at all since Fig. 1 shows two configurations that when rotated about an $A$ site appear to be exchanged. The answer to this comes from two sources (1) domain walls between configurations (to be explored in a future post) and (2) edges. Fig. 1 shows the situation with edges: either the lattice ends with a strong bond or a weak bond. Both of these situations come into full focus when we consider edge states resulting from the topology.

Symmetries

Aside from translation symmetry, which we have already used to find $H_k$, what other symmetries are in the problem? In particular, what discrete symmetries can we identify? The model as written down in \eqref{eq:ssh-kspace} has the following symmetries:

Time reversal symmetry
Chiral symmetry
Particle-hole symmetry
Inversion symmetry

Each of these is important for the topology of the problem.

Time-reversal symmetry

The first symmetry we identify is the anti-unitary symmetry of time-reversal. In this model it takes the form $T = K$ where $K$ is complex conjugation (in the real space basis). In terms of our $k$-space Hamiltonian we have

\[T H_k T^{-1} = H_{-k}.\]

For a the general two-band Hamiltonian \eqref{eq:general-k-space-two-band}, we have

\[\begin{equation} \begin{aligned} \epsilon_0(k) & = \epsilon_0(k), & d_x(k) = d_x(-k), \\ d_y(k) & = -d_y(-k), & d_z(k) = d_z(-k). \end{aligned} \end{equation}\]

In general, this does not constrain our problem enough to give us topology since if we imagine taking $k$ and mapping it onto the sphere (some closed path) given the above constraints, at $k=0$ and $k=\pi$ $\mathbf d$ is constrained to be in the $xz$-plane. Under these constraints, any path can be continuously deformed onto just a single $\mathbf d(k) \equiv \mathbf d_0$, indicating a trivial topology.

We clearly need something to limit our available paths.

Chiral Symmetry

Chiral symmetry is an odd symmetry. It is unitary, however the symmetry relates the positive and negative energies. In particular, the chiral operator $\sigma_z$ anti-commutes with the Hamiltonian $\{H, \sigma_z\} = 0$ or in terms of its action on $H_k$

\[\sigma_z H_k \sigma_z = - H_k\]

We can directly discern from this that for every positive energy states we can construct a negative state $\sigma_z \lvert E_k \rangle = \lvert - E_k \rangle$. For the two-level problem we are working with, the core issue constraint is

\[\begin{align} \epsilon_0(k) & = 0 & d_z(k) & = 0. \end{align}\]

This constrains us to be on the equator (see Fig. 4)

Figure 4. The constrained space from chiral symmetry is just the equator of the Bloch sphere, and the resulting topological number is just the winding number about the origin (how many times you go around the equator).

The winding number can be defined safely with the gauge chosen in \eqref{eq:uk_north}, this is the polarization but now with $\mod e$, so to differentiate it we call it

\[\begin{equation} \nu_{\mathbb Z} = \frac1{\pi} \oint_{BZ} A(k) = -\frac{1}{2\pi} \oint_{BZ} dk \, \frac{d\phi_k}{dk}. \end{equation}\]

However, we would like a gauge-independent quantity and if we let $\lvert u_k \rangle \rightarrow e^{i\alpha_k} \lvert u_k \rangle$ then we can change the above expression by an even integer. To remedy this, we first note that general solutions now take the form

\[\begin{equation} \lvert u_k \rangle =e^{i\alpha_k} \frac1{\sqrt2} \begin{pmatrix}1 \\ e^{i\phi_k} \end{pmatrix} \end{equation}\]

And we can in fact obtain the phase difference $\phi_k$ by a slight modification of the Berry connection to include the chiral operator

\[\begin{equation} \nu_{\mathbb Z} = -\frac1{\pi} \oint_{BZ} dk \langle u_k | \sigma_z | i\partial_k u_k \rangle = -\frac1{2\pi} \oint_{BZ} dk\, \frac{d\phi_k}{dk}. \end{equation}\]

Thus chiral symmetry gives us a strong constraint which leads to an integer topological number $\mathbb Z$.

This winding number can also be computed directly from the Hamiltonian. Two natural ways of getting it include use of the $\mathbf d(k)$ vector

\[\begin{equation} \nu_{\mathbb Z} = \frac{1}{2\pi} \oint_{BZ} \left(\hat{\mathbf d}(k) \times \frac{d}{dk}\hat{\mathbf d}(k) \right) \cdot \hat{\mathbf z} \, dk, \end{equation}\]

and the second, which generalizes to multiple bands requires us put our Hamiltonian in the form $\begin{equation} H_k = \begin{pmatrix} 0 & h(k) \\ h^\dagger(k) & 0 \end{pmatrix}, \end{equation}$ and for the matrix $h(k)$ we have

\[\begin{equation} \nu_{\mathbb Z} = \frac1{2\pi} \oint_{BZ} dk \frac{d}{dk} \log \det h(k). \end{equation}\]

Physically, this symmetry relies on there not being a term proportional to $\mathbb{I}$ or $\sigma_z$ in the Hamiltonian which in terms of sublattices means that no potential is allowed and we are only allowed hoppings from the $A$ sublattice to the $B$ sublattice. In polyacetyline this is not an actual symmetry since no physical process prevents hopping from within a sublattice. We therefore need to argue for some other symmetry in the problem that leads to topology.

Inversion symmetry

The inversion symmetry in the SSH model (equivalently, reflection symmetry or 180-degree rotation symmetry) is best understood by taking Fig. 1 and flattening it out as in Fig. 5 below

Figure 5. The inversion center (gray vertical line) takes the $A$ sublattice to the $B$ sublattice as well as exchanges the unit cells $n$ and $-n$.

This is a traditional unitary symmetry of the problem and its action on the $k$-space Hamiltonian is to relate $k$ to $-k$ and to exchange the $A$ sublattice with the $B$ (this is accomplished with $\sigma_x$)

\[\sigma_x H_k \sigma_x = H_{-k}.\]

In terms of the two-band system

\[\begin{equation} \begin{aligned} \epsilon_0(k) & = \epsilon_0(-k), & d_x(k) & = d_x(-k), \\ d_y(k) & = - d_y(-k), & d_z(k) & = - d_z(-k). \end{aligned} \label{eq:inversion_twobands} \end{equation}\]

Our first hint of two topologically distinct classes comes from there being two inversion symmetric polarizations possible $P = 0 \mod e$ or $P = e/2 \mod e$ since in both cases $P = -P \mod e$ (polarization changes signs under inversion).

Now the unit vector $\hat{\mathbf d}: S^1 \rightarrow S^2$ maps a closed path onto the (Bloch) sphere. If we permit any continous change in $\hat{\mathbf d}$ every closed path can be reduced to a single point (this is why without symmetries in 1D the topology is trivial). With the added constraint of inversion though, we have two special points $\mathbf d(0) = d_x(0) \hat{\mathbf x}$ and $\mathbf d(\pi) = d_x(\pi) \hat{\mathbf x}$ and the path $k\in[0,\pi]$ completely determines the path $k\in[-\pi,0)$. This leads to two classes of paths, one in which $\mathrm{sgn}(d_x(0)d_x(\pi)) > 0$ and the other in which $\mathrm{sgn}(d_x(0)d_x(\pi)) < 0$, and continuously changing $\hat{\mathbf d}$ while respecting inversion symmetry cannot take one class of paths to the next. These two types of paths are represented below in Fig. 6.

Figure 6. (left) The trivial path where $\mathbf d(0)$ and $\mathbf d(\pi)$ are both in the positive $x$-direction. The solid line represents the path for $k\in[0,\pi)$ while the dashed line represents $k \in [-\pi,0)$. The green area (inclosed by the dashed lines) is exactly the same as the red area but oppositely oriented and thereby they cancel each other by Stokes' theorem when one calculates $P = \frac{e}{2\pi} \oint dk A(k) = 0 \mod e$. (right) The topological path where again the path $k\in[0,\pi)$ is represented by a solid line while the dashed line represents $k \in [-\pi,0)$. In this case, the way $k$ relates to $-k$ guarantees that upon going over the whole Brillouin zone, Stokes' theorem will map out half of the area of the sphere modulo an integer times the full area or in other words $P = \frac{e}{2\pi}\oint dk A(k) = e/2 \mod e$.

This suggests a simple way to differentiate topological and trivial phases known as symmetry indicators. If we look at the inversion symmetric points, the Hamiltonian has the form

\[\begin{equation} \begin{aligned} H_{k=0} & = \epsilon_0(k=0) - \sigma_x d_x(k = 0) & H_{k=\pi} & = \epsilon_0(k=\pi) - \sigma_x d_x(k=\pi). \end{aligned} \end{equation}\]

The inversion operator at these points is simply $\sigma_x$ with $\lvert{u_{k=0}}\rangle$ and $\lvert{u_{k=\pi}}\rangle$ both eigenvectors of that operator with eigenvalues $\xi_{0} = \mathrm{sgn}(d_x(k=0))$ and $\xi_\pi = \mathrm{sgn}(d_x(k=0))$, respectively. There is then a $\mathbb Z_2$ index which we can define simply by

\[\begin{equation} \tilde \nu_{\mathbb Z_2} = \xi_0 \xi_\pi = \begin{cases} 1, & \text{trivial,} \\ -1, & \text{topological.} \end{cases}. \end{equation}\]

(we have defined $\tilde \nu_{\mathbb Z_2} = (-1)^{\nu_{\mathbb Z_2}}$.) The $\mathbb Z_2$ invariant is also computed, as we have suggested, by the polarization $\begin{equation} \nu_{\mathbb Z_2} = \frac1{\pi} \oint_{BZ} dk \, \langle u_k | i \partial_k u_k \rangle \mod 2, \end{equation}$ such that $P = \frac{e}{2}\nu_{\mathbb Z_2}$.

By just looking at two points in the Brillouin zone, we can discern if the phase is topologically nontrivial. In summary, inversion symmetry alone gives us $\mathbb Z_2$ topology.

Particle-hole symmetry

The last symmetry we consider is actually a combination of previous symmetries, particle-hole symmetry for this problem will be the anti-unitary symmetry made by combining time-reversal symmetry and chiral symmetry. In fact, for this problem

\[C = \sigma_z T = \sigma_z K,\]

It acts on our $k$-space Hamiltonian such that

\[C H_k C^{-1} = - H_{-k}.\]

In terms of the two-band model we have the constraints

\[\begin{equation} \begin{aligned} \epsilon_0(k) & = - \epsilon_0(-k) & d_x(k) & = d_x(-k), \\ d_y(k) & = -d_y(-k) & d_z(k) & = - d_z(-k). \end{aligned} \end{equation}\]

These constraints are suprisingly close to the constraints place upon us due to inversion symmetry \eqref{eq:inversion_twobands} except for $\epsilon_0(k)$ which here must be odd in $k$ instead of even in $k$. However, that detail does not affect the wavefunctions which only follow $\mathbf d(k)$ and so by very similar logic to inversion symmetry if we have this particle-hole symmetry, we have a $\mathbb Z_2$ topology.

Cartan Classification

Each individual Hamiltonian such as \eqref{eq:ssh-kspace} has their own particular symmetries. However, to speak of topology one must think closely about what terms could be added to the Hamiltonian. This is seen very clearly in the SSH model where we have enumerated many symmetries, but have argued that some of them (such as the chiral symmetry) do not make sense for the physical symmetry. Nonetheless, it is a useful exercise to take each symmetry seriously and build out what topology it implies. For the two-band systems suggested by the SSH model \eqref{eq:general-k-space-two-band} we have done just that, and the result matches what is known from topological classification .

T	C	S	Topology
0	0	0	0
1	0	0	0
0	1	0	$\mathbb Z_2$
0	0	1	$\mathbb Z$
1	1	1	$\mathbb Z$

T represents time reversal, C represents particle-hole symmetry, and S represents chiral symmetry (0 is no symmetry, 1 is that it is present).

Additionally, we can include inversion symmetry. The only relevant combination is time-reversal and inversion, which leads directly to $d_z(k) = 0$ and allows us to use the arguments for chiral symmetry to obtain a $\mathbb Z$ invariant (though with a finite $\epsilon_0(k)$ which needs to be considered only for purposes of having a direct gap between bands as well as when taking into account Fermi energies)

T	Inversion	Topology
0	1	$\mathbb Z_2$
1	1	$\mathbb Z$

All of this applies to the entire lower band of this system, when we are in the insulating state. Partially filled bands don’t allow us to perform many of the full integrals over the whole Brillouin zone and therefore cannot be classified as we have mentioned.

Conclusions

This post has been mainly concerned with the bulk classification of the topology inherent in the SSH model, carefully going through each symmetry and its relation to the topological classification (as well the corresponding topological numbers). In this system, the main observable for the model are edge states, which we have not discussed here.

The effect of a bound state on the continuum

2021-12-30T00:00:00+00:00

In models that naturally have a continuum, it is sometimes possible to find bound states with the application of a potential well. These states don’t come out of nowhere though and since they are combinations of continuum states, the continuum itself is altered. To begin to understand how the continuum is altered, we look at the simplest example here: A $\delta$-function potential in one-dimension and with quadratic dispersion.

\[\begin{equation} -\frac1{2m} \psi''(x) - \lambda \delta(x) \psi(x) = E \psi(x). \label{eq:schro} \end{equation}\]

The general strategy here is to convert this potential into a matching condition between left and right parts of space. In particular, if we integrate $x$ from $0^-$ to $0^+$, we get

\[\begin{equation} -\frac1{2m}[\psi'(0^+) - \psi'(0^-)] = \lambda \psi(0). \label{eq:matching} \end{equation}\]

The bound state solution to this problem is a simple exponential

\[\begin{equation} \psi_0(x) = \sqrt{m\lambda}\, e^{- m\lambda |x|}, \quad E= -\tfrac12 m\lambda^2 \end{equation}\]

Expand for derivation

Quasicrystalline Art

2019-09-05T00:00:00+00:00

An image quilt of quasicrystals

Quasicrystals, a beautiful manifestation of something without a strict crystalline symmetry but nonetheless shows order, have won a Nobel prize and have recently interested my own work with a dodecagonal graphene quasicrystal making its way into Science¹.

This led to this beautiful cover in Science²

Cover image from a Science magazine cover story.

This phenomena is a perfect example of the kind of research I’ve been doing a lot with these days, and so it inspired the new logo for this website

Building a Penrose tiling from two sheets of graphene twisted at 30-degrees with respect to each other.

One can tell how this is done: You find the points where two hexagons are on top of each other, put down a point, and connect. There are three shapes: a rhombus, an equilateral triangle, and a square. This can be done along the entire sheet to create an amazing looking pattern. For completeness, we can fill in the rest of the pictured grid to obtain:

A fully Penrose tiled sheet.

The pattern starts to look even more intriguing the further out in the tiling you go. There is much to learn about such physical systems and their quasiperiodic cousins.

Footnotes

S. J. Ahn et al., Science 361, 782 (2018)’ ↩
We have been studying how quasiperiodicity interplays with materials that have Dirac nodes, including twisted bilayer graphene. While we have not studied graphene at 30-degrees like the work in Science, that is an extreme where all crystalline periodicity is lost. ↩

Pulsing a two-band model to discover topology

2017-02-20T00:00:00+00:00

In systems with an anomalous quantum Hall effect, the quantized Hall conductivity comes from the integral of a Chern number over some manifold. Usually, this integral is derived via Kubo formula. However, there is geometry involved in how the state evolves, and in fact, we can use the dynamics of the current following a weak pulse in order to find the DC conductivity. The route is easy enough: Say we have a conductivity which when written with respect to time is $\sigma_{xy}(t-t')$, and without loss of generality we apply a pulse $E_x(t) = A_x\delta(t)$, then we can find the current response in the perpendicular direction

\[\begin{align} j_y(t) = \int dt \, \sigma_{yx}(t-t') E_x(t') = \sigma_{yx}(t) A_x. \end{align}\]

This allows us to derive an expression for the DC-conductivity

\[\begin{align} \sigma_{yx} = \int dt \, \sigma_{yx}(t) = \frac1{A_x} \int dt \, j_y(t). \end{align}\]

Geometrically, there is a lot going on with $j_y(t)$ when we have a system with spin-orbit coupling. In particular, take the two band model

\[\begin{align} h(\mathbf p) = \mathbf d(\mathbf p) \cdot \sigma, \end{align}\]

where $\mathbf d$ is 3D, $\mathbf p$ is 2D, and $\sigma = (\sigma_x, \sigma_y, \sigma_z)$, the vector of Pauli matrices. The initial states of the system can be represented by where they are on the Bloch sphere $-\hat{\mathbf d}(\mathbf p)$. But once a pulse is supplied, this state will begin to rotate about a different vector $\mathbf d(\mathbf p - e \mathbf A)$. Thus, if we add time dependence to $\hat{\mathbf d}(\mathbf p, t)\equiv -\langle \sigma(t) \rangle$ to represent the state’s location, we can use Heisenberg’s equations of motion to obtain

\[\begin{align} \hbar \frac{\partial \hat{\mathbf d}(\mathbf p, t)}{\partial t} = 2\mathbf{d}(\mathbf p - e\mathbf A) \times \hat{\mathbf{d}}(p, t). \end{align}\]

We can rewrite this equation as

\[\begin{align} \hat{\mathbf d}(\mathbf p,t) = \hat{\mathbf d}(\mathbf p - e \mathbf A) [\hat{\mathbf d}(\mathbf p - e \mathbf A) \cdot \hat{\mathbf d}(\mathbf p)] - \hbar \frac{\hat{\mathbf d}(\mathbf p-e \mathbf A) \times \frac{\partial \hat{\mathbf d}(\mathbf p, t)}{\partial t}}{2d(\mathbf p - e A)} \end{align}\]

However, this state has an associated current with it, and that can be represented by the operator $j_\mu = -e \partial_\mu \mathbf d(\mathbf p- e \mathbf A) \cdot \sigma$. And the vector $\langle\sigma\rangle = -\hat{\mathbf{d}}(p,t)$. Therefore,

\[\begin{align} \langle j_\mu \rangle = e^2 \partial_\mu \mathbf d(\mathbf p - e\mathbf A ) \cdot \hat{\mathbf d}(\mathbf p, t) \end{align}\]

Combining these expressions, we have $\begin{multline} \langle j_\mu \rangle = e^2 \partial_\mu \mathbf d(\mathbf p - e\mathbf A ) \cdot \Bigg[ \hat{\mathbf d}(\mathbf p - e \mathbf A) [\hat{\mathbf d}(\mathbf p - e \mathbf A) \cdot \hat{\mathbf d}(\mathbf p)] \\ - \hbar \frac{\hat{\mathbf d}(\mathbf p-e \mathbf A) \times \frac{\partial \hat{\mathbf d}(\mathbf p, t)}{\partial t}}{2d(\mathbf p - e A)} \Bigg]. \end{multline}$

Or simplified $\begin{multline} \langle j_\mu \rangle = e^2 \partial_\mu d(\mathbf p - e\mathbf A )[\hat{\mathbf d}(\mathbf p - e \mathbf A) \cdot \hat{\mathbf d}(\mathbf p)] \\ - \frac{e^2}{2} \partial_\mu \hat{\mathbf{d}}(\mathbf p - e \mathbf A) \cdot \left[\hat{\mathbf d}(\mathbf p-e \mathbf A) \times \frac{\partial \hat{\mathbf d}(\mathbf p, t)}{\partial t} \right]. \end{multline}$

This is exact. At this point, we make a couple of approximations. First of all, the first term is independent of $t$ so it cannot contribute to the total current if we have a finite DC conductivity. This leaves the second term. We can do the integral over time — there is an order of limits problem but we can get around this by noting that we do not expect the infinite time state to contribute to the energy (or: it averages to something proportional to $\mathbf d(\mathbf p - e \mathbf A)$ anway and so the cross product vanishes), so we discard it and therefore, $\int_0^\infty dt' \, \mathbf d(\mathbf p, t') = - \mathbf d(\mathbf p)$.

Hence, we get the Hall conductivity

\[\begin{align} \sigma_{yx} = \frac{e^2\hbar}{2 A_x} \int \frac{d^2 p}{h^2} \partial_y \hat{\mathbf{d}}(\mathbf p - e \mathbf A) \cdot \left[\hat{\mathbf d}(\mathbf p-e \mathbf A) \times \hat{\mathbf d}(\mathbf p) \right]. \end{align}\]

At this point, we actually have not expanded in terms of $A_x$ yet. The first term will produce a term that is symmetric in $x$ and $y$—however it drops out due to the cross product $\hat{\mathbf d}(\mathbf p) \times \hat{\mathbf d}(\mathbf p) = 0$ vanishes. Thus, only the second term persists and we see directly that $x$ and $y$ must be different and in fact, we get the well-known formula

\[\begin{align} \sigma_{yx}^{\mathrm{Hall}} =\frac{e^2}{h} \int \frac{d^2 p}{4\pi} \hat{\mathbf{d}}(\mathbf p) \cdot \left[\partial_y \hat{\mathbf d}(\mathbf p) \times \partial_x \hat{\mathbf d}(\mathbf p) \right]. \end{align}\]

This describes the Chern number of some manifold parametrized by $\mathbf p$ (usually the Brillioun zone). It is the number of times the vector $\mathbf d$ wraps the sphere.

To understand why it’s a topological invariant, note that the quantity in the integral looks very much like a Jacobian. In fact, it is; it describes a coordinate transformation from the $\hat {\mathbf d}$ to $(p_x,p_y)$. In this way, the integrand represents an area element on the sphere, and in general $\mathbf p$ is a closed manifold. So $\mathbf d(\mathbf p)$ maps that closed manifold to the sphere, and without any edges or boundaries the area it maps out must be $4 \pi$ times an integer.

This formula is well-known, but this dynamical way of obtaining it is slightly less well-known. We have extended this idea in a paper published last year to handle the out-of-equilibrium case of quenches. In that situation, new phenonmena appear that are quite different from the equilibrium case—terms that we discarded in this calculation become quite relevant.

Subtleties in linear response theory

2014-12-22T00:00:00+00:00

In linear response theory, we consider some small perturbation to a Hamiltonian and look at the response of some observable to that perturbation. In the case considered here, the perturbation is an electric field, and the response is current. The linear response that characterizes these quantities is called the conductance.

There’s a problem though, an electric field accelerates a charge. Consider a classical electron for the time being, then

\[\begin{align} m \ddot x(t) = - e \mathbf{E}. \label{eq:Newton} \end{align}\]

Or in terms of current $j(t) = -e \dot x(t)$, $\frac{d j}{dt} = \frac{e^2}{m} \mathbf{E}$. From here we can quickly and naively go to frequency space to find $j(\omega) = i \frac{e^2/m}{\omega} \mathbf{E}(\omega)$. Then one might remember that another way to define $\mathbf{E}$ is in terms of a vector potential that is purely time-dependent, so $\mathbf{E}(\omega) = i \omega \mathbf A(\omega)$. Now, if we just plug this into our linear response for the current, we get

\[j(\omega) = - \frac{ e^2}{m} \mathbf A(\omega). \label{eq:jA}\]

All is well and good, right? Well, not quite. In electromagnetism, the constant part of $\mathbf A(t)$ corresponds to the $\omega = 0$ term of $A(\omega)$. This represents what is known as a “pure gauge”. These gauges are physically equivalent to the null field $\mathbf A(\omega=0) = 0$. Thus, whatever linear response is represented above at $\omega = 0$ must be unphysical, right?

Wrong.

Before explainging why this is wrong, let’s give some further context to this linear response theory. The term $- e^2/m$ is actually the single particle term of what is known as the “diamagnetic” response to the conductivity when you add in more electrons (usually distributed in a Fermi distribution). This term persists in quantum mechanics, and no other terms appear to cancel it in the simplest case of $H = \frac{p^2}{2m}$. In fact, while the math becomes more cumbersome, the solution we shall illustate below holds perfectly well for even the non-interacting multi-electron system.

Now, at this point you may have guessed that there’s something strange going on at $\omega = 0$ due to the fact that the electric field accelerates the particle and doesn’t just have a velocity response. At the $\omega = 0$ point, the physical field $E(\omega)$ seems to necessarily be equal to zero in the gauge we have prescribed unless $A(\omega) \sim 1/\omega$ for small $\omega$. This would lead to a divergent $j(\omega)$, restoring our faith that the system is accelerating out of control.

But what about when $\mathbf{A}(\omega=0) = \mathbf{A}_0$? It seems like then we have a true velocity response to an unphysical object. The solution is subtle: At some point in the quick derivation we made an assumption that implied $\mathbf A(t) \rightarrow 0$ at $t \rightarrow -\infty$. This implies that if $\mathbf A(t) = \mathbf A_0$ at any finite time, there had to be some time in between where $d\mathbf A/ dt \neq 0$. Thus, during that “ramp up” time, an electric field was on and it accelerated the charge to a specific velocity resulting in the current $j(\omega) = - \frac{ e^2}{m} \mathbf A(\omega)$.

The assumption is subtle, but the result is rather simple. For now, just assume that $\mathbf A(-\infty) = 0$ and at some $t_0$, $\mathbf A(t_0) = \mathbf A_0$, then we can integrate Eq. \eqref{eq:Newton} to obtain the velocity:

\[m \dot x (t_0) = e \int_{-\infty}^{t_0} \frac{d \mathbf A}{d t} d t = e \mathbf A_0.\]

Or, in other words, $\mathbf j(t_0) = -\frac{e^2}{m} \mathbf A_0$, the same as before! This is how a constant $\mathbf A_0$ can be physical: When it represents the change from a different constant vector potential.

Now, to isolate the assumption, let us run through what they were

$\frac{d j}{d t} = \frac{e^2}{m} E(t)$.
Take the Fourier transform: $j(\omega) = i \frac{e^2/m}{\omega} E(\omega)$.
Insert vector potential with $E = -\frac{d}{dt} A$ and assume the Fourier transform exists for $A$: $j(\omega) = - \frac{e^2}m A(\omega)$.
Undo the Fourier transform: $j(t) = - \frac{e^2}{m} A(t)$.

Now, let us get the last equation (in #4 above) by a simpler route.

$\frac{d j}{dt} = \frac{e^2}{m} E(t)$.
Use $E = - \frac{d}{dt} A$ and integrate the above expression from $-\infty$ to $t$: $j(t) - j(-\infty) = -\frac{e^2}{m} ( A(t) - A(-\infty))$.

Two perfectly legitimate calculations resulting in different results. Firstly, this highlights that the first procedure does actually assume $A(-\infty) = 0$. Secondly, the only assumptions that could have given $j(-\infty) = 0$ (an assumption we probably wanted anyway) and $A(-\infty) = 0$ are that they could be given in terms of Fourier transforms. In order for a function to have a Fourier transform it needs to be absolutely integrable—i.e. $\int_{-\infty}^\infty \lvert A(t) \rvert dt \lt \infty$. Given $A(t)$ as a continuous, piece-wise differentiable function, we need $A(t) \rightarrow 0$ for $t \rightarrow \pm \infty$. This imposes our gauge, and since we are not interested in future times let alone $t \rightarrow +\infty$, we can artificially modify the function as we see fit to accomodate that. But how the function began at $-\infty$ is important, and we must impose that. Hence, we have chosen, at least partially, a gauge.

We are left with a dilemma then about pure plain waves $A(t) = A(\omega) e^{-i \omega t}$. How do those function?

Technically, they are outside of the bounds of the Fourier analysis and we can see that simply by the fact that if we tried to do the above procedure, we couldn’t have a well defined answer as $t \rightarrow -\infty$ (too oscillatory). However, we can approximate the plain wave in terms of an absolutely integrable function $A_\delta(t) = A(\omega) e^{-i (\omega + i \delta) t}$ for any $\delta \gt 0$, and everything works. This shows us explicitly that $t = - \infty$ does have $A_\delta \rightarrow 0$ for all $\delta \gt 0$. And this is the origin of the well known substitution $\omega \rightarrow \omega + i\delta$.

The natural question to ask now is how this works for a real system (with dissipation). Why does such a term not exist at zero frequency?

Unless your system is a superconductor, there is some dissipation in the system. The simplest way to include this is classically: When an electron is going at velocity $\dot x(t)$ it experiences a “drag” that tends to slow it down. Thus, our Newton’s equations become

\[m \ddot x(t) = - m \gamma \dot x(t) - e \mathbf E\]

where $\gamma$ describes how much drag the electron experiences. For more disorder, this would be a larger number. Playing the same Fourier transform game, we can obtain rather quickly that

\[j(\omega) = - \frac{e^2}{m} \frac{\omega}{\omega +i \gamma} A(\omega).\]

This is just one step away from the well-known Drude model. We see that if $A(t) = A_0$, then $j(\omega) = 0$. But our gauge choice that we described before is still in place, the only difference is that our “kick” at $t=-\infty$ has an infinite time to dissipate back to rest (the inclusion of $\gamma$ above is critical for $j(\omega=0) =0$). This also suggests a steady state current when $\mathbf E$ is constant: $m\ddot x=0$ implies $j = \frac{e^2}{m \gamma} \mathbf E$. Our current relaxes to zero when there’s nothing around ($\mathbf E = 0$), as we would expect.

When a quantum mechanical description is done—by taking a random disorder potential and averaging over disorder configurations—one obtains similar results. The diamagnetic term for a clean system is real and has a physically well defined explanation.

One may not be surprised that this curious “diamagnetic term” occurs for superconductors, however it is sometimes explained that “gauge symmetry is broken” and that is why such a term exists. This is a misleading statement, but one I will explore in a future post.

Current in Single Particle Quantum Mechanics

2014-01-14T00:00:00+00:00

For simplicity, I will only use one-dimension in this post, but this can be generalized to higher dimensions rather easily.

Many textbooks on Quantum Mechanics mention current density can be derived from the continuity equation and probability. The usual method for figuring this out is to assume you have some Hamiltonian $H = p^2/2m + V(x)$ where $p$ is the momentum and $x$ is the position. In this way the current density is written in terms of the wave function $\psi(x,t)$ as

\begin{equation} j(x,t) = \frac1{2m i}\left[ \psi^* \overrightarrow{\partial_x} \psi - \psi^* \overleftarrow{\partial_x} \psi \right]. \end{equation}

This then satisfies the continuity equation

\begin{equation} \partial_t \rho(x,t) + \partial_x j(x,t) = 0, \end{equation}

with density $\rho(x,t)=\psi^*(x,t)\psi(x,t)$. It should be noted that if you write the wave function as $\psi(x,t) = \sqrt{\rho(x,t)} e^{i \theta(x,t)}$, then the current is just proportional to the gradient of the phase $j(x,t)= \rho(x,t) \partial_x \theta(x,t)/m$, giving the spatial change in phase a physical significance.

However, there are two lingering questions:

Is this current density related to the Heisenberg operator $\dot x(t)$ which tracks the velocity of the system?
If so, does it generalize to more arbitrary Hamiltonians?

To answer these questions, we consider the more arbitrary Hamiltonian

\begin{equation} H = T(p) + V(x), \end{equation}

where $V(x)$ is some potential and the kinetic energy is some polynomial

\begin{equation}T(p) = \sum_{n=1} a_n \frac{p^n}{n!}.\end{equation}

We are unworried about bounding the energy, so odd-order Kinetic energy terms are allowed (in the higher dimensional case, the Dirac-like Hamiltonians have linear terms in $p$). At this point, we can take our Heisenberg operator $\dot x(t)$ and find

\[\begin{align} \dot x(t) & = i [H, x(t)] \\ & = T'(p(t)). \end{align}\]

where $T'$ is the derivative of $T$ with respect to its argument. Now, we would like to obtain a current density from this quantity. We can certainly define the total current at a specific time as

\[\begin{align} I & = \langle\psi_0\lvert \dot x(t) \rvert \psi_0\rangle = \langle\psi_0\lvert T'(p(t)) \rvert \psi_0\rangle \\ & = \langle\psi(t)\lvert T'(p) \rvert \psi(t)\rangle, \end{align}\]

where in the last line we go from the Heisenberg to Schroedinger picture. Now to get density, we need to use a complete set position states, so that

\[\begin{align} \label{eq:total-current} I & = \int dx \, dy \, \langle\psi(t)\lvert x\rangle \langle x \lvert T'(p) \rvert y \rangle \langle y \lvert \psi(t)\rangle. \end{align}\]

Now, $p$ acts as a derivative on position kets, so that one can verify that

\[\begin{equation} \langle x \lvert T'(p) \rvert y \rangle = T'(-i \partial_x ) \delta(x-y). \end{equation}\]

However, there is an ambiguity here since we can write

\[\begin{align} T'(-i \partial_x)\delta(x-y) & = \sum_{n=0} a_{n+1} \frac{(-i\partial_x)^n}{n!} \delta(x-y) \nonumber \\ & = \sum_{n=0} a_{n+1} \frac{(-i\partial_x)^{n-m} (i \partial_y)^m}{n!} \delta(x-y). \label{eq:T-delta} \end{align}\]

This ambiguiuty in how to choose the derivatives leaves us with many way to define the current density. Fortunately, only one of these combinations satisfies the continuity equation. To figure out which one that is, let us reverse engineer the continuity equation to obtain a solution. The density is $\rho(x,t) = \psi^*(x,t) \psi(x,t)$, and so using the Schroedinger’s equation, we have

\[\begin{align} i \partial_t \rho(x,t) & = i(\psi^*(x,t) \overleftarrow {\partial_t} \psi(x,t) + \psi^*(x,t) \overrightarrow {\partial_t} \psi(x,t) ) \\ & = - [\psi^*(x,t)( T(i \overleftarrow{\partial_x} ) - T(-i \overrightarrow{\partial_x} ) )\psi(x,t) ]. \end{align}\]

Thus, the continuity equation must become

\begin{align} \partial_t \rho(x,t) - i [\psi^*(x,t)( T(i \overleftarrow{\partial_x} ) - T(-i \overrightarrow{\partial_x} ) )\psi(x,t) ] = 0. \end{align}

If we now assume that we have a current density that takes the form

\begin{equation} j(x,t) = \psi^*(x,t) \vartheta(\overleftarrow{\partial_x},\overrightarrow{\partial_x}) \psi(x,t), \end{equation}

and satisfies the continuity equation, $\partial_t \rho + \partial_x j = 0$, then we can equate operators to obtain

\begin{align} \label{eq:diff-ops-cty} \overleftarrow{\partial_x} \vartheta + \vartheta \overrightarrow{\partial_x} = -i [T(i \overleftarrow{\partial_x} ) - T(-i \overrightarrow{\partial_x} ) ].
\end{align}

Anticipating the answer, we write the general form of $\vartheta$ as

\begin{equation} \vartheta = \sum_{n=0} \sum_{m=0}^n (-1)^{m} i^n b_{n,m} \overleftarrow{\partial}{}_x^{n-m} \overrightarrow{\partial}{}_x^m. \end{equation}

Then we can take the left hand side Eq. \eqref{eq:diff-ops-cty} and write

\[\begin{multline} \overleftarrow{\partial_x} \vartheta + \vartheta \overrightarrow{\partial_x} = -i \sum_{n=1} i^n b_{n-1,0} \overleftarrow{\partial}{}_x^{n} - \sum_{n=0} \sum_{m=0}^{n-1} (-1)^m i^n \left[ b_{n,m+1} - b_{n,m} \right] \\ \times \overleftarrow{\partial}{}_x^{n-m} \overrightarrow{\partial}{}_x^{m+1} + i \sum_{n=1} (-i)^{n} b_{n-1,n-1} \overrightarrow{\partial}{}_x^{n} . \end{multline}\]

On the other hand, we can calculate the right hand side of Eq. \eqref{eq:diff-ops-cty} to be

\[\begin{equation} -i [T(i \overleftarrow{\partial_x} ) - T(-i \overrightarrow{\partial_x} ) ] = -i \sum_{n=1} a_n i^n \frac{\overleftarrow{\partial}{}_x^n}{n!} + i \sum_{n=1} a_n (-i)^n \frac{\overrightarrow{\partial}{}_x^n}{n!}. \end{equation}\]

Equating the left and right sides, we can just read off that $b_{n-1,0} = a_n/n!$, $b_{n-1,n-1}= a_n/n!$ and $b_{n,m+1} = b_{n,m}$, so that $b_{n,m} = a_{n+1}/(n+1)!$.

Thus, we have

\begin{equation} \vartheta = - \sum_{n=0} \sum_{m=0}^n (-1)^{m} i^n \frac{a_{n+1}}{(n+1)!} \overleftarrow{\partial}{}_x^{n-m} \overrightarrow{\partial}{}_x^m. \end{equation}

Returning all the way to when we were considering $\dot x(t)$ as an integral over position, this suggests that in Eq. \eqref{eq:T-delta}, we want to consider

\begin{equation} T’(-i \partial_x)\delta(x-y) \ = \sum_{n=0} \frac{a_{n+1}}{n!} \frac1{n+1}\sum_{m=0}^n (-i\partial_x)^{n-m} (i \partial_y)^m \delta(x-y). \end{equation}

Given the expression for total current Eq. \eqref{eq:total-current} and integrating the delta function by parts numerous times, we can replace $\partial_x$ with $-\overleftarrow \partial_x$ and $\partial_y$ with $- \overrightarrow\partial_x$, and then the total current is just

\begin{equation} I = \int dx \, \psi^*(x,t) \sum_{n=0} \frac{a_{n+1}}{(n+1)!} \sum_{m=0}^n (i \overleftarrow \partial_x)^{n-m} (-i \overrightarrow \partial_x)^m \psi(x,t). \end{equation}

which actually integrates the current density! Thus, we have shown that

\begin{equation} j(x,t) = \psi^*(x,t) \sum_{n=0} \frac{a_{n+1}}{(n+1)!} \sum_{m=0}^n (i \overleftarrow \partial_x)^{n-m} (-i \overrightarrow \partial_x)^m \psi(x,t), \end{equation}

and that

\begin{equation}\langle \dot x(t) \rangle = \int dx \, j(x,t). \end{equation}

Indeed, $\dot x(t)$ does track the current of the problem and can even be written as the integral of a current density. Even for the more arbitrary Hamiltonian $H = T(p) + V(x)$.

The delta-function Potential Lattice

2013-06-14T00:00:00+00:00

I was messing around with some simple problems, and found this simple but illustrative problem. It starts you off in basic quantum mechanics and introduces concepts in a very straightforward way to both get to a more condensed matter perspective while even showing some interesting effects that have experimental consequences (energy bands and band gaps opening).

We look at the relatively simple problem¹ of finding the energy spectrum for a particle in the lattice potential

\begin{equation} U(x) = \alpha\sum_{n=-\infty}^\infty \delta(x - n a)\end{equation}.

A visual representation of U(x).

The time-independent Schrödinger equation takes the form

\begin{equation} \left[ - \frac{\partial_x^2}{ 2 m} + U(x) \right] \psi(x) = E \psi(x),\end{equation}

where $E$ is the energy.

Since we can solve the problem between the delta-functions quite simply ($U(x) =0$ there), let us restrict our focus to $na \lt x \lt (n+1)a$. Here, the wave function takes on the form

\begin{equation} \psi(x) = A_n e^{ik(x-na)} + B_n e^{-ik(x-na)},\end{equation}

where we have $k = \sqrt{2m E}$. Now, we can find an operator that commutes with the Hamiltonian so that we can diagonlize it to help solve the problem — this will be the operator that translates us by $a$.

To make this clear, let us abstract things to operators so that we have a momentum operator $p$ and a position operator $x$, then we have the commutator $[ x, p] = i$. The operator $p$ commutes with functions of $x$ as though it were a derivative $[ p, f( x)] = -i f'( x)$, so considering the translation operator $e^{i a p }$, we can write

\begin{equation} e^{i a p} U( x) e^{ - i a p} = \sum_{n=0}^\infty \frac1{n!} U^{(n)}( x) a^n = U( x + a),\end{equation}

and since $U(x)$ is periodic in $a$, we have that $e^{i a p} U( x) e^{ - i a p} = U( x)$. Thus, the operator $T_a = e^{i a p}$ commutes with the Hamiltonian and we can simulataneously diagonalize both it and the Hamiltonian. We say $T_a \lvert \psi\rangle = e^{i a q} \lvert \psi \rangle$ has quasi-momentum $q$. It is important to note that this not the same as real momentum which is not a well-defined quantum number in this problem (that needs translation symmetry).

In other words, we can write our eigenfunctions such that $\psi(x+a) = e^{i q a} \psi(x)$, and this naturally leads us to relate the coefficients for our eigenfunctions above as

\begin{equation} A_{n-1} = e^{-i q a} A_n, \quad B_{n-1} = e^{-i q a} B_n.\end{equation}

Now, we can apply matching conditions at $x = na$ remembering that our wavefunction should be continuous, but that the delta-function will cause the first derivatives to be discontinuous. The equations for the coefficients are

\begin{align} A_n + B_n & = e^{i k a} A_{n-1} + e^{-i k a} B_{n-1}, \\
\left( 1 + \frac{2 i m \alpha}{k} \right) A_n - \left( 1 - \frac{2 i m \alpha}k \right) B_n & = e^{i k a} A_{n-1} - e^{-ik a} B_{n-1}, \end{align}

and if we insert the relation between the coefficients at $n-1$ and $n$, we get a matrix equation

\begin{align} \begin{pmatrix} e^{i q a} - e^{i k a} & e^{i q a} - e^{- i k a} \\
e^{i q a} \left(1 + \tfrac{2 i m \alpha}k \right) - e^{i k a} & - e^{i q a } \left( 1 - \tfrac{2 i m \alpha}k \right) + e^{-ik a} \end{pmatrix} \begin{pmatrix} A_n \ B_n \end{pmatrix} = 0. \end{align}

This equation has a non-zero solution only if the determinant of the matrix is zero which can be written as

\begin{equation} \cos q a - f(E) = 0 , \quad f(E) = \cos ka + \frac{m \alpha}{k} \sin ka. \end{equation}

This is an equation which relates the energy to the quasi-momentum. Since $\cos q a$ can only be between -1 and 1, this equation only has a solution when $f(E)$ is between -1 and 1. The oscillatory nature of $f(E)$ means that it should pass in this range multiple (in fact, a countable infinite) number of times.

Armed with this equation, we can use $q$ and an integer to label our energies and we obtain the following energy bands by just solving for $E$ (setting $m = 10$, $a = 1$, and $\alpha = 0.3$)

Energy bands in the delta-function lattice

The dotted lines in this plot represent the spectrum if there were no delta-function potentials (displaced in energy by $\alpha / a$ for clarity). Notice how the introduction of the delta-functions opens up gaps in this energy spectrum, so that there are some energies that are inaccessible. The gaps actually open up when $\lvert f(E)\rvert \geq 1$ since there is no solution to our equation there. This energy gap for small $\alpha$ just goes like $\alpha/a$, vanishing as we’d expect when $\alpha = 0$.

Notice that in this energy spectrum, we see that if $q \rightarrow -q$ we get the same energy.

Additionally, if the energy ever goes negative the solutions turn from plane waves $e^{\pm i k (x-na)}$ into functions localized around the delta functions $e^{\pm\kappa(x-na)}$. In fact, the wave functions look like this for a $q=0$ state:

Negative energy localized states in delta-function lattice with alpha less than 0.

These only appear when $\alpha \lt 0$, and are related to the fact that the delta-function potential has a bound state. Additionally, only one band can ever have this state. This is due to the fact that the oscillatory sine and cosine change to their non-oscillatory hyperbolic counterparts. However, these states in the delta-function lattice are spread out throughout the crystal, and can not be said to be “localized” to a specific site — they still all have a definite quasi-momentum.

As with other single particle problems, upon considering the many particle picture, these energy bands get filled up to a set energy level (if we are considering fermions).

This is problem 2.53 in Exploring Quantum Mechanics by Galitski, Karnakov, Kogan, and Galitski. ↩

Use GmailTeX to compose and view emails with LaTeX

2013-05-16T00:00:00+00:00

I’ve received numerous emails with pseudo-LaTeX in them, and I’ve composed many emails as well. My normal solution is to convert LaTeX to unicode with the helpful Mac application Unicodeit. However, this method is incomplete and misses some of the more complicated LaTeX.

To address this, there is an extension to gmail called GmailTeX available on Chrome, Firefox, Safari, and Opera (also it has a bookmarklet for any other browser).

If someone sends you an email in pseudo-LaTeX it can try to parse what the math with simple math. The resulting output is kind of what you’d get by using Unicodeit. Or, if someone sends you LaTeX inside dollar signs (i.e., $ […] $), then it can just compile that to LaTeX for you with its rich math function.

But best of all, when you compose emails and use the rich math function, it creates an image of your math hosted on a remote server. They don’t need to have GmailTeX installed unless you decide to send them the code itself.

The best collaborative apps require little to no commitment for collaborators to use, and this is one of those cases. Collaborators will just receive email with LaTeX images without needing to install anything.

The only issue that I’ve found from playing around with this extension is that the receiver may have to flag your email as “trustworthy” and/or allow images from remote servers to be viewed in their email client. Otherwise, they may not see any mathematics. Additionally, as you might have guessed, the emails won’t be viewable in an offline mode. Unless you’re going through these emails on an airplane though, I don’t think that should be too big of an issue.

(h/t Brian Danielak)

Spin-orbit coupled Hamiltonian

2013-04-28T00:00:00+00:00

For applications in many parts of condensed matter physics and cold atoms physics, we use what is known as the Rashba spin-orbit coupled Hamiltonian. This Hamiltoninan is so-named because it couples momentum $\mathbf{p}$ to the spin $\mathbf{S}=\frac12\sigma$ where $\sigma = (\sigma_x,\sigma_y,\sigma_z)$ are the Pauli matrices and $\mathbf{p}=(p_x,p_y,p_z)$ is a vector of momentum operators:

\begin{equation} H = \frac{p^2}{2m} + \alpha (\boldsymbol{\sigma} \times \mathbf{p})\cdot \hat{\mathbf{z}} + \Delta \sigma_z. \end{equation}

$m$ is the mass, $\alpha$ is the spin-orbit coupling strength, and $\Delta$ is some Zeeman field (it acts as magnetic field on the spin).

In this post, we go through the calculation of the energy spectrum and eigenvectors – a straight forward exercise in undergraduate linear algebra.

First of all, instead of the normal method of finding eigenvectors, we note that we can rewrite this Hamiltonian in the form

\begin{equation} H = \frac{p^2}{2m} + \mathbf{b}(p) \cdot \boldsymbol{\sigma} \end{equation}

where $\mathbf{b}(p) = (\alpha p_y, -\alpha p_x, \Delta)$. Now, $\mathbf{b}(p)$ represents a point on the Bloch sphere, and so we expect the eigenvectors to be parallel and anti-parallel to this vector. The energies in this case are very straight forward and amount to the positive and negative of $\lvert\mathbf{b}(p)\rvert$:

\begin{equation} \epsilon_\pm(p) = \frac{p^2}{2m} \pm \sqrt{ \alpha^2 p^2 + \Delta^2}. \end{equation}

With these eigenvalues, it is a straight forward exercise in linear algebra to find the eigenvectors. After a bit of algebra, the eigenvectors of $H$ in terms of the eigenvectors of $\sigma_z$ ( $\sigma_z\left\lvert\uparrow\right\rangle = \left\lvert\uparrow\right\rangle$ and $\sigma_z\left\lvert\uparrow\right\rangle = -\left\lvert\uparrow\right\rangle$ ) are

\begin{equation}\left\lvert\pm\right\rangle = \frac1{\sqrt2}\left[\sqrt{1 \pm \frac{\Delta}{\sqrt{\Delta^2+\alpha^2 p^2}}}\left\lvert\uparrow\right\rangle + e^{-i\phi} \sqrt{1 \mp \frac{\Delta}{\sqrt{\Delta^2+\alpha^2 p^2}}}\left\lvert\downarrow\right\rangle \right]\end{equation}

where we have defined $\phi$ by $p_y+ip_x = p e^{i\phi}$. Note that when $p_{x,y} \rightarrow -p_{x,y}$, the occupations stay the same. However, if we just look at one energy, $\epsilon_-(p)$ the ground state energy, we see that the state we get when $p_{x,y} \rightarrow -p_{x,y}$ is almost orthogonal to the original state.

The energy bands themselves look like the figure on the right where the vertical axis is energy (and for this particular example, $m=1$, $\alpha = 3$, and $\Delta=2$). Interestingly, the introduction of $\Delta$ actually causes the gap to open up – the dotted lines are for when $\Delta=0$.

Now, if we have a bunch of fermions filling up these energies, if we set the chemical potential to be in the gap, we would find that the only excitations would states that are spin-locked to the momentum.

Many things can be done with this Hamiltonian to interesting effect. It finds its way into cold atom physics as well as condensed matter.