The Heisenberg Relation - Mathematical Formulations

We study some of the possibilities for formulating the Heisenberg relation of quantum mechanics in mathematical terms. In particular, we examine the framework discussed by Murray and von Neumann, the family (algebra) of operators affiliated with a finite factor (of infinite linear dimension).


Introduction
The celebrated Heisenberg relation, = h 2π and h is Planck's experimentally determined quantum of action ≈6.625 × 10 −27 erg sec, is one of the basic relations (perhaps, the most basic relation) of quantum mechanics. Its form announces, even before a tentative mathematical framework for quantum mechanics has been specified, that the mathematics of quantum mechanics must be non-commutative. By contrast, the mathematics suitable for classical mechanics is the commutative mathematics of algebras of real-valued functions on spaces (manifolds).
Our program in this article is to study specific mathematical formulations that have the attributes necessary to accommodate the calculations of quantum mechanics, more particularly, to accommodate the Heisenberg relation. We begin that study by noting the inadequacy of two natural candidates. We turn, after that, to a description of the "classic" representation and conclude with a model, suggested by von Neumann, especially suited to calculations with unbounded operators. Von Neumann had hoped that this model might resolve the mathematical problems that the founders of quantum mechanics were having with those calculations. In connection with this hope and the Heisenberg relation, we answer a question that had puzzled a number of us. 8πhcλ −5 e hc/kλT − 1 dλ expresses the energy per unit volume inside a cavity with reflecting walls associated with the wave lengths lying between λ and λ + dλ of a full (black body) radiator at (absolute) temperature T . Try as he could to explain his formula purely in terms of Maxwell's classical electromagnetic theory, Planck could not rid himself of the need for the assumption that energy emitted and absorbed by one of the basic units of the radiating system, a linear (harmonic) oscillator of frequency ν occurred as integral multiples of hν. If Planck assumed that there is no smallest non-zero amount of energy that each unit (oscillator) could emit, then he was led to the classical Rayleigh-Jeans radiation formula 8πkT λ −4 dλ, which is a good approximation to Planck's formula for larger wave lengths. However, as λ → 0, the energy associated with the oscillator of this wave length tends to ∞; and the total energy per unit volume in the cavity due to this part of the spectrum is, according to the Rayleigh-Jeans formula, which also tends to ∞ as λ → 0. This breakdown of the formula, being associated with short wave lengths, those corresponding to the ultraviolet end of the spectrum, was termed "ultraviolet catastrophe".
Planck was forced to assume that there was the "quantum of action", h erg sec, for his formula to agree with experimental measurements at the high frequency as well as the low frequency spectrum of the radiation. It is no small irony that, for some years, Planck was deeply disappointed by this shocking break with the principles of classical mechanics embodied in his revolutionary discovery. It was not yet clear to him that this discovery was to become the fundamental physical feature of "small" systems. Not many years later, others began to make the assumption of "quanta" in different physical processes involving small systems.
Any discussion of the inadequacy of classical mechanics for explaining the phenomena noted in experiments on systems at the subatomic scale must include some examination of the startling evidence of the dual, corpuscular (material "particle") and wave, nature of light. Maxwell (1865) had shown that the speed of electromagnetic-wave propagation in a vacuum is a constant (representing the ratio of the electromagnetic unit of charge to the electrostatic unit, abcoulomb/esu) 3×10 10 cm/sec (approximately). He noted that this is (close to) the speed of light and concluded that light is a form of electromagnetic wave. Measurements indicate that the wave length of the visible spectrum lies between 0.000040 cm (=4000 × 10 −8 cm =4000Ångstrøm) at the violet end and 0.000076 cm (=7600Å) at the red end. Above this, to 0.03 cm is the infrared spectrum. Below the violet is the ultraviolet spectrum extending to 130Å, and below this, the X-rays from 100Å to 0.1Å, and the γ-rays from 0.1Å to 0.005Å.
Another type of wave behavior exhibited by light is the phenomenon of polarization. A pair of thin plates cut from a crystal of tourmaline allows no light to pass through it if one is held behind the other with their optical axes perpendicular. As one of the plates is rotated through a 90 • angle, more and more of the light passes through, the maximum occurring when the axes are parallel -an indication of light behaving as a transverse wave.
The phenomenon of (wave) interference provides additional evidence of the transverse-wave character of light. Two waves of the same frequency and amplitude are superimposed. If they are in phase, they "reenforce" one another. If they are in phase opposition, they cancel.
Further evidence of the wave nature of light is found in the phenomenon of diffraction -the modification waves undergo in passing the edges of opaque bodies or through narrow slits in which the wave direction appears to bend producing fringes of reenforcement and cancelation (light and dark). A diffraction grating, which consists, in essence, of a transparent plate on which parallel, evenly spaced, opaque lines are scribed -several thousand to the centimeter -uses interference and diffraction to measure wave length. A brief, simple, geometric examination of what is happening to the light, considered as transverse waves, during interference and diffraction shows how this measurement can be made.
In 1912, von Laue proposed that crystals might be used as natural diffraction gratings for diffracting high frequency X-rays. The spaces between the atomic planes would act as slits of the grating for diffracting X-rays. The nature of X-rays was far from clear at that point. Von Laue was convinced by experiments a year earlier of C.G. Barkla that X-rays are electromagnetic waves, but, waves of very short wave lengths. In order for interference effects to produce fringe patterns with a grating, the distance between the "slits" cannot be much larger than the wave lengths involved. Presumably the atoms in the crystals von Laue envisioned are close and symmetrically spaced. The spaces between the atomic planes would act as slits of the grating. As it turns out, the distance between neighboring atoms in a crystal is about 1Å. Von Laue suggested to W. Friedrich and F. Knipping that they examine his ideas and calculations experimentally. They did, and confirmed his conjectures.
Einstein's 1905 description of the "photo electric effect" is one of the basic known instances of these early "ad hoc" quantum assumptions. J.J. Thompson and Lenard noted that ultraviolet light falling on metals causes an emission of electrons. Varying the intensity of the light does not change the velocity, but does change the number of electrons emitted. Einstein used Planck's assumption that energy in radiation is emitted and absorbed in quanta of size hν, where ν is the frequency. Einstein pictures the light as waves in which energy is distributed discretely over the wave front in quanta (called photons) with energy hν and momenta h/λ. His photo electric equation expresses the maximum kinetic energy of an emitted electron when the frequency of the incident radiation is ν and a is the energy required to remove one of the lightly bound electrons (a varies with the metal). The photo electric effect is an indication of the corpuscular (material-particle) nature of light. Perhaps, the most dramatic instance of the early appearances of the ad hoc quantum assumptions was Niels Bohr's 1913 explanation, in theoretical terms, of the lines in the visible portion of the spectrum of hydrogen, the "Balmer series". Their wave lengths are: 6563Å (red), 4861Å (blue), 4380Å, 4102Å, and 3921Å (at the ultra violet end). Bohr uses Rutherford's "planetary" model of the atom as negatively charged electrons moving with uniform angular velocities in circular orbits around a central nucleus containing positively charged protons under an attractive Coulomb force. In the case of hydrogen, there is one electron and one proton with charges − and + ; so, the attractive force is − 2 r 2 (between the electron and the proton). If ω is the uniform angular velocity of the electron, its linear velocity is rω (tangential to the orbit), where r is the radius of its orbit, and its linear acceleration is rω 2 directed "inward", along "its radius". The moment of inertia I of the electron about the nucleus is mr 2 (the measure of its tendency to resist change in its rotational motion, as mass is the measure of its tendency to resist change in its linear motion). The "angular momentum" of the electron is Iω, where m is the (rest) mass of the electron (9.11 × 10 −28 gm). Bohr's single quantum assumption is that the angular momentum in its stable orbits should be an integral multiple of (= h 2π ). That is, mr 2 ω = k , with k an integer, for those r in which the electron occupies a possible orbit.
At this point, it is worth moving ahead ten years in the chronological development, to note de Broglie's 1923 synthesis of the increasing evidence of the dual nature of waves and particles; he introduces "matter waves". De Broglie hypothesized that particles of small mass m moving with (linear) speed v would exhibit a wave like character with wave length h/mv. Compare this with Einstein's assumption of momentum h/λ (=mv). So, for perspective, an electron moving at c/3 would have wave length: h mv = 6.625 × 10 −27 erg sec 9.11 × 10 −28 gm × 10 10 cm/sec = 66.25 × 10 −10 dyne cm 9.11 gm cm/sec 2 ≈ 0.0727Å.
Returning to the Bohr atom, Bohr's quantum assumption, mr 2 ω = k , can be rewritten as Combining this with de Broglie's principle, and noting that rω is the linear speed of the electron (directed tangentially to its orbit), h/mrω is its wave length, when it is viewed as a wave. Of course, 2πr is the length of its (stable) orbit. It is intuitively satisfying that the stable orbits are those with radii such that they accommodate an integral number of "complete" wave cycles, a "standing wave-train". Considering, again, the hydrogen atom, and choosing units so that no constant of proportionality is needed (the charge is in esu -electrostatic units), we have that mrω 2 = 2 r 2 . From Bohr's quantum assumption, mr 2 ω = kh 2π . Thus The values 1, 2, 3 of k give possible values of r for the stable states.
The kinetic energy of the electron in the orbit corresponding to r is 1 2 mv 2 (= 1 2 mr 2 ω 2 ). The potential energy of the electron in this Coulomb field can be taken as the work done in bringing it from ∞ to its orbit of radius r. That is, its potential energy is The total energy is The differences in energy levels will be The wave number (that is, the number of waves per centimeter) is given by 11 × 10 −28 gm × (4.8025 × 10 −10 esu) 4 (6.625 × 10 −27 erg sec) 3 × 2.99776 × 10 10 cm/sec (= 109, 739.53/cm) × 1 If we substitute 2 for k and then: 3 for l, we find that λ = 6561Å, 4 for l gives 4860Å, 5 for l gives 4339Å, 6 for l gives 4101Å, 7 for l gives 3969Å. Comparing these wave lengths with those noted before (from spectroscopy), we see startling agreement, especially when we note that the physical constants that we use are approximations derived from experiments. Of course, poor choices for the rest mass and charge of the electron would produce unacceptable values for wave lengths in the hydrogen spectrum. It was a happy circumstance that reasonably accurate values of the mass and charge of the electron were available after the 1912 Millikan "oil drop" experiment.
This striking evidence of the efficacy of uniting the principles of Newtonian (Hamiltonian) mechanics and "ad hoc" quantum assumptions makes clear the importance of finding a mathematical model capable of holding, comfortably, within its structure both classical mechanical principles and a mathematics that permits the formulation of those "ad hoc" quantum assumptions. We study the proposal (largely Dirac's) for such a mathematical structure in the section that follows.

Quantum mechanics -a mathematical model
In Dirac's treatment of physical systems [2], there are two basic constituents: the family of observables and the family of states in which the system can be found. In classical (Newtonian-Hamiltonian) mechanics, the observables are algebraic combinations of the (canonical) coordinates and (conjugate) momenta. Each state is described by an assignment of numbers to these observables -the values certain to be found by measuring the observables in the given state. The totality of numbers associated with a given observable is its spectrum. In this view of classical statics, the observables are represented as functions on the space of states -they form an algebra, necessarily commutative, relative to pointwise operations. The experiments involving atomic and sub-atomic phenomena made it clear that this Newtonian view of mechanics would not suffice for their basic theory. Speculation on the meaning of these experimental results eventually led to the conclusion that the only physically meaningful description of a state was in terms of an assignment of probability measures to the spectra of the observables (a measurement of the observable with the system in a given state will produce a value in a given portion of the spectrum with a specific probability). Moreover, it was necessary to assume that a state that assigns a definite value to one observable assigns a dispersed measure to the spectrum of some other observable -the amount of dispersion involving the experimentally reappearing Planck's constant. So, in quantum mechanics, it is not possible to describe states in which a particle has both a definite position and a definite momentum. The more precise the position, the less precise the momentum. This is the celebrated Heisenberg uncertainty principle [5]. It entails the non-commutativity of the algebra of observables.
The search for a mathematical model that could mirror the structural features of this system and in which computations in accord with experimental results could be made produced the self-adjoint operators (possibly unbounded) on a Hilbert space as the observables and the unit vectors (up to a complex multiple of modulus 1) as corresponding to the states [7]. If A is an observable and x corresponds to a state of interest, Ax, x , the inner product of the two vectors Ax and x, is the real number we get by taking the average of many measurements of A with the system in the state corresponding to x. Each such measurement yields a real number in the spectrum of A. The probability that that measurement will lie in a given subset of the spectrum is the measure of that set, using the probability measure that the state assigns to A. The "expectation" of the observable A in the state corresponding to x is Ax, x .
With this part of the model in place, Dirac assigns a self-adjoint operator H as the energy observable and, by analogy with classical mechanics, assumes that it will "generate" the dynamics, the time-evolution of the system. This time-evolution can be described in two ways, either as the states evolving in time, the "Schrödinger picture" of quantum mechanics, or the observables evolving in time, the "Heisenberg picture" of quantum mechanics. The prescription for each of these pictures is given in terms of the one-parameter unitary group t → U t , where t ∈ R, the additive group of real numbers, and U t is the unitary operator exp(itH), formed by applying the spectral-theoretic, function-calculus to the self-adjoint operator H, the Hamiltonian of our system. If the initial state of our system corresponds to the unit vector x, then at time t, the system will have evolved to the state corresponding to the unit vector U t x. If the observable corresponds to the self-adjoint operator A at time 0, at time t, it will have evolved to U * t AU t (=α t (A)), where, as can be seen easily, t → α t is a one-parameter group of automorphisms of the "algebra" (perhaps, "Jordan algebra") of observables. In any event, the numbers we hope to measure are AU t x, U t x , the expectation of the observable A in the state (corresponding to) U t x, as t varies, and/or (U * t AU t )x, x , the expectation of the observable α t (A) in the state x, as t varies. Of course, the two varying expectations are the same, which explains why Heisenberg's "matrix mechanics" and Schrödinger's "wave mechanics" gave the same results. (In Schrödinger's picture, x is a vector in the Hilbert space viewed as L 2 (R 3 ), so that x is a function, the "wave function" of the state, evolving in time as U t x, while in Heisenberg's picture, the "matrix" coordinates of the operator A evolves in time as α t (A).) The development of modern quantum mechanics in the mid-1920s was an important motivation for the great interest in the study of operator algebras in general and von Neumann algebras in particular. In [21] von Neumann defines a class of algebras of bounded operators on a Hilbert space that have acquired the name "von Neumann algebras" [3] (Von Neumann refers to them as "rings of operators"). Such algebras are self-adjoint, strong-operator closed, and contain the identity operator. Von Neumann's article [21] opens up the subject of "operator algebras" (see also [16,17,18,23]).
We use [9,10,11,12,13] as our basic references for results in the theory of operator algebras as well as for much of our notation and terminology. Let H be a Hilbert space over the complex numbers C and let , denote the (positive definite) inner product on H. By definition, H is complete relative to the norm defined by the equation x = x, x 1 2 (x ∈ H). If K is another Hilbert space and T is a linear operator (or linear transformation) from H into K, T is continuous if and only if sup{ T x : x ∈ H, x ≤ 1} < ∞. This supremum is referred to as the norm or (operator ) bound of T . Since continuity is equivalent to the existence of a finite bound, continuous linear operators are often described as bounded linear operators. The family B(H, K) of all bounded linear operators from H into K is a Banach space relative to the operator norm. When K = H, We write B(H) in place of B(H, H). In this case, B(H) is a Banach algebra with the operator I, the identity mapping on H, as a unit element.
If T is in B(H, K), there is a unique element T * of B(K, H) such that T x, y = x, T * y (x ∈ H, y ∈ K). We refer to T * as the adjoint of T . Moreover, (aT + bS) * =āT * +bS * , (T * ) * = T , T * T = T 2 , and T = T * , whenever S, T ∈ B(H, K), and a, b ∈ C. When H = K, we have that (T S) * = S * T * . In this same case, we say that T is self-adjoint when T = T * . A subset of B(H) is said to be self-adjoint if it contains T * when it contains T .
The metric on B(H) that assigns T − S as the distance between T and S gives rise to the norm or uniform topology on B(H). There are topologies on B(H) that are weaker than the norm topology. The strong-operator topology is the weakest topology on B(H) such that the mapping T → T x is continuous for each vector x in H. The weak-operator topology on B(H) is the weakest topology on B(H) such that the mapping T → T x, x is continuous for each vector x in H.
The self-adjoint subalgebras of B(H) containing I that are closed in the norm topology are known as C * -algebras. Each abelian C * -algebra is isomorphic to the algebra C(X) (under pointwise addition and multiplication) of all complex-valued continuous functions on a compact Hausdorff space X. Each C(X) is isomorphic to some abelian C * -algebra. The identification of the family of abelian C * -algebras with the family of function algebras C(X) underlies the interpretation of the general study of C * -algebras as noncommutative (real) analysis. This "noncommutative" view guides the research and provides a large template for the motivation of the subject. When noncommutative analysis is the appropriate analysis, as in quantum theory [15,24], operator algebras provide the mathematical framework.
Those self-adjoint operator algebras that are closed under the strong-operator topology are called von Neumann algebras (each von Neumann algebra is a C * -algebra). We describe some examples of commutative von Neumann algebras. Suppose (S, µ) is a σ-finite measure space. Let H be L 2 (S, µ). With f an essentially bounded measurable function on S, we define M f (g) to be the product f · g for each g in H. The family A = {M f } of these multiplication operators is an abelian von Neumann algebra and it is referred to as the multiplication algebra of the measure space (S, µ). Moreover, A is in no larger abelian subalgebra of B(H). We say A is a maximal abelian (self-adjoint) subalgebra, a masa. Here are some specific examples arising from choosing explicit measure spaces. Choose for S a finite or countable number of points, say n, each of which has a positive measure (each is an atom). We write "A = A n " in this case. Another example is given by choosing, for S, [0, 1] with Lebesgue measure. In this case, we write "A = A c " ("c" stands for "continuous"). Finally, choose, for S, [0, 1] with Lebesgue measure plus a finite or countably infinite number n of atoms. We write "A = A c ⊕ A n " in this case.
Theorem 3.1. Each abelian von Neumann algebra on a separable Hilbert space is isomorphic to one of A n , A c , or A c ⊕ A n . Each maximal abelian von Neumann algebra on a separable Hilbert space is unitarily equivalent to one of these.
In the early chapters of [2], Dirac is pointing out that Hilbert spaces and their orthonormal bases, if chosen carefully, can be used to simplify calculations and for determinations of probabilities, for example, finding the frequencies of the spectral lines in the visible range of the hydrogen atom (the Balmer series), that is, the spectrum of the operator corresponding to the energy "observable" of the system, the Hamiltonian. In mathematical terms, Dirac is noting that bases, carefully chosen, will simultaneously "diagonalize" self-adjoint operators in an abelian (or "commuting") family. Notably, the masas we have just been describing.
The early experimental work that led to quantum mechanics made it clear that, when dealing with systems at the atomic scale, where the measurement process interferes with what is being measured, we are forced to model the physics of such systems at a single instant of time, as an algebraic mathematical structure that is not commutative. Dirac thinks of his small, physical system as an algebraically structured family of "observables" -elements of the system to be observed when studying the system, for example, the position of a particle in the system would be an observable Q (a "canonical coordinate") and the (conjugate) momentum of that particle as another observable P -and they are independent of time. As the particle moves under the "dynamics" of the system, the position Q and momentum P become time dependent. By analogy with classical mechanics, Dirac refers to them, in this case, as "dynamical variables". He recalls the Hamilton equation of motion for a general dynamical variable that is a function of the canonical coordinates {q r } and their conjugate momenta {p r }: where H is the energy expressed as a function of the q r and p r and, possibly, of t. This H is the Hamiltonian of the system. Hence, with v a dynamical variable that is a function of the q r and p r , but not explicitly of t, where δ r,s is the Kronecker delta, 1 when r = s and 0 otherwise. So, Dirac assumes that the quantum Poisson brackets of the position Qs and the momentum P s satisfy these same relations. In the case of one degree of freedom, that is, one Q (and its conjugate momentum P ), QP − P Q = i I, the basic Heisenberg relation. This relation encodes the non-commutativity needed to produce the so-called "ad hoc quantum assumptions" made by the early workers in quantum physics. At the same time, this relation gives us a "numerical grip" on "uncertainty" and "indeterminacy" in quantum mechanics. In addition, the Heisenberg relation makes it clear (regrettably) that quantum mechanics cannot be modeled using finite matrices alone. The trace of QP − P Q is 0 when Q and P are such matrices, while the trace of i I is not 0 (no matter how we normalize the trace). It can be shown that the Heisenberg relation cannot be satisfied even with bounded operators on an infinite-dimensional Hilbert space. Unbounded operators are needed, even unavoidable for "representing" (that is, "modeling") the Heisenberg relation mathematically. This topic is studied in the following sections.

Basics of unbounded operators on a Hilbert space
What follows is a compendium of material drawn from Sections 2.7, 5.2, 5.6, and 6.1 of [10,11]: material that we need in the succeeding sections gathered together here for the convenience of the reader.

Definitions and facts
Let T be a linear mapping, with domain D(T ) a linear submanifold (not necessarily closed), of the Hilbert space H into the Hilbert space K. We associate a graph G (T ) with T , where We say that T is closed when G (T ) is closed. The closed graph theorem tells us that if T is defined on all of H, then G (T ) is closed if and only if T is bounded. The unbounded operators T we consider will usually be densely defined, that is, D(T ) is dense in H. We say that T 0 extends (or is an extension of ) T , and write T ⊆ T 0 , when D(T ) ⊆ D(T 0 ) and T 0 x = T x for each x in D(T ). If G (T ) − , the closure of the graph of T (a linear subspace of H K), is the graph of a linear transformation T , clearly T is the "smallest" closed extension of T , we say that T is preclosed (or closable) and refer to T as the closure of T . From the point of view of calculations with an unbounded operator T , it is often much easier to study its restriction T |D 0 to a dense linear manifold D 0 in its domain D(T ) than to study T itself. If T is closed and G (T |D 0 ) − = G (T ), we say that D 0 is a core for T . Each dense linear manifold in G (T ) corresponds to a core for T .
If T is a linear transformation with D(T ) dense in the Hilbert space H and range contained in the Hilbert space K, we define a mapping T * , the adjoint of T , as follows. Its domain consists of those vectors y in K such that, for some vector z in H, x, z = T x, y for all x in D(T ). For such y, T * y is z. If T = T * , we say that T is self-adjoint. (Note that the formal relation T x, y = x, T * y , familiar from the case of bounded operators, remains valid in the present context only when x ∈ D(T ) and y ∈ D(T * ).) (iv) if T is closed, T * T + I is one-to-one with range H and positive inverse of bound not exceeding 1.
Definition 4.4. We say that T is symmetric when D(T ) is dense in H and T x, y = x, T y for all x and y in D(T ). Equivalently, T is symmetric when T ⊆ T * . (Since T * is closed and G (T ) ⊆ G (T * ), in this case, T is preclosed if it is symmetric. If T is self-adjoint, T is both symmetric and closed.) It follows that A has no proper symmetric extension. That is, a self-adjoint operator is maximal symmetric.
Proposition 4.6. If T is a closed symmetric operator on the Hilbert space H, the following assertions are equivalent: (ii) T * ± iI have (0) as null space; (iii) T ± iI have H as range; (iv) T ± iI have ranges dense in H.
Proposition 4.7. If T is a closed linear operator with domain dense in a Hilbert space H and with range in H, then where N (T ) and R(T ) denote the projections whose ranges are, respectively, the null space of T and the closure of the range of T .

Spectral theory
If A is a bounded self-adjoint operator acting on a Hilbert space H and A is an abelian von Neumann algebra containing A, there is a family {E λ } of projections in A (indexed by R), called the spectral resolution of A, such that (v) A = A − A λdE λ in the sense of norm convergence of approximating Riemann sums; and A is the norm limit of finite linear combinations with coefficients in sp(A), the spectrum With the abelian von Neumann algebra A isomorphic to C(X) and X an extremely disconnected compact Hausdorff space, if f and e λ in C(X) correspond to A and E λ in A , then e λ is the characteristic function of the largest clopen subset X λ on which f takes values not exceeding λ.
The spectral theory described above can be extended to unbounded self-adjoint operators. We associate an unbounded spectral resolution with each of them. We begin with a discussion that details the relation between unbounded self-adjoint operators and the multiplication algebra of a measure space.
If g is a complex measurable function (finite almost everywhere) on a measure space (S, m), without the restriction that it be essentially bounded -multiplication by g will not yield an everywhere-defined operator on L 2 (S), for many of the products will not lie in L 2 (S). Enough functions f will have product gf in L 2 (S), however, to form a dense linear submanifold D of L 2 (S) and constitute a (dense) domain for an (unbounded) multiplication operator M g . To see this, let E n be the (bounded) multiplication operator corresponding to the characteristic function of the (measurable) set on which |g| ≤ n. Since g is finite almost everywhere, {E n } is an increasing sequence of projections with union I. The union D 0 of the ranges of the E n is a dense linear manifold of L 2 (S) contained in D. A measure-theoretic argument shows that M g is closed with D 0 as a core. In fact, if {f n } is a sequence in D converging in L 2 (S) to f and {gf n } converges in L 2 (S) to h, then, passing to subsequences, we may assume that {f n } and {gf n } converges almost everywhere to f and h, respectively. But, then, {gf n } converges almost everywhere to gf , so that gf and h are equal almost everywhere.
Note that M g E n is bounded with norm not exceeding n. One can show that M g is an (unbounded) self-adjoint operator when g is real-valued. If M g is unbounded, we cannot expect it to belong to the multiplication algebra A of the measure space (S, m). Nonetheless, there are various ways in which M g behaves as if it were in A -for example, M g is unchanged when it is "transformed" by a unitary operator U commuting with A . In this case, U ∈ A , so that U = M u where u is a bounded measurable function on S with modulus 1 almost everywhere. With f in D(M g ), guf ∈ L 2 (S); while, if guh ∈ L 2 (S), then gh ∈ L 2 (S) and h ∈ D(M g ). Thus U transforms D(M g ) onto itself. Moreover Thus U * M g U = M g . The fact that M g "commutes" with all unitary operators commuting with A in conjunction with the fact that each element of a C*-algebra is a finite linear combination of unitary elements in the algebra and the double commutant theorem (from which it follows that a bounded operator that commutes with all unitary operators commuting with A lies in A ) provides us with an indication of the extent to which M g "belongs" to A . We formalize this property in the definition that follows.
Definition 4.8. We say that a closed densely defined operator T is affiliated with a von Neumann algebra R and write T ηR when U * T U = T for each unitary operator U commuting with R. (Note that the equality, U * T U = T , is to be understood in the strict sense that U * T U and T have the same domain and formal equality holds for the transforms of vectors in that domain. As far as the domains are concerned, the effect is that U transforms D(T ) onto itself.) Remark 4.9. If T is a closed densely defined operator with core D 0 and U * T U x = T x for each x in D 0 and each unitary operator U commuting with a von Neumann algebra R, then T ηR.
Theorem 4.10. If A is a self-adjoint operator acting on a Hilbert space H, A is affiliated with some abelian von Neumann algebra A . There is a resolution of the identity for each x in F n (H) and all n, in the sense of norm convergence of approximating Riemann sums.
Since A is self-adjoint, from Proposition 4.6, A + iI and A − iI have range H and null space (0); in addition, they have inverses, say T + and T − , that are everywhere defined with bound not exceeding 1. Let A be an abelian von Neumann algebra containing I, T + and T − .
In particular, A is affiliated with the abelian von Neumann algebra generated by I, T + and T − . Since A is abelian, A is isomorphic to C(X) with X an extremely disconnected compact Hausdorff space. Let g + and g − be the functions in C(X) corresponding to T + and T − . Let f + and f − be the functions defined as the reciprocals of g + and g − , respectively, at those points where g + and g − do not vanish. Then f + and f − are continuous where they are defined on X, as is the function f defined by f = (f + + f − )/2. In a formal sense, f is the function that corresponds to A. Let X λ be the largest clopen set on which f takes values not exceeding λ. Let e λ be the characteristic function of X λ and E λ be the projection in A corresponding to e λ . In this case, That is, we have constructed a resolution of the identity {E λ }. This resolution is unbounded if f / ∈ C(X). Let F n = E n − E −n , the spectral projection corresponding to the interval [−n, n] for each positive integer n. AF n is bounded and self-adjoint. Moreover, ∪ ∞ n=1 F n (H) is a core for A. From the spectral theory of bounded self-adjoint operators, Ax = n −n λdE λ x, for each x in F n (H) and all n.

Polar decomposition
Each T in B(H) has a unique decomposition as V H, the polar decomposition of T , where H = (T T * ) 1/2 and V maps the closure of the range of H, denoted by r(H), isometrically onto r(T ) and maps the orthogonal complement of r(H) to 0. We say that V is a partial isometry with initial space r(H) and final space r(T ). If R(H) is the projection with range r(H) (the range projection of H), then V * V = R(H) and V V * = R(T ). We note that the components V and H of this polar decomposition lie in the von Neumann algebra R when T does. There is an extension of the polar decomposition to the case of a closed densely defined linear operator from one Hilbert space to another.
Theorem 4.11. If T is a closed densely defined linear transformation from one Hilbert space to another, there is a partial isometry V with initial space the closure of the range of (T * T ) 1/2 and final space the closure of the range of T such that T = V (T * T ) 1/2 = (T * T ) 1/2 V . Restricted to the closures of the ranges of T * and T , respectively, T * T and T T * are unitarily equivalent (and V implements this equivalence). If T = W H, where H is a positive operator and W is a partial isometry with initial space the closure of the range of H, then H = (T * T ) 1/2 and W = V . If R is a von Neumann algebra, T ηR if and only if V ∈ R and (T * T ) 1/2 ηR.

Representations of the Heisenberg relation
In this section, we study the Heisenberg relation: classes of elements with which it can't be realized, a classic example in which it can be realized with a bounded and an unbounded operator (the argument drawn from [12]) and special information about extendability to self-adjoint operators. The standard representation, involving a multiplication operator ("position") and differentiation ("momentum") viewed as the infinitesimal generator of the one-parameter group of translations of the additive group of the reals, appears in Section 5.3. The account is precise and complete also with regard to domains and unbounded operator considerations.

Bounded operators
Heisenberg's encoding of the ad-hoc quantum rules in his commutation relation, QP −P Q = i I, where Q and P are the observables corresponding to the position and momentum (say, of a particle in the system) respectively, I is the identity operator and = h 2π with h as Planck's constant, embodies the characteristic indeterminacy and uncertainty of quantum theory. The very essence of the relation is its introduction of non-commutativity between the particle's position Q and its corresponding conjugate momentum P . This is the basis for the view of quantum physics as employing noncommutative mathematics, while classical (Newtonian-Hamiltonian) physics involves just commutative mathematics. If we look for mathematical structures that can accommodate this non-commutativity and permit the necessary computations, families of matrices come quickly to mind. Of course, we, and the early quantum physicists, can hope that the finite matrices will suffice for our computational work in quantum physics. Unhappily, this is not the case, as the trace (functional) on the algebra of complex n × n matrices makes clear to us. The trace of the left side of the Heisenberg relation is 0 for matrices P and Q, while the trace of the right side is i ( =0). That is to say, the Heisenberg relation cannot be satisfied by finite matrices. Of course, the natural extension of this attempt is to wonder if infinite-dimensional Hilbert spaces might not "support" such a representation with bounded operators. Even this is not possible as we shall show. Let us argue informally for the moment. The following argument leads us to the correct formula for the inverse of I − BA, and gives us a proof that holds in any ring with a unit.
Thus if I − AB has an inverse, we may hope that B(I − AB) −1 A + I is an inverse to I − BA. Multiplying, we have and similarly for right multiplication by I − BA.  [26] proved the quantum result for bounded self-adjoint operators on a Hilbert space. H. Wielandt [25] proved it for elements of a Banach algebra by a method different from what has just been used.)

With unbounded operators
In Section 5.1, we showed that the Heisenberg relation is not representable in terms of elements of complex Banach algebras with a unit element. Therefore, in our search for ways to represent the Heisenberg relation in some (algebraic) mathematical structure, we can eliminate finite matrices, bounded operators on an infinite-dimensional Hilbert space, and even elements of more general complex Banach algebras. Is there anything left? It becomes clear that unbounded operators would be essential for dealing with the non-commutativity that the Heisenberg relation carries. The following example gives a specific representation of the relation with one of the representing operators bounded and the other unbounded. Proof . Each element f of H can be approximated (in L 2 norm) by a continuous function f 1 . In turn, f 1 can be approximated (in the uniform norm, hence in the L 2 norm) by a polynomial f 2 . Finally, f 2 can be approximated (in L 2 norm) by an element f 3 of D 0 ; indeed, it suffices to take f 3 = gf 2 , where g : [0, 1] → [0, 1] is continuously differentiable, vanishes at the endpoint 0 and 1, and takes the value 1 except at points very close to 0, 1.
The preceding argument shows that D 0 is dense in H, so D 0 is a densely defined linear operator. When f, g ∈ D 0 , the functionḡ has a continuous derivativeḡ , and we have Thus iD 0 f, g = f, iD 0 g , for all f and g in D 0 ; and iD 0 is symmetric. Thus One can press this example further to show that iD 0 has a self-adjoint extension. We shall show the following: (ii) The equation defines a closed linear operator D 1 with domain D 1 = K(H 1 ), and D 1 is the closure of D 0 .
(iii) The equation . Now K is one-to-one, the equation D 1 Kf 1 = f 1 (f 1 ∈ H 1 ) defines a linear operator D 1 with dense domain D 1 = (K(H 1 )).
If {g n } is a sequence in D 1 such that g n → g and D 1 g n = f , then g n = Kf n and D 1 g n = f n for some sequence {f n } in H 1 . Since f n → f , H 1 is closed, and K is bounded, we have f ∈ H 1 and Kf = lim Kf n = lim g n = g. Thus g ∈ K(H 1 ) = D 1 , and D 1 g = f ; so D 1 is closed.
If g ∈ D 0 (⊆ K(H 1 )), then g = Kg and g ∈ H 1 . Thus g ∈ D 1 , To prove that D 1 ⊆ D 0 , suppose that g ∈ D 1 and D 1 g = f . Then f ∈ H 1 , and Kf = g. There is a sequence {h n } of continuous functions on [0, 1] such that f − h n → 0; and h n , u → f, u = 0. With f n defined as h n − h n , u u, f n is continuous, f n , u = 0, and f − f n → 0. Let g n = Kf n , so that g n → Kf = g. Since it follows that g n has a continuous derivative f n , and satisfies g n (0) = g n (1) = 0. Thus g n ∈ D 0 , g n → g, and D 0 g n = f n → f = D 1 g. This shows that each point (g, D 1 g) in the graph of D 1 is the limit of a sequence {(g n , D 0 g n )} in the graph of D 0 ; so D 1 ⊆ D 0 .
(iii) If f ∈ H, a ∈ C and Kf + au = 0, then a + it follows that f is a null function. So the equation defines a linear operator D 2 with domain D 2 = {Kf + au : f ∈ H, a ∈ C}. In addition, D 1 ⊆ D 2 . In particular, D 2 is densely defined. If {g n } is a sequence in D 2 such that g n → g and D 2 g n → f , then g n = Kf n + a n u, where f n ∈ H and a n ∈ C; and D 2 g n = f n . Thus f n → f, Kf n → Kf, a n u = g n − Kf n → g − Kf, and therefore g − Kf = au for some scalar a. Thus g = Kf + au ∈ D 2 , D 2 g = f ; and D 2 is closed.
3 . It follows that D 3 is closed and iD 3 is self-adjoint.
First, we note that if f 1 ∈ H 1 , f ∈ H and a ∈ C, then   Suppose that g 1 , g 2 ∈ D 3 , and let g j = Kf j + a j u, where f 1 , f 2 ∈ H 1 and a 1 , a 2 ∈ C. Since f j , u = 0, from (5.1) we have For any f 1 ∈ H 1 and a ∈ C, Kf 1 + au ∈ D 3 , and D 3 (Kf 1 + au) = f 1 . Thus By varying a, it follows that h, u = 0; so h ∈ H 1 , and f 1 , g = Kf 1 , h . From (5.1), we now have , and g = −Kh + au for some scalar a. Thus g ∈ D 3 , and D(D * 3 ) ⊆ D 3 .

The classic representation
Given the discussion and results to this point, what are we to understand by a "representation of the Heisenberg relation", QP − P Q = i I? Having proved that this representation cannot be achieved with finite matrices in place of Q and P and I, nor even with bounded operators on a Hilbert space, nor elements Q, P , I in a complex Banach algebra, we begin to examine the possibility that this representation can be effected with unbounded operators for Q and P . It is "rumored", loosely, that Q, which is associated with the physical observable "position" on R, and P , which is associated with the (conjugate) "momentum" observable, will provide such a representation. The observable Q is modeled, nicely, by the self-adjoint operator, multiplication by x on L 2 (R), with domain those f in L 2 (R) such that xf is in L 2 (R). The observable P is modeled by i d dt , differentiation on some appropriate domain of differentiable functions with derivatives in L 2 (R). But QP − P Q certainly can't equal i I, since its domain is contained in D(Q) ∩ D(P ), which is not H. The domain of P must be chosen so that P is self-adjoint and D(QP − P Q) is dense in H and QP − P Q agrees with i I on this dense domain. In particular, QP − P Q ⊆ i I. Since i I is bounded, it is closed, and QP − P Q is closable with closure i I. We cannot insist that, with the chosen domains for Q and P , QP − P Q be skew-adjoint, for then it would be closed, bounded, and densely defined, hence, everywhere defined. In the end, we shall mean by "a representation of the Heisenberg relation QP − P Q = i I on the Hilbert space H" a choice of self-adjoint operators Q and P on H such that QP − P Q has closure i I.
As mentioned above, the classic way [22] to represent the Heisenberg relation QP −P Q = i I with unbounded self-adjoint operators Q and P on a Hilbert space H is to realize H as L 2 (R), the space of square-integrable, complex-valued functions on R and Q and P as, respectively, the operator Q corresponding to multiplication by x, the identity transform on R, and the operator P corresponding to i d dt , where d dt denotes differentiation, each of Q and P with a suitable domain in L 2 (R). The domain of Q consists of those f in L 2 (R) such that xf is in L 2 (R). The operator d dt is intended to be differentiation on L 2 (R), where that differentiation makes sense -certainly, on every differentiable functions with derivative in L 2 (R). However, specifying a dense domain, precisely, including such functions, on which "differentiation" is a self-adjoint operator is not so simple. A step function, a function on R that is constant on each connected component of an open dense subset of R (those components being open intervals) has a derivative almost everywhere (at all but the set of endpoints of the intervals -a countable set), and that derivative is 0. The set of such step functions in L 2 (R) is dense in L 2 (R), as is their linear span. To include that linear span in a proposed domain for our differentiation operator condemns any closed operator extending our differentiation operator to be the everywhere-defined operator 0. Of course, that is not what we are aiming for. Another problem that we face in this discussion is that of "mixing" measure theory with differentiation. We speak, loosely, of elements of our Hilbert space L 2 (R) as "functions". We have learned to work quickly and accurately with the mathematical convenience that this looseness provides us avoiding such pitfalls as taking the union of "too many" sets of measure 0 in the process. The elements of L 2 (R) are, in fact, equivalence classes of functions differing from one another on sets of measure 0. On the other hand, differentiation is a process that focuses on points, each point being a set of Lebesgue measure zero. When we speak of the L 2 -norm of a function in L 2 (R) it doesn't matter which function in the class in question we work with; they all have the same norm. It is not the same with differentiability. Not each function in the class of an everywhere differentiable function is everywhere differentiable. There are functions in such classes that are nowhere differentiable, indeed, nowhere continuous (at each point of differentiability a function is continuous). The measure class of each function on R contains a function that is nowhere continuous. To see this, choose two disjoint, countable, everywhere-dense subsets, for example, the rationals Q in R and Q + √ 2. With f a given function on R, the function g that agrees with f , except on Q where it takes the value 0 and on Q + √ 2 where it takes the value 1 is in the measure class of f and is continuous nowhere (since each non-null open set in R contains a point at which g takes the value 0 and a point at which it takes the value 1). These are some of the problems that arise in dealing with an appropriate domain for d dt . There is an elegant way to approach the problem of finding precisely the self-adjoint operator and its domain that we are seeking. That approach is through the use of "Stone's theorem" [20] (from the very beginning of the theory of unitary representations of infinite groups). We start with a clear statement of the theorem. Particular attention should be paid to the description of the domain of the generator iH in this statement.

Theorem 5.4 (Stone's theorem).
If H is a (possibly unbounded) self-adjoint operator on the Hilbert space H, then t → exp itH is a one-parameter unitary group on H. Conversely, if t → U t is a one-parameter unitary group on H, there is a (possibly unbounded) self-adjoint operator H on H such that U t = exp itH for each real t. The domain of H consists of precisely those vectors x in H for which t −1 (U t x − x) tends to a limit as t tends to 0, in which case this limit is iHx.
The relevance of Stone's theorem emerges from the basic case of the one-parameter unitary group t → U t on L 2 (R), where (U t f )(s) = f (s + t). That is, U t is "translation by t". In this case, U t = exp itH, with H a self-adjoint operator acting on L 2 (R). The domain of H consists of those f in L 2 (R) such that t −1 (U t f − f ) tends to a limit g in L 2 (R), as t tends to 0, in which case, iHf = g. We treat d dt as the infinitesimal generator of this one-parameter unitary group. An easy measure-theoretic argument shows that this one-parameter unitary group is strong-operator continuous on H. That is, U t f → U t f , in the norm topology of H, as t → t , for each f in H, or what amounts to the same thing, since t → U t is a one-parameter group, (R). From Stone's theorem, there is a skew-adjoint (unbounded) operator (iH) we denote by d dt on H such that U t = exp t d dt for each real t. The domain of d dt consists of those f in L 2 (R) such that t −1 (U t f − f ) tends to some g in L 2 (R) as t tends to 0, in which case g = d dt f . Now, let us make some observations to see how Stone's theorem works in our situation. Our aim, at this point, is to study just which functions are and are not in the domain of d dt . (This study will make clear how apt the notation d dt is for the infinitesimal generator of the group of real translations of R.) To begin with, Stone's theorem requires us to study the convergence behavior of t −1 (U t f − f ) as t tends to 0. This requirement is to study the convergence behavior in the Hilbert space metric (in the "mean of order 2", in the terminology of classical analysis), but there is no harm in examining how t −1 (U t f − f ) varies pointwise with t at points s in R.
For this, note that which suggests f as the limit of t −1 (U t f − f ) when f is differentiable with f in L 2 (R) (and motivates the use of the notation " d dt " for the infinitesimal generator of t → U t ). However, the "instructions" of Stone's theorem tell us to find g in L 2 (R) such that Our first observation is that if f fails to have a derivative at some point s 0 in R in an essential way, then f is not in the domain of d dt . This may be surprising, at first, for the behavior of a function at a point rarely has (Lebesgue) measuretheoretic consequences. In the present circumstances, we shall see that the "local" nature of differentiation can result in exclusion from the domain of an unbounded differentiation operator because of non-differentiability at a single point.
We begin with a definition of "jump in a function" that is suitable for our measure-theoretic situation.
Definition 5.5. We say that f has jump a (≥0) for width δ (>0) at s 0 in R when inf{f (s)} with s in one of the intervals [s 0 − δ, s 0 ) or (s 0 , s 0 + δ] is a + sup{f (s)} with s in the other of those intervals.
Typically, one speaks of a "jump discontinuity" when lim distinct. In the strictly measure-theoretic situation with which we are concerned, the concept of "jump", as just defined, seems more appropriate.
Remark 5.6. If f has a jump a for width δ at some point s 0 in R, then U s 0 f has a jump a for width δ at 0, and bU s 0 f has jump ba for width δ at 0 when 0 < b. Letting f r be the function whose value at s is f (rs), one has that f r has a jump a at r −1 s 0 for width r −1 δ. Thus a −1 (U s 0 f ) δ has jump 1 at 0 for width 1.
Proof . We shall show that is unbounded when f has jump 1 at 0. Noting that g r = r −1 g for g in L 2 (R), that (g + h) r = g r + h r , and that is. This holds for each positive r, in particular, when r is δ, where f has jump 1 at 0 for width δ. Since f δ has jump 1 at 0 for width 1 (=δ −1 δ)), from Remark 5.6, it will suffice to show that t −1 (U t f − f ) is unbounded for t near 0, when f has jump 1 at 0 for width 1. We shall do this by finding a sequence t 2 , t 3 , . . . of positive numbers t j tending to 0 such that t −1 We assume that f has jump 1 at 0 for width 1. In this case, |f (s ) − f (s )| ≥ 1 when s ∈ [−1, 0) and s ∈ (0, 1]. Thus, when t n = 1 n−1 , Theorem 5.8 (cf. [14,Theorem 4.7]). If f 1 is a continuously differentiable function on R such that f 1 and f 1 are in L 2 (R), then f 1 ∈ D( d dt ); and d dt (f 1 ) = f 1 .
Proof . We prove, first, that if f , in L 2 (R), vanishes outside some interval [−n, n], with n a positive integer, and f is continuously differentiable on R with derivative f in L 2 (R), then The desired convergence of t −1 (U t f − f ) to f in L 2 (R) follows from this. With f 1 as in the statement of this theorem, suppose that we can find f as in the preceding discussion (that is, vanishing outside a finite interval) such that f 1 − f 2 and f 1 − f 2 are less than a preassigned positive ε. Then (f 1 , f 1 ) is in the closure of the graph of d dt , since each (f, f ) is in that closure from what we have proved. But d dt is skew-adjoint (from Stone's theorem); hence, d dt is closed. Thus, if we can effect the described approximation of f 1 and f 1 by f and f , it will follow that f 1 ∈ D( d dt ) and d dt (f 1 ) = f 1 . Since f 1 and f 1 are continuous and in L 2 (R), the same is true for |f 1 | + |f − 1 | + |f 1 | + |f − 1 |, where g − (s) = g(−s) for each s in R and each complex-valued function g on R. (Note, for this, that s → −s is a Lebesgue-measure-preserving homeomorphism of R onto R.) It follows that, for each positive integer n, there is a real s n such that n < s n and We can choose s n such that s n−1 < s n . Since n < s n , we have that s n → ∞ as n → ∞, and is the function that agrees with h on [−s n , s n ] and is 0 outside this interval. With ε (<1) positive, there is an n 0 such that, if n > n 0 , then each of 2 is less than ε 2 . At the same time, we may choose n 0 large enough so that 1 n < ε 4 when n > n 0 . For such an n, a "suitably modified" f (n) 1 will serve as the desired f for our approximation. In the paragraphs that follow, we describe that modification.
We complete the definition of h by adjoining to the graph of h over [0, x 0 ] the graph of 1 2 y 0 [cos(( 1 2 − x 0 ) −1 π(x − x 0 )) + 1] over [x 0 , 1 2 ]. Note that this graph passes through (x 0 , y 0 ) and ( 1 2 , 0). Finally, we define h(x) to be 0 when x ∈ [ 1 2 , 1]. As constructed, h is continuously differentiable on [0, 1]. Since |h(x)| ≤ 2|h((0)| < ε 2 for x in [0, 1 2 ] and h vanishes on [ 1 2 , 1], We may ask whether the converse statement to the preceding theorem holds as well. Does a function class in D( d dt ) necessarily contain a continuously differentiable function with derivative in L 2 (R)? As it turns out, there are more functions, not as well behaved as continuously differentiable functions, in the domain of d dt . We shall give a complete description of that domain in Theorem 5.11.
Our notation and terminology has a somewhat "schizophrenic" character to it -much in the style of the way mathematics treats certain topics. In the present instance, we use the notation 'L 2 (R)' to denote, both, the collection (linear space) of measurable functions f such that |f | 2 is Lebesgue integrable on R and the Hilbert space of (measure-theoretic) equivalence classes of such functions equipped with the usual Hilbert space structure associated with L 2 spaces. In most circumstances, there is no danger of serious confusion or misinterpretation. In our present discussion of the domain of d dt , these dangers loom large. We note, earlier in this section, that each measure-theoretic equivalence class of functions contains a function that is continuous at no point of R. It can make no sense to attempt to characterize special elements x of L 2 (R) by the "smoothness" properties of all the functions in the equivalence class denoted by 'x' (their continuity, differentiability, and so forth). Despite this, our next theorem describes the domain given to us by the generator, which we are denoting by ' d dt ', of the one-parameter unitary group t → U t of translations of the equivalence classes of functions in L 2 (R) (to other such classes) in terms of smoothness properties. However, these smoothness properties will be those of a single element in the class as we shall see. We note, first, that if an equivalence class contains a continuous function f on R, then f is the unique such function in the class. This is immediate from the fact that f − g vanishes nowhere on some non-null, open interval when f and g are distinct continuous functions, whence f and g differ on a set of positive Lebesgue measure and lie in different measure classes.
The unique continuous function in each measure class of some family of measure classes allows us to distinguish subsets of this family by smoothness properties of that continuous function in the class. In the case of the one-parameter unitary group induced by translations on R, corresponding to an element x in the domain of the Stone generator d dt , the measure class x contains a continuous function (hence, as noted, a unique such function), and this function must be absolutely continuous, in L 2 (R), of course, with derivative almost everywhere on R in L 2 (R). Moreover, an absolutely continuous function in L 2 (R) with derivative almost everywhere in L 2 (R) has measure class an element of the Hilbert space on which the unitary group (corresponding to the translations of R) acts that lies in the domain of d dt . So, this absolutecontinuity smoothness, together with the noted L 2 restrictions, characterizes the domain of d dt . It is dangerously misleading to speak of the domain of d dt as "consisting of absolutely continuous functions in L 2 with almost everywhere derivatives in L 2 "; it consists of the measure classes of such functions and each such class contains, as noted, a function which is nowhere continuous.
We undertake, now, the proof of the theorem that describes the domain of d dt , the generator of t → U t , the one-parameter unitary group corresponding to translations of L 2 (R) (=H). (Compare [14,Theorem 4.8], where a sketch of the proof is given. See, also, [6].) The following results in real analysis will be useful to us [4,19]. Lemma 5.9. Suppose that f ∈ L 1 (R). Let F (x) = x 0 f (s)ds. Then F is differentiable almost everywhere, and the derivative is equal to f almost everywhere. Theorem 5.11. The domain of d dt is the linear subspace of measure classes in H (=L 2 (R)) corresponding to absolutely continuous functions on R whose almost-everywhere derivatives lie in L 2 (R).
Proof . Suppose x ∈ D( d dt ). Then, from Stone's theorem, there is a vector y in H such that, with f in the measure class x and g in the class y, Suppose, now, that x in H (=L 2 (R)) contains an absolutely continuous function f with almost-everywhere derivative g in L 2 (R). Let y be the measure class of g. With this notation, g(r)dr (=g t (s)) tends to g in L 2 norm as |t| → 0 + (Lemma 5.10).
We now describe a core, for d dt , that is particularly useful for computations. Theorem 5.12 (cf. [14,Theorem 4.9]). The family D 0 of functions in L 2 (R) that vanish outside a finite interval and are continuously differentiable with derivatives in L 2 (R) determines a core for the generator d dt of the one-parameter, translation, unitary group on L 2 (R).
Proof . Suppose f is the (unique) continuous function in a measure class {f } in D( d dt ). Suppose, moreover, f is continuously differentiable with derivative in L 2 (R). For any ε > 0, there is a positive integer N (N ≥ 1) such that Using the technique in the proof of Theorem 5.8, we extend g N to R from [−N, N ] so that the extension g remains continuously differentiable with g and g vanishing outside some finite interval and Finally, , it can be approximated as closely as we wish by ({g}, {g }) with g ∈ D 0 . It follows that D 0 is a core for d dt .

R.V. Kadison and Z. Liu
In the classic representation of the Heisenberg relation, QP − P Q = i I, the operator Q corresponds to multiplication by x, the identity transform on R. The domain of Q consists of measure classes of functions f in L 2 (R) such that xf is in L 2 (R). Elementary measure-theoretic considerations establish that D 0 is also a core for Q. Moreover, D 0 ⊆ D(QP ) ∩ D(P Q), that is, D 0 is contained in the domain of QP − P Q. A calculation, similar to the one at the end of Example 5.2, shows that Moreover, for any {f } ∈ D (= D(QP − P Q), the domain of QP − P Q), with f the unique continuous function in the measure class {f }, for all t at which f is differentiable, Thus As noted, the family of continuously differentiable functions on R vanishing outside finite intervals constitutes a very useful core for d dt for computing purposes. It may be made even more useful, for these purposes, by introducing a class of polynomials associated with an f in this core, the Bernstein polynomials B n (f ) is the nth Bernstein polynomial for f .
The following identities will be useful to us in the proof of Theorem 5.14.

2)
B n x 2 = n k=0 n k Proof . From it follows that, for each x in [0, 1], To estimate this last sum, we separate the terms into two sums and , those where | k n − x| is less than a given positive δ and the remaining terms, those for which δ ≤ | k n − x|. Suppose that x is a point of continuity of f . Then for any ε > 0, there is a positive δ such that |f (x ) − f (x)| < ε 2 when |x − x| < δ. For the first sum, For the remaining terms, we have δ 2 ≤ | k n − x| 2 , Thus For this δ, we can choose n 0 large enough so that, when n ≥ n 0 , 2M δ 2 n < ε 2 . For such an n and the given x Hence B n (f )(x) → f (x) as n → ∞ for each point x of continuity of the function f . If f is continuous at each point of [0, 1], then it is uniformly continuous on [0, 1], and for this given ε, we can choose δ so that |f (x ) − f (x)| < ε 2 for each pair of points x and x in [0, 1] such that |x − x| < δ. From the preceding argument, with n 0 chosen for this δ, and when n ≥ n 0 , (Note that k n − x x k−1 = −1 when k = 0 and k n − x (1 − x) n−k−1 = 1 when k = n.) Also, Thus Suppose that x is a point of differentiability of f . Let a positive ε be given. We write From the assumption of differentiability of f at x, there is a positive δ such that, when 0 < |x − x| < δ, | f (x )−f (x) x −x − f (x)| < ε 2 . Thus, when 0 < | k n − x| < δ, If k n happens to be x for some k, we define ξ k to be 0 for that k and note that the inequality just stated, when | k n − x| > 0, remains valid when k n = x. It follows that For the last equality we made use of (5.6). We estimate this last sum by separating it, again, into the two sums and , those with the k for which | k n − x| < δ and those for which δ ≤ | k n − x|, respectively. For the first sum, we have from (5.6) and the choice of δ (that is, the differentiability of f at x). For the second sum, we have that δ ≤ | k n − x| so that For this δ, we can choose n 0 large enough so that, when n ≥ n 0 , For such n and the given x Hence B n (f )(x) → f (x) as n → ∞ for each point x of differentiability of the function f . We show, now, that if f is continuously differentiable on [0, 1], then the sequence {B n (f )} tends to f uniformly. We intercept the proof for pointwise convergence at each point of differentiability of f at the formula: Assuming that f is everywhere differentiable on [0, 1] and f is continuous on [0, 1], let M be sup{|f (x)| : x ∈ [0, 1]}. Choose δ positive and such that |f (x ) − f (x)| < ε 2 when |x − x| < δ. Now, for any given x in [0, 1], recall that we had defined From the differentiability of f on [0, 1], the mean value theorem applies, and where x k is in the open interval with endpoints k n and x, when k n = x. In case k n = x, we may choose f (x k ) as we wish, and we choose x as x k . With these choices, In this case when we estimate the sum in the right-hand side of this equality by separating it into the two parts and exactly as we did before (for approximation of the derivatives at the single point x of differentiability), except that in this case, |ξ k | is replaced by |f (x k )−f (x)| and δ has been chosen by means of the uniform continuity of f on [0, 1] such that |f (x k ) − f (x)| < ε 2 when |x k − x| < δ, as is the case when | k n − x| < δ. For the first sum , the sum over those k such that | k n − x| < δ, For the second sum , the sum over those k such that δ ≤ | k n −x|, again, we have δ 2 ≤ | k n −x| 2 . This time, |f (x k ) − f (x)| ≤ 2M (and we really don't care that x k may be very close to x as long as | k n − x| ≥ δ in this part of the estimate), Again, for this δ, we can choose n 0 large enough so that, when n > n 0 and for each x in [0, 1]. Thus B n (f ) − f ≤ ε, and {B n (f )} tends to f uniformly. 6 Murray-von Neumann algebras

Finite von Neumann algebras
Let H be a Hilbert space. Two projections E and F are said to be orthogonal if EF = 0. If the range of F is contained in the range of E (equivalently, EF = F ), we say that F is a subprojection of E and write F ≤ E. Let R be a von Neumann algebra acting on H. Suppose that E and F are nonzero projections in R. We say E is a minimal projection in R if F ≤ E implies F = E. Murray and von Neumann conceived the idea of comparing the "sizes" of projections in a von Neumann algebra in the following way: E and F are said to be equivalent (modulo or relative to R), written E ∼ F , when V * V = E and V V * = F for some V in R. (Such an operator V is called a partial isometry with initial projection E and final projection F .) We write E F when E ∼ F 0 and F 0 ≤ F and E ≺ F when E is, in addition, not equivalent to F . It is apparent that ∼ is an equivalence relation on the projections in R. In addition, is a partial ordering of the equivalence classes of projections in R, and it is a non-trivial and crucially important fact that this partial ordering is a total ordering when R is a factor (Factors are von Neumann algebras whose centers consist of scalar multiples of the identity operator). Murray and von Neumann also define infinite and finite projections in this framework modeled on the set-theoretic approach. The projection E in R is infinite (relative to R) when E ∼ F < E, and finite otherwise. We say that the von Neumann algebra R is finite when the identity operator I is finite.
Proposition 6.1. Suppose that E and F are projections in a finite von Neumann algebra R.
Proof . Suppose I − E and I − F are not equivalent. Then there is a central projection P such that either P (I − E) ≺ P (I − F ) or P (I − F ) ≺ P (I − E). Suppose P (I − E) ∼ G < P (I − F ). Then, since P E ∼ P F , P = P (I − E) + P E ∼ G + P F < P (I − F ) + P F = P , contrary to the assumption that R is finite. The symmetric argument applies if P (I − F ) ≺ P (I − E). Thus Proposition 6.2. For any projections E and F in a finite von Neumann algebra R, where ∆ is the center-valued dimension function on R.
Proof . (i) Since the net {E a ∨ G} is increasing and bounded above by E ∨ G, it converges to a projection P in R, and Since the net {E a ∧ G} is increasing and bounded above by E ∧ G, it converges to a projection P in R, and P ≤ E ∧ G. Recall that the center-valued dimension function ∆ on R is weak-operator continuous on the set of all projections on R; together with Proposition 6.2, Since E ∧ G − P is a projection in R and ∆(E ∧ G − P ) = 0, it follows that P = E ∧ G.
(iii) The net {E a ∧ F a } is increasing and therefore has a projection P as a strong-operator limit and least upper bound. Since E a ∧ F a ≤ E ∧ F for each a, P ≤ E ∧ F . With a fixed the net {E a ∧ F a } has strong-operator limit E ∧ F a from (ii). Since E a ∧ F a ≤ E a ∧ F a when a ≤ a, E ∧ F a ≤ P for each a . Again, from (ii), {E ∧ F a } has E ∧ F as its strong-operator limit. Thus E ∧ F ≤ P . Hence P = E ∧ F .  If E F , then there is a central projection P in R such that P F ≺ P E. At the same time, P (I − F ) P (I − E) so that P (I − F ) ∼ E 0 ≤ P (I − E). Thus P = P F + P (I − F ) ≺ P E + E 0 ≤ P E + P (I − E) = P . This is contrary to the assumption that R is finite. It follows that E F .

The algebra of af f iliated operators
Recall (Definition 4.8) that a closed densely defined operator T is affiliated with a von Neumann algebra R acting on a Hilbert space H when T U = U T for each unitary operator U in R (the commutant of R). Proposition 6.5. If T is affiliated with a von Neumann algebra R, then (iii) From Theorem 4.11, T = V (T * T ) 1/2 , where V is a partial isometry in R with initial projection R((T * T ) 1/2 ) and final projection R(T ). From (ii), R(T * ) = R((T * T ) 1/2 ). Thus R(T ) and R(T * ) are equivalent in R.
Throughout the rest of this section, R denotes a finite von Neumann algebra acting on a Hilbert space H, and A f (R) denotes the family of operators affiliated with R. We shall show that A f (R) is a * algebra (cf. [8,16]). The hypothesis that R is finite is crucial for the results that follow. Proposition 6.6. If S is a symmetric operator affiliated with R, then S is self-adjoint.
Proof . Since S ∈ A f (R), (S + iI) ∈ A f (R). It follows that R(S + iI) Thus V * A is symmetric. If fact, V * A is affiliated with R. To see this, first, V * A is densely defined since D(V * A) = D(A). Now, suppose {x n } is a sequence of vectors in D(V * A) such that x n → x and V * Ax n → y. As V * is isometric on the range of A, Ax n − Ax m = V * Ax n − and BCx = BCJ n x = B n x. Thus x ∈ D(A · (B · C)). It follows that A · (B · C) (=(A · B) · C) is densely defined. Now, we show that the closure A · (B · C) is affiliated with R, which completes the proof. If U is a unitary operator in R and x ∈ D (=D(A · (B · C))), since A, B, and C are affiliated with R, we have A · (B · C) · U x = A · U · (B · C)x = U · A · (B · C)x. From Remark 4.9, A · (B · C) is affiliated with R since D is a core for A · (B · C). Hence, (A + B)C and CA + CB are preclosed. We shall show that (A + B)C and CA + CB are densely defined and their closures are affiliated with R. Then, again, using Proposition 6.7, we obtain (A+ B)· C = A· C+ (B· C) and C· (A+ B) = C· A+ (C· B). We define V 1 H 1 , V 2 H 2 , V 3 H 3 and E n , F n , G n as in the proof of Proposition 6.10. By choice of G n , the operator C n = CG n = V 3 H 3 G n is a bounded and everywhere-defined. Let J n be the projection on the range G n (H) ∩ {x : C n x ∈ (E n ∧ F n )(H)}. Then ∞ n=1 J n (H) is dense in H since {J n } is an increasing sequence with strong-operator limit I. If x ∈ J n (H), then C n x ∈ (E n ∧ F n )(H) so that C n x ∈ D(A + B). At the same time, x ∈ G n (H) so that x ∈ D(H 3 ) = D(C) and Cx = CG n x = C n x. Thus x ∈ D((A + B)C). It follows that (A + B)C is densely defined.
Let A n = AE n and B n = BF n . Then A n and B n are bounded, everywhere-defined operators in R. Let K n be the projection on the range E n (H) ∩ {x : A n x ∈ G n (H)} ∩ F n (H) ∩ {x : B n x ∈ G n (H)}.
Again, {K n } is an increasing sequence with strong-operator limit I so that ∞ n=1 K n (H) is dense in H. If x ∈ K n (H), then A n x ∈ G n (H) and B n x ∈ G n (H) so that A n x ∈ D(C) and B n x ∈ D(C). At the same time, x ∈ E n (H) and x ∈ F n (H) so that x ∈ D(A), x ∈ D(B) and Ax = AE n x = A n x, Bx = BF n x = B n x. Thus x ∈ D(CA + CB). It follows that CA + CB is densely defined.
Theorem 6.13. The family A f (R) is a * algebra (with unit I) when provided with the opera-tions+ (addition) and· (multiplication).
We call A f (R), the * algebra of operators affiliated with a finite von Neumann algebra R, the Murray-von Neumann algebra associated with R.

The Heisenberg-von Neumann puzzle
The Heisenberg-von Neumann puzzle asks whether there is a representation of the Heisenberg commutation relation in terms of unbounded operators affiliated with a factor of Type II 1 .
Recall that factors are von Neumann algebras whose centers consist of scalar multiples of the identity operator I. A von Neumann algebra is said to be finite when the identity operator I is finite. Factors without minimal projections in which I is finite are said to be of "Type II 1 ". So, factors of Type II 1 are finite von Neumann algebras. As noted in Section 6, the operators affiliated with a finite von Neumann algebra R have special properties and they form an algebra A f (R) (the Murray-von Neumann algebra associated with R). Von Neumann had great respect for his physicist colleagues and the uncanny accuracy of their results in experiments at the subatomic level. In effect, the physicists worked with unbounded operators, but in a loose way. If taken at face value, many of their mathematical assertions were demonstrably incorrect.
When the algebra A f (M), with M a factor of Type II 1 , appeared, von Neumann hoped that it would provide a framework for the formal computations the physicists made with the unbounded operators. As it turned out, in more advanced areas of modern physics, factors of Type II 1 do not suffice, by themselves, for the mathematical framework needed. It remains a tantalizing question, nonetheless, whether the most fundamental relation of quantum mechanics, the Heisenberg relation, can be realized with self-adjoint operators in some A f (M).