Biomolecular NMR, UCSB Chem and Biochem NMR Facility

Biomolecular NMR

The NMR Facility at the Department of Chemistry and Biochemistry also supports macromolecular NMR research led by Prof. Rick Dahlquist and involving several other groups from Chemistry and MCDB, including the groups of Norbert Reich, John Lew, David Low, and Herbert Waite. The following illustrates some of the techniques in protein NMR research using examples mostly from collaborative work with researchers at the department.

Isotope Labeling, Editing, and Heteronuclear NMR Spectroscopy
Chemical Shift Indexing and Secondary Shifts
Intermolecular Interaction: Chemical Shift Perturbation Mapping
Structure Determination and NOESY
Protein Dynamics
Structure Validation with Residual Dipolar Coupling (RDC)
TROSY: Transverse Relaxation Optimized Spectroscopy

Isotope Labeling, Editing, and Heteronuclear NMR Spectroscopy

Multi-channel, multi-nuclear pulsing capability is required in most biomolecular NMR research where the macromolecule is often isotopically labeled and heteronuclear NMR technique is applied. One key purpose of isotope labeling is to enhance the dispersion of the ¹H signals by encoding the ¹H signals with the through-bond correlated ¹⁵N or ¹³C chemical shifts. Without isotope editing, thousands of ¹H signals from a macromolecule would be overlapping in a narrow ¹H frequency range and detailed study is not possible. Equally true, ¹⁵N and ¹³C shift crowdedness is also reduced through correlation to adjacent ¹H.

Crystal structure of HhaI, a 37 kDa DNA-binding methyltransferase. Positions of backbone amide ¹⁵N atoms are shown as spheres. The protein polypeptide chain is color coded to change from blue (N-terminus) to red (C-terminus). ¹⁵N-edited NMR spectroscopy, such as 2D ¹H-¹⁵N HSQC shown in the right panel, allows the detection of all ¹⁵N-attached protons while all other proton signals are suppressed. Each sphere in the structure above represents a ¹H-¹⁵N pair that gives a peak in the spectrum.

1D ¹H spectrum of the amide NH region (top) and 2D ¹H-¹⁵N correlation spectrum (HSQC). The overlapping ¹H signals from over 300 NH protons in the 1D spectrum are mostly resolved along the ¹⁵N dimension in the 2D spectrum. Red peaks are NH resonances that are folded/aliased from outside the spectral window along ¹⁵N dimension.

Isotope labeling of a protein is achieved when the protein is overexpressed in cells grown in isotope enriched minimal media. Typical labeling involves replacing naturally abundant ¹⁴N with ¹⁵N, ¹²C with ¹³C, and sometimes ¹H with ²H (or D). In one type of 3D or 4D experiment conducted with an isotopically ¹⁵N/¹³C/²H triply labeled sample, the protein backbone amide ¹H-¹⁵N nuclei are through-bond correlated with sidechain ¹³C nuclei while sidechain ²H nuclei are decoupled. Both 3D and 4D NMR data, once processed into the frequency domain, are simply 2D planes similar to any other 2D NMR spectrum, except that there are many of these 2D planes and each is indexed by a ¹⁵N or ¹³C chemical shift in a 3D spectrum, or double indexed by both ¹⁵N and ¹³C shifts in a 4D spectrum.

Another important advantage of the isotope editing technique is fundamental to structural biology. ¹H signals alone are not sufficient to address the wide range of questions related to the structure and biological function of the molecules. A number of important NMR techniques involve studies using the isotopes directly. These include, among others: (1) secondary structure prediction based on ¹³C chemical shifts in a folded protein, (2) protein dynamics through measurement of ¹⁵N or ¹³C relaxation properties, and (3) structure confirmation and modeling with residual dipolar couplings of ¹H-¹³C and ¹H-¹⁵N.

Chemical Shift Indexing and Secondary Shifts

It is well known that when a protein folds into a specific 3D structure from a random coil polypeptide, chemical shift dispersion of both ¹H and ¹³C nuclei increases, reflecting more diverse local chemical environments of the nuclei and specific atomic contacts. ¹H and ¹³C chemical shifts are sensitive to local secondary structure formation such as helix, strand or random coil. Chemical shifts of backbone ¹³CO and ¹³C of sidechain CαH and CβH for random coil peptides have been measured and compared with chemical shifts from residues in well-formed secondary structures in proteins. It has been well established that these chemical shifts correlate well with the type of secondary structure. These findings lead to the popular secondary structure analysis technique named chemical shift indexing or secondary shift analysis.

The secondary chemical shifts are first calculated as the differences between the measured values from the protein and the random coil values. These differences are plotted as a function of residue number. Consecutive positive or negative deviations are classified as either helix or strand structure with the degree of deviation corresponding to a level of confidence about the structural assignment. Such an approach has a remarkable accuracy, particularly when coupled with a database search of short peptide sequence against a set of PDB structures, as implemented in the program TALOS. Proton secondary shifts of CαH groups also correlate with protein secondary structure. However, ¹³C chemical shifts are more reliable in this analysis due to its wider chemical shift ranges and more types of ¹³C nuclei, including Cα, Cβ and CO, available for this analysts.

Structure of a flagellar motor protein FliM

Deviation of Cα chemical shifts from average values from random coiled peptides. The secondary structure of FliM is schematically drawn above the figure with helices denoted by cylinders and strands by rectangles.

Dyer et al (2009) A molecular mechanism of bacterial flagellar motor switching. J. Mol. Biol. 388, 71-84.

Intermolecular Interaction: Chemical Shift Perturbation Mapping

When a protein interacts with a small molecule ligand or another macromolecule, perturbations to the NMR crosspeaks of the interface residues are often observed. These changes are used to identify the binding interface. This method, called "chemical shift perturbation mapping", is widely used under the assumption that the shift perturbations, due to either structural changes or close contact, are localized in the macromolecule. While this assumption is proven true in most cases, large-scale chemical shift perturbations could obscure the actual binding site, rendering the data analysis less reliable. This situation could occur due to global structural changes or multiple binding sites, etc.

The binding assay is often done using heteronuclear NMR with one partner isotope labeled to observe its signals and the other unlabeled or deuterated.

As in all binding studies under NMR condition, the scale of exchange rate (K_ex) between the free and bound forms relative to their chemical shift difference (ΔCS) has a strong influence on how the titration data appear and should be interpreted.

Slow exchange: typically with a K_D of μM or smaller, where the exchange is slow (K_ex << ΔCS, in Hz), distinct resonances from the free and bound forms are observed. There are no "intermediate" resonances that appear in between the free and bound peak positions.
Intermediate exchange: typically with a K_D of μM or larger, where the exchange rate is comparable to the shift difference of the free and bound forms (Kex ~ ΔCS), line-broadening occurs. The exchange broadening may cause the peaks to disappear all together.
Fast exchange: typically with K_D of 10 μM or much higher, where the exchange between the free and bound forms is rapid (K_ex >> ΔCS). An average peak between the free and bound positions is observed. Peak linewidth and chemical shift should correspond to the population weighted average values of the free and bound forms.

Specific (strong, slow-exchange) and non specific (weak, fast exchange) interactions between HhaI and DNA. When a piece of non-cognate DNA resembling the cognate DNA in sequence is added to HhaI, the protein resonances display both specific and nonspecific binding characteristics. Shown on the right is a superposition of a set titration spectra of HhaI showing Phe79 going through a transition from the free form (binary) to the DNA-bound specific complex and non-specific complex, ending with a mixed population of specific and nonspecific complexes. The continuous movement of the nonspecific complex peaks along a straight line is typical of weak binding with fast exchange rate. In the slow exchange case, the intermediate peaks are not observed and the bound peak simply gets stronger as the free protein peak gets weaker until binding is saturated with DNA.

Keep in mind that the exchange rate and binding affinity may fall anywhere in between these three cases. In the fast-exchange case, the k_D of the binding can often be obtained from fitting the chemical shift as a function of substrate concentration. In the borderline cases, however, detailed analysis may not be always possible.

Interaction of heparin with N-terminal domain of hepatocyte growth factor (HGF-N). This is a slow exchange case with apparent K_D of μM or smaller. When heparin is added to ¹⁵N-labeled protein, the amide NH peaks are perturbed, shown as movements from the black peaks to the red peaks. A number of positively charged residues form a binding surface in the protein and interact with negatively charged heparin. The combined chemical shift perturbations (in Hz) of amide ¹H and ¹⁵N reveal residue-specific changes upon heparin binding. These changes, when color coded onto the structure, form a unique binding surface.

Zhou et al (1998) The solution structure of the N-terminal domain of hepatocyte growth factor reveals a potential heparin-binding site. Structure 6, 109-116

Zhou et al (1999) Identification and dynamics of a heparin-binding site in hepatocyte growth factor. Biochemistry 38, 14793-802.

Research conducted at National Cancer Institute - Basic Research Program, Frederick, Maryland.



Interaction of bacterial chemotaxis receptor with CheW. This is a fast exchange binding with weak affinity. CheW (18 kD) is deuterated except that the methyl groups of Ile, Leu, and Val residues are protonated and ¹³C labeled. The labeling scheme allows only these methyl groups to be detected in the ¹H-¹³C correlation spectrum of CheW mixed with deuterated receptor. The dissociation constant (K_D ~ 300 uM) is measured by the chemical shift change as a function of receptor fragment concentration. The area colored yellow and red in the CheW structure is the receptor-binding region showing resonance perturbations. Vu et al (2012) The receptor-CheW binding interface in bacterial chemotaxis. J. Mol. Biol. 415, 759-767.
Chemical Shift Perturbation (Hz) vs Residue Number Interaction of methyltransferase HhaI with its DNA target. This is a case where large, extensive chemical shift perturbations, observed in ¹H-¹⁵N correlation spectrum, occur upon DNA binding. Exchange is slow with K_D < a few nM. A 20-residue active site loop switches from an open conformation in the apo-protein to closed one to lock up the target DNA. The long-range perturbations to remote sites, as can be seen from the color-coded figure, make precise determination of the binding site difficult. Nevertheless, the perturbation map is consistent with the X-ray structure of the complex. Zhou et al (2009) The recognition pathway for the DNA cytosine methyltransferase M.HhaI. Biochemistry 48, 7807-7816.

Structure Determination and NOESY

Solution structure determination is one of the main applications of NMR techniques. It not only compliments the X-ray structure but importantly it also provides a more native-like structure under solution state. In some cases, the protein fails to crystallize and NMR technique becomes the only tool to provide a high-resolution structure.

The structure determination process typically starts with overexpression of isotope-labeled proteins, followed by a series of 3D and/or 4D NMR experiments that enable ¹H, ¹⁵N and ¹³C resonance assignments to specific nuclei in the protein. The sequential assignment process is done on the computer by examining sequentially correlated crosspeaks to establish their sequential relationship. Once the sequential assignments are made, NOESY (Nuclear Overhauser Effect Spectroscopy) data are analyzed and assigned to proton pairs.

**Structure, Distance and NOE**
(a) Section of CheY structure formed by a strand, a turn and a helix starting from the N-terminal end. The helix is a more compact structure and the strand is an extended structure, causing the distances between the amide protons (green) to be short between neighbors in a helix and much longer in a strand. This leads to strong amide-amide proton NOEs for a helical structure and weak NOEs for a strand structure. The patterns of amide-amide and amide-CαH proton NOEs are good indicators of secondary structure. 1/r⁶ rule: NOE intensity drops off as a function of 1/r⁶ where r is the distance between the protons. The rapid dropoff only allows NOE between protons of < 5 angstrom, making NOE a sensitive measure of short distances.	(b) Summary of hydrogen exchange experiment, NH-CαH ³J_NHα coupling constants and NOE data. In a hydrogen exchange (HX) experiment, the protein sample in H₂O is rapidly switched into a D₂O buffer and an ¹H-¹⁵N correlation spectrum is collected immediately. The peaks from the amides protected from HX, mostly due to hydrogen bonds and/or burial, remain in the spectrum long after (mins, hours, or days) the exchange while those who disappear quickly are presumed not hydrogen-bonded or buried, therefore have been replaced with deuterons and are not observable. NH-CαH ³J_NHα coupling constants correlate well with the dihedral angle Φ according the Karplus equation. Large (> 8 Hz) J-values, shown as filled circles, indicate extended strand structure; smaller (< 7Hz) values indicate helical structure. The NOEs among various neighboring protons (NH-NH, CβH-NH, CαH-NH) are shown by black lines. The thickness of the lines corresponds to strength of the NOE. The secondary structure assignment based on these data are drawn above the figure.

NOE occurs through the dipole-dipole interaction between protons within a short distance, typically within 5 Angstrom. These interactions are presented in the NOESY spectrum as crosspeaks linking two protons. Distance information is obtained from NOESY spectra and is applied in the form of distance restraints in a structure calculation program such as X-PLOR to fold the polypeptide chain into a 3D structure. Other constraints in the calculation include H-bonds related to secondary structure, determined by a combination of chemical shift index (see above), dihedral angle measurements, NOE patterns, D₂O hydrogen exchange experiments, etc.

NOESY Spectrum and Reduction of Peak Degeneracy with Isotope Editing Technique

(a) 2D ¹⁵N-edited NOESY from a ~130 amino acid, ¹⁵N,¹³C-labeled phosphotransfer domain (P1) of chemotaxis histidine kinase CheA. The NOE crosspeaks here are between ¹⁵NH amide protons and all other protons.

(b) ¹H-¹H NOESY plane at 120ppm along ¹⁵N from a 3D ¹⁵N-edited NOESY spectrum. The dramatically reduced crowdedness of the ¹H peaks allows the assignments of thousands of NOE crosspeaks to individual proton pairs. The 2D crosspeaks in (a) are spread over 64 ¹H-¹H planes each with a unique ¹⁵N shift .

(c) Slice at apparent shifts of ¹⁵N: 120ppm and ¹³C: 62.5ppm from 4D ¹⁵N,¹³C-edited NOESY. All peaks seen are NOE crosspeaks between ¹⁵N-attached protons and sidechain ¹³C-attached protons. Compare this figure with the ¹H-¹H plane in the 3D ¹⁵N-edited NOESY in (b) from the same ¹⁵N slice. Note the reduction in the number of peaks in the 4D spectrum due to the additional ¹³C editing, as shown by the appearance of the boxed peaks but not others among the vertical strips of peaks having NOE to these two NH protons.

(d) P1 Structure determined using 2759 NOE distance constraints along with some H-bond and dihedral angle constraints. On the left is a best-fit superposition of 25 structures. On the right is a ribbon diagram of the average helix bundle structure.

Structure of P1-CheY Complex

Intermolecular NOEs detected with 4D ¹⁵N and ¹³C-edited NOESY. One binding partner is ¹⁵N-labeled and the other ¹³C-labeled. Therefore, only intermolecular {¹⁵N}¹H-{¹³C}¹H NOEs are detected.

P1 and CheY complex structure calculated with a number of intermolecular NOE distances and long-range distances obtained with paramagnetic relaxation enhancement (PRE) through spin labeling of engineered cysteine residues.

Mo et al (2012) Solution structure of a complex of the histidine autokinase CheA with its substrate CheY. Biochemistry 51, 3786-3798.

Protein Dynamics

Protein dynamics is the hallmark of the NMR advantages in addressing a range of interesting questions relevant to protein function. Protein structure is not a collection of atoms from a static 3D structure. The structure fluctuates, both on the psec time scale and sometimes on the usec to msec time scale that is often relevant to its function. Protein backbone amide ¹⁵N nuclei are common probes of the local dynamics at each residue. ¹³C and ²H nuclei are also used as probes of backbone or sidechain dynamics. ¹H nuclei are not as good as probes in a macromolecule due to their strong interactions with multiple partners which tend to complicate data interpretation.

Measured T₂ values and extracted order parameters from the phosphotransfer domain of the histidine kinase CheA. Increased T₂ values, often found in the terminal ends and in turns between the rigid helices (shown as A-E rectangles), reflect more flexibility of the protein backbone in these regions. This trend is also reflected in the extracted order parameter based on Lipari-Szabo's model-free approach assuming the relaxation of the amide ¹⁵N spin is caused by dipolar coupling from the amide ¹H in addition to chemical shift anisotropy, both modulated by molecular tumbling and internal local dynamics.

Protein backbone dynamics of a 233-residue fragment of chemotaxis histidine kinase CheA revealed by measured T₁, T₂ and extracted overall rotational correlation times of backbone amide ¹⁵N spins. The two domains of the fragment are joined by a flexible linker with little contact between the domains, evidenced by different dynamics parameters and high mobility in the linker. The N-terminal domain (residues 1-134) is twice as large as the C-terminal domain.

Zhou et al (1996) The phosphotransfer domain and the CheY-binding domain of the histidine kinase CheA are joined by a flexible linker. Biochemistry 35, 433-443.

Structure Validation with Residual Dipolar Coupling

Under isotropic Brownian rotation, the energy contribution from dipolar coupling between two closely spaced spins averages to zero in solution. However, partial coupling can be recovered by mixing the macromolecules in a special media such as bicelles, phage particles or other chemicals that offer rotational restriction but do not interact with the macromolecules or cause severe line-broadening. The macromolecules in such media often exhibit a small degree of non-uniform rotation, sandwiched between the much larger media molecules, due to the alignment of the media molecules with the external strong magnetic field. This molecular alignment enhances existing but small rotational anistropy and leads to a small but measurable residual dipolar couplings (RDCs) for the macromolecule of interest.

RDCs correlate the bond vector directions, i.e. amide ¹H-¹⁵N bonds in a protein, with the molecular rotational diffusion properties; both are described in the same molecular frame. RDCs can be measured from the deviations of ¹H-¹⁵N or ¹H-¹³C couplings with or without alignment in the magnet. By fitting these RDCs with a known structure, the solution structure can be validated and differences can be detected. The measured RDCs can also be used as additional constraints to refine a structure computed with other constraints, including primarily distance constraints provided by NOEs. Unlike most of other experimental data used as constraints, RDCs provide a global structure profile and therefore are often used in the later stages of the structure calculation.

RDC values (Hz) were measured for DNA methyl transferase M.HhaI bound with a DNA fragment. These values were fit with the crystal structures of the apo- and DNA-bound structures, both bound with a cofactor. The measured RDCs are consistent with the DNA-bound ternary crystal structure and confirm a large closure movement by a 20 amino acid active-site loop to lock down the DNA as observed in the crystal structure. The DNA fragment, looked down towards its helix axis, is shown in green. The cofactor is in red. The alignment media used was a mixture of alkyl-polyethylene glycol C12E5 and hexanol.

Matje et al (2013) Enzyme-Promoted Base Flipping Controls DNA Methylation Fidelity. Biochemsitry 52, 1677-1685.

Structure refinement using RDC values as constraints. The precision of DNA structure improves with the inclusion of RDCs in addition to other constraints.

Zhou et al (2001) Incorporating residual dipolar couplings into the NMR solution structure determination of nucleic acids. Biopolymers 52, 168-80.

Amide ¹H-¹⁵N bond vectors in the molecular frame. The vector from the center of the sphere to each dot is an amide ¹H-¹⁵N vector. The useful information content in measured RDCs relies on a wide, more uniform distribution of the vector direction reflecting the molecular shape. The non-spherical shape of the molecule enhances asymmetrical molecular tumbling and enlarges measured RDCs. Meausred RDCs in the TM14 chemoreceptor fragment would be heavily biased along one long axis and don't provide independent data content addressing the other perpendicular axes.

Amide ¹H-¹⁵N bond RDC measurement for the phosphotransfer domain (~ 134 aa) of the histidine kinase CheA. The one-bond ¹J + RDC value (~ 90 - 100 Hz) shown along the vertical ¹⁵N dimension is measured in two separate experiments to give the two split peaks, overlayed in black and red. The difference of this value in the presence and absence of the alignment media of Pf1 phage particles is taken as RDC.

RDCs between ¹H and ¹³C_α are measured in a 3D HNCA experiment with (red) and without (black) Pf1 phage as the alignement media.

TROSY: Transverse Relaxation Optimized Spectroscopy

TROSY is a type of NMR experiments where the cross relaxation between dipolar coupling (DC) and chemical shift anisotropy (CSA) is explored to obtain the sharpest component of the coupled NMR signals. Both DC and CSA are modulated by the same time-dependent molecular tumbling in their interaction, therefore tightly coupled. The two tensors, both causing relaxation, decompose into additive and subtractive terms when projected onto the same molecular frames of reference, thereby enhance certain components of the relaxation and reduce others. To benefit from the DC-CSA cancelation effect on relaxation, decoupling of one-spin from the other has to be left off to prevent averaging of the fast and slow relaxation. For a ¹H-¹⁵N pair, among the four coupled peaks, one is the sharpest with the TROSY gain from both nuclei. Coherence selection NMR techniques allow only this particular coherence to be selected for detection.

CSA scales with the magnetic field as B² while DC is independent of the field. The best cancelation is when the field is slightly above 1 GHz but the effect is obvious between 600-800 MHz as well if the protein is deuterated. For more details, see Pervushin, K.; Riek, R.; Wider, G.; Wüthrich, K. Attenuated T2 relaxation by mutual cancellation of dipole-dipole coupling and chemical shift anisotropy indicates an avenue to NMR structures of very large biological macromolecules in solution. Proc. Natl. Acad. Sci. U.S.A. 1997, 94, 12366-12371.

TROSY is widely used for large molecules when they are deuterated. Deuteration removes other ¹H dipolar couplings that contribute a background broadening to the peaks. Generally, deuteration (or ²H labeling) is necessary in cases where the large size of the macromolecule leads to shortened T₂ relaxation time due to strong ¹H dipolar couplings and slow molecular tumbling, consequently lowering signal-to-noise ratio. ²H (or D) has only a very small magnetic moment therefore a small dipolar interaction with other nuclei. Replacing some ¹H with ²H in a protein often dramatically increases T₂ of ¹H and ¹³C nuclei of interest and enhances the sensitivity of most experiments. In one situation the uniform deuteration occurs for all protons except the backbone amide NH and other labile protons to enable their detection. In another scheme, using special protein production techniques, only the methyl CH₃ groups are ¹³C-labeled and protonated, leaving all other carbons unlabeled and all other protons deuterated except the labile protons. These labeling schemes enable NMR experiments on macromolecules or complexes of 30 kDa or larger.

The advances in NMR probe technology, increase in field strength, improving isotope-labeling techniques and the development of TROSY have together enormously broadened the biomolecular NMR field in the last two decades. The size of protein or complexes that can be studied by NMR has increased to several hundred kDa. The precision of NMR structure determination has reached level comparable to X-ray crystal structure in many cases.

Hongjun Zhou, updated 2019