We strive to illuminate biological functions at the molecular level through the discovery and analysis of three-dimensional protein structures.

Form determines function in biology, even at the level of individual molecules. The understanding of biological function derived from three-dimensional structures of key proteins is one of the most stunning outcomes of the molecular revolution in biology, which began with the realization that DNA codes for RNA, and RNA codes for protein. Proteins are the chemical machines of living systems; they are the control, communication and molecular transportation molecules of cells; and they form part of the structural skeleton. All proteins in an organism can be identified from its genome sequence, a parts list, as it were, for the chemical, regulatory, transport and communications systems of the organism. Genome sequences also provide linear amino-acid sequences for the set of proteins encoded in the genome. However, the three-dimensional structure — the unique "fold" of the polypeptide chain — must be known to understand the function and mechanism of a protein. The sequence of amino acids determines the fold of a protein, but at present the fold cannot be deduced from the amino acid sequence. Protein folds must be determined experimentally.

The overall goal of our research is to understand biological function at the molecular level through knowledge of protein three-dimensional structure. X-ray crystallography is the experimental method we use to determine protein structures. We have contributed to the development of new methods for rapid structure determination so that knowledge of key protein structures influences the study of biological problems early rather than retrospectively. This work takes advantage of powerful synchrotron X-ray sources, which are both tunable and extremely intense relative to conventional laboratory sources. The new methodology, multiwavelength anomalous diffraction (MAD), exploits the tunability of synchrotron sources to determine protein crystal structures rapidly and directly. Several years ago we demonstrated the broad applicability of the MAD method by showing that it can be used to solve crystal structures of large proteins. MAD is now used routinely, the method of choice for structure determination for us and many others.

Even though we can now determine protein structures rapidly, it is not possible to solve structures for all proteins that are relevant to all important biological processes. Therefore, a major application of protein structure information is to predict the function or molecular mechanism of other proteins. This is possible because Nature repeats successful molecular solutions to biological problems by gene duplication and adaptation of the duplicate copy to new function. A theme throughout our work has been to transfer the understanding of molecular mechanism of the proteins we study to other proteins.

One of the biological systems we study illustrates the sophisticated control mechanisms that balance the many biochemical pathways in living cells. In this case, the overall metabolic health of the cell influences the availability of nitrogen for synthesis of new biomolecules by using a central carbohydrate metabolite to deliver nitrogen for biosynthesis rather than a simple nitrogen molecule such as ammonia (NH3). However, cells pay a price for the homeostasis provided by such a nitrogen carrier system. Biosynthetic pathways requiring nitrogen use "complex" enzymes known as glutamine amidotransferases (GATs) to remove nitrogen from the carrier molecule glutamine. Our work has elucidated the structural basis for catalysis and control in GATs, and has uncovered several underlying features of the relevant protein structural families.

We established three-dimensional structures for the two major families among the fifteen different GAT enzymes, represented by glutamine PRPP amidotransferase (GPAT) and of guanosine monophosphate synthetase (GMPS). This work showed that the GPAT and GMPS enzymes each have a structural domain for removal of nitrogen from glutamine and another for addition of nitrogen to their respective acceptor substrates.

A fundamental question from the initial structural work was how the dual catalytic domains work together to transfer nitrogen from glutamine in one active site to the acceptor substrate in the other. We showed that during each catalytic cycle a narrow tunnel for transfer of ammonia forms transiently between the two active sites of GPAT. Furthermore, the structural change that forms the tunnel is also a molecular signal between the distant active sites, allowing precise coupling of the catalytic activities. These results led to hypotheses that all GAT enzymes produce simple ammonia in one catalytic domain and channel it to a second catalytic domain, and that complex enzymes are assembled from separately evolved catalytic modules. These ideas have been verified for other GAT enzymes, most recently by ourselves for imidazole glycerol phosphate synthase (IGPS).

One of the most important and fascinating aspects of structural biology is the discovery of unanticipated connections between biological systems, and the predictive power this confers. Following the initial GPAT structural work, we discovered, in collaboration with other structural biologists, that the GAT domain of GPAT is a member of an enzyme superfamily that catalyzes a variety of hydrolytic reactions. Members of the superfamily are so far diverged from their common ancestor that their homology was not detectable by analysis of amino acid sequences, but only by comparison of three-dimensional structures and by their similar chemistries. Characterization of structural superfamilies is important to assignment of functions to proteins first identified in genome sequences.

The theme of protein families is present throughout our work. For example, the second domain of GPAT is a member of a protein family whose members bind PRPP. We have used the structures of GPAT and other family members to develop a structure-based catalytic mechanism for the entire family. These proteins were all thought to be enzymes catalyzing various additions to the acceptor substrate PRPP. However, Nature has adapted some members of the family to regulatory function. We solved crystal structures of two of the regulatory proteins, and have used our understanding of the molecular mechanisms of the family to explain their regulatory properties.

Another system we have studied is the photosynthetic energy-transducing cytochrome b6f complex. Photosynthesis is the remarkable conversion of light energy to chemical energy. Light is transduced to electrochemical energy by splitting water into protons, electrons and molecular oxygen, and by separating charges across a lipid membrane. Chloroplasts accumulate an electrochemical potential by passing protons and electrons through several proteins in the photosynthetic membrane. We study cytochrome b6f, which transfers electrons between the two light-absorbing protein complexes of photosynthesis and, in the process, contributes to the transmembrane proton gradient that is the basis of the electrochemical potential. We discovered a buried water chain inside cytochrome f and showed that it is highly conserved throughout the biological range of the cytochrome. The water chain may assist in the poorly understood process of proton translocation. We used the structures of cytochrome f and the Rieske protein to build a picture of the intact b6f complex and compared this with the analogous respiratory complex. The parallel systems for energy transduction in photosynthesis and respiration are an excellent example of the combination of conservation and diversity in complex biomolecular systems. Our work has led to an understanding of which energy-transducing steps of photosynthesis are homologous to those of respiration and which differ.

Catalysis, Channeling and Signaling in Complex Enzymes

Living organisms are supported by an enormous number of biochemical pathways. Sophisticated, and sometimes very subtle, control systems regulate these biochemical pathways according to the needs of the cell at any time or place. One example of subtle control is the system for delivery of nitrogen to biochemical pathways that synthesize nitrogen-containing molecules. A carrier system for nitrogen is linked to the central pathway that "burns" carbohydrates to produce energy so that the availability of nitrogen for synthesis of new biomolecules is influenced by the cellular metabolic state. In exchange for this sophisticated control feature, biosynthetic pathways requiring nitrogen must abstract it from the carrier molecule glutamine. Accordingly, Nature has evolved a set of "complex" biological catalysts known as glutamine amidotransferases (GATs). The structural basis for catalysis and control of GATs has been elucidated from crystal structures of three GAT enzymes, which have the dual function of abstracting nitrogen from glutamine and adding it to a variety of acceptor-substrate molecules.

Among the fifteen different GAT enzymes, at least two different protein families transfer glutamine nitrogen to acceptor substrates. Crystal structures of glutamine PRPP amidotransferase (GPAT) and of GMP synthetase (GMPS), which represent the two major GAT families, established that each of the enzymes has two structural domains with widely separated active sites. The initial structural work on GPAT and GMPS also uncovered the detailed structures of the active sites that catalyze removal of nitrogen from glutamine, which are quite different in these enzymes.

However, the initial structures did not explain how the dual catalytic domains work together in each enzyme. A subsequent crystal structure of GPAT, trapped in the form most relevant to catalysis, showed that the enzyme forms a narrow tunnel between the two active sites. The tunnel is created when a floppy protein loop closes over the PRPP acceptor substrate. This established that the glutamine active site is chemically distinct and produces ammonia, which is transferred through the tunnel to the second active site. Based on ideas from the closed-loop structure, it was shown that the closed floppy protein loop also signals the glutamine active site to begin producing ammonia.

GPAT is a prototype for other GAT enzymes, all of which appear to have ammonia tunnels between separated active sites. The complex enzymes are thus assembled from simpler, separately evolved catalytic modules. The newest GAT enzyme structure, of imidazole glycerol phosphate synthase (IGPS), is unlike GPAT. IGPS has a permanent ammonia tunnel between the two active sites. The tunnel carries ammonia through the core of the protein, but is blocked by a "gate" in the resting enzyme. We anticipate that the gate will open at the appropriate moment in the catalytic cycle.