DNA–protein interactions are fundamental to the existence of life forms,providing the key to the genetic plan as well as mechanisms for its maintenanceand evolution. The study of these interactions is therefore fundamental to our understanding of growth, development, differentiation, evolution, and disease. The manipulation of DNA–protein interactions is also becoming increasingly important to the biotechnology industry, permitting among other things the reprogramming of gene expression. The success of the first edition of DNA– Protein Interactions; Principles and Protocols was the result of Dr. G. Geoff Kneale's efforts in bringing together a broad range of relevant techniques. In producing the second edition of this book, I have tried to further increase this diversity while presenting the reader with alternative approaches to obtaining
the same information. A major barrier to the study of interactions between biological macromolecules has always been detection and hence the need to obtain sufficient material. The development of molecular cloning and subsequently of protein overexpression systems has essentially breached this barrier. However, in the case of DNA–protein interactions, the problem of quantity and hence of detection is often offset by the high degree of selectivity and stability of DNA– protein interactions. DNA–protein binding reactions will often go to near completion at very low component concentrations even within crude protein extracts. Thus, although many techniques described in this volume were initially developed to study interactions between highly purified components, these same techniques are often just as applicable to the identification of novel DNA–protein interactions within systems as undefined as a whole cell extract. In general, these techniques use a DNA rather than a protein detection system because the former is more sensitive. Radiolabeled DNA fragments are easily produced by a range of techniques commonly available to molecular biologists. DNA–protein complexes may be studied at three distinct levels—at the level of the DNA, of the protein, and of the complex. At the level of the DNA, the DNA binding site may be delimited and exact base sequence requirements defined. The DNA conformation can be studied and the exact bases contacted by the protein identified. At the protein level, the protein species binding a given DNA sequence can be identified. The amino acids contacting DNA and the protein surface facing the DNA may be defined and the amino acids essential to the recognition process can be identified. Furthermore, the protein’s tertiary structure and its conformational changes on complex formation can be studied. Finally, global parameters of a DNA–protein complex such as stoichiometry, the kinetics of its formation and dissociation, its stability, and the energy of interaction can be measured. Filter binding, electrophoretic mobility shift assay (EMSA/gel shift), DNaseI footprinting, and Southwestern blotting have been the most commonly used techniques to identify potentially interesting DNA target sites and to define the proteins that bind them. For example, gel shift or footprinting of a cloned gene regulation sequence by proteins in a crude cell extract may define binding activities for a given DNA sequence that correlates with gene expression or silencing. These techniques can be used as an assay during subsequent isolation of the protein(s) responsible. Interference assays, SELEX, and more refined footprinting techniques, such as hydroxy radical footprinting and DNA bending assays, can then be used to study the DNA component of the DNA–protein complex, whereas the protein binding surface can be probed by amino acid side chain modification, DNA–protein crosslinking, and of course by the production of protein mutants. Genetic approaches have also opened the way to engineer proteins recognizing chosen DNA targets. DNA–protein crosslinking has in recent years become a very important approach to investigate the relative positions of proteins in multicomponent protein–DNA complexes such as the transcription initiation complex. Here, crosslinkable groups are incorporated at specific DNA sequences and these are used to map out the “positions” of different protein components along the DNA. Extension of this technique can also allow the mapping of the crosslink within the protein sequence. Similar data can be obtained by incorporating crosslinking groups at known sites within the protein and then identifying the nucleotides targeted.
Once the basic parameters of a DNA–protein interaction have been defined, it is inevitable that a deeper understanding of the driving forces behind the DNA–protein interaction and the biological consequences of its formation will require physical and physicochemical approaches. These can be either static or dynamic measurements, but most techniques have been developed to deal with steady-state situations. Equilibrium constants can be obtained by surface plasmon resonance, by spectroscopic assays that differentiate
complexed and uncomplexed components, and, for more stable products, by footprinting and gel shift. Spectroscopy can also give specific answers about the conformation of proteins and any conformational changes they undergo on interacting with DNA as well as providing a rapid quantitative measure of mcomplex formation. Microcalorimetry gives a global estimation of the forces stabilizing a given complex. Static pictures of protein–DNA interactions can be obtained by several techniques. At atomic resolution, X-ray crystallography,
and nuclear magnetic resonance (NMR) studies require large amounts of highly homogeneous material. Lower resolution images can be obtained by electron and, more recently, by atomic force microscopies. Large multiprotein complexes are generally beyond the scope of NMR or even of X-ray crystallography. These are therefore more often studied using the electron microscope, either in a direct imaging mode or via the analysis of data obtained from 2D pseudocrystalline arrays.
Dynamic measurements of complex formation or dissociation can be obtained by biochemical techniques when the DNA–protein complexes have half-lives of several minutes to several hours. For footprinting and crosslinking,
a general rule is that the complexes should be stable for a time well in excess of the proposed period of the enzymatic or chemical reaction. For gel shift, the complex half-life should at least approach that of the time of gel migration, although the cage effect may tend to stabilize the complex within the gel matrix,
extending the applicability of this technique. More rapid assembly kinetics, multistep assembly processes, and short-lived DNA–protein complexes require much more rapid techniques such as UV laser-induced crosslinking,
surface plasmon resonance, and spectroscopic assays. UV-laser induced DNA– protein crosslinking is a promising development because it potentially permits the kinetics of complex assembly to be followed both in vitro and in vivo.
the same information. A major barrier to the study of interactions between biological macromolecules has always been detection and hence the need to obtain sufficient material. The development of molecular cloning and subsequently of protein overexpression systems has essentially breached this barrier. However, in the case of DNA–protein interactions, the problem of quantity and hence of detection is often offset by the high degree of selectivity and stability of DNA– protein interactions. DNA–protein binding reactions will often go to near completion at very low component concentrations even within crude protein extracts. Thus, although many techniques described in this volume were initially developed to study interactions between highly purified components, these same techniques are often just as applicable to the identification of novel DNA–protein interactions within systems as undefined as a whole cell extract. In general, these techniques use a DNA rather than a protein detection system because the former is more sensitive. Radiolabeled DNA fragments are easily produced by a range of techniques commonly available to molecular biologists. DNA–protein complexes may be studied at three distinct levels—at the level of the DNA, of the protein, and of the complex. At the level of the DNA, the DNA binding site may be delimited and exact base sequence requirements defined. The DNA conformation can be studied and the exact bases contacted by the protein identified. At the protein level, the protein species binding a given DNA sequence can be identified. The amino acids contacting DNA and the protein surface facing the DNA may be defined and the amino acids essential to the recognition process can be identified. Furthermore, the protein’s tertiary structure and its conformational changes on complex formation can be studied. Finally, global parameters of a DNA–protein complex such as stoichiometry, the kinetics of its formation and dissociation, its stability, and the energy of interaction can be measured. Filter binding, electrophoretic mobility shift assay (EMSA/gel shift), DNaseI footprinting, and Southwestern blotting have been the most commonly used techniques to identify potentially interesting DNA target sites and to define the proteins that bind them. For example, gel shift or footprinting of a cloned gene regulation sequence by proteins in a crude cell extract may define binding activities for a given DNA sequence that correlates with gene expression or silencing. These techniques can be used as an assay during subsequent isolation of the protein(s) responsible. Interference assays, SELEX, and more refined footprinting techniques, such as hydroxy radical footprinting and DNA bending assays, can then be used to study the DNA component of the DNA–protein complex, whereas the protein binding surface can be probed by amino acid side chain modification, DNA–protein crosslinking, and of course by the production of protein mutants. Genetic approaches have also opened the way to engineer proteins recognizing chosen DNA targets. DNA–protein crosslinking has in recent years become a very important approach to investigate the relative positions of proteins in multicomponent protein–DNA complexes such as the transcription initiation complex. Here, crosslinkable groups are incorporated at specific DNA sequences and these are used to map out the “positions” of different protein components along the DNA. Extension of this technique can also allow the mapping of the crosslink within the protein sequence. Similar data can be obtained by incorporating crosslinking groups at known sites within the protein and then identifying the nucleotides targeted.
Once the basic parameters of a DNA–protein interaction have been defined, it is inevitable that a deeper understanding of the driving forces behind the DNA–protein interaction and the biological consequences of its formation will require physical and physicochemical approaches. These can be either static or dynamic measurements, but most techniques have been developed to deal with steady-state situations. Equilibrium constants can be obtained by surface plasmon resonance, by spectroscopic assays that differentiate
complexed and uncomplexed components, and, for more stable products, by footprinting and gel shift. Spectroscopy can also give specific answers about the conformation of proteins and any conformational changes they undergo on interacting with DNA as well as providing a rapid quantitative measure of mcomplex formation. Microcalorimetry gives a global estimation of the forces stabilizing a given complex. Static pictures of protein–DNA interactions can be obtained by several techniques. At atomic resolution, X-ray crystallography,
and nuclear magnetic resonance (NMR) studies require large amounts of highly homogeneous material. Lower resolution images can be obtained by electron and, more recently, by atomic force microscopies. Large multiprotein complexes are generally beyond the scope of NMR or even of X-ray crystallography. These are therefore more often studied using the electron microscope, either in a direct imaging mode or via the analysis of data obtained from 2D pseudocrystalline arrays.
Dynamic measurements of complex formation or dissociation can be obtained by biochemical techniques when the DNA–protein complexes have half-lives of several minutes to several hours. For footprinting and crosslinking,
a general rule is that the complexes should be stable for a time well in excess of the proposed period of the enzymatic or chemical reaction. For gel shift, the complex half-life should at least approach that of the time of gel migration, although the cage effect may tend to stabilize the complex within the gel matrix,
extending the applicability of this technique. More rapid assembly kinetics, multistep assembly processes, and short-lived DNA–protein complexes require much more rapid techniques such as UV laser-induced crosslinking,
surface plasmon resonance, and spectroscopic assays. UV-laser induced DNA– protein crosslinking is a promising development because it potentially permits the kinetics of complex assembly to be followed both in vitro and in vivo.