• Free energy diagrams

(1)

Outline

• Exam question

• Free energy diagrams

• SSBond homework discussion

• Mutagenesis - one amino acid substitution

4 May 2011 Week 10

(2)

(3)

21 84 B I O C H E M I S T R Y V I L L A F R A N C A E T A L .

13 free S H )

( I l r e e SH1

cp 0-

S-

coo-

I I

^{r n n -}

I LUU

1mM DTT coo-

S - S b

NO2

+ -sxJ”

^-

FIGURE

1 : Probable reaction sequence for oxidation of Cys-39 DHFR by dithionitrobenzoate.

than 1.5 free sulfhydryls (<75% conversion) were found after oxidation by such mixtures of glutathione or cysteine (data not presented). It was observed, however, that treatment of the Cys-39 enzyme with dithionitrobenzoate (DTNB, Ellmans’

reagent) yielded a species with only one thionitrobenzoate moiety bound per molecule, while the same treatment of the wild-type enzyme yielded a species with two thionitrobenzoate moieties bound per molecule. These results could be explained by three possibilities: (1) formation of a disulfide bond be- tween Cys-39 and Cys-85, (2) formation of oligomers linked by intermolecular disulfide bonds, and (3) formation of an intramolecular disulfide bond between Cys- 152 and either Cys-39 or Cys-85. Oligomers were not observed, so the second possibility was eliminated. Since Cys-152 is 28 A from Cys-39 and Cys-85, an intramolecular disulfide bond between them would require a gross distortion of the native structure. Such a distortion in conformation was not detected by nondenaturing polyacrylamide gel electrophoresis of the oxidized Cys-39 DHFR and seems unlikely in light of the oxidized enzyme’s full catalytic activity (see below). It seemed reasonable, therefore, to conclude that DTNB treatment did indeed spe- cifically oxidize the Cys-39 and Cys-85 sulfhydryls to a di- sulfide bond.

Since the ultraviolet spectra of bound thionitrobenzoate and of the free anion are very different and readily allow quan- tification, one can easily follow the course of the DTNB re- action. The reaction was determined to be nearly complete with >95% conversion to the Cys-39-Cys-85 disulfide-cross- linked enzyme. The single remaining thionitrobenzoate group, presumably linked to Cys-152, is easily removed by mild re- duction with 1 mM dithiothreitol (DTT), yielding a single free sulfhydryl per molecule. Reduction of the disulfide bond, on the other hand, requires more drastic conditions, such as 0.05

M DTT and 6 M urea at 32 OC, to achieve complete reduction.

A probable reaction sequence for oxidation of Cys-39 DHFR by DTNB is depicted in Figure 1. We believe the two thionitrobenzoate moieties in the intermediate are linked to cysteine residues 39 and 152 since the X-ray structure shows that their S H groups are readily accessible to solvent, while that of Cys-85 is not.

Activity _of Cys-39 DHFR. Both the reduced and oxidized forms of Cys-39 DHFR have kinetic parameters essentially

0 1 2 3 4 5

Guanidine-HCI M

FIGURE

2: Gdn-HC1 denaturation curves of (0) wild-type, (A) reduced

Cys-39, and ₍₀₎ oxidized Cys-39 DHFRs. The fraction of denatured protein, FD, was measured as

^(t,bsd

^-

^cN)/(cD

^-

^cN),

where eN and ^cD are the extinction coeficients at 292 nm for native and denatured forms of the protein.

identical with those of the wild-type enzyme (Howell et al., 1986). The k,, values for wild-type and reduced and oxidized Cys-39 DHFR are 30, 28, and 27

^s-l,

respectively. KM values for dihydrofolate are 1.6, and 1.4 and 1.2 pM, ^and ^KM ^values

for NADPH are 1.4, 1.2, and 1.2 pM. The previously reported loss of activity following air oxidation of Cys-39 DHFR (Villafranca et al., 1983) was probably due to the formation of various extraneous byproducts rather than simple di- sulfide-bond formation.

Structural Stability. One of the most widely used ap- proaches for estimation of protein structural stability is analysis of unfolding as a function of Gdn-HC1 or urea concentration.

These denaturants are capable of causing complete unfolding of proteins in a reversible manner that can be described by a two-state mechanism. Given the two-state assumption, one can make rather straight-forward estimates of structural stability in terms of the free energy change on denaturation (AG,) (Greene & Pace, 1974).

We have used both Gdn-HC1 and urea to evaluate the effect of the Cys-39 mutation, in both its reduced and oxidized forms, on the structural stability of DHFR. In the case of Gdn-HC1, unfolding of DHFR was monitored by measuring the spectral shift of tryptophan absorbance at 292 nm as a function of denaturant concentration. A blue shift at this wavelength is known to correspond to an increase in the polarity of the environment surrounding the tryptophan indole, as occurs when a buried tryptophan side chain is exposed to aqueous solvent, and thus indicates the extent to which protein unfolding has occurred (Donovan, 1972). This technique is particularly applicable to DHFR since the enzyme contains five tryptophan residues that are spatially distributed throughout the molecule.

We obtain a value of -4350 M-’ cm-’ for all three maximally unfolded DHFRs (wild type and both forms of Cys-39 mutant). On the basis of model compounds (Donovan,

1969), this AcZ9’ corresponds to exposure of 2.7 buried tryp- tophan residues, in good agreement with the expected number of 3.0 obtained by solvent-accessibility calculations (Richards,

1977) on the 2-A crystal structures, and indicates complete unfolding has been attained in each case by Gdn-HC1 dena- turation. Figure 2 shows the denaturation curves for wild-type and reduced and oxidized Cys-39 DHFRs. Whereas the wild-type and the reduced forms of Cys-39 DHFR show a sharp transition between 1 and 2 M Gdn-HC1, with complete unfolding by 2.0 M, the oxidized form of Cys-39 DHFR shows a more gradual transition between 1.5 and 4 M, and complete unfolding does not occur until a concentration of 4.0 M Gdn-HC1 is reached. The shift of the denaturation curve to higher Gdn-HC1 concentration for the oxidized Cys-39 DHFR

- definitely more stable

- broad transition suggests

intermediates form

(4)

E N G I N E E R E D D I S U L F I D E B O N D I N D H F R

L/

^m-637Kcd;-'K'

\, ^\

^mo:-193 ^IN!^,^.oli'c'

1

Dd cs39 0" c,m

-1 8

- t o

0 1 3

[guonidm-HCI ]

FlOURe 3 Plot of AGD, the free energy of unfolding, vs. Gdn-HCI concentration; d,I2 is the midpoint of the unfolding transition where

AGD = 0.

indicates an increase in structural stability. However, the shallower slope of the transition represents a diminished folding/unfolding cooperativity, probably due to the appear- ance of stable intermediates (Pace, 1975).

Estimates of the free energy of unfolding, AGo, can ^be obtained from analysis of the denaturation curves with

KO = ^fD/fN = ^e-AOoIRT (1)

where fo and ^fN represent the fraction of protein in the denatured (D) and native (N) forms and KO is the equilibrium constant. One can obtain a measure of the free energy of unfolding in the absence of denaturant (AG3') by linear extrapolation (Aune & Tanford, 1969; Greene & Pace, 1974) according to

AGo = AG$O

-

m[denaturant] (2) where m is the slope of the plot and the quantity in brackets represents the concentration of denaturant. Such an analysis of the denaturation curves for reduced and oxidized Cys-39 DHFRs is presented in Figure 3. Extrapolation yields a for the reduced form of 11.8 kcal/mol and 5.0 kcal/mol for the oxidized form, a difference in the energy of unfolding between the oxidized and reduced form (AAG3O) of -6.8 kcal/mol, implying that the oxidized Cys-39 DHFR is less stable than either the reduced mutant or the wild-type enzyme, this despite the fact that a higher concentration of denaturant is required to cause the oxidized mutant to unfold. Clearly, because of the decreased slope of the curve, this estimate of the AG3' of the oxidized mutant is much too low. Presum- ably the decreased slope in turn results from the existence of stable intermediates in the unfolding transition, leading to breakdown of the two-state assumption (Pace, 1975). The problem can be partially circumvented, and a reasonable estimate of the difference in stability between the reduced and oxidized forms can be made nevertheless. It has been ^shown that, wen when intermediates are present, the region near the midpoint of the transition will yield a value of the unfolding equilibrium constant (KO) expeaed for a twc-state mechanism ( C u p & Pace, 1983). Consequently, by taking the difference in the Gdn-HCI concentrations at the midpoint of the reduced and oxidized unfolding transitions (0.86 M) and multiplying it by the slope of the oxidized form transition (m = -1.93 kcal mol-' M-'), one can obtain a value for the lower limit of the stability difference between the two forms. On the basis of this analysis, the oxidized form appears to be at least 1.8 kcal/mol more stable than the reduced form. From theoretical considerations (Poland & Scheraga, 1965). the increase in conformational stability due to the entropy effect of a disulfide bond cross-linking a 46-residue loop, as in the oxidized Cys-39

V O L . 2 6 , N O . 8 , 1 9 8 7 2185 A

1

f

ir "nLa-> ^{B M}

B

I

f

ir ^UtltA--) ^{O M}

C

I f

0 UREA- 8 M

FIGURE 4 Urea-gradient polyacrylamide gels showing the unfolding profiles of ^(A)wild-type DHFR, (B) oxidized Cys-39 DHFR, and (C) a mixture of reduced and oxidized Cys-39 DHFR. The letters

f and _uindicate the position on the gel of the folded and unfolded forms of the protein.

DHFR, can be estimated at 2.6 kcaljmol.

As a check on our results with Gdn-HCI denaturation, experiments with urea-gradient polyacrylamide gels (Creigh- ton, 1979) were also conducted. Samples of wild-type, reduced, and oxidized Cys-39 DHFRs were applied individually to polyacrylamide gels containing urea gradients (0-8 M) per- pendicular to the direction of the electrophoretic potential.

Electrophoresis was carried out with the gel temperature maintained at 15 OC. Under these conditions electrophoretic mobility is dependent on protein conformation and charge.

Protein bands were visualized with Coomassie blue stain.

The urea-gradient gels confirmed and extended the con- clusions from the Gdn-HCI experiments. The unfolding profiles of wild-type and reduced and oxidized Cys-39 DHFRs on urea gels are shown in Figure 4. Again, the reduced form shows a sharp reversible transition indicative of two-state behavior, indistinguishable from that of the wild type. The oxidized form, on the other hand, clearly shows a shift in the transition to a higher concentration of denaturant and a marked decrease in its slope.

One of the advantages of urea-gradient gels is their ability to display those stable conformational intermediates present in significant amounts. In the case of the oxidized form of Cys-39 DHFR, at least five distinct conformational species appear, two of which occur under maximally unfolding con- ditions. It is important to note that these intermediates are unlikely to be products of disulfide bond scrambling since identical Gdn-HCI denaturation profiles are obtained for all

~5 kcal/mol

- extrapolate to [Gdn-HCl] = 0

- indicates ox. Cys39 is less stable, which is not correct.

~3.0 kcal/mol (Yi Jeong Sang)

- two-state hypothesis not valid for ox. Cys39

- midpoint of transition give Keq expected for two-state mechanism even when intermediates are present

- compare at ΔG

D

= 0 to get 0.86 M Gdn-HCl

≥1.8 kcal/mol

(5)

S ⎯ → ←⎯

^K

⎯ ⎯ P

^eq

K

_eq

= [ ] P

eq

[ ] S

_eq

ΔG° = −RT ln K

_eq

reaction coordinate Free energy, ΔG S

P RTlnK_eq

Free energy change for a reaction, ΔG

ΔG = ΔG° + RT lnQ where Q =

[ ]

P

[ ]

S

reaction coordinate Free energy, ΔG S

P RTlnK_eq + RTlnQ

P'

at standard state, ΔG°

at other concentrations, ΔG

(6)

Free energy of activation, ΔG ^‡

ΔG

^‡

= −RT ln k ⋅ constant ( )

ΔG

^‡

∝ −RT ln k ΔG

^‡

∝RT ln 1

k

⎛

⎝ ⎜ ⎞

⎠ ⎟

reaction coordinate Fr ee energ y, Δ G S

P ΔG

^‡

α RTln(1/k)

transition state

(7)

reaction coordinate

Fr ee energ y, Δ G

E + S

E·S

transition state

E + P ΔG α RTln(1/k

_cat

)

ΔG

^‡

α RTln(K

_M

/k

_cat

)

ΔG = -RTln(1/K

_M

)

Enzyme-catalyzed reaction, low [S]

(8)

Enzyme-catalyzed reaction, high [S]

reaction coordinate

Fr ee energ y, Δ G

E + S

E·S

transition state

E + P

ΔG

^‡

α RTln(1/k

_cat

)

(9)

Directed evolution: simple idea, complex to put in practice

10 PRODUCE A MUTANT SPECTRUM OF SELF-REPRODUCING TEMPLATES

20 SEPARATE AND CLONE INDIVIDUAL MUTANTS 30 AMPLIFY CLONES

40 EXPRESS CLONES

50 TEST FOR OPTIMAL PHENOTYPES 60 IDENTIFY OPTIMAL GENOTYPES

70 RETURN TO 10 WITH A SAMPLE OF OPTIMAL GENOTYPES

M. Eigen, W. Gardiner (1984), Pure Appl. Chem. 56, 967-978.

(10)

• All possible mutants >> number of atoms in the universe

• Bacterial transformation rarely yields more than 10 ⁶ colonies.

Impossible to make all mutants

– 5,700 ways to change one amino acid (19*300)

– 16,200,000 ways to change two amino acids (19300)(19299)/2 _variants

M

= 19

^M

300!

(300 − M )!M ! for a 300 aa protein where M = number of amino acids that differ

Bosley & Ostermeier (2005) Mathematical expressions useful in the

construction, description and evaluation of protein libraries, Biomol. Eng., 22, 57.

(11)

Find one substitution that improves the enzyme

• test all single substitutions (systematic saturation mutagenesis)

(find best mutant, rare improvement, unexpected location)

• test some random single substitutions

(improvements are common; error prone PCR yields only ~10-20% of possibilities)

• focus changes at locations most likely for success

(find acceptable mutant with a minimum of screening)

(12)

Systematic saturation mutagenesis

• Systematic complete library of all possible single

substitution mutants: saturation mutagenesis using NNK primers at every position. [a lot of work!]

• Example: increase enantioselectivity of a nitrilase from E

~15 to >100 with Ala190His.

• Error-prone PCR would have missed it. Ala (GCN) to His CAU/C requires 2 or 3 nucleotide substitutions

DeSantis et al. (2003), Creation of a productive, highly enantioselective nitrilase through gene site saturation mutagenesis (GSSM), J. Am. Chem. Soc., 125, 11476-11477.

15N nitrile hydrolysis

NC C

¹⁵

N

OH

NC COOH

OH

HOOC C

¹⁵

N

OH

(S), m/z = 130 pseudo-prochiral (R), m/z = 129

Lipitor

^®

precursor

14N nitrile hydrolysis

(13)

Random Mutagenesis:

Error-prone PCR

NOT completely random, NOT complete (16-29% of ideal)

Incomplete library - use when you believe multiple solutions exist.

1. Codon bias 1. Completely random codons (64) do not code for the 20 amino acids equally (e.g., 4 for Gly, 1 for Trp) due to codon bias of genetic code.

2. Error-prone PCR does not yield completely random codons.

a) Polymerase bias. Polymerases favor some nucleotides substitutions over others.

b) Codon bias 2. One substitution in a codon is more likely than two substitutions, which is in turn more likely than three

substitutions.

(14)

• Temperature cycle

• 1. denature (break double strands) 95°C, 30 sec

• 2. annealing (bind primer) 55°C, 30 sec

• 3. elongation (synthesize new DNA strand) 72°C, 60 sec

• Each cycle increases [DNA] 2-fold. 2 ³⁰

= 10 ⁹

• Uses thermostable DNA polymerase (bacteria from hot spring in Yellowstone)

What happens in the tube

5'

5' 3'

3'

1 Denaturation

5'

5' 3'

3'

+

5'

3' 3'

5'

2 Annealing

5'

3' 5'

3'

3 Elongation

5'

3' 3'

3' 5' 5'

3' 5'

1

5' 3'

3' 5' 3' 5'

3' 5'

2

+ +

&

3

5' 3'

3' 5'

5' 3'

3' 5'

1 , 2 & 3

Exponential growth of short product

• PCR animation

www.dnalc.org/ddnalc/resources/pcr.html

(15)

Error-prone PCR ingredients

• Diversify PCR Random Mutagenesis (Clontech) Taq polymerase, control of mutation rate by Mn

²⁺

and dGTP concentration

• GeneMorph (Stratagene): control mutation rate by amount of

template (less template, more copying, which leads to more errors)

Caldwell & Joyce (1994) Mutagenic PCR, PCR Methods Appl., 3, S136.

!

!"#$%&'(()*+,

Parent sequence

DNase treatment

Denaturization

PCR w/o primers

Shuffled produkt Random fragments

Extension via polymerase

-../.$0./+1 023

" #$%&'()(*+,*&%-./*0-+1/)$*-/.)+$(

" 234*5+$6)1)+$(*/$7%$5/*8)(8%157/(

" 9%1:-%1)$.*8:1%./$/()(*;)-1:%&&'*)80+(()<&/

" 3+8<)$%1)+$*+,*8:1%1)+$(

-../.$0./+1 023

=%>

?.^!@

63=2A*6==2 6B=2A*6#=2

&%5C( DEĺFE /G+$:5&/%(/ %51);)1' 0+((/(/( )$1-)$()5 /--+- -%1/

-../.$0./+1 023

=%>

?.^!@

63=2A*6==2 n 6B=2A*6#=2 p

&%5C( DEĺFE /G+$:5&/%(/ %51);)1' 0+((/(/( )$1-)$()5 /--+- -%1/

0-+8+1/( 8)()$5+-0+-%1)+$

-../.$0./+1 023

=%>

?.^!@*n

63=2A*6==2 n 6B=2A*6#=2 p

&%5C( DEĺFE /G+$:5&/%(/ %51);)1' 0+((/(/( )$1-)$()5 /--+- -%1/

(1%<)&)H/( $+$I5+80&/8/$1%-'

<%(/ 0%)-(

0-+8+1/( 8)()$5+-0+-%1)+$

-../.$0./+1$023

?$^!@

=%>

?.^!@*n

63=2A*6==2 n 6B=2A*6#=2 p

&%5C(*DEĺFE /G+$:5&/%(/ %51);)1' 0+((/(/( )$1-)$()5*/--+-*-%1/

(1%<)&)H/(*$+$I5+80&/8/$1%-'*

<%(/*0%)-(

0-+8+1/(*8)()$5+-0+-%1)+$

-/6:5/(*17/*(0/5),)5)1'*+,*17/*

0+&'8/-%(/

&+J*%$$/%&)$.*1/80/-%1:-/

1/80&%1/*%8+:$1A*5'5&/*$:8</-

(16)

Codon bias 1: 64 codons translate to 20 amino acids unequally

• Four codons yield Gly, but only two yield Phe and only one yields Trp.

• Completely random

DNA codons favor some

amino acids over others.

(17)

Types of nucleotide substitutions

• Transition (purine to purine; A, G) or pyrimidine to pyrimidine; C, T)

• Tranversions (purine to pyrimidine or pyrimidine to purine)

Exercise: Write out all the possibilities to show that random muta- genesis should yield twice as many transversions as transitions.

N N N

N

NH₂

NH N N

N

O

NH₂

N N NH₂

O

NH N O

O

R R

A

C T

G

(18)

Polymerases biased for some substitutions over others

• Both Taq and Mutazyme I favor transitions over transversions (Ts/Tv > 0.5)

• Taq replaces more AT with GC than vice versa (GC content of DNA increases). Mutazyme I does the reverse.

• Taq makes ~4x more mutations at A and T than at G and C.

Mutazyme I does the reverse.

• Mutazyme II is a mixture of Taq and Mutazyme I which minimizes mutational bias.

!" #$%$&'()*"++",-./'%$"0'123%"&4526$%$737"835"

!"#$%&''&

()*+*,-.+/&0123*4+&-5&()*+6782&+.9&!+:&;<"&=-/7824+>2>&

&

!712?>@&-5&8)*+*,-.>&

()*+6782&''&&

;<"&1-/7824+>2^!&

()*+6782&'&&

;<"&1-/7824+>2^!&

"!#&;<"&1-/7824+>2&

?A25242.32&B@^$%&

#,+>&'.9,3+*-4>&

97:9;" <=>" ?=@" <=!"

A9o#.:#.oA9" <=B" <=@" ?=>"

AoCD"9oC" E<=FG" @E=BG" FE=>G"

#oCD".oC" HI=!G" F@=EG" ?>=BG"

!4+.>,*,-.>"

Ao#D"9o." ?F=EG" ?<=IG" @F=BG"

#oAD".o9" @E=EG" HI=FG" ?I=BG"

!4+.>C24>,-.>"

Ao9D"9oA" @!=EG" ??=?G" H<=>G"

Ao.D"9o#" H=FG" H=@G" F=IG"

#o.D".o#" H=?G" !=!G" ?=HG"

#o9D".oA" ?H=?G" @<=<G" H=EG"

'.>24*,-.>&+.9&;2/2*,-.>&

+%7$(53'%7" <=FG" <=!G" <=IG"

0$/$53'%7" H=!G" ?=?G" H=@G"

()*+*,-.&D42:)2.37&

&45253'%7:JK"L)$("M.NO^!" IP?B"L)$("M.NO" Q?"5'"F"L)$("M.NO" H=>"L)$("M.NO"

"" 9*$"&452RS1$"0CA")'/S1$(27$7"T$($"47$U"T35*"5*$"V'(($7)'%U3%6"#$%$&'()*"(2%U'1"14526$%$737"J357="

#" 9*$"$"%"0CA")'/S1$(27$"T27"47$U"T35*"&%^@WXV'%523%3%6"K4YY$("2%U"4%K2/2%V$U"UC9M"V'%V$%5(253'%7D"T*3V*"2($"

14526$%3V"V'%U353'%7"Y'("$"%"0CA")'/S1$(27$="

!& +%3532/"52(6$5"21'4%57"'Y"?B")6"5'"?"P6"L&452RS1$"++"0CA")'/S1$(27$OD"?")6"5'"?<<"%6"L&452RS1$"+"0CA"

)'/S1$(27$OD"2%U"<=<?"%&"5$1)/25$"L$"%"0CA")'/S1$(27$O"T$($"47$U"5'"6$%$(25$"U252="

As shown in Table II, error-prone enzymes generally favor transitions over transversions, as shown by Ts/Tv ratios greater than 0.5, with Mutazyme II and Taq exhibiting a somewhat higher tendency to create transversions over transitions and Mutazyme I exhibiting a greater tendency for introducing transitions over transversions. Examining transition mutation frequencies shows that Mutazyme II produces AToGC and GCoAT mutations with similar rates (AToGC/GCoAT ratio = 0.6), while Mutazyme I is 4 times more likely to generate GCoAT transitions over AToGC transitions, and Taq is 2 times more likely to introduce AToGC transitions over GCoAT transitions. In addition, Mutazyme II DNA polymerase introduces mutations at A’s and T’s only slightly more frequently than G’s and C’s. In contrast, Mutazyme I is nearly 3 times more likely to mutate G’s and C’s, while Taq under error-prone conditions is 4 times more likely to mutate A’s and T’s than G’s and C’s.

0.5 1.0 50%

50%

(19)

Codon bias 2: Some aa substitutions require 1 nucleotide change; others 2.

G G A Gly

mutation at 1 ^st position

C G A Arg

A G A stop

T G A Arg

• Single nucleotide change at GGA codon (Gly) yields not 9, but only 4 amino acid substitutions.

• Average: 5.7 amino acids accessible by a single nucleotide change. Two nucleotide changes are much less likely.

mutation at 3 ^rd position mutation at

2 ^nd position

G C A Ala

G A A Glu

G T A Val

G G G Gly

G G C Gly

G G T Gly

(20)

Expected result of epPCR

• Ideal: 19*300 = 5700 protein variants

Assume 19 condon substitution at each codon (not three nucleotides randomly): 19*300 = 5700 at DNA level;

screen 4.6*5700 = 26,200 colonies)

• Codon bias. Only 5.7 amino acids accessible by a

single nucleotide substitution. 5.7*300 = 1710 (29%)

(This value also accounts for synonymous amino acids codons.)

• Unequal distribution requires screening ~8 x more colonies to find rare ones.

Polymerase bias. Taq polymerase favors transitions ~2 x over transversions and mutations at AT ~4 x over mutations at GC. Estimate ~8 x bias.

(screen 4.681710 = 68,000 colonies)

• Screening 26,000 colonies will find only (26/68)*

(1710/5700) = ~11% of ideal number!

(21)

epPCR successes & failures

• Success when many solutions exist.

Increase the stability of a peroxidase for laundry applications.

- Both rational design and epPCR identified Glu239 to eliminate an electrostatic repulsion and Met242 which can be easily oxidized.

- Error prone PCR found three other substitutions, which contribute to stability, but it is not clear why.

• Failure when only a few solutions exist.

Error-prone PCR failed to expand the substrate range of esterases/

lipase to tertiary alcohols likely because the solution requires introducing two adjacent glycine residues in the oxyanion loop.

Cherry et al. (1999) Directed evolution of a fungal peroxidase, Nature Biotechnol., 17, 379-384.

Henke et al. (2002) Activity of lipases and esterases towards tertiary alcohols: insights into

structure-function relationships, Angew. Chem. Intl. Ed., 41, 3211-3213.

(22)

Saturation Mutagenesis

- strategies to encode all amino acids using synthetic oligonucleotides

- predicting the number of colonies that must be screened

(23)

Randomizing synthetic oligonucleotides

!"#$ $%& '&()# *+ ,%-.% $%& /01 -$#&23 -# "()45'-6&4 47"-)8

#+)$%&#-#9 ()4 #&.5)42+ $%& '&$%545258+ 35" -).5":5"($-)8 $%&

#+)$%&$-. 52-85)7.2&5$-4&; <%&#& $,5 -##7&# ,-22 *& 4-#.7##&4

#&:("($&2+9 (2$%578% #5'& -##7&# "(-#&4 *+ 5)& .() *& 4&(2$

,-$% *+ $%& 5$%&" ()4 =-.& =&"#(;

!"# $%&'"#$($ )* +,&-).(/#- )0(1)&230#)'(-#$

<%& =(27& 53 52-85)7.2&5$-4&>*(#&4 '7$(8&)&#-# -# $%($ .5)$"52 5=&" $%& .%&'-#$"+ 53 /01 #+)$%&#-# (225,# .5':2&$& .5)$"52 5=&" $%& 2&=&29 -4&)$-$+ ()4 :5#-$-5) 53 "()45'-6($-5); <%7#9 -3 () 52-85)7.2&5$-4& .() *& #+)$%&#-6&4 (# ( '-?$7"&9 5" -3 ( )7'*&" 53 #+)$%&$-. 52-85)7.2&5$-4&# .() *& '-?&49 $%&) $%-#

.() *& -).5":5"($&4 4-"&.$2+ -)$5 ( .5':2&$& 8&)& #&@7&).&;

<%&"& ("& ( ,-4& "()8& 53 $&.%)-@7&# 3"5' $%& !&24 53 .5'*-)($5"-(2 .%&'-#$"+ $%($ ("& (=(-2(*2& $5 ( .5'*-)($5"-(2

*-5258-#$; A)4&&49 $%& *-5258-#$ %(# () (4=()$(8& 5=&" $%&

.%&'-#$ (# ( '-?$7"& 53 8&)&# .() *& "&(4-2+ #&:("($&4 35"

()(2+#-# *+ $"()#35"'($-5) -)$5 *(.$&"-(2 .&22# ()4 -#52($-5) 53

#-)82& $"()#35"'&4 .525)-&#;

<%& #+)$%&#-# 53 4&8&)&"($& 52-85)7.2&5$-4&# -# ,&22

&#$(*2-#%&4B #+)$%&$-. :"-'&"# -).5":5"($-)8 '-?$7"&# 53 ()+

.5'*-)($-5) 53 $%& 357" )($7"(2 *(#&# ($ ()+ :5#-$-5) .() *&

5"4&"&4 4-"&.$2+ 3"5' '5#$ #7::2-&"#; C7.% :-&.&# 53 #+)$%&$-.

/01 .() *& 7#&4 $5 .5':2&$&2+ "()45'-6& ( #:&.-!. :5#-$-5) ,-$%-) ( 8&)&; <%& #+)$%&#-# 53 D45:&4E 52-85)7.2&5$-4&#9 ,%&"& ( #'(22 :"5:5"$-5) %(=& ( '7$($-5) ($ ( #:&.-!. :5#-$-5) 5" :5#-$-5)#9 -# ( #2-8%$2+ '5"& #:&.-(2-#$ :"5.&##9 *7$

52-85)7.2&5$-4&# 53 $%-# $+:& .() *& 5"4&"&4 3"5' '5#$

#7::2-&"#; <%&#& ("& 7#&4 $5 8&)&"($& 2-*"("-&# ,%&"& $%&

"()45'-6($-5) -# #:"&(4 57$ *7$ #$-22 $("8&$# $%5#& :5#-$-5)#

$%($ ("& 45:&4 -) $%& :"-'&"#; 1)+ #+)$%&$-. :"5.&## ,%&"&

( )7'*&" 53 "&(8&)$# ("& 7#&4 (# '-?$7"&# -# #7#.&:$-*2& $5

*-(# ("-#-)8 3"5' 8"&($&" -).5":5"($-5) 53 5)& "&(8&)$ $%()

()5$%&"; F7()$-$($-=& #$74-&# -)4-.($& $%($ ,%&"& #+)$%&#-#

-# .("&3722+ .5)$"522&4 ()4G5" 7#&# 5:$-'-6&4 "&(8&)$#

H&;8; <"()#8&)5'-.E# DI"&.-#-5) 07.2&5$-4& J-?EK9 $%-# *-(#

-# #'(22 -) #+)$%&$-. /01 2-*"("-&# HLM9LNK; A$ #%5724 *& )5$&4

$%($ $%-# "&2($-=& 2(.O 53 *-(# -# )5$ '(-)$(-)&4 ,%&) $%&#&

2-*"("-&# ("& .25)&49 (2$%578% $%& "&(#5) 35" $%-# -# )5$ .2&("

HLNK;

1)5$%&" *-(# :"5*2&' ("-#&# 47& $5 $%& '-#'($.% *&$,&&)

$%& *(#&>*+>*(#& #+)$%&#-# 53 $%& 52-85)7.2&5$-4& ()4 $%&

$"-:2&$ )($7"& 53 $%& 8&)&$-. .54&; <5 "()45'-6& ( .545) #5

$%($ -$ .() &).54& (22 PM ('-)5 (.-4#9 ( '-?$7"& 53 (22 357"

*(#&# -# "&@7-"&4 ($ $%& !"#$ $,5 :5#-$-5)# ()4 ($ 2&(#$ $%"&&

*(#&# -) $%& $%-"4 :5#-$-5); <%-# -) $7") 2&(4# $5 ( 35"' 53 .545)

*-(# (# $%&"& ("& #-? $-'&# (# '()+ .545)# 35" #5'& ('-)5 (.-4#9 #7.% (# #&"-)&9 $%() 5$%&"# #7.% (# $"+:$5:%() ()4 '&$%-5)-)&; A) (44-$-5)9 $%&"& -# $%& :5$&)$-(2 35" $%&

-)$"547.$-5) 53 #$5: .545)#; <%-# .() *& (=5-4&4 *+ 2-'-$-)8

$%& '-?$7"& 53 *(#&# ($ $%& $%-"4 :5#-$-5) 53 $%& .545) $5 < ()4 Q9 *7$ $%-# '&()# $%($ .545)# 35" ( "()8& 53 ('-)5 (.-4# ,-22 )5$ *& :"&#&)$ HR-8; LK; 1 .5':"5'-#& -# $5 "()45'-6& $%&

.545) ,-$% <9 Q 5" S -) $%& !)(2 :5#-$-5)9 8-=-)8 5)2+ 5)& #$5:

.545) -) &=&"+ TU :"-'&"#9 ()4 &).54-)8 (22 PM ('-)5 (.-4# 5"

00SG< 5" 00SGQ ,%-.% :"5=-4& (22 ('-)5 (.-4# ,-$% #2-8%$2+

'5"& .5''5) #$5: .545)#; 1)5$%&" "&#72$ 53 $%-# 35"' 53 .545) *-(# -# $%($ -$ -# 4-3!.72$ $5 -)#&"$ .545)# 35" ( #7*#&$ 53 ('-)5 (.-4# -3 $%-# -# 4&#-"(*2&;

1 )7'*&" 53 #527$-5)# %(=& *&&) 4&=&25:&4 $5 $%-# :"5*2&';

<%& #-':2&#$ #527$-5) -# $5 #+)$%&#-6& $%& /01 35" &(.%

4&#-"&4 '7$($-5) #&:("($&2+; R5" "&2($-=&2+ #'(22 2-*"("-&# $%&

3(22-)8 .5#$ 53 52-85)7.2&5$-4& #+)$%&#-# '(O&# $%-# :5##-*2&

,-$% $%& #-6& 53 $%& 2-*"("+ 2-'-$&4 *+ $%& #-6& 53 $%& *748&$ ()4 )5$ *+ $&.%)-.(2 .5)#-4&"($-5)#; <%& 52-85)7.2&5$-4&# .() $%&)

&-$%&" *& '-?&4 5" 7#&4 #&:("($&2+ $5 .5)#$"7.$ $%& 8&)&

4(12+# 56 1::"5(.%&# $5 "()45'-6-)8 #+)$%&$-. /01; V?(':2&# #%5, "()45'-6($-5) 53 5)& .545) ,-$% '-?&4 )7.2&5$-4&# H0009 00<GQ9 00SG< 5" 00<G SGQK ()4 ,-$% $"-)7.2&5$-4& :%5#:%5"('-4-$&#; C+)$%&#-# -) (22 $%"&& .(#&# .5''&).&# .5)=&)$-5)(22+ L! 53 $%& "()45'-6&4 .545); 1$ $%& L!>&)4 53 $%&

"()45'-6&4 .545) H7K (22 357" )7.2&5$-4&#9 H8K ( '-?$7"& 53 < ()4 Q9 H9K ( '-?$7"& 53 S ()4 < 5" H:K ( '-?$7"& 53 <9 S ()4 Q .() *& (44&4; A) &(.% .(#& ( '-?$7"& 53 (22 357" )7.2&5$-4&# -# (44&4 ($ &(.% 53 $%& "&'(-)-)8 $,5 :5#-$-5)#; W(=-)8 ( '-?$7"& 53 S ()4 Q ($ $%& L!>&)4 53 $%& .545) ,-22 :"5=-4& LP .545)#9 (22 PM ('-)5 (.-4# ()4 5)& #$5: .545); H;K Q5)=&"#&2+9 $%& .545) .() *& #+)$%&#-6&4 *+ $%& 4-"&.$ (44-$-5) 53 ( '-?$7"& 53 PM $"-)7.2&5$-4& :%5#:%5">

('-4-$&# -) 5)& #$&:; 1X1Y<ZI "&:"&#&)$ PM :"&#+)$%&#-6&4 L>)$ .545)#9 5)& $5 .54& 35" &(.% ('-)5 (.-4;

!"#$ !"#$%&# '#&() *%)%+,#-. /001. 23$4 5/. !34 1

NNK best, need 20 primers

(24)

How many colonies to screen to test each different mutant?

P

_i

= 1 − (1 − F

_i

)

^T

Pi = probability that sequence i among the transformants (colonies) tested

Fi = frequency at which sequence i is present in the library T = number of transformants (colonies) tested

Exercise 1: Show that screening 146 colonies ensure with a 99% probability that you have tested every mutant in an

NNK library. (K = G or T)

(25)

Rule: Oversample 4.6-fold a for 99% probability

Exercise: Rearrange the equation P i = 1-(1-F i )T to T· F i = -ln(1-P i )

using the approximation ln(1-F i ) ~-F i when F i <<1.

• ^{T· F} ⁱ = number of transformants

x frequency of sequence i in library

• ^{For P} ⁱ = 99%, T· F i = -ln(0.99) = 4.6; must screen 4.6

times more than library size

(26)

Group problems

1. How many colonies must you screen for an NNK saturation mutagenesis at one position for a 90% probability of testing

each mutant?

2. How many colonies did the GSSM experiment require for

the 330 amino acid nitrilase for 90% probability?

(27)

3. If you make an NNN library at one position instead of an

NNK library, how many more colonies will you need to screen for 90% probability of testing each mutant?

4. If you make twenty primers that each code for one amino acid, how many colonies must you screen to have a 90%

probability of testing each mutant?

(28)

• Free energy diagrams

Outline

• Exam question