Codon Modification

Purpose:  attempt to silently eradicate RNA elements that might inhibit protein expression in mammalian cell lines    
Adapted from:  Pastrana et al (2004) Virology 321:205        

Unfortunately, the computer algorithm that automatically converts a native sequence of interest to "as different as possible" codons can no longer be hosted on NCI servers. 

Please contact Chris Buck (buckc@nih.gov) to obtain the software via email.      

After ADAP conversion
Add Kozak translational initiation context upstream of the initiator ATG (GCC RCC ATG)      
For terminator codon use TGAG.  TGA is the most common terminator.  The additional G downstream is thought to augment termination  

Hand edit the output. Manipulations are listed in descending order of hypothetical importance       
Remove anything that resembles a splice donor (see Figure 3, below, from Patel (2003) Nature Reviews Molecular Cell Biology 4:961)
               Specifically: search for and silenly remove: GGTR, GTRNG, GGTNNG        

Remove poly A/T tracts of five or more bases (search syntax WWWWW). Especially important to avoid:
                 AATAAA or AAAAAA (polyadenylation signal-like)
                 ATTTA or TTTTT (Stefan Schwartz inhibitory elements)         

avoid C/T tracts of six or more bases, especially if they are just upstream of an AG (i.e., splice-acceptor-like - see Patel)
                   Search syntax YYYYY, look for AG dinucleotide within 10bp downstream       

avoid clusters (<150bp apart) of the relatively rare codons:
                  CAA (Gln), AAA (Lys), AGA (Arg), CAT (His), or TTT (Phe) codons, or TCA (Ser), TTG (Leu)     

Less important motifs that may or may not matter
Avoid T/G rich sequences - e.g., GTTGTTTG, TATATGTTT (HuR class III binding sites, Cumming (2009) Virology 383:142)   

Avoid clusters of GGG triplets (hnRNP-H binding (JV 79:9254))
                  Hypothesis:  GGG triplets are prone to formation of G-quadruplexes?  May be wise to use G4 Calculator software:  

Avoid ACCACC (inhibition via SRp20 binding Jia08 JV 83:167)        

Final polish:           

Some converted sequences are very G/C rich.  The gene synthesis process may be easier if the third position of some Gly, Pro, Ala codons is changed to to A or T 

We have some Gateway-adapted Destination constructs that work well in 293TT cells. The synthesized gene can be flanked with attL sequences to allow use Gateway LR recombinase

codon usage in highly-expressed genes
originally from <http://www-igbmc.u-strasbg.fr/>

AmAcidCodonNumber/1000Fraction ..
GlyGGG905.0018.76 0.24
GlyGGA525.0010.880.14 
GlyGGT441.009.140.12 
GlyGGC1867.0038.700.50