The importance of location, where one amino acid makes the difference
The C-terminal sequence of a protein is involved in processes such as efficiency of translation termination and protein degradation. However, the general relationship between features of this C-terminal sequence and levels of protein expression remains unknown. Here, we identified C-terminal amino acid biases that are ubiquitous across the bacterial taxonomy (1582 genomes). We showed that the frequency is higher for positively charged amino acids (lysine, arginine) while hydrophobic amino acids and threonine are lower. We then studied the impact of C-terminal composition on protein levels in a library of M. pneumoniae mutants, covering all possible combinations of the two last codons. We found that charged and polar residues, in particular lysine, led to higher expression, while hydrophobic and aromatic residues led to lower expression, with a difference in protein levels up to 4-fold. We further showed that modulation of protein degradation rate could be one of the main mechanisms driving these differences. Our results demonstrate that the identity of the last amino acids has a strong influence on protein expression levels and therefore it is important to look at it in biotechnological applications.