Introduction to peptide design
Peptides can be designed de novo or based on peptide sequences from native proteins, depending on the desired application. Synthetic peptides can be modified to change their properties or conformation, tagged for purification or detection, conjugated to immunogens for antibody production or isotopically labeled for protein quantitation. Peptides are complex biomolecules that have unique chemical and physical properties that are a direct result of their amino acid composition. This page focuses on the key elements of peptide design that influence synthesis, purity and stability and how they can be modified.
Peptide length is variable and depends on the application for which they are used. For example, peptides 10-20 amino acids in length are ideal for antibody preparation, while peptides used for structure/function studies can be more variable. Although technological advances have enabled current peptide synthesis strategies to be considerably more efficient than ever before, the purity of synthesized crude peptides is limited by the length of the proposed peptide. As the length of the peptide increases, so do the amount of impurities that must be removed from the growing peptides after each deprotection-coupling cycle. Additionally, longer peptide sequences require more coupling reactions between the growing peptide and the next amino acid in the sequence. With each coupling cycle, a small number of coupling reactions on individual peptides in the reaction mixture fail, resulting in an increasing concentration of truncated peptides (deletions) in the reaction as the length of the peptide being synthesized increases. The concentration of full-length peptide synthesized in a reaction is inversely correlated with the length of the proposed peptide; therefore, as the length of the peptide increases, the yield is reduced because of the increasing difficulty in purifying the low-abundant product from the crude mixture; while peptides 75 amino acids in length can be synthesized, the yield in the synthesis reaction will be poor compared to the yield when synthesizing shorter peptides.
Graphical representation of the relationship between peptide length and full-length peptide yield. Due to the cyclic nature of the methods of peptide synthesis, the concentration of full-length peptide synthesized in a given reaction is inversely correlated with the length of the proposed peptide.
Peptides can be designed de novo or based on peptide sequences from native proteins, depending on the desired application. Synthetic peptides can be modified to change their properties or conformation, tagged for purification or detection, conjugated to immunogens for antibody production or isotopically labeled for protein quantitation. Peptides are complex biomolecules that have unique chemical and physical properties that are a direct result of their amino acid composition. This page focuses on the key elements of peptide design that influence synthesis, purity and stability and how they can be modified.
Learn more
Select products
Amino acids are grouped according to their hydropathy, and the inclusion or exclusion of hydrophobic or hydrophilic amino acids in a peptide sequence influences the ability to synthesize the peptide or solubilize the final product in aqueous solutions.
Amino acid classifications. | |
Hydrophobic (non-polar) | Ala, Ile, Leu, Met, Phe, Trp, Val |
Uncharged (polar) | Asn, Cys, Gly, Gln, Pro, Ser, Thr, Tyr |
Acidic (polar) | Asp, Glu |
Basic (polar) | His, Lys, Arg |
Peptides with a high proportion of hydrophobic amino acids will negatively affect the solubility in aqueous solutions. A rule of thumb in designing soluble peptides is to ensure that 1 out of every 5 amino acids is charged. If this cannot be achieved, then amino acids in the peptide sequence that are not critical to the function of the peptide can be replaced with charged residues. This, of course, may influence the nature of the peptide, therefore, substitutions should be carefully considered.
While it is difficult to determine the exact solubility of peptides without empirical testing, there are general guidelines that can be used to predict peptide solubility:
- Peptides shorter than 5 residues are usually soluble in aqueous solutions, except if the entire sequence consists of hydrophobic amino acids.
- Hydrophilic peptides containing >25% charged residues and <25% hydrophobic amino acids are usually soluble in aqueous solutions.
- Hydrophobic peptides containing 50% or more hydrophobic residues may be insoluble or only partly soluble in aqueous solutions. These peptides should first be dissolved in organic solvents such as dimethylsulfoxide (DMSO), dimethylformamide (DMF) or acetonitrile prior to a careful dilution in aqueous solutions.
- Peptides containing a very high proportion (>75%) of D, E, H, K, N, Q, R, S, T or Y are capable of building intermolecular hydrogen bonds (crosslinks) and thus form gels in aqueous solutions. These peptides should either be solubilized in organic solvents or the pH of the buffer should be modified.
Besides the peptide length, certain amino acids or amino acid combinations can negatively affect peptide synthesis, purification, solubility or stability. These amino acids can be substituted with conservative amino acids such as alanine or glycine, deleted, or replaced with an analogue, depending on the specific amino acid. Depending on the application, the peptide may be based on native proteins, and often the sequences contain both amino acids that are essential for its function in a given assay and those that are not essential and act solely in a structural capacity. With these kinds of peptides, the rule of thumb is to make any modifications or substitutions on nonessential residues. Another method of addressing difficult amino acids or unfavorable combinations in native sequences is to either slightly shift the sequence aligned with the native sequence to make it more favorable or break up unfavorable combinations.
Graphical representation of shifting the peptide sequence to avoid unfavorable amino acids. Besides substituting conservative amino acids with those that may interfere with a given application or negatively affect synthesis or purification, sequences based on native proteins can sometimes be shifted slightly to either exclude amino acids (indicated by the red arrow) or break up unfavorable sequences.
The following points are guidelines in designing de novo or native-based peptides that have a composition that favors synthesis, purification, storage and solubility.
Cysteine and methionine are susceptible to rapid oxidation, which can negatively influence the cleavage of protecting groups during synthesis and the subsequent peptide purification. To avoid this, cysteine can be replaced with serine and methionine replaced with norleucine (Nle). Multiple cysteines on a peptide are susceptible to forming disulfide linkages unless a reducing agent such as dithiothreitol (DTT) is added to the buffer or the cysteines are replaced with serine residues. Cysteine residues in peptides used for antibody production can affect the avidity of the antibody, because free cysteines are uncommon in vivo and therefore may not be recognized by the native peptide structure.
N-terminal glutamine is unstable, because it forms cyclic pyroglutamate under acidic conditions during protecting group cleavage. This can be prevented by acetylating the N-terminal glutamine or by substituting glutamine with pre-formed pyroglutamic acid or a conservative amino acid.
N-terminal asparagine should be avoided, because the asparagine N-terminal protecting group can be difficult to remove during cleavage. Therefore, either remove or substitute the N-terminal amino acid.
Aspartic acid can undergo hydrolysis and cause peptide cleavage under acidic conditions when paired with glycine, proline or serine. Avoid these combinations if possible by substitution or breaking them up by shifting the sequence.
Multiple serine or proline residues in a sequence can cause significant deletions during synthesis, especially proline residues, which can undergo cis/trans isomerization and reduce peptide purity.
A series of glutamine, isoleucine, leucine, phenylalanine, threonine, tyrosine or valine can cause β sheets, which cause incomplete solvation during peptide synthesis, resulting in deletions. Conservative substitution of asparagine for glutamine or serine for threonine, adding a proline or glycine every third amino acid or shifting the sequence can break up β sheets.
Select products
A powerful aspect of synthetic peptides is the ability to create peptides with the exact conformation or characteristics needed for a given application through the addition of modifications. Peptides in general and specific amino acids have distinct moieties that are amenable to modification, including:
- N-terminal amino group
- C-terminal carboxy group
- ε-Amino group on lysine
- Hydroxyl group on serine, threonine and tyrosine
- Guanidine group on arginine
- Thiol group on cysteine
Common peptide moieties accessible for modification. Peptides are commonly modified at specific moieties on individual amino acids and at sites that are conserved throughout all peptides. (A) N-terminal amino group; (B) ε-amino group on lysines; (C) Thiol group on cysteines; (D) Hydroxyl group on serine, threonine or tyrosine; (E) Guanidinyl group on arginines; and (F) C-terminal carboxy group.
There are multiple types of modifications, and the total number of modifications is vast. Many of them are post-translational modifications that occur in vivo, while others are substitutions of natural amino acids with non-natural or isotopically labeled variants. Additionally, tags or proteins can be chemically conjugated via crosslinking chemistry to the moieties listed previously. Because of the C- to N-terminus synthesis orientation, it is recommended that any tags or dyes be conjugated to the N-terminus so that only full-length peptides are labeled.
While the modifications listed below comprise commonly used modifications, this collection is by no means exhaustive.
Phosphorylated tyrosine, serine or threonine can be positioned anywhere on a given peptide. Although multiple phosphorylated amino acids can be added, they can negatively affect peptide synthesis and purification.
Chemically synthesized peptides carry positively and negatively charged amino and carboxy termini, respectively. Because these termini are not charged in vivo, they can be modified by N-terminal acetylation and C-terminal amidation, which remove the respective charges to mimic natural peptides and increase cell permeability.
Methylation of histone proteins is a common method of epigenetic regulation, and mono-, di- and trimethylated lysine residues (or sometimes arginine) can be added to peptides to mimic this post-translational modification. The thiol group on cysteine can also be methylated via an acetamidomethyl (Acm) group to selectively form disulfide bridges.
Standard peptide synthesis is performed using L-amino acids, but a common option is to synthesize the peptide using D-isomers, which are the enantiomers, or mirror images, of chiral L-amino acids (all natural amino acids except glycine). While the chemical formula is identical, D-amino acids may modify the function of the peptide.
In contrast to standard amino acids, isotopically labeled "heavy" amino acids are synthesized by substituting 12C and 14N atoms with 13C and/or 15N atoms, respectively. Heavy amino acids are non-radioactive and have known molecular weights that are heavier than standard amino acids. This molecular weight difference makes heavy peptides useful tools for quantitative peptide analysis or protein structure and dynamics determination by mass spectrometry (MS) or nuclear magnetic resonance (NMR) spectroscopy, respectively.
Cyclization can be used to mimic natural peptide structures or to synthesize more stable peptide analogues, resulting in enhanced conformational stability compared to their natural counterparts. Cyclic peptides are also known for their resistance to proteolytic hydrolysis and degradation. Two common methods of peptide cyclization are:
- Amide condensation
- Thiol oxidation (disulfide bridge formation)
These chemical structures separate peptides from tags and dyes and can be hydrophobic or hydrophilic to modify the natural hydropathy of the peptide to which it is conjugated. Spacers of different lengths are commonly available for variable distances between peptides and the dye or tags. A common hydrophobic spacer is aminohexanoic acid (Ahx), and a common hydrophilic spacer is poly (ethylene) glycol (PEG).
Learn more
Select products
Tags can be conjugated to peptides to aid purification or detection in an experimental system. Depending on the type of tag, they can also be used to facilitate localization to specific cellular compartments (e.g., membranes, cytoplasm). Peptides are commonly tagged with biotin or lipids (e.g., farnesyl, formyl, myristoyl, palmitoyl and stearyl groups).
Learn more
Select products
Dyes are commonly used to aid in localization or protein binding studies and can be broadly separated into fluorescent dyes, quenchers (non-fluorescent; used to quench proximal fluorescent dye molecules) and chromogens. Dyes that are commonly conjugated to peptides are listed below.
Fluorescent Dyes | |
---|---|
7-Amino-4-methyl-coumarin (AMC) | UV-excitable dye used in enzyme assays using cuvettes or by flow cytometry |
5-((2-Aminoethyl)amino)napthalene-1-sulfonic acid (EDANS) | A commonly used dye in fluorescence resonance energy transfer (FRET) peptides in combination with Dabcyl as a quencher |
Fluorescein derivatives (FITC, FAM) | A commonly used fluorescent dye in confocal laser-scanning microscopy and flow cytometric applications |
7-Nitrobenz-2-oxa-1,3-diazole (NBD) | A fluorescent dye used for amine modification |
Rhodamine derivatives (Rhodamine B, TAMRA) | A commonly used group of fluorescent dyes used in many fluorescent applications |
Fluorescent Quenchers | |
Dabcyl | A non-fluorescent dye predominantly used as a quencher for other fluorophores, especially fluorescein derivatives or EDANS |
Dansyl | A fluorophore quencher that, unlike Dabcyl, has a specific wavelength emission |
2,4-Dinitrophenol (DNP) | A non-fluorescent dye that can be used as a quencher similar to Dabcyl |
Other dyes | |
p-Nitroaniline | A chromogen used as a colorimetric enzyme substrate in many standard enzyme assays |
Select products
Peptide-protein conjugates are used to generate antibodies that target the specific peptide. Peptides alone are mostly too small to elicit an immune response sufficient to generate antibodies. Therefore, the peptide of interest is conjugated to carrier proteins containing many epitopes to stimulate T-helper cells, which induce the B-cell response that generates the antibodies. A key factor in this approach is that the immune system reacts to the peptide-protein conjugate as a whole, and therefore a proportion of the elicited antibodies that target the linker region and carrier protein besides the peptide of interest should be removed by purification of the peptide-specific antibody.
Common carrier proteins used for antibody production are:
- Keyhole limpet hemocyanin (KLH) – a copper-containing non-heme protein found in arthropods and mollusks that has a MW of 4.5 x 105 to 1.3 x 107 Da. KLH is the most commonly selected carrier because of its high immunogenicity compared to other proteins.
- Bovine serum albumin (BSA) – a stable and soluble plasma protein in cattle that has a MW of 67 x 103 Da and contains 59 lysine residues, 30-35 of which are accessible for conjugation. This characteristic specifically makes BSA a popular carrier protein for weakly antigenic compounds.
- Ovalbumin (OVA) – a protein isolated from chicken egg whites with a MW of 45 x 103 that is a good choice as a second carrier protein to verify if antibodies are specific for the peptide alone and not the carrier protein (e.g., BSA, KLH).
- Multiple antigen peptides (MAPs) – branched peptides that are bound via a lysine backbone can be used for antibody production without using a large protein against which antibodies are also generated. MAPs are available with 4, 8 and sometimes 12 branches, depending on the synthesis service.
Compared to the 20 natural, "proteinogenic" amino acids, unnatural amino acids are not encoded by the Universal Genetic Code, although they can usually be found in nature as metabolic products, especially in plants and bacteria. Some examples include:
- Citrulline
- Ornithine
- ε-Acetyl-lysine
- 3-Amino-propionic acid (β-alanine)
- Aminobenzoic acid
- 6-Aminocaproic acid (Aca; 6-Aminohexanoic acid)
- Aminobutyric acid (Abu)
- Hydroxyproline
- Mercaptopropionic acid (MPA)
- 3-Nitro-tyrosine
- Norleucine (Nle)
- Pyroglutamic acid
- Lloyd-Williams P. et al. (1997) Chemical approaches to the synthesis of peptides and proteins. Boca Raton: CRC Press. 278
- Merrifield R. B. (1963) Solid phase peptide synthesis. I. The synthesis of a tetrapeptide. Journal of the American Chemical Society. 85, 2149-54.
- Carpino L. A. (1957) Oxidative reactions of hydrazines. Iv. Elimination of nitrogen from 1, 1-disubstituted-2-arenesulfonhydrazides1-4. Journal of the American Chemical Society. 79, 4427-31.
- McKay F. C. and Albertson N. F. (1957) New amine-masking groups for peptide synthesis. Journal of the American Chemical Society. 79, 4686-90.
- Anderson G. W. and McGregor A. C. (1957) T-butyloxycarbonylamino acids and their use in peptide synthesis. Journal of the American Chemical Society. 79, 6180-3.
- Carpino L. A. and Han G. Y. (1972) 9-fluorenylmethoxycarbonyl amino-protecting group. The Journal of Organic Chemistry. 37, 3404-9.
仅供科研使用,不可用于诊断目的。