Simplified molecular-input line-entry system

From Sciencemadness Wiki
Jump to: navigation, search

The Simplified Molecular-Input Line-Entry System, short SMILES, is a simplified way of specifying all chemicals, notably organic structures.

How to use

  • Atoms and charge

Elements are written in brackets (for example: [He] for helium). Charge is specified by writing either as many plusses or minuses as the charge is or by writing either a plus or a minus and the charge as a number. As an example, the copper(II) cation can be specified as either [Cu++] or [Cu2+]. Except for hydrogens used in organic compounds, in which it is not specified (unless it has a charge or is of a specific isotope), inorganic hydrogen has to be specified in brackets (for example: [ClH] for hydrogen chloride)

  • Isotopes

The isotope can be specified by writing the number of the isotope in front of the element (For example: [2H] for deuterium). If not specified, the standard isotope mixture is used by default.

  • Organic subset elements

The organic subset elements are elements B, C, N, O, P, S, F, Cl, Br, and I. These can be written without brackets. An example is methane which is simply a single C.

  • Bonds

Single bonds can be specified as a dash (-) such as C-C, but this is not needed and is usually left out for simplicity. All other bonds except the aromatic bonds have to be specified. Double bonds are specified by an equals (=) sign ( CC(=O)C for acetone) and triple bonds are specified by a hash (#) ( C#N for cyanide). Cyclic bonds are usually not specified, but there is a different way you have to specify them which is described in a later section.

  • Branching

A branch is defined by at least three bonds of any kind at a single spot. In this case, the third bond and anything connecting to the bond is put in parenthesis (). A good example is acetone CC(=O)C.

  • Cyclic bonds

Cyclic bonds are specified by putting an index number on the first carbon, such as simply 1 and on the last carbon of the ring the same index number to close the ring. For cyclohexane, one would then write C1CCCCC1. If the compound has multiple rings, one would simply increase the number of the index for each one as needed.

  • Aromaticity


  • Stereo Isomerism


  • Reactions



Relevant Sciencemadness threads