Package uk.ac.cam.ch.wwmm.opsin
Class ParseRules
- java.lang.Object
-
- uk.ac.cam.ch.wwmm.opsin.ParseRules
-
public class ParseRules extends java.lang.Object
Instantiate via NameToStructure.getOpsinParser() Performs finite-state allocation of roles ("annotations") to tokens: The chemical name is broken down into tokens e.g. ethyl -->eth yl by applying the chemical grammar in regexes.xml The tokens eth and yl are associated with a letter which is referred to here as an annotation which is the role of the token. These letters are defined in regexes.xml and would in this case have the meaning alkaneStem and inlineSuffix The chemical grammar employs the annotations associated with the tokens when deciding what may follow what has already been seen e.g. you cannot start a chemical name with yl and an optional e is valid after an arylGroup- Author:
- ptc24, dl387
-
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description ParseRulesResults
getParses(java.lang.String chemicalWord)
Determines the possible annotations for a chemical word Returns a list of parses and how much of the word could not be interpreted e.g.
-
-
-
Method Detail
-
getParses
public ParseRulesResults getParses(java.lang.String chemicalWord) throws ParsingException
Determines the possible annotations for a chemical word Returns a list of parses and how much of the word could not be interpreted e.g. usually the list will have only one parse and the string will equal "" For something like ethyloxime. The list will contain the parse for ethyl and the string will equal "oxime" as it was unparsable For something like eth no parses would be found and the string will equal "eth"- Parameters:
chemicalWord
-- Returns:
- Results of parsing
- Throws:
ParsingException
-
-