TY - JOUR
T1 - Chemical named entities recognition
T2 - A review on approaches and applications
AU - Eltyeb, Safaa
AU - Salim, Naomie
PY - 2014/4/28
Y1 - 2014/4/28
N2 - The rapid increase in the flow rate of published digital information in all disciplines has resulted in a pressing need for techniques that can simplify the use of this information. The chemistry literature is very rich with information about chemical entities. Extracting molecules and their related properties and activities from the scientific literature to "text mine" these extracted data and determine contextual relationships helps research scientists, particularly those in drug development. One of the most important challenges in chemical text mining is the recognition of chemical entities mentioned in the texts. In this review, the authors briefly introduce the fundamental concepts of chemical literature mining, the textual contents of chemical documents, and the methods of naming chemicals in documents. We sketch out dictionary-based, rule-based and machine learning, as well as hybrid chemical named entity recognition approaches with their applied solutions. We end with an outlook on the pros and cons of these approaches and the types of chemical entities extracted.
AB - The rapid increase in the flow rate of published digital information in all disciplines has resulted in a pressing need for techniques that can simplify the use of this information. The chemistry literature is very rich with information about chemical entities. Extracting molecules and their related properties and activities from the scientific literature to "text mine" these extracted data and determine contextual relationships helps research scientists, particularly those in drug development. One of the most important challenges in chemical text mining is the recognition of chemical entities mentioned in the texts. In this review, the authors briefly introduce the fundamental concepts of chemical literature mining, the textual contents of chemical documents, and the methods of naming chemicals in documents. We sketch out dictionary-based, rule-based and machine learning, as well as hybrid chemical named entity recognition approaches with their applied solutions. We end with an outlook on the pros and cons of these approaches and the types of chemical entities extracted.
KW - Chemical entities
KW - Chemical names
KW - Information extraction
UR - http://www.scopus.com/inward/record.url?scp=84901001907&partnerID=8YFLogxK
U2 - 10.1186/1758-2946-6-17
DO - 10.1186/1758-2946-6-17
M3 - Review article
AN - SCOPUS:84901001907
SN - 1758-2946
VL - 6
JO - Journal of Cheminformatics
JF - Journal of Cheminformatics
IS - 1
M1 - 17
ER -