Mostrar registro simples

dc.contributor.authorLopes, Lucelenept_BR
dc.contributor.authorVieira, Renatapt_BR
dc.contributor.authorFinatto, Maria José Bocornypt_BR
dc.contributor.authorMartins, Danielpt_BR
dc.date.accessioned2018-04-03T02:26:06Zpt_BR
dc.date.issued2010pt_BR
dc.identifier.issn0104-6500pt_BR
dc.identifier.urihttp://hdl.handle.net/10183/174302pt_BR
dc.description.abstractThe need for domain ontologies motivates the research on structured information extraction from texts. A foundational part of this process is the identification of domain relevant compound terms. This paper presents an evaluation of compound terms extraction from a corpus of the domain of Pediatrics. Bigrams and trigrams were automatically extracted from a corpus composed by 283 texts from a Portuguese journal, Jornal de Pediatria, using three different extraction methods. Considering that these methods generate an elevated number of candidates, we analyzed the quality of the resulting terms according to different methods and cut-off points. The evaluation is reported by metrics such as precision, recall and f-measure, which are computed on the basis of a hand-made reference list of domain relevant compounds.en
dc.format.mimetypeapplication/pdf
dc.language.isoengpt_BR
dc.relation.ispartofJournal of the Brazilian Computer Society. Rio de Janeiro, RJ. Vol. 16 (2010), p. [247]-259pt_BR
dc.rightsOpen Accessen
dc.subjectTerm extractionen
dc.subjectOntologiapt_BR
dc.subjectTerminologiapt_BR
dc.subjectStatistical and linguistic methodsen
dc.subjectOntology automatic constructionen
dc.subjectExtraction from corporaen
dc.titleExtracting compound terms from domain corporapt_BR
dc.typeArtigo de periódicopt_BR
dc.identifier.nrb001057475pt_BR
dc.type.originNacionalpt_BR


Thumbnail
   

Este item está licenciado na Creative Commons License

Mostrar registro simples