首页 馆藏资源 舆情信息 标准服务 科研活动 关于我们
现行 ISO 24614-2:2011
到馆阅读
收藏跟踪
购买正版
Language resource management — Word segmentation of written texts — Part 2: Word segmentation for Chinese, Japanese and Korean 语言资源管理——书面文本的分词Spart 2:中文、日文和韩文的分词
发布日期: 2011-08-25
ISO 24614-1中定义的分词的基本概念和一般原则适用于汉语、日语和韩语。文本需要被分割成标记、单词、短语或一些其他类型的较小文本单元,以便在语言资源上执行某些计算应用,例如自然语言处理、信息检索和机器翻译。ISO 24614-2:2011仅限于将文本分割为单词或其他分词单元(WSU)。这项任务与词法或句法分析本身不同,尽管它在很大程度上依赖于词法句法分析。它也不同于构建一个词汇框架并识别其词条的任务,即引理和词素。 后一项任务的框架由ISO 24611、ISO 24613和ISO 24615提供。 ISO 24614-2:2011规定了为中文、日文和韩文划定WSU的规则。三种语言都有一些共同的规则,尽管每种语言都有自己独特的识别WSU的规则。讨论了它们的共同特点,然后为中国人、日本人和韩国人制定了不同的规则。
The basic concepts and general principles of word segmentation as defined in ISO 24614-1 apply to Chinese, Japanese and Korean. Text needs to be segmented into tokens, words, phrases or some other types of smaller textual units in order to perform certain computational applications on language resources, such as natural language processing, information retrieval and machine translation. ISO 24614-2:2011 is restricted to the segmentation of a text into words or other word segmentation units (WSUs). This task is distinct from morphological or syntactic analysis per se, although it greatly depends on morphosyntactic analysis. It is also different from the task of laying out a framework for constructing a lexicon and identifying its lexical entries, namely lemmas and lexemes. The frameworks for the latter tasks are provided by ISO 24611, ISO 24613 and ISO 24615. ISO 24614-2:2011 specifies rules for delineating WSUs for Chinese, Japanese and Korean. Some rules are common to all three languages, though each language also has its own distinct rules for identifying WSUs. The common features are discussed, then the distinct rules are laid out for Chinese, for Japanese and for Korean.
分类信息
关联关系
研制信息
归口单位: ISO/TC 37/SC 4
相似标准/计划/法规
现行
BS ISO 24614-2-2011
Language resource management. Word segmentation of written texts-Word segmentation for Chinese, Japanese and Korean
语言资源管理 书面文本的分词
2013-04-30
现行
BS ISO 24614-1-2010
Language resource management. Word segmentation of written texts-Basic concepts and general principles
语言资源管理 书面文本的分词
2010-11-30
现行
ISO 24614-1-2010
Language resource management — Word segmentation of written texts — Part 1: Basic concepts and general principles
语言资源管理——书面语分词SPART 1:基本概念和一般原则
2010-10-25
现行
GOST R ISO 24614-1-2013
Менеджмент языковых ресурсов. Пословная сегментация письменных текстов. Часть 1. Основные концепции и общие принципы
语言资源管理 书面文字的分词 第一部分基本概念和一般原则
现行
BS 10/30196857 DC
BS ISO 24614-2. Language resource management. Word segmentation of written texts. Part 2. Word segmentation for Chinese, Japanese and Korean
BS ISO 24614-2 语言资源管理 书面文本的分词 第二部分 中文、日文和韩文分词
2010-06-24
现行
BS 09/30196484 DC
BS ISO 24614-1. Language resource management. Word segmentation of written texts for monolingual and multilingual information processing. Part 1. Basic concepts and general principles
BS ISO 24614-1 语言资源管理 用于单语和多语信息处理的书面文本分词 第一部分 基本概念和一般原则
2009-03-06