Language resource management — Feature structures — Part 2: Feature system declaration
语言资源管理——要素结构Spart 2:要素系统声明
发布日期:
2011-09-26
ISO 24610-2:2011提供了一种在自然语言应用程序中表示、存储或交换特征结构的格式,用于语言数据的注释和生成。它的最终目的是提供一种计算机格式来定义类型层次结构,并声明对一组特征规范和特征结构操作的约束,从而提供检查每个特征结构与参考规范的一致性的方法。特征结构是许多语言形式主义的重要组成部分,也是表示语言工程应用所消耗或产生的信息的基本机制。
要素系统声明(FSD)是一个辅助文件,与使用fs(即要素结构)元素的特定类型的文本一起使用。消防处有四个目的。1) 它提供了一种编码,通过这种编码可以引入和定义类型及其子类型和继承关系,从而为构建特征系统奠定基础。2) 它提供了一种机制,通过这种机制,编码器可以列出所有特征名称和特征值,并对每个特征代表的内容进行详细描述。3) 它提供了一种机制,通过该机制可以声明类型约束,并根据类型化特征逻辑中所述的给定理论验证类型化特征结构。
这些约束可能涉及对特征值范围的约束、在某些类型的特征结构中允许哪些特征的约束,或防止某些特征值对同时出现的约束。这些约束的来源通常是正在建模的经验领域。4) 它提供了一种机制,编码器可以通过该机制定义未指定特征结构的预期解释。这涉及到为缺少的特征定义默认值(无论是文字值还是计算值)。
ISO 24610中描述的方案-
2:2011可用于记录任何特征系统,但主要用于ISO 24610-1中定义的类型化特征结构表示。ISO 24610-1的特征结构表示法规定了受ISO 24610-2:2011规定的类型约定和约束的数据结构。ISO 24610-1的特征结构表示也用于ISO 24610-2:2011中定义的一些元素中。
ISO 24610-2:2011 provides a format to represent, store or exchange feature structures in natural language applications, for both annotation and production of linguistic data. It is ultimately designed to provide a computer format to define a type hierarchy and to declare the constraints that bear on a set of feature specifications and operations on feature structures, thus offering means to check the conformance of each feature structure with regards to a reference specification. Feature structures are an essential part of many linguistic formalisms as well as an underlying mechanism for representing the information consumed or produced by and for language engineering applications.
A feature system declaration (FSD) is an auxiliary file used in conjunction with a certain type of text that makes use of fs (that is, feature structure) elements. The FSD serves four purposes. 1) It provides an encoding by which types and their subtyping and inheritance relationships can be introduced and defined, thus laying the basis for constructing a feature system. 2) It provides a mechanism by which the encoder can list all of the feature names and feature values and give a prose description as to what each represents. 3) It provides a mechanism by which type constraints can be declared, against which typed feature structures are validated relative to a given theory stated in typed feature logic. These constraints may involve constraints on the range of a feature's value, constraints on which features are permitted within certain types of feature structures, or constraints that prevent the co-occurrence of certain feature-value pairs. The source of these constraints is normally the empirical domain being modelled. 4) It provides a mechanism by which the encoder can define the intended interpretation of underspecified feature structures. This involves defining default values (whether literal or computed) for missing features.
The scheme described in ISO 24610-2:2011 may be used to document any feature system, but is primarily intended for use with the typed feature structure representation defined in ISO 24610-1. The feature structure representations of ISO 24610-1 specify data structures that are subject to the typing conventions and constraints specified using ISO 24610-2:2011. The feature structure representations of ISO 24610-1 are also used within some of the elements defined in ISO 24610-2:2011.