Character Description Language

From Wenlin Guide
Revision as of 20:48, 17 January 2015 by Rscook (talk | contribs)
Jump to navigation Jump to search

Wenlin 216x93.png Appendix G of the Wenlin User’s Guide

This appendix introduces features relating to Wenlin Institute’s Character Description Language (CDL), a powerful font and character description technology.

CDL enables Wenlin's Stroking Box (described in Chapter 7), and Song Hanzi font (Chapter 13), as well as the Brush Tool (Chapter 4) and the list of characters by stroke count (Chapter 6). After you choose Song Hanzi (CDL) or Plain Hanzi (CDL) in the Wenlin Font Menu, the Chinese text you see in Wenlin is rendered using CDL font technology. Likewise, after you choose Monospace Pinyin in the Wenlin Font Menu, all Pinyin, English and most Roman text you see in Wenlin is also rendered using CDL.

The following image shows some variants of the character 𤁉[U+24049] / 漢[U+6F22] / 汉[U+6C49] rendered in the CDL Stroking Box.

Han4s.png

CDL has always been a part of Wenlin, but the underlying language was invisible until Wenlin version 4.0. Now it is possible to view and manipulate the CDL description for any character that can be viewed in the Stroking Box. The key is to choose Advanced Options from the Options menu, and turn on the option labeled Enable advanced CDL (Character Description Language) features. Then, when you are viewing any character in the stroking box, there will be a checkbox labeled advanced, and when it is checked, additional buttons will be available. These buttons include:

  • CDL: to display the character's description in XML format.
  • Points: to show the control points for manipulating the arrangement of strokes and components.
  • Strokes: to convert the description into one that uses only <stroke> elements, not <comp> elements.
  • Scale: to ensure that the coordinates fit the entire grid, when editing.
  • EPS: to convert the character glyph into Encapsulated PostScript, an outline usable in graphics programs.
  • SVG: to convert the character glyph into Scalable Vector Graphics, an outline usable in web browsers and other programs.

All of the CDL descriptions provided by Wenlin are copyright © 2012 Wenlin Institute, Inc., All Rights Reserved. To use CDL in your applications and publications, contact Wenlin Institute. Conventional fonts can be exported from CDL, and the CDL Engine and Database are available for licensing.

The CDL descriptions are stored in a file named “cdl.wenlindb” in the “W4DB” folder; it has an index file named “cdl.wenlintree” (in the same folder) with which it must stay synchronized. Another index file, “bihua.wenlintree” is also in the same folder; it is derived from “cdl.wenlindb” by indexing all character descriptions by their strokes, and is used by the brush tool and listing by stroke count.

You can watch a video demonstration of CDL.

The CDL website contains pointers to various CDL resources. The Wenlin User’s Guide is gradually being augmented with more information about the CDL specification, and about the interface for creating and modifying CDL descriptions. The Specification for CDL and the Set of Basic Stroke Types together provide a nearly complete explanation of the grammar and vocabulary of Character Description Language. Experiment with the CDL Stroking Box, and let us know if you have any questions.

CDL Technology Overview

  • CDL is the engine (C source code) behind CJK Unicode megafonts, breaking the 64K glyph barrier! (A CDL font can contain an unlimited number of glyphs.)
  • CDL is an XML application, a standards-based font and encoding technology designed for precise and compact description, rendering, and indexing of all 漢/汉 Han (Chinese, Japanese, Korean, and Vietnamese = CJKV) characters, encoded and unencoded.
  • CDL is a font database containing (to date) XML/Unicode descriptions of nearly 100,000 characters, complete Unicode 7.1 CJK character support, and more.
  • CDL adds a third dimension to the code space, with a variant mechanism for associating an unlimited number of CDL descriptions with any Unicode codepoint.
  • Each CDL description can be associated with zero or more Unicode code points, making CDL the ideal tool for extending The Unicode Standard.
  • CDL means consistent stroke/component analyses, built-in indexing and variant mappings, and high-quality graphic images as outlines convertible to SVG, PostScript, MetaFont, and more.
  • CDL is a compressed binary with an incredibly small memory footprint (~1.4 MB [1,402,091 bytes]!), suitable for use in limited-memory mobile devices that want full Unicode CJK support.
  • CDL technology has applications for machine learning, for handwriting recognition and input methods, for optical character recognition (OCR), and most importantly for human language-learning.
  • The basic elements of CDL are a two-dimensional coordinate space, and a set of basic stroke types. Using these simple elements, CDL provides a framework for describing characters and components, and for (recursive) reuse of character and component descriptions in the descriptions of other characters and components.
  • CDL has applications beyond CJK, for organizing information underlying the rendering of any complex script.

Mouse pointer finger right.jpg | Previous: App. F. Remembered Options | Next: App. H. Wenlin Menu Overview | Contents |