Character Description Language

From Wenlin Guide

Jump to: navigation, search

Wenlin 216x93.png Appendix G of the Wenlin User’s Guide

This appendix introduces features relating to Wenlin Institute’s Character Description Language (CDL), a powerful font and character description technology.

CDL enables Wenlin's stroking box (described in Chapter 7), and Song Hanzi font (Chapter 13), as well as the brush tool (Chapter 4) and the list of characters by stroke count (Chapter 6).

CDL has always been a part of Wenlin, but the underlying language was invisible until Wenlin version 4.0. Now it is possible to view and manipulate the CDL description for any character that can be viewed in the stroking box. The key is to choose Advanced Options from the Options menu, and turn on the option labeled Enable advanced CDL (Character Description Language) features. Then, when you are viewing any character in the stroking box, there will be a checkbox labeled advanced, and when it is checked, additional buttons will be available. These buttons include:

  • CDL: to display the character's description in XML format.
  • Points: to show the control points for manipulating the arrangement of strokes and components.
  • Strokes: to convert the description into one that uses only <stroke> elements, not <comp> elements.
  • Scale: to ensure that the coordinates fit the entire grid, when editing.
  • EPS: to convert the character glyph into Encapsulated PostScript, an outline usable in graphics programs.
  • SVG: to convert the character glyph into Scalable Vector Graphics, an outline usable in web browsers and other programs.

All the CDL descriptions provided by Wenlin are copyright © 2012 Wenlin Institute, Inc., All Rights Reserved. For permission to use the descriptions in other applications or publications, contact Wenlin Institute.

The CDL descriptions are stored in a file named “cdl.wenlindb” in the “W4DB” folder; it has an index file named “cdl.wenlintree” (in the same folder) with which it must stay synchronized. Another index file, “bihua.wenlintree” is also in the same folder; it is derived from “cdl.wenlindb” by indexing all character descriptions by their strokes, and is used by the brush tool and listing by stroke count.

You can watch a video demonstration of CDL.

For more information about CDL, please visit the CDL Project Website. In particular, the Specification for CDL and the Set of Basic Stroke Types together provide a nearly complete explanation of the grammar and vocabulary of Character Description Language.

We will eventually provide more information about the interface for creating and modifying character descriptions. In the meantime, you can experiment in the stroking box, and let us know if you have any questions.

A few more highlights of CDL:

  • CDL is the engine (C source code) behind CJK Unicode megafonts, breaking the 64K glyph barrier! (A CDL font can contain an unlimited number of glyphs.)
  • CDL is an XML application, a standards-based font and encoding technology designed for precise and compact description, rendering, and indexing of all 漢/汉 Han (Chinese, Japanese, Korean, and Vietnamese = CJKV) characters, encoded and unencoded.
  • CDL is a font database containing (to date) XML/Unicode descriptions of over 82,000 characters, complete Unicode 6.0 CJK character support, and more.
  • CDL adds a third dimension to the code space, with a variant mechanism for associating an unlimited number of CDL descriptions with any Unicode codepoint.
  • Each CDL description can be associated with zero or more Unicode code points, making CDL the ideal tool for extending The Unicode Standard.
  • CDL means consistent stroke/component analyses, built-in indexing and variant mappings, and high-quality graphic images as outlines convertible to SVG, PostScript, MetaFont, and more.
  • CDL is a compressed binary with an incredibly small memory footprint (~1.4 MB [1,402,091 bytes]!), suitable for use in limited-memory mobile devices that want full Unicode CJK support.
  • CDL technology has applications for machine learning, for handwriting recognition and input methods, for optical character recognition (OCR), and most importantly for human language-learning.
  • The basic elements of CDL are a two-dimensional coordinate space, and a set of basic stroke types. Using these simple elements, CDL provides a framework for describing characters and components, and for (recursive) reuse of character and component descriptions in the descriptions of other characters and components.
  • CDL has applications beyond CJK, for organizing information underlying the rendering of any complex script.

Mouse pointer finger right.jpg | Previous: App. F. Remembered Options | Next: App. H. Wenlin Menu Overview | Contents |

Personal tools