4 thoughts on “Chinese Character Web API

  1. zitao

    Do you plan to make these Web APIs for open source project? I’m developing a iPad software for learning Chinese, but I’m not sure if I can rely on your web service. If I can host my own, it will be great.

    Reply
    1. Jens-Ingo Farley Post author

      I like the concept, but I’m not sure about licensing. The API is exposing data from unicode.org, and though the data is free to use as long as you don’t charge for any derivatives, it’s not open source in the sense of people being able to contribute back to it.

      So, there are two things: the data itself and the method of access. I’ve been experimenting with the method of access, and I thought a Web API was kind of cool. However, I have to admit that for performance and reliability, I think most app developers would probably want a local DB, trimmed down to just what they need.

      But back to the original topic, let’s say “phase 2” of this API experiment would be making it either open source or at least freely-accessible in some way. Sorry I don’t have timeline, as this is a spare-time activity.

      Reply
  2. Joe Wicentowski

    Great work! Does your API make it possible to list alternate forms of a given character? For example, given 國, the Unihan database lists these alternates: 囯囶囻国圀. With this method, a web-based version of the Unihan Variant Dictionary http://www.ideographer.com/unihan/ would be possible.

    Reply
    1. Jens-Ingo Farley Post author

      You mention that for國, the Unihan database lists these alternates: 囯囶囻国圀. But If I go here:

      http://www.unicode.org/cgi-bin/GetUnihanData.pl?codepoint=%E5%9C%8B

      I see only three variants. Yes, that information is available via my API, as follows:

      http://ccdb.hemiola.com/characters/string/%E5%9C%8B?fields=kSemanticVariant,kZVariant,kSimplifiedVariant

      As I was exploring the Unihan database, I looked at a lot of variants (beyond the basic traditional/simplified), and they made almost universally no sense to me. I showed a bunch of them to a native speaker, who could also derive no meaning from most of them. My conclusion was that they were getting into esoterica that weren’t that useful to a learner of the language, and thus I didn’t create any streamlined APIs for getting variants (beyond dealing with traditional/simplified).

      Reply

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>