[R: New Features on pinyin] Convert Chinese Characters into Sijiao and Wubi codes

By Peng Zhao | October 12, 2018

What features did I add?

  • Four times faster for converting.
  • At the beginning of the year 2018 I received an issue report by psychelzh about a polyphone error. Now a new pinyin library has been added, which more or less solved the polyphone problem.
  • Convert Chinese characters into Sijiao codes (literally four corner code).
  • and Wubi codes (literally five-stroke).
  • Some minor bugs were fixed.

Figure 1: Test the new features in RStudio IDE

How did I implement them?

  • Following Qu Cheng's suggestions in personal communications, I converted the pinyin library into an environment to accelerate the converting procedure by the pylib() function.
  • A new pinyin library '/inst/lib/zh2.txt' was added and a parameter dic = c('zh', 'zh2') in the pylib() function allows the users to choose a preferable library for polyphone.
  • New functions fclib() and four_corner() imports a four-corner library and converts Chinese characters into four-corner codes, according to Qu Cheng's suggestions.
  • A new function wubi() imports a five-stroke library and converts Chinese characters into five-stroke codes, again according to Qu Cheng's suggestions.
  • The downstream functions bookdown2py(), file.rename2py(), file2py() were updated to support the updates mentioned above.

Each part of the functions are well documented. Other files were updated automatically by compilation.

Link to relevant lines in the code on GitHub can be found mainly in my latest commit (click to see the details):

comments powered by Disqus