B4A Library B4A Pinyin - B4A拼音

B4A Pinyin - Chinese to Pinyin (B4A拼音)

Pinyin4j is a popular Java library supporting conversion between Chinese characters and most popular Pinyin systems. This wrapper was requested here.

I based my wrapper on a github project found here.
Kudos to the original authors.

Many thanks also to @xky for helping with the testing of the wrapper

Usage:
Add the attached libraries in your Additional Library folder. Make sure to refresh the library pane in B4A. Then take also look at the sample-project attached. I have also attached the java-wrapper if someone wants to modify it.

This is a useful link for further information:
http://pinyin4j.sourceforge.net/pin.../pinyin4j/format/HanyuPinyinOutputFormat.html

Note: methods present in previous versions of this wrapper are still included for compatibility. However, they should be considered Deprecated and you should now use the new method ConvertToPinyin().
Note2: the old method converterToFirstSpell() can still be useful if you need a pinyin output in an abbreviated format.
Note3: this library works fine with B4J as well.

B4Apinyin
Author:
moster67/Mikael Osterhed
Version: 1.3
[*]Pinyin
Fields:

  • DO_NOT_USE_VCHARTYPE As Int
  • LOWERCASE As Int
  • UPPERCASE As Int
  • WITHOUT_TONE As Int
  • WITH_TONE_MARK As Int
  • WITH_TONE_NUMBER As Int
  • WITH_U_AND_COLON As Int
  • WITH_U_UNICODE As Int
  • WITH_V As Int
Methods:
  • ConvertToPinyin (chinese As String, caseType As Int, toneType As Int, vCharType As Int) As String
    ConvertToPinyin() requires four parameters.
    1) chinese hologram (word) to convert (String)
    2) caseType to apply. Can be LOWERCASE or UPPERCASE
    3) toneType to apply. Can be WITHOUT_TONE, WITH_TONE_MARK or WITH_TONE_NUMBER
    4) vCharType to apply. Can be WITH_U_UNICODE, WITH_U_AND_COLON, WITH_V or DO_NOT_USE_VCHARTYPE
    Note: combinations of toneType WITH_TONE_MARK and vCharTypes WITH_U_AND_COLON and WITH_V do not work.
    This is intentional and will produce a blank output.
    Example usage:
    Private PinyinObject As Pinyin
    Dim str2 As String = PinyinObject.convertToPinyin("壮丽",PinyinObject.LOWERCASE,PinyinObject.WITH_TONE_MARK, PinyinObject.WITH_U_UNICODE)
  • HanZiToPinYinWIthToneMark (hanzi As Char) As String
    Deprecated. Included only for compatibility reasons for previous versions of this wrapper.
    Please use the method ConvertToPinyin() instead.
  • converterToFirstSpell (chines As String) As String
    Deprecated. Included only for compatibility reasons for previous versions of this wrapper.
    Please use the method ConvertToPinyin() instead.
  • converterToSpell (chines As String) As String
    Deprecated. Included only for compatibility reasons for previous versions of this wrapper.
    Please use the method ConvertToPinyin() instead.


Hope it will turn out to be useful (I guess mostly for our fellow Chinese B4A-users).

img.png
 

Attachments

  • libs_v1_3.zip
    309.7 KB · Views: 429
  • B4APinyinSampleApp.zip
    7 KB · Views: 266
  • javasource.zip
    3.8 KB · Views: 219
Last edited:

xky

Member
Licensed User
Longtime User
But can you add some option for tone mark in the lib?
such as:
HanyuPinyinOutputFormat hanyuPinyin = new HanyuPinyinOutputFormat();
hanyuPinyin.setCaseType(HanyuPinyinCaseType.LOWERCASE);
hanyuPinyin.setToneType(HanyuPinyinToneType.WITH_TONE_MARK);
hanyuPinyin.setVCharType(HanyuPinyinVCharType.WITH_U_UNICODE);

Output is:
upload_2016-10-8_9-8-22.png
 

xky

Member
Licensed User
Longtime User
Great,this is a very great step. Chinese have 4 tone marked 1: -, 2: / , 3: V , 4: \. you can use a TTS or a online dictory (http://hanyu.dict.cn/只) listen:支, 直 ,只 ,至" to have feel of it.
Next, there are some Chinese charactors have not only one announce, for example , "长" in word "长沙"(it's a city in China Hunan) says "chang sha" , but in word "长大" (grow up) says "zhang da". please see the link below the code will list each announce of the multi-announce charactor. So, can you let me list them?
https://sourceforge.net/p/pinyin4j/discussion/554159/thread/1abc1752/
 

moster67

Expert
Licensed User
Longtime User
Great,this is a very great step. Chinese have 4 tone marked 1: -, 2: / , 3: V , 4: \. you can use a TTS or a online dictory (http://hanyu.dict.cn/只) listen:支, 直 ,只 ,至" to have feel of it.
Do you mean:
B4X:
HanyuPinyinToneType.WITH_TONE_NUMBER;)
??
For second one, I will see.

The thing is that I don't understand Chinese so I cannot determine if results are correct or not.
I will try to write a "general method" where you can put your own parameters.

If you want to help, send me a list with Chinese logograms (words) and their corresponding output in Pinyin. This makes it easier for me since I know what the expected result will be and so I can copy and paste logograms.

So far the wrapper was very simple and did not require much time. If this takes longer, then I will need to do this later when I have more free time.
 

xky

Member
Licensed User
Longtime User
dear, I just want this
str2 = PinyinHelper.toHanyuPinyinStringArray(test,outputFormat);
you can give me
HanZitoPinyinWithToneMarkArray(ChineseChar)
also
Dim str2 As String = PinyinObject.converterToSpellwithToneMark("长沙市长")
what I write above are some knowledge of Chinese speech, not about the lib. Because you say you interested in Chinese.;)
list with Chinese logograms (words) and their corresponding output in Pinyin is already in pinyin4j, you just needs to do wrap the function to B4A.
 
Last edited:

xky

Member
Licensed User
Longtime User
I mean
give you "长" you can give me
Cháng,zhǎng
give you "长大" you can give me
Cháng dà,zhǎng dà
 

moster67

Expert
Licensed User
Longtime User
You didn't reply to my first question about tone numbers....
 

moster67

Expert
Licensed User
Longtime User
With list I meant that you send me a list with a selection of logograms. 20 should be OK.
 

xky

Member
Licensed User
Longtime User
You didn't reply to my first question about tone numbers....
that means Cháng,zhǎng (tone mark) will display Chang2 Zhang3 (tone number)
in real Chinese there's no Pinyin like "Chang2 Zhang3" (tone number), use tone number is for ASCII display. as now UTF-8 is up, so you can give only "Cháng,zhǎng" (tone mark)
 
Last edited:

xky

Member
Licensed User
Longtime User
With list I meant that you send me a list with a selection of logograms. 20 should be OK.
我(wǒ)爱(ài)祖国(zǔguó)
祖国(zǔguó)山水(shānshuǐ)美(měi)如(rú)画(huà),锦绣(jǐnxiù)大地(dàdì)多(duō)壮丽(zhuànglì)。
滚滚(gǔngǔn)长江(chángjiāng)不尽(bújìn)流(liú),滔滔(tāotāo)黄河(huánghé)奔腾(bēnténg)急(jí)。
长城(chángchéng)蜿蜒(wānyán)数万(shùwàn)里(lǐ),青藏高原(qīngzànggāoyuán)称(chēng)屋脊(wūjǐ)。
地下(dìxià)宝藏(bǎozàng)无(wú)穷尽(qióngjìn),石油(shíyóu)花(huā)开(kāi)香(xiāng)千(qiān)里(lǐ)。
北疆(běijiāng)牛(niú)羊(yáng)肥(féi)且(qiě)壮(zhuàng),琼(qióng)地(dì)椰子(yēzi)甜(tián)如(rú)蜜(mì)。
各个(gègè)民族(mínzú)步调(bùdiào)齐(qí),团结(tuánjié)友爱(yǒuài)胜(shèng)兄弟(xiōngdì)。
五星红旗(wǔxīnghóngqí)高(gāo)飘扬(piāoyáng),台湾(táiwān)一定(yídìng)能(néng)统一(tǒngyī)。
天下兴亡(tiānxiàxīngwáng)应有(yīngyǒu)责(zé),祖国(zǔguó)和谐(héxié)是(shì)真(zhēn)理(lǐ)。
Is this ok for your test?
 

moster67

Expert
Licensed User
Longtime User
Ok for the list.
So you confirm you don't need tone numbers since utf8 is used today?
 

xky

Member
Licensed User
Longtime User
Ok for the list.
So you confirm you don't need tone numbers since utf8 is used today?
No, if this lib is not only for me, you need keep it. because others may need tone number for programing. It's easy to find out which tone in runtime.
 

moster67

Expert
Licensed User
Longtime User
Updated the library (version 1.30) with a universal method "ConvertToPinyin()" which lets you combine casetype, tonetype and chartype.

See updated information in first post.
 
Top