List of Chinese Characters and their Hakka Pronunciations

Discussions on the Hakka dialects.
Locked
Kobo-Daishi

List of Chinese Characters and their Hakka Pronunciations

Post by Kobo-Daishi » Wed Apr 28, 2004 10:00 pm

Dear all,

Where can I find a list of Chinese characters and their corresponding Hakka pronunciations in romanization?

I want the most complete list available for the variety of Hakka with the most speakers.

Thank you.

Kobo-Daishi, PLLA.
Dylan

Re: List of Chinese Characters and their Hakka Pronunciation

Post by Dylan » Thu Apr 29, 2004 6:40 am

The one listed in my site covers around 9500 plus characters with multiple pronunciations in some, giving rise to around 11400 readings, the rest are symbols and characters not in big5.

http://www.sungwh.freeserve.co.uk/sapienti/hagfa99b.htm

This was the primary source for the Hakka annotator

http://www.sungwh.freeserve.co.uk/misc/dialect.htm

BTW, I have finished a version of Sino-Viet annotator, for Big5. I've not placed it on line yet. It has some 8500 readings, if you're interested...

Dyl.
Kobo-Daishi

Re: List of Chinese Characters and their Hakka Pronunciation

Post by Kobo-Daishi » Thu Apr 29, 2004 8:07 am

Dear Dylan,

Thanks for the information.

Have you thought about expanding the lists to include all the characters in the Unicode characterset?

For instance,

? [giam4]
剑 [giam4]
? [giam4]
? [giam4]
?? [giam4]
? [giam4]
? [giam4]

and so forth.

Dylan wrote:

>>BTW, I have finished a version of Sino-Viet annotator, for Big5.
>>I've not placed it on line yet. It has some 8500 readings, if
>>you're interested...

Wow. That was quick. Did you type out all the SV readings?

Can’t wait until you post the Sino-Viet annotator online.

Kobo-Daishi, PLLA.
Dylan

Re: List of Chinese Characters and their Hakka Pronunciation

Post by Dylan » Thu Apr 29, 2004 12:54 pm

The problem with unicode text, is, there are so many varieties or formats used. There's UTF8, UTF7, HTML version, and so on...

I've found that programs written to read the UTF8 text tends to get the text corrupted. That's why I decided on one of the Chinese encodings, and Big5 had the larger select of characters. I'm not sure why though. Maybe it doesn't recognise the characters... I've noticed that certain ranges of characters are so many characters long, for instance, one, two and three byte characters occur in UTF8. Since there are diacritics involved, I decided to convert the Big5 characters in situ into HTML format, so when you have a big5 input text, the output becomes HTML-ised so that both Chinese characters and Vietnamese diacritics are displayable, without there being phantom characters on accound of the accented characters occuring in the upper ascii range.

I've run out of space on the freeserve site. Will look to upload to my other site once I've finished painting the house. Spring cleaning and all that.

Dyl.
Locked