Hello,
i've got a question regarding different / localized character encodings.
In the app i'm working on i receive a certain text from an internet server.I use two different transport methods to get this text, one method is http and the other one is pop3( mail).The text is independent from which transport method i choose and it is always using the same character encoding.
The text is then printed in a textbox control.
The text contains non-US/ACSII characters ( german umlauts for example).For the http method i use the code page style webresponse object:
which is working fine and as expected ( the Response.New1 leaves me with unprintable characters).
The text is in the ISO 8859-1 encoding, which is an 8bit extension to the 7bit ACSII - see ISO/IEC 8859 - Wikipedia, the free encyclopedia .
The used code page 1252 in the response is pretty much the same as ISO8859-1 encoding except for some control characters which does not matter in this case - see Windows-1252 - Wikipedia, the free encyclopedia
If i receive the same text from the POP3/Mail server, i end up with unprintable characters ( squares).Using a network sniffer i can see that the text is encoded in exactly the same way when received through http.
So let's say, the text contains a german "ö", which has a hex code of F6 in ISO8859-1 encoding.Due to the lack of any code page handling in B4P ( except a few intructions as the mentioned webresponse.new2) my plan was just to substitute the ISO codes for umlauts using a
but this does not work, probably due to the fact that Chr() does not know about Code Page 1252 or ISO8859-1.
Questions regarding that matter:
- how does Basic4PPC handle different code pages ?
- does it at all or does it completely rely on UTF-8 encoding ? ( Chr() does not appear to be able to cope with UTF-8 ???) or on ASCII encoding ?
- what about MIME/quoted-printable encoding ?
- how can i solve my problem outlined above ? Manual character conversion is relatively complex and time-consuming.
Kind regards
TWELVE
i've got a question regarding different / localized character encodings.
In the app i'm working on i receive a certain text from an internet server.I use two different transport methods to get this text, one method is http and the other one is pop3( mail).The text is independent from which transport method i choose and it is always using the same character encoding.
The text is then printed in a textbox control.
The text contains non-US/ACSII characters ( german umlauts for example).For the http method i use the code page style webresponse object:
B4X:
Response.New2(1252)
which is working fine and as expected ( the Response.New1 leaves me with unprintable characters).
The text is in the ISO 8859-1 encoding, which is an 8bit extension to the 7bit ACSII - see ISO/IEC 8859 - Wikipedia, the free encyclopedia .
The used code page 1252 in the response is pretty much the same as ISO8859-1 encoding except for some control characters which does not matter in this case - see Windows-1252 - Wikipedia, the free encyclopedia
If i receive the same text from the POP3/Mail server, i end up with unprintable characters ( squares).Using a network sniffer i can see that the text is encoded in exactly the same way when received through http.
So let's say, the text contains a german "ö", which has a hex code of F6 in ISO8859-1 encoding.Due to the lack of any code page handling in B4P ( except a few intructions as the mentioned webresponse.new2) my plan was just to substitute the ISO codes for umlauts using a
text = StrReplace(text,Chr(246),"ö")
but this does not work, probably due to the fact that Chr() does not know about Code Page 1252 or ISO8859-1.
Questions regarding that matter:
- how does Basic4PPC handle different code pages ?
- does it at all or does it completely rely on UTF-8 encoding ? ( Chr() does not appear to be able to cope with UTF-8 ???) or on ASCII encoding ?
- what about MIME/quoted-printable encoding ?
- how can i solve my problem outlined above ? Manual character conversion is relatively complex and time-consuming.
Kind regards
TWELVE