Read a txt file and Spanish chars not displayed correctly

billy047

Member
Licensed User
Longtime User
Spanish characters not displayed correctly, it shows a character <?>. :confused:
I have a file type. Txt, which have characters with accents, when I read those characters, they can not show well.

I saved the file as ANSI encoding and data is read, but with that problem.

Try saving the file text with UTF-8, but the program can not read the file, try saving as well as UNICODE and could not read the file.

Is there any way to do encoding conversion?

Thank you and have a nice day

Guillermo
 

billy047

Member
Licensed User
Longtime User
You should use UTF-8. If it doesn't work please upload the text file.

Erel Thanks for your quick response, I'm using UTF-8 and well.
when previously changed from ANSI to UTF-8, I thought I could not read the text file because:
1 .- The variable that receives the data (that number) is an int variable and get to error
That error did not occur with the file in ANSI.

2 .- In fact, the information is obtained, but generates an error when trying to convert it to integer.

Dim int as numreg
numreg = TextReader1.ReadLine

Also try this way
Dim int as numreg
StrNum as string Dim
strNum = TextReader1.ReadLine
numreg = strNum
but here, generates the error

I do not understand that should not be able to make that conversion.

I sent an example of code in a test zip file which included the utf-8.

Thanks and best regards
Guillermo
 

Attachments

  • TestErrInteger.zip
    6.4 KB · Views: 377
Upvote 0

Erel

B4X founder
Staff member
Licensed User
Longtime User
Your file starts with UTF8 BOM: Byte order mark - Wikipedia, the free encyclopedia

Write Log(strNum.Length) and you will see that it is 4 instead of 3. The marking makes this problem.
I'm using Notepad++ which I believe is the best Windows text editor. It allows you to get rid of this marking:
notepadplus.png



By selecting Convert to UTF8 without BOM this problem is solved.

Another solution is to read the first three bytes and only then use TextReader.
 
Upvote 0

billy047

Member
Licensed User
Longtime User
Your file starts with UTF8 BOM: Byte order mark - Wikipedia, the free encyclopedia

Write Log(strNum.Length) and you will see that it is 4 instead of 3. The marking makes this problem.
I'm using Notepad++ which I believe is the best Windows text editor. It allows you to get rid of this marking:
notepadplus.png



By selecting Convert to UTF8 without BOM this problem is solved.

Another solution is to read the first three bytes and only then use TextReader.

Erel, Thanks again, using the Windows Notepad, it generates the BOM, I'm using Notepad + + and now everything works fine.

I wonder how you can answer as many questions, develop examples, working in the B4A and much more, when the day has only 24 hours. Thanks for your effort and your commitment to us all.

Greetings.

Guillermo
 
Upvote 0

Rusty

Well-Known Member
Licensed User
Longtime User
Read a txt file with Spanish chars not displayed correctly

Erel,
I followed your instructions and saved the file without BOM. I included the new file in my DirAssets. When I read the file from DirAssets, it works fine. However, when i copy th file from DirAssets to File.DirDefultExternal and then use this file, it does not work.
If File.Exists(File.DirAssets, Filename & ".txt") Then
File.Copy(File.DirAssets, Filename & ".txt", SDCard.Data, Filename & ".txt")
...
I then fall into the same read code
Dim Reader As TextReader
Reader.Initialize(File.OpenInput(File.{either DirAssets or DirDefaultExternal}, Filename & ".txt"))
ConsentText = Reader.ReadAll

When I copy then read it, i get the question marks for the spanish characters.
When I read it from DirAssets it works fine.
Any ideas? I want to have the file on my SDCARD and not in DirAssets.
Thanks,
 
Upvote 0

manios

Active Member
Licensed User
Longtime User
If the file is from a windows-system try the following:


Reader.Initialize2(File.OpenInput(File.DirInternal , filename& ".txt"),"ISO-8859_1")

This will convert the file during input. It does work for me with german characters!
 
Upvote 0
Top