B4A Library Pdf To Text

Hi all.
Pdf To Text
This library converts pdf files to txt.
I was looking for a library that could convert a pdf file to txt. Behind tip by @Johan Schoeman ( Thank you dear ) i delivery this wrapper itextpdf-5-5-6.jar ( https://sourceforge.net/projects/itext/ )


pdftotext
Author:
DevilApp
Version: 1
  • PdftToText
    Events:
    • onMessage (Success As String)
    Methods:
    • Initialize (EventName As String)
    • ParsePdf (filepdf As String, filetxt As String)


You must copy this file itextpdf-5-5-6.jar and the wrapper pdftotext ( in attachment )
So you have 3 files:
pdftotext.xml
pdftotext.jar
itextpdf-5-5-6.jar
Copy all files in your libraries folder.

This is code as example ( you found the same in attachment ):
B4X:
Sub Activity_Create(FirstTime As Boolean)
    'Do not forget to load the layout file created with the visual designer. For example:
    'Activity.LoadLayout("Layout1")
    File.Copy(File.DirAssets, "test-armen.pdf", File.DirRootExternal, "test-armen.pdf")
    Dim filepdf As String = File.DirRootExternal & "/test-armen.pdf"
    Dim filetxt As String = File.DirRootExternal & "/test-armen.txt"
 
    Dim pdf As PdftToText
 
    pdf.Initialize("pdf")
    pdf.ParsePdf(filepdf, filetxt)

End Sub

Sub pdf_onMessage(Success As String)
    Log("Status conversion: " & Success)
End Sub
 

Attachments

  • PdfToText-Example.zip
    212.6 KB · Views: 698
  • PdfToText-Library.zip
    2.1 KB · Views: 689
  • pdftotext-source.zip
    4.1 KB · Views: 570
Last edited:

DonManfred

Expert
Licensed User
Longtime User
no response
Upload the pdf you want to convert could be of help.

Note that the result may be empty if you have just images in the pdf.
 

yaqoob

Active Member
Licensed User
Longtime User
Hi all.
Pdf To Text
This library converts pdf files to txt.
I was looking for a library that could convert a pdf file to txt. Behind tip by @Johan Schoeman ( Thank you dear ) i delivery this wrapper itextpdf-5-5-6.jar ( https://sourceforge.net/projects/itext/ )


pdftotext
Author:
DevilApp
Version: 1
  • PdftToText
    Events:
    • onMessage (Success As String)
    • Methods:
    • Initialize (EventName As String)
    • ParsePdf (filepdf As String, filetxt As String)


You must copy this file itextpdf-5-5-6.jar and the wrapper pdftotext ( in attachment )
So you have 3 files:
pdftotext.xml
pdftotext.jar
itextpdf-5-5-6.jar
Copy all files in your libraries folder.

This is code as example ( you found the same in attachment ):
B4X:
Sub Activity_Create(FirstTime As Boolean)
    'Do not forget to load the layout file created with the visual designer. For example:
    'Activity.LoadLayout("Layout1")
    File.Copy(File.DirAssets, "test-armen.pdf", File.DirRootExternal, "test-armen.pdf")
    Dim filepdf As String = File.DirRootExternal & "/test-armen.pdf"
    Dim filetxt As String = File.DirRootExternal & "/test-armen.txt"
 
    Dim pdf As PdftToText
 
    pdf.Initialize("pdf")
    pdf.ParsePdf(filepdf, filetxt)

End Sub

Sub pdf_onMessage(Success As String)
    Log("Status conversion: " & Success)
End Sub
Hello MarcoRome,

Thank you for providing the PDFtoText library. I have tried it, and it works perfectly for English. Is it possible to use it for different languages or text in the format UTF-8.

Thank you.
 

MarcoRome

Expert
Licensed User
Longtime User
Hello MarcoRome,

Thank you for providing the PDFtoText library. I have tried it, and it works perfectly for English. Is it possible to use it for different languages or text in the format UTF-8.

Thank you.


You should. You can do some tests

1714469528525.png
 
Top