B4J Question Reading in large amounts of data in DB

strupp01

Active Member
Licensed User
Longtime User
How can I import a file with 40,000 - 100,000 lines as fast as possible into a database. Data per line are separated by commas.
"LOAD DATA INFILE" probably does not work. With "LoadCSV" you can only load into a list? Or does it go directly into a DB?
Who can help and can show me the command?
 

strupp01

Active Member
Licensed User
Longtime User
@PCastagnetti,
Thank you for your program code. Unfortunately your program crashes shortly before the end with this error message:

upload_2017-6-28_21-57-56.png
 
Upvote 0

keirS

Well-Known Member
Licensed User
Longtime User
@keirS,
Thank you for your program code. First, I have windows 7 64-bit.
Your program ran in the beginning in about 8 seconds. After 10-12 attempts it has a running time of more than 70 seconds and sometimes it depends entirely on. Very strange. All test with same file.

Are you running 64 bit JDK? If not try it.
 
Upvote 0

rboeck

Well-Known Member
Licensed User
Longtime User
I tryed it a second time; i changed the path of the database and the txt file. I also removed Progressdialog...
My timing: 427 ms between Begin- and TransactionSuccessful, 668 ms after NonQuery: true
I had no problem with the database, you can check it, if the file and the records are correct.
System: 64 bit Java 8_131, 8 GB RAM, i7-4790K @ 4 Ghz, System Drive SSD
 

Attachments

  • Test_Sql1.zip
    2.6 KB · Views: 274
  • Schlaf_gut_Test.zip
    78.3 KB · Views: 329
Upvote 0

OliverA

Expert
Licensed User
Longtime User
@strupp01, you need the latest jSQL library that works with the new wait for. Version 1.50 can be found here.
 
Upvote 0

strupp01

Active Member
Licensed User
Longtime User
I have pulled the new library, unpacked and copied it into the folder 'Additional Libraries'. B4J restarted, but JSQL library still shows 1.3.
What is still to be done ?

upload_2017-6-29_18-26-8.png
 
Upvote 0

OliverA

Expert
Licensed User
Longtime User
I have pulled the new library, unpacked and copied it into the folder 'Additional Libraries'. B4J restarted, but JSQL library still shows 1.3.
What is still to be done ?
See the first post (here) in the thread where you downloaded v1.50.
 
Upvote 0

rboeck

Well-Known Member
Licensed User
Longtime User
You have lost the s of sqlite...; besides that, everything is ok.
But in library manager you should see jSQL (version:1:50) - its an internal library.
 
Upvote 0

strupp01

Active Member
Licensed User
Longtime User
Thanks to all for your help.
I've come a long way, but I'm not satisfied. Will continue testing. Have the reading for my approximately 400,000 records reduced from about 36 minutes to 1.5 to 2 minutes. More is not at the moment. Sometimes the program hangs up, sometimes it takes longer longer. Is probably synonymous with the Windows, processor and RAM memory.
Thanks again for your trouble.
Greetings Strupp01
 
Upvote 0

OliverA

Expert
Licensed User
Longtime User
How can I import a file with 40,000 - 100,000 lines as fast as possible into a database
Your program code will run in 1.2 seconds, but there is no data in the database. My big file is broken after about 30 seconds.
Your program ran in the beginning in about 8 seconds. After 10-12 attempts it has a running time of more than 70 seconds and sometimes it depends entirely on. Very strange. All test with same file.
Have the reading for my approximately 400,000 records reduced

My timing: 427 ms between Begin- and TransactionSuccessful, 668 ms after NonQuery: true
This code took 55 seconds to generate 4282824 records running B4J in debug mode.

Ok, I know you’re closing this down, but something just does not make any sense here. You went from 40K-100K records to 400K records for the import, a 4-10 x increase in the amount of data to import. Even with this, 400K records in a CSV file should be nothing (1K long records @ 400K records is 400K in RAM – a pittance in today’s standards). Also, unless @keirS has a typo, an import of 4 Million (40x to 100x of original requirements) took only 55 seconds in DEBUG mode. Unless there is something strange about your CSV files or something off with your code, it makes no sense that you are so much slower and have so many issues with the import. To probably get some good help from this forum, you may want to post your complete code (unaltered) and either the same test file or a sanitized version (same size, same amount of data, just data you don’t have to worry about the public seeing) of the test file before anything can be determined as to what the issues are here. For now, everyone is just guessing with their code and their data (which may be totally different from yours) and everyone may just be talking past each other.
 
Last edited:
Upvote 0

strupp01

Active Member
Licensed User
Longtime User
o put the test program on the net is no problem. The text file is zipped 1972kb large. Too big to put them on the web. That's the problem.
 
Upvote 0

OliverA

Expert
Licensed User
Longtime User
The text file is zipped 1972kb large. Too big to put them on the web. That's the problem.
No access to web server, Google Driver, Dropbox, FTP server, ... something?
 
Upvote 0

keirS

Well-Known Member
Licensed User
Longtime User
Ok, I know you’re closing this down, but something just does not make any sense here. You went from 40K-100K records to 400K records for the import, a 4-10 x increase in the amount of data to import. Even with this, 400K records in a CSV file should be nothing (1K long records @ 400K records is 400K in RAM – a pittance in today’s standards). Also, unless @keirS has a typo, an import of 4 Million (40x to 100x of original requirements) took only 55 seconds in DEBUG mode. Unless there is something strange about your CSV files or something off with your code, it makes no sense that you are so much slower and have so many issues with the import. To probably get some good help from this forum, you may want to post your complete code (unaltered) and either the same test file or a sanitized version (same size, same amount of data, just data you don’t have to worry about the public seeing) of the test file before anything can be determined as to what the issues are here. For now, everyone is just guessing with their code and their data (which may be totally different from yours) and everyone may just be talking past each other.

Not a typo. My machine is pretty fast and that was to an SSD but just a bog standard SATA one The CSV file I used is 127 Meg and it's just the records of the original CSV repeated.
 
Upvote 0

strupp01

Active Member
Licensed User
Longtime User
Just have your program, which you sent me back, run. After about 30-45 seconds, it breaks without error. I am at a loss.
 
Upvote 0

strupp01

Active Member
Licensed User
Longtime User
I shared the file. The files must be renamed so they are without .txt. e.g. BRP.zip.001
 

Attachments

  • BRP.zip.001.txt
    439.5 KB · Views: 314
  • BRP.zip.002.txt
    439.5 KB · Views: 319
  • BRP.zip.003.txt
    439.5 KB · Views: 343
  • BRP.zip.004.txt
    439.5 KB · Views: 315
  • BRP.zip.005.txt
    213.8 KB · Views: 292
Upvote 0

OliverA

Expert
Licensed User
Longtime User
@strupp01: Code? Can you clean and then create a zip of your project file and post it?
 
Upvote 0

keirS

Well-Known Member
Licensed User
Longtime User
There is a problem with you CSV file. It ends with a ",". I had to delete this to get my code to work. But anyway it takes around 3.5 to 4 seconds to to process the file.
As @OliverA requested pleas post your code.
 
Upvote 0
Top