Thanks for the encouragement, I'm learning and that's okay, that's what I want to get to, the file is sorted, but I don't know how to create the pointer, how to say go to the middle of the file and if bigger go to the middle between this and the last, or smaller, you go halfway between this and the first ... I don't know I can be clear
Ok, lets have some fun, shall we?
First, I can raw read + loop search the 10,000 rows 0.04 seconds. (brute force). Also here is the load times for your sample 10,000 row file:
Now if you look close? The read time for 10,000 rows was 0.3 seconds (1/3). So, 100,000 rows? Probably 3 seconds. 400,000 rows? about 12 by my guess.
That is not too bad. As noted, if you doing some kind of bar code or look up? Then I would still consider a database. But lets keep this really simple.
Next up - search/scan against the 10,000 rows. Well, lets look for the last row (brute force). And we get this:
Now, that speed REALLY did surprise me. So, 10,000 rows - in a loop Time = 0.05. So, 100,000 rows should be about 0.5, and say 300,000 rows about 1.5.
Now, someone suggested a "pointer" search. Man, talk about putting candy in front of me? What I decded to do was to SORT the list. Now, I can use a old style binary chop search. They are EASY to write. If you look at the first image? Note how I converted the read list into somthing a bit more friendly to sorting. and better yet is you can now in your code use a "name" to reference the 3 columns. Again that sort speed IS STUNNING!!!
Now, with a binary chop search? The search time is LESS then 1000/th of a second!!! And it will be for 400,000 rows.
I simply could not, but could not refain from writing that chop search. Its really simple you have a sorted list/array. You check in teh middle, and then chop in half based on the search value. On 400,000 rows, this SIMPLE search code will run without any dealy.
I have attached a working zip project of above. What is really nice? Well, you were VERY smart and kind to provide a sample file. So, delete the smaple text file (its in this project! Delete it (use the files from B4A).
For those wondering about that chop search? Oh that is a classic computer algorything. One that I not had to write for more then 20 years. In fact I think the last time i did this was on a Apple II (in pascal!!!).
here is the search routine - it just assumes the list is in order, and given that assume? Then you just chop the list in half down the middle, left or right until you get your answer - it is STUPID fast on this 10,000 list.
The code (snip) for that search is this:
Sub cmdByLine_Click
..
Do While bolFound = False
OneRow = MyList.Get(Half)
If txtSearch.Text > OneRow.PNum Then
If Half = MyList.Size - 1 Then
bolFound = False
Exit
End If
Lower = Half
Half = (Half + Upper) /2
Else If txtSearch.Text < OneRow.PNum Then
If Half = 0 Then
bolFound = False
Exit
End If
Upper = Half
Half = (Half + Lower)/2
Else
bolFound = True
intFound = Half
Exit
End If
Loop
End Sub
I did snip out a bit - the routine is in the attached sample.
(its been so long - I can't really remember the correct way to terminate! - but above seems to work just fine).
(just oh so much fun - I remember this algorithm from my computer sci book at university!!
However, my termination code I don't think is 100% - I think I still have that book! (algorithms)).
Now I have attached the B4A project here. You should be able to open in B4A, and it should just work. I avoided any libraries that are not part of B4A, and I attempted to write the code in VB/VBA style for your ease of learning.
Give the sample attached project try. And then from B4A, remove the 10,000 row sample, and bump it up to 100,000 rows. Try that. And then try the larger 400k table. This way, you can progress up to that final large table.
However, right now?
Performance with 10,000 rows, or 100,000 rows? Not an issue.
And me? Well, in place of opening up a Christmas list for shopping in notepad for 10 people?
I will fire up a database!!!
For me, every solution is a database, since that's the hammer I carry! I do think with a 400k list, then a database would be better, since the list does NOT have to be pre-read and then pre-loaded into memory, and that overall will be a better choice. but, lets push B4A - see how big your list can go before we damage the little reactor.
Anyway, do give the sample attached a try - it if works, then try a 100k text file in the project (remove the 10k one).
So our read times are less then 1 second. I would as noted, try 100k row.
If that speed is ok, then go with reading the file.
If not then we start to consider a database. But the above project "ready as is" for you to try should work. Have fun - I REALLY had a giggle writing this out!!
So you have options! - and the choices we have here I think CAN deal with this problem - it just not 100% clear which road is the least effort - but B4A as a tool is MOST certainly up to this task.
Regards,
Albert D. Kallal
Edmonton, Alberta Canada