B4J Question Best way to get 15K integers from spreadsheet to B4J

SeaBee

Member
Licensed User
I have computed with Excel about 15K integers which are polynomial arguments, and which need to be entered in an array, list, table, type or whatever. They must all be available for calculations as soon as the program starts. They will be broken down into multiple subsets to simplify their use, but I do not want them in an external file or database to be loaded on program start. They must be internal to the program itself.

It will be a trivial matter to export the data to a comma delimited or tab delimited file, but I have no idea of how to handle the data in the most efficient and fast way in B4J. I don't mind reformatting the raw data to get it loaded as I can do that with macros - but I really don't want to retype it all!

I would be most grateful if someone more experienced than I with B4X could provide some suggestions or pointers.

Thanks very much!
 

SeaBee

Member
Licensed User
Just to add a little more information, I had originally planned to use Create2DArray(……) (which I have used before in B4A) and then just copy and paste blocks of lines from a csv file with an underscore at the end of each line, but too my chagrin there is no such facility in B4J.
 
Upvote 0

DonManfred

Expert
Licensed User
Longtime User
but I do not want them in an external file or database to be loaded on program start
it is a mistake to add them in code if that is your idea...

Put them in a small database (SQLite), a textfile.

Load the data at programstart (15k line reading should not effect much time).
 
Upvote 0

SeaBee

Member
Licensed User
I am loath to do that as I have spent several years developing these polynomials, and I do want to make it just a little bit harder for them to be copied. I know the final project can be hacked, but with the parts I have already written I plan to use my own obfuscator.

As I said above, I just need a series of 2D arrays containing integers ranging roughly from -100 to +100 - hardly appropriate for the overhead of a database. The data is already correctly arranged in a series csv files saved from the Excel spreadsheet. I just need to get it into a 2D array.
 
Upvote 0

emexes

Expert
Licensed User
@DonManfred has a valid argument against adding them in code, but sometimes it's a case of gotta do what you gotta do...

I have occasionally used the string functions within Excel to construct cut-and-pasteable program code that I can import directly. The polynomial coefficients will probably each compile down to 4-byte binary format floats, plus a few bytes of eg, push and multiply instructions, so you should not hit any Java or B4A code-size limit.

If the polynomial coefficients are indeed all integers -100 to 100, then you could just store them in an initialized list (or array) of byte arrays, and then feed the appropriate array into a general polynomial calculation routine, eg:

 
Last edited:
Upvote 0

emexes

Expert
Licensed User
This code:
B4X:
Sub Activity_Create(FirstTime As Boolean)

    InitPolyArray
   
    Dim UseThisPoly() As Byte = Polywaffle(17)    'arbitrary example, could be any from 0..PollyWaffle.Length-1
   
    Dim X As Double
    For X = 4.56 To 7.89 Step 0.125
        Log(X & " = " & CalcPoly(UseThisPoly, X))
    Next
   
End Sub

Sub CalcPoly(C() As Byte, X As Double) As Double
   
    Dim Y As Double = 0
   
    For I = 0 To C.Length - 1
        Y = Y * X + C(I)
    Next
   
    Return Y
   
End Sub

Sub InitPolyArray
   
    Dim Polywaffle(100) As Object
   
    Polywaffle(0) = Array As Byte(96 ,-71 ,-9 ,71 ,22 ,-116)
    Polywaffle(1) = Array As Byte(64 ,52 ,36 ,65 ,30 ,45)
    Polywaffle(2) = Array As Byte(-111 ,-89 ,-56 ,-104 ,-65 ,71)
    Polywaffle(3) = Array As Byte(-75 ,98 ,-64 ,-72 ,33 ,12)
    Polywaffle(4) = Array As Byte(118 ,42 ,14 ,-34 ,-55 ,107)
    Polywaffle(5) = Array As Byte(9 ,-99 ,-93 ,49 ,-89 ,114)
    Polywaffle(6) = Array As Byte(38 ,-43 ,-27 ,78 ,75 ,56)
    Polywaffle(7) = Array As Byte(114 ,-13 ,33 ,-80 ,16 ,89)
    Polywaffle(8) = Array As Byte(-114 ,-49 ,111 ,-51 ,59 ,118)
    Polywaffle(9) = Array As Byte(88 ,-34 ,-62 ,-34 ,-3 ,43)
    Polywaffle(10) = Array As Byte(70 ,102 ,58 ,52 ,-50 ,5)
    Polywaffle(11) = Array As Byte(39 ,99 ,78 ,29 ,86 ,97)
    Polywaffle(12) = Array As Byte(42 ,-112 ,27 ,-55 ,69 ,43)
    Polywaffle(13) = Array As Byte(-58 ,36 ,-106 ,-83 ,44 ,-106)
    Polywaffle(14) = Array As Byte(-8 ,18 ,-3 ,-77 ,69 ,115)
    Polywaffle(15) = Array As Byte(84 ,117 ,106 ,100 ,15 ,-66)
    Polywaffle(16) = Array As Byte(80 ,69 ,21 ,19 ,16 ,75)
    Polywaffle(17) = Array As Byte(98 ,-12 ,84 ,40 ,-32 ,81)
    Polywaffle(18) = Array As Byte(9 ,59 ,97 ,-50 ,-72 ,81)
    Polywaffle(19) = Array As Byte(-72 ,-85 ,114 ,-2 ,73 ,-100)
    Polywaffle(20) = Array As Byte(84 ,-96 ,-11 ,-70 ,-107 ,-5)
    Polywaffle(21) = Array As Byte(-89 ,18 ,117 ,59 ,-46 ,-19)
    Polywaffle(22) = Array As Byte(95 ,92 ,12 ,75 ,-41 ,48)
    Polywaffle(23) = Array As Byte(93 ,-30 ,-31 ,29 ,-118 ,107)
    Polywaffle(24) = Array As Byte(20 ,-87 ,98 ,109 ,-62 ,32)
    Polywaffle(25) = Array As Byte(116 ,-103 ,-95 ,15 ,68 ,-33)
    Polywaffle(26) = Array As Byte(98 ,-25 ,95 ,-52 ,62 ,-100)
    Polywaffle(27) = Array As Byte(-118 ,-13 ,43 ,65 ,-33 ,65)
    Polywaffle(28) = Array As Byte(22 ,-44 ,-99 ,43 ,17 ,-111)
    Polywaffle(29) = Array As Byte(-44 ,27 ,53 ,104 ,-64 ,-58)
    Polywaffle(30) = Array As Byte(69 ,-78 ,-43 ,34 ,99 ,31)
    Polywaffle(31) = Array As Byte(-41 ,18 ,-27 ,22 ,-25 ,94)
    Polywaffle(32) = Array As Byte(-75 ,109 ,-58 ,-95 ,-77 ,70)
    Polywaffle(33) = Array As Byte(-21 ,-88 ,-103 ,80 ,112 ,103)
    Polywaffle(34) = Array As Byte(19 ,98 ,-58 ,-12 ,108 ,85)
    Polywaffle(35) = Array As Byte(-110 ,-59 ,20 ,64 ,-57 ,-67)
    Polywaffle(36) = Array As Byte(-47 ,-117 ,29 ,-4 ,-21 ,-3)
    Polywaffle(37) = Array As Byte(-96 ,-105 ,-61 ,-21 ,29 ,87)
    Polywaffle(38) = Array As Byte(-44 ,-114 ,109 ,111 ,-6 ,-120)
    Polywaffle(39) = Array As Byte(71 ,-59 ,-62 ,-115 ,-82 ,84)
    Polywaffle(40) = Array As Byte(-53 ,23 ,-120 ,10 ,-94 ,-85)
    Polywaffle(41) = Array As Byte(114 ,-86 ,-11 ,1 ,-59 ,84)
    Polywaffle(42) = Array As Byte(46 ,87 ,-28 ,-74 ,48 ,44)
    Polywaffle(43) = Array As Byte(-19 ,23 ,-91 ,-108 ,-90 ,42)
    Polywaffle(44) = Array As Byte(-58 ,67 ,88 ,32 ,-57 ,48)
    Polywaffle(45) = Array As Byte(73 ,83 ,-115 ,-70 ,6 ,62)
    Polywaffle(46) = Array As Byte(-103 ,12 ,85 ,74 ,-1 ,80)
    Polywaffle(47) = Array As Byte(-91 ,-31 ,109 ,-33 ,-45 ,81)
    Polywaffle(48) = Array As Byte(-83 ,120 ,-108 ,-7 ,-76 ,-69)
    Polywaffle(49) = Array As Byte(30 ,104 ,56 ,88 ,114 ,41)
    Polywaffle(50) = Array As Byte(34 ,11 ,-113 ,-42 ,-93 ,-16)
    Polywaffle(51) = Array As Byte(107 ,110 ,-14 ,49 ,-25 ,-6)
    Polywaffle(52) = Array As Byte(94 ,80 ,88 ,35 ,81 ,48)
    Polywaffle(53) = Array As Byte(-15 ,44 ,-10 ,18 ,-89 ,-39)
    Polywaffle(54) = Array As Byte(-78 ,75 ,63 ,-85 ,92 ,-7)
    Polywaffle(55) = Array As Byte(15 ,35 ,96 ,-83 ,-12 ,92)
    Polywaffle(56) = Array As Byte(63 ,-38 ,30 ,56 ,70 ,8)
    Polywaffle(57) = Array As Byte(27 ,16 ,-7 ,67 ,5 ,-115)
    Polywaffle(58) = Array As Byte(-22 ,-12 ,13 ,45 ,-45 ,97)
    Polywaffle(59) = Array As Byte(-69 ,-96 ,45 ,-85 ,92 ,-45)
    Polywaffle(60) = Array As Byte(-12 ,-89 ,-42 ,120 ,-9 ,-9)
    Polywaffle(61) = Array As Byte(-11 ,-15 ,-70 ,-76 ,-29 ,-56)
    Polywaffle(62) = Array As Byte(91 ,-25 ,26 ,-11 ,-91 ,91)
    Polywaffle(63) = Array As Byte(11 ,-31 ,-35 ,-81 ,-83 ,115)
    Polywaffle(64) = Array As Byte(26 ,28 ,72 ,19 ,107 ,89)
    Polywaffle(65) = Array As Byte(99 ,56 ,93 ,-32 ,53 ,-38)
    Polywaffle(66) = Array As Byte(-41 ,-12 ,-82 ,71 ,-51 ,-23)
    Polywaffle(67) = Array As Byte(95 ,54 ,77 ,-60 ,-4 ,-90)
    Polywaffle(68) = Array As Byte(-90 ,71 ,114 ,22 ,114 ,-33)
    Polywaffle(69) = Array As Byte(27 ,99 ,66 ,-1 ,89 ,-12)
    Polywaffle(70) = Array As Byte(-50 ,101 ,112 ,46 ,69 ,17)
    Polywaffle(71) = Array As Byte(-108 ,-89 ,-72 ,13 ,34 ,55)
    Polywaffle(72) = Array As Byte(4 ,46 ,19 ,16 ,24 ,51)
    Polywaffle(73) = Array As Byte(93 ,-49 ,-15 ,-43 ,-80 ,30)
    Polywaffle(74) = Array As Byte(116 ,-97 ,-35 ,61 ,-59 ,106)
    Polywaffle(75) = Array As Byte(-23 ,95 ,16 ,18 ,101 ,-69)
    Polywaffle(76) = Array As Byte(-26 ,-65 ,71 ,60 ,43 ,63)
    Polywaffle(77) = Array As Byte(-98 ,119 ,118 ,-63 ,-13 ,56)
    Polywaffle(78) = Array As Byte(-73 ,-17 ,-54 ,26 ,6 ,105)
    Polywaffle(79) = Array As Byte(-120 ,-49 ,32 ,-52 ,-57 ,-13)
    Polywaffle(80) = Array As Byte(70 ,82 ,-12 ,-87 ,-89 ,-89)
    Polywaffle(81) = Array As Byte(116 ,15 ,51 ,52 ,-91 ,-90)
    Polywaffle(82) = Array As Byte(-43 ,-72 ,118 ,26 ,48 ,-38)
    Polywaffle(83) = Array As Byte(-107 ,-13 ,1 ,35 ,-101 ,-80)
    Polywaffle(84) = Array As Byte(-77 ,-47 ,34 ,-54 ,31 ,-113)
    Polywaffle(85) = Array As Byte(99 ,71 ,-94 ,-17 ,24 ,50)
    Polywaffle(86) = Array As Byte(-53 ,52 ,-33 ,103 ,-94 ,105)
    Polywaffle(87) = Array As Byte(-90 ,27 ,-90 ,18 ,20 ,92)
    Polywaffle(88) = Array As Byte(-32 ,7 ,16 ,38 ,42 ,96)
    Polywaffle(89) = Array As Byte(69 ,-61 ,52 ,71 ,-88 ,73)
    Polywaffle(90) = Array As Byte(-71 ,25 ,33 ,-55 ,-86 ,-106)
    Polywaffle(91) = Array As Byte(87 ,59 ,45 ,44 ,-11 ,73)
    Polywaffle(92) = Array As Byte(106 ,85 ,27 ,22 ,10 ,19)
    Polywaffle(93) = Array As Byte(-54 ,16 ,-115 ,87 ,-25 ,-18)
    Polywaffle(94) = Array As Byte(104 ,31 ,-52 ,17 ,-23 ,-23)
    Polywaffle(95) = Array As Byte(20 ,75 ,95 ,43 ,89 ,-81)
    Polywaffle(96) = Array As Byte(-6 ,-34 ,-29 ,64 ,-57 ,-45)
    Polywaffle(97) = Array As Byte(-4 ,-56 ,47 ,-77 ,-117 ,34)
    Polywaffle(98) = Array As Byte(29 ,-85 ,-71 ,-77 ,98 ,92)
    Polywaffle(99) = Array As Byte(53 ,-98 ,-99 ,-99 ,-37 ,28)

End Sub
produces this log:
B4X:
Logger connected to:  TCL 5009A
--------- beginning of main
*** Service (starter) Create ***
--------- beginning of system
** Service (starter) Start **
** Activity (main) Create, isFirst = true **
4.56 = 196762.3231054847
4.685 = 224860.08661963238
4.81 = 256096.95746420973
4.935 = 290730.13793511514
5.06 = 329030.5699766847
5.185 = 371283.29406841035
5.31 = 417787.8081116596
5.435 = 468858.42631639325
5.56 = 524824.6380878846
5.685 = 586031.4669134385
5.81 = 652839.8292491096
5.935 = 725626.8934064213
6.06 = 804786.4384390847
6.185 = 890729.2130297166
6.31 = 983883.2943765597
6.435 = 1084694.4470801994
6.56 = 1193626.4820302846
6.685 = 1311161.6152922448
6.81 = 1437800.8269940093
6.935 = 1574064.2202127273
7.06 = 1720491.3798614843
7.185 = 1877641.7315760225
7.31 = 2046094.9006014592
7.435 = 2226451.0706790057
7.56 = 2419331.342932685
7.685 = 2625378.0947560505
7.81 = 2845255.3386989096
** Activity (main) Resume **
 
Upvote 0

emexes

Expert
Licensed User
They will be broken down into multiple subsets ... I just need a series of 2D arrays containing integers ... I just need to get it into a 2D array.
Hang on... one 2D array, or many 2D arrays?

Could you post a dummy (but realistic) sample with multiple (eg, 2 or 3) subsets and polynomials? Perhaps I have gone down the wrong trail.
 
Last edited:
Upvote 0

SeaBee

Member
Licensed User
Hang on... one 2D array, or many 2D arrays?

Could you post a dummy (but realistic) sample with multiple (eg, 2 or 3) subsets and polynomials? Perhaps I have gone down the wrong trail.
Thank you for all the effort you have put into this. I find your solution extremely elegant. An array of byte arrays will fill my requirements perfectly, and the use of Excel to add the necessary strings a master stroke!

Last night I hacked up a kludge of adding a space and underscore to the end of each row of csv data, and copy/pasting the data into B4J to be sucked into a one dimensional array, and then using nested For... Step... Next loops to break it down into a 2D array, but I have yet to test it with real data.

I now intend to complete my kludge and try it with real data, and also implement your method, and have a race! Those few arrays that would exceed 100 entries can be truncated to 100 with very little loss of accuracy, and no byte arrays exceed 10 entries.

Personally, I hope your method wins, as it is so much more elegant than mine, but bearing in mind that all 54 arrays, plus 5k of simple arrays have to be loaded on start-up, speed is important.

Again, thank you for your work and very elegant method.
 
Last edited:
Upvote 0

emexes

Expert
Licensed User
It seems there is a 64kB limit on method size, and initializing byte arrays using Array As Byte(data) isn't as code-space efficient as I thought it would be.

If the data is all small integers, then a workaround is to encode each number into 2 characters, eg if you just used uppercase letters A..Z then two would give you 26x26 = 676 values = say values -338 to 337. Join it all together into a long string (perhaps using those $" "$ quotes) and then in your program translate those double-characters back to numbers and load them into the polynomial arrays.

That plan should be no worries if string literals are compiled to ASCII or UTF-8, but might get a bit squeezed if they compile to 16-bit characters.
 
Last edited:
Upvote 0

emexes

Expert
Licensed User
Excel formula to translate numbers to double-letters, eg number in cell B2968:

=CHAR(FLOOR((B2968+338)/26,1)+65)&CHAR(MOD(B2968+338,26)+65)

No min/max value checking, though, so perhaps run a MIN() and MAX() over your Excel data ranges to make sure it's all within -338 to 337. Although it'll probably still work for larger/smaller values, but with the leading character outside of A..Z

edit: slightly tidier with QUOTIENT function eg:

=CHAR(QUOTIENT(B2968+338,26)+65)&CHAR(MOD(B2968+338,26)+65)
 
Last edited:
Upvote 0

emexes

Expert
Licensed User
I used a 65512 character string literal in B4J no problem, but fell over when added another 100 characters eg 65612.
 
Upvote 0

SeaBee

Member
Licensed User
Interesting! I wrote a very primitive, inexact and smaller version of what I wish to do now - about 35 years ago! Then I had to fit it into an HP75C with 32K total memory (24k + 8k RAM). In that case I used single characters in a continuous string, where the character with ascii value 100 represented 0, and characters higher or lower represented positive or negative values respectively. These strings were declared as DATA 'zxertygunjimk' and followed by a READ statement. It worked! It has bugged me for years that I couldn't do the job properly, but now I am in semi-retirement I have the time to it right!

I am surprised that byte storage is not particularly efficient, as a java byte is two's complement, i.e. ranges from -128 to +127 - exactly the range I want as all values range from -100 to +100. Logic tells me that within the required range it should require less space than a double character. I am wondering if having an array of byte arrays is adding a lot of overhead.

I think I will proceed with my original plan and see what happens. If end up with a size problem, I will have to reconsider, and perhaps try to implement my old ascii method.

Again, many thanks for all your work on this, it is really appreciated.
 
Upvote 0

SeaBee

Member
Licensed User
I have just found out that all java 2D arrays are, in fact, arrays of arrays, row first, so either of the solutions end up using the same data structure, thus there may be very little in it timewise.

I have also just realized another problem - not only will there be a reasonably compact array of byte arrays, but there will be a much larger series of data - strings of numbers and commas - sitting in the code, which won't go away after the arrays are loaded.

I fear @DonManfred is right again - as you mentioned earlier!

I think the answer will have to be an external file containing rows of heavily obfuscated ascii character strings, which can then be translated into two's complement bytes.
 
Upvote 0

emexes

Expert
Licensed User
Then I had to fit it into an HP75C with 32K total memory (24k + 8k RAM).
With experience like that, this should be a relaxing stroll in the park.
I am surprised that byte storage is not particularly efficient
The byte storage for arrays is efficient, it's the initializing byte arrays using Array As Byte(data) that takes more of the limited 64kB-per-method code space than expected, eg, these lines in B4A:
B4X:
Dim Polywaffle() As Object = Array As Object( _
Array As Byte(96 ,-71 ,-9 ,71 ,22 ,-116), _
Array As Byte(64 ,52 ,36 ,65 ,30 ,45), _
Array As Byte(-111 ,-89 ,-56 ,-104 ,-65 ,71), _
compile to these lines in Java:
B4X:
main._polywaffle = RemoteObject.createNewArray("Object",new int[] {476},new Object[] {

    (RemoteObject.createNewArray("byte",new int[] {6},new Object[] {
        BA.numberCast(byte.class, 96),
        BA.numberCast(byte.class, -(double) (0 + 71)),
        BA.numberCast(byte.class, -(double) (0 + 9)),
        BA.numberCast(byte.class, 71),
        BA.numberCast(byte.class, 22),
        BA.numberCast(byte.class, -(double) (0 + 116))
    })),

    (RemoteObject.createNewArray("byte",new int[] {6},new Object[] {
        BA.numberCast(byte.class, 64),
        BA.numberCast(byte.class, 52),
        BA.numberCast(byte.class, 36),
        BA.numberCast(byte.class, 65),
        BA.numberCast(byte.class, 30),
        BA.numberCast(byte.class, 45)
    })),

    (RemoteObject.createNewArray("byte",new int[] {6},new Object[] {
        BA.numberCast(byte.class, -(double) (0 + 111)),
        BA.numberCast(byte.class, -(double) (0 + 89)),
        BA.numberCast(byte.class, -(double) (0 + 56)),
        BA.numberCast(byte.class, -(double) (0 + 104)),
        BA.numberCast(byte.class, -(double) (0 + 65)),
        BA.numberCast(byte.class, 71)
    })),
which I don't have a problem with as long as the BASIC code works like it should, and the Java Virtual Machine Just-In_Time compiler seems to be doing some mighty impressive optimisation - it regularly has my eyebrows shooting up in surprise. But the (doubles) make me think that it is using 8 bytes of code space just to store a single byte-sized number. Normally not a problem, because memory and disk space are bordering on free and unlimited nowadays. Plus the code gets compressed for distribution. And besides, what kind of a nutcase would write a method that gets even halfway to 64 kB anyway? ;-)

perhaps try to implement my old ascii method
If that worked before, then use that. Avoid the first 32 characters (control characters) and perhaps character 127 (delete), but that still leaves you 223 character values to use for representing the 201 coefficient values you need. Heaps easy! Perhaps even skip character 32 (space) also, so that you don't need worry about spurious trailing spaces screwing up your data alignment, and you could also then use spaces to separate the polynomial data for visual confirmation.
 
Last edited:
Upvote 0

emexes

Expert
Licensed User
there will be a much larger series of data - strings of numbers and commas - sitting in the code
It is only readable numbers and commas in the source code - once it gets compiled to Java bytecode, it is not at all obvious. Although it can be decompiled to something vaguely readable. If you use the small-integers-encoded-to-ascii-characters, then things will be harder for somebody to work out wtf is going on, and certainly no worse than to have the same string in a file.
I fear @DonManfred is right again - as you mentioned earlier!
The prospect of that being true (again) just drives me even more to:



(usually I save this victory lap until a problem is completely solved, but... desperate times call for desperate measures ;-)
 
Upvote 0

emexes

Expert
Licensed User
The StringUtils library has EncodeBase64 and DecodeBase64 functions. 25% less efficient, but 100% already working.
 
Upvote 0

Philip Chatzigeorgiadis

Active Member
Licensed User
Longtime User
If I understand the problem correctly: Saving your data in a file is a good approach (like DonManfred suggested)- but you worry about data safety.

Here is how i tackled this:
Create an algorithm to change (obfuscate or encrypt) your initial data and then save them to a file (SQLite is OK).
The algorithm can be very simple (e.g. adding or multiplying with a constant or a variable like record number) or complex, it is really up to you. But, in the end, the data saved in the file will be of no use to anyone in their saved form.
You can even add "salt", like adding dummy records (e.g. every 3rd record can be a dummy).

Then, in your program, write an algorithm to read the correct records (i.e. avoid dummy records) and then "decrypt" your data.
In this way, only your program will be able to get the correct values from the file.
 
Upvote 0

SeaBee

Member
Licensed User
The issue is not just data security - it is also how to get the data into the program, but you are right about the requirements for obfuscation. I enjoy writing obfuscation algorithms, and some of the are very mean indeed. What would you make of the entirety of Hamlet's soliloquy imbedded in a string, which was then read into meaningless substrings at various places?
 
Upvote 0
Cookies are required to use this site. You must accept them to continue using the site. Learn more…