Phew
What a long day. This morning I receive a request to consolidate 1.1GB worth of data with 511 files, excel spreadsheet files from a particular tab that has about 120+ columns and I should use a similar template to write all the consolidated rows.
Thanks to b4j and jPoi, following on Erels way of doing things, all went well, well with challenges..
Challenges & Solutions
C1. I'm running a 64 bit windows machine and I have 32 bit java jdk installed. I experienced heap size errors. The excel file sizes descend from 10MB up until 4MB at most.
S1. Install 64 Bit JDK and configure paths and in Main set
C2. jPoi reads everything on the sheet rows and columns provided it has data. I'm running a sub that builds column names from a to z... this creates about 702 columns. The maximum number of columns in excel are in the range of 16K. I know for a fact that the columns in my spreadsheets are 120-, so I need to tweak my code to read the columns to 702. There is a file that shows column size of 802 and this crashes my code.
S2. Limit the number of columns read to 701
For each row, this sub creates a map from a specified row number that will have a=data, b=data, c=data, d=data etc, where data is the actual contents of the file.
C3. We are using a template that is already defined with R1 - R5000, this has some empty rows.
S3 Because jPoi reads all available specified content, some and most of the rows are blank. On first run the database had 1M records from all consolidated records from the excel sheets, so we needed to mark some columns as required so that of the information is not available in those columns, the rows are not needed. This resulted in an overall 85K records with 120 columns and a 20MB sqlite db.
C4. Write the data back to the original template.
S4. This I quickly did manually with export to excel and then used copy and paste. My excel kept on hanging but will explore writing directly to a new template file next time.
*hand clap* for jPoi.
Ta!
PS: I'm using an excel module I wrote with consolidated methods for jPoi.
What a long day. This morning I receive a request to consolidate 1.1GB worth of data with 511 files, excel spreadsheet files from a particular tab that has about 120+ columns and I should use a similar template to write all the consolidated rows.
Thanks to b4j and jPoi, following on Erels way of doing things, all went well, well with challenges..
Challenges & Solutions
C1. I'm running a 64 bit windows machine and I have 32 bit java jdk installed. I experienced heap size errors. The excel file sizes descend from 10MB up until 4MB at most.
S1. Install 64 Bit JDK and configure paths and in Main set
B4X:
#VirtualMachineArgs: -Xms3G -Xmx3G
C2. jPoi reads everything on the sheet rows and columns provided it has data. I'm running a sub that builds column names from a to z... this creates about 702 columns. The maximum number of columns in excel are in the range of 16K. I know for a fact that the columns in my spreadsheets are 120-, so I need to tweak my code to read the columns to 702. There is a file that shows column size of 802 and this crashes my code.
B4X:
'define the column names as a list from a > zz
Sub PoiColumnNames() As List
Dim lst As List
lst.initialize
Dim alphalist1 As List = CreateList(",","a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p,q,r,s,t,u,v,w,x,y,z")
Dim alphalist2 As List = CreateList(",","a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p,q,r,s,t,u,v,w,x,y,z")
Dim l1Tot As Int = alphalist1.Size - 1
Dim l2Tot As Int = alphalist2.Size - 1
Dim l1cnt As Int
Dim l2cnt As Int
'first alphabets
For l1cnt = 0 To l1Tot
lst.Add(alphalist1.Get(l1cnt))
Next
'rest of the alphabets
For l1cnt = 0 To l1Tot
For l2cnt = 0 To l2Tot
Dim strKey As String = $"${alphalist1.Get(l1cnt)}${alphalist2.Get(l2cnt)}"$
lst.add(strKey)
Next
Next
Return lst
End Sub
S2. Limit the number of columns read to 701
B4X:
return a list of excel rows as maps, each records key is the column name
'this reads all rows and columns in the excel sheet
Sub PoiGetRows(ws As PoiSheet) As List
'define the column names list
Dim colNames As List = PoiColumnNames
Dim lst As List
lst.Initialize
Dim rowEnd As Int = ws.LastRowNumber
Dim rowStart As Int = ws.FirstRowNumber
Dim rowCnt As Int
Dim celCnt As Int
Dim celTot As Int
'loop through each row
For rowCnt = rowStart To rowEnd
Dim row As PoiRow = ws.GetRow(rowCnt)
row.IsInitialized
'how many columns are there
celTot = row.Cells.Size - 1
If celTot > 701 Then celTot = 701
'each row should be saved as a map using column alphabets
Dim rowMap As Map
rowMap.Initialize
For celCnt = 0 To celTot
'get the cellname from the list
Dim cellName As String = colNames.Get(celCnt)
'define the poicell
Dim cellP As PoiCell = row.GetCell(celCnt)
cellP.IsInitialized
'get the cell value
Dim cellValue As String = PoiGetCell(cellP)
rowMap.Put(cellName,cellValue)
Next
lst.add(rowMap)
Next
Return lst
End Sub
For each row, this sub creates a map from a specified row number that will have a=data, b=data, c=data, d=data etc, where data is the actual contents of the file.
C3. We are using a template that is already defined with R1 - R5000, this has some empty rows.
S3 Because jPoi reads all available specified content, some and most of the rows are blank. On first run the database had 1M records from all consolidated records from the excel sheets, so we needed to mark some columns as required so that of the information is not available in those columns, the rows are not needed. This resulted in an overall 85K records with 120 columns and a 20MB sqlite db.
C4. Write the data back to the original template.
S4. This I quickly did manually with export to excel and then used copy and paste. My excel kept on hanging but will explore writing directly to a new template file next time.
*hand clap* for jPoi.
Ta!
PS: I'm using an excel module I wrote with consolidated methods for jPoi.