Android Question Store fast-arriving data locally

Mike1970

Well-Known Member
Licensed User
Longtime User
Hi everyone,
I'm developing a project where I use MQTT to receive on my device telemetry data.
The MQTT broker is on the device itself because it is designed in such a way that the system can be auto-sufficient without depending on the Internet connection.

It already forwards the telemetry to the cloud server too.

The problem is:
Data arrives FAST, and IF there is NO INTERNET connection to forward them to the cloud, what's the best way (in B4X/B4A) to temporarily store locally this information to send them to the cloud later?
My concern is about SQLite to be not fast enough to handle telemetry data from tens of sensors and causing overhead.


Thanks in advance
 

emexes

Expert
Licensed User
Longtime User
Queue it up, and if queue buffer fills up, dilute it by throwing out every 16th sample, then every 8th sample, then every 4th sample, etc. From memory, the "math" wasn't that hard to do: we used an extra byte per sample time (ie per all correlated channels, not for each sample individually) that indicated how many times the data had been divided into two, which will take you to a "save ratio" lower than you'd ever conceivably get to.

That way, let's say you've got 10 MB alllocated to the queuing 50 channels with 16 bit samples, that's 100,000 samples you can store locally for each channel, so if your internet connection fails and it takes a day to fix it, at least you'll still have a tad over one sample per second (per channel) of unlost data.
 
Upvote 0

emexes

Expert
Licensed User
Longtime User
Data arrives FAST

How fast? Like, for some use cases, 20 samples per second is fast.

tens of sensors

I originally read this as tons of sensors and thought ok, you must be doing train testing or similar with readings from each wheel individually or something.

Tens of sensors implies less than 100, which sounds much more manageable. We had an engine dyno + data acquisition system that had that many actual sensors, a few were at 50 Hz (ie mains frequency) and about half were at 10 Hz (also related to mains frequency) and the slower-moving remainder (like temperatures and tank levels) were at 1 Hz. Maybe you have similar possibilities to reduce the avalanche of data.

Another idea is that if data is being held up locally during internet connection failure, perhaps that data can be "archived" aka compressed, either with Zip etc, or by a hand-rolled delta and run-length compression. Like a channel for the ambient air temperature, with 0.1 degC resolution, how fast is that going to change? Why store 1000's of readings that are all the same, when you can just store once that it's been eg 23.4 degC for the past n readings?

that many actual sensors

Now that I think about it, there were 96 digital inputs alone (plus 48 analog inputs plus a few timers). Although those were the capabilities, not the actual sensor counts, but we were definitely using more than 32 analog channels otherwise we'd have gone with 2 analog boards instead of 3.
 
Last edited:
Upvote 0

Mike1970

Well-Known Member
Licensed User
Longtime User
Thanks for your response! Very interesting.

How fast?
As you said, actually, there are different data types that are being sent.
Some of them can have very low frequency (e.g. state changes) other, like fluxmeters/motors rpm, sends out data with higher frequency (I did not measure it, and I can't right now, but assume 50+Hz (for each one of the n devices that has many of these sensors/actuators).


Another idea is that if data is being held up locally during internet connection failure, perhaps that data can be "archived" aka compressed, either with Zip etc, or by a hand-rolled delta and run-length compression.
I did not fully understand this sentence because I'm not sure about what do you mean with "locally".
In this particular scenario "locally" can be intended as "on the device with sensors (❌)" or "on the android device (✅)"

To clarify the scenario (simplified version), here you are a sketch.
Untitled Diagram.png


The orange path is considered always granted, since it is a wired direct connection (and the broker is on the android device itself).
The problem is when the blue path is interrupted... the "locally" for me is the "Android Panel PC".

So my concern is How can I implement a storage on the Android Panel PC capable of handling arriving data?
You said "archiving"/"zipping" and I found it interesting, were you thinking about Android or on the devices?

In Android how can this be achived? maybe by keeping a InputFileStream open when internet connection fails and using it to save data coming from MQTT somehow? Is it safe? Idk.

(I'm not considering SQLite, but I know that it could be an option)
 
Upvote 0

Erel

B4X founder
Staff member
Licensed User
Longtime User
Upvote 0

Mike1970

Well-Known Member
Licensed User
Longtime User
Start with the simplest possible solution and make some tests
Hi Erel!!
You mean using KVS/CKVS or SQLite?

Do you think it’s a good idea to store data using KVS and use as key the MQTT topic and as value the list of arriving messages for that specific topic?
Maybe with an associated timestamp?

Could it have memory/storage overhead or some read/write limitations?

Does KVS handle the concurrency? (E.g. multiple MQTT message arriving at the same time)

in the cloud the telemetry data is being stored using a MongoDB configured to use timeseries data (the server is written in NodeJS+Express)
 
Last edited:
Upvote 0

emexes

Expert
Licensed User
Longtime User
You said "archiving"/"zipping" ... were you thinking about Android or on the devices?

I was thinking buffering and compression would be done Just before the most-likely point of failure, ie the internet connection.

In your (impressive!) diagram, that would be on the Android panel PC.

But only archiving / zipping / compressing if the internet connection is down; the aim being to maximize the amount of data that can be saved, ie maximize the internet outage duration that can be handled without data loss.

But also I might be jumping the gun - perhaps compression is not necessary, if the buffer is a file and there is oodles of space.

First get it working, then get it working right, then get it working fast.


fluxmeters/motors

Now I'm imagining this project is for lunar or martian explorer vehicles, or for a UAV. Sounds interesting.
 
Last edited:
Upvote 0

f0raster0

Well-Known Member
Licensed User
Longtime User
The problem is:
Data arrives FAST, and IF there is NO INTERNET connection to forward them to the cloud, what's the best way (in B4X/B4A) to temporarily store locally this information to send them to the cloud later?
My concern is about SQLite to be not fast enough to handle telemetry data from tens of sensors and causing overhead
Sqlite is not good enougth.. storage in txt file is better/faster (from my experience with sensors)
 
Upvote 0

Erel

B4X founder
Staff member
Licensed User
Longtime User
Does KVS handle the concurrency? (E.g. multiple MQTT message arriving at the same time)
The event that you handle always runs on the main thread. You don't need to worry about concurrent access.

What I meant, is that you shouldn't be too worried about the performance unless you actually see that there is a problem. Make some tests (in release mode!).
Do you think it’s a good idea to store data using KVS and use as key the MQTT topic and as value the list of arriving messages for that specific topic?
Maybe with an associated timestamp?
Only if you are fine with overwriting the existing key / value pair.

Buffering is very important. Don't write every single point of data separately. Find ways to group them.
 
Upvote 0

Erel

B4X founder
Staff member
Licensed User
Longtime User
Small example:
B4X:
Sub Class_Globals
    Private Root As B4XView
    Private xui As XUI
    Private Collector As List
    Private CollectedLength As Int
    Private Output As OutputStream
    Private serializator As B4XSerializator
    Private bc As ByteConverter
End Sub

Public Sub Initialize
'    B4XPages.GetManager.LogEvents = True
End Sub

'This event will be called once, before the page becomes visible.
Private Sub B4XPage_Created (Root1 As B4XView)
    Root = Root1
    Root.LoadLayout("MainPage")
    Collector.Initialize
    Output = File.OpenOutput(xui.DefaultFolder, "1.dat", True) 'append = true?
    For i = 1 To 10
        DataGenerator(20, 50)
    Next
End Sub

Private Sub DataGenerator (BytesPerMessage As Int, Hz As Int)
    Dim sr As SecureRandom
    Dim b(BytesPerMessage) As Byte
    Do While True
        sr.GetRandomBytes(b)
        CollectData(b)
        Sleep(800 / Hz)
    Loop
End Sub

Private Sub CollectData(b() As Byte)
    Collector.Add(b)
    CollectedLength = CollectedLength + b.Length
    If CollectedLength > 100000 Then
        Dim n As Long = DateTime.Now
        Dim serialized() As Byte = serializator.ConvertObjectToBytes(Collector)
        Dim n1 As Long = DateTime.Now - n
        n = DateTime.Now
        Output.WriteBytes(IntToBytes(serialized.Length), 0, 4) 'write the message length
        Output.WriteBytes(serialized, 0, serialized.Length)
        Output.Flush
        Dim n2 As Long = DateTime.Now - n
        Log($"$Time{DateTime.Now}: wrote data. Serialization: ${n1}ms, Writing: ${n2}ms"$)
        CollectedLength = 0
        Collector.Clear
    End If
End Sub

Private Sub IntToBytes(i As Int) As Byte()
    Return bc.IntsToBytes(Array As Int(i))
End Sub

Output:
*** Service (starter) Create ***
** Service (starter) Start **
** Activity (main) Create (first time) **
Call B4XPages.GetManager.LogEvents = True to enable logging B4XPages events.
** Activity (main) Resume **
10:04:15: wrote data. Serialization: 74ms, Writing: 0ms
10:04:24: wrote data. Serialization: 105ms, Writing: 1ms
10:04:32: wrote data. Serialization: 101ms, Writing: 0ms
10:04:41: wrote data. Serialization: 105ms, Writing: 1ms
10:04:49: wrote data. Serialization: 90ms, Writing: 1ms
10:04:58: wrote data. Serialization: 106ms, Writing: 1ms
10:05:06: wrote data. Serialization: 110ms, Writing: 0ms
10:05:15: wrote data. Serialization: 52ms, Writing: 1ms


Using B4XSerializator makes it easy to later read the data.
 
Upvote 0

Mike1970

Well-Known Member
Licensed User
Longtime User
Ok!! Thank you to everyone!
At the moment I’m on vacation and I do not have the instrumentation to make tests with me.

As soon as I come back I will try your suggestions and I will give a feedback here, maybe with the final solution, I hope.
 
Upvote 0
Top