As i can understand, if a system during work generates huge qty of files - it's very bad situation for absolutely any operating system ?
Due to if you need to work with files - you have to list them all first to iterate further, and for huge qty - you'll just hang your app up waiting for listing result...
Correct ?
If yes - maybe someone already tried to solve this situation and made some code functions to store\get files from a huge files storage ?
What are you doing that generating a list can "hang" your app?
How long does it take to list the files? How many files are there to list?
Are there any mechanism to delete "old and already "parsed" onces? If not: why not?
How many data is inside the files?
NN-controlling app receiving .jpg-frames, 2-3 FPS, 24\7, and storing them for weeks.
Really long list for a single folder, changing every second.
And possibility to manually list, show, compare, delete...
i would use a server-step/routine in between. you can run a cronjob on linux for example to list all files and write the list to a file.
You app just read the file if you want o do something with the files.
On what operatin system they are stored?
Do you store everything in a single folder or grouped by week/day/month/whatever?
Ubuntu host, my java-app uses an API, receiving the frames.
And now storing is into a single folder. And i have found that normal (fast) work is possible only if files per folder qty, maybe 7...20 K max, or something...
RAM-disk in between is already used, but after all processing, the storage is required for long time, with listing\checking\comparing\deleting....
So, some storage system is needed for files millions...
I guess, that for such storage some class should be programmed, with
1) folder structure creation on-the-go, based on hours (or maybe smaller interval)
2) new file name\path calculation
3) files full list generation on background
4) file searching in this list
5) deleting files and folders that are outdated (the storage limit)
NN... is see... but i can' get the idea (saving jpegs of what?)... ok that is something you know why...
1) @peacemaker ....hmm..... are you using custom way (a code, an app created by you) to save jpg files (screenshot), or ffmpeg, or an app that saves at the folder you need... ?
2) the files have date and hour into their filename ?
1) jpg files are received from 3rd party API. I have to store them myself according to the task.
And again: huge qty of files is bad for any folder in any OS, if i'm right...
We can just save tons of files into a single folder, but it's impossible to list them all _fastly_ any time you want... due to qty and time to list...
2) filtering how ? i do not understand how without getting the full list of names
3) 200 kb .jpg х 10 million = ... better not to try... 2TB+ db
So, it needs a continuously working file system over an OS file system, with changing 2...5 times per second, without freezing the host app, storing 10 million files during the fixed time interval and deleting outdated ones on the background...
2) Well i am sure you already know it... at cmd line / terminal
for example at linux list only files of specific hour or minute or second...
(at linux/ubuntu/debian) LS IMG_20230918200001*.*
(at windows) DIR IMG_20230918200001*.*
i am sure that will be also way to do that in b4j or java...
This will get a list of all images taken at 2023 09 18 at 20:00 at 1st second... or you can do it 1st minute or... for an hour...
3) sure not good option saving on db... maybe saving only the filename and if parsed... (true/false)
No. Let the "serverapp" store all filenames in Database. Only the Filename together with some fields to search for them....
Do the search then in the database and get the files which are needed only. Should be FAST even after YEARS of storing.
You just need to have to use a good indexing in your folderstructure. year/month/day/hour where hour can be just hours 0-23 or even more up to minute
year/month/day/hour(0-23/minute(0-59)
There can be millions of files without any problem. Using the filenames on disc to search for anything may be a intensive task without Database....
I can not answer this. YOU need to count them. You have the knowledge what is coming in and how fast the amount of files is raising.
If i remember Linux has no problem in large Directories (amount of files). On Windows it can be problematic.
So better split them by year/month/day and if you want hour.
On every change you can run a batch to write the list of avaiable files to a txt file for each folder.