!kXpdasWeCTAvtfgULQ:matrix.org

librecube

58 Members
open source space and earth exploration: https://librecube.org/18 Servers

Load older messages


Timestamp Message
2 Feb 2020
12:37:22@rayan:hackerspaces.berayan joined the room.
13:11:29@sharma:diasp.insharma joined the room.
13:25:34@matt:openintents.modular.immatt joined the room.
5 Feb 2020
14:25:55@pedro:utwente.io@pedro:utwente.io left the room.
8 Feb 2020
11:33:51@artur.librecube:matrix.orgArtur ScholzHi! Who knows about dealing with big data? For example if I want to plot a parameter sampled every 5 seconds over duration of one year, thats over 6 Mio data points to store, collect, and plot.
11:35:18@artur.librecube:matrix.orgArtur ScholzI am thinking on using Mongodb (store), Python requests (collect/transfer) and bokeh/datashader (plot).
9 Feb 2020
18:17:01@bilderbuchi:matrix.orgbilderbuchiI have used bokeh+datashader to plot 1 GB csv time series files with ~60 columns. It works, although loading the data and creating the plot took quite some time O(1 min), at least when used in a Jupyter Notebook. When the plots were created all was quite smooth. One drawback I found about datashader is that, while it works nicely to render data that is bigger than what the plot displays, it seems mainly geared to plot large amounts of data of a single quantity. When still plotting e.g. several (6+) traces, you have to do a lot of legwork as now all of the traces have the same color, you don't have a legend anymore (compared to when not using datashader), etc.
18:18:22@bilderbuchi:matrix.orgbilderbuchiAlso, is 6 mio datapoints really "big data"? I think the classical understanding of the term is "bigger than what fits in the memory of the machine you are processing it with", so possibly you don't need a "big data" solution.
18:19:30@bilderbuchi:matrix.orgbilderbuchiAlso, check out Dask and Feather/parquet/arrow as file formats for big amounts of data, as they should map quite nicely into the whole pandas/python dataviz ecosystem. I have no personal experience with it (yet), but it's what I would investigate first when I have the need.
10 Feb 2020
12:49:56@artur.librecube:matrix.orgArtur ScholzThanks! Yes, it's not really BIG data but big enough to slow me down :( The use case is to store e.g. satellite telemetry on a machine and make those data available via language-agnostic REST API. So there are two bottlenecks: loading of data and transfer via HTTP...
12:52:44@artur.librecube:matrix.orgArtur ScholzThe datasets have the fields "time" and "value" (and are in separate collections per parameter). For MongoDb for example, setting an index on the time field yields a speed increase of at least factor 10.
12:53:33@artur.librecube:matrix.orgArtur ScholzOn the other hand, this increaes the needed disk space, as indexing creates additional hash keys.
12:54:42@artur.librecube:matrix.orgArtur ScholzI am right now looking into influxdb, which should be optimized for time-value datasets. Not sure however about disk space usage.
13:15:12@bilderbuchi:matrix.orgbilderbuchiI have heard that at TU M√ľnchen they are using Grafana (https://grafana.com/) as a telemetry platform for most (all?) their student satellite projects. Maybe that is suitable for you.
17:36:58@artur.librecube:matrix.orgArtur Scholzthanks, I had looked into this couple of times before, but nevery particulary liked it (in particualr not for large datasets...) My favorite is bokeh server + datashader
17:37:20@artur.librecube:matrix.orgArtur Scholzregarding the data storing option, this looks promising: https://medium.com/@aroussi/fast-data-store-for-pandas-time-series-data-using-pystore-89d9caeef4e2
11 Feb 2020
07:00:58@bilderbuchi:matrix.orgbilderbuchiyes it seems great. However, I kinda miss a quantitative analysis - the author seems excited that stuff takes milliseconds! Pretty fast, right? But there is e.g. no mention of the size of the data set (37 years of data - how many MB/entries is that?), and no comparison to loading/saving speed with other common approaches - csv files, sqlite database, excel,..., so it's impossible to tell if the library is actually fast(er than others)...
23 Feb 2020
14:55:36@bkecman:matrix.orgBkecMan joined the room.
16:04:29@bkecman:matrix.orgBkecMan changed their display name from bkecman to BkecMan.
3 Mar 2020
17:25:34@artur.librecube:matrix.orgArtur Scholz
In reply to @bilderbuchi:matrix.org
yes it seems great. However, I kinda miss a quantitative analysis - the author seems excited that stuff takes milliseconds! Pretty fast, right? But there is e.g. no mention of the size of the data set (37 years of data - how many MB/entries is that?), and no comparison to loading/saving speed with other common approaches - csv files, sqlite database, excel,..., so it's impossible to tell if the library is actually fast(er than others)...
I've done an (not complete) implementation of pystore into StretchyDb (https://gitlab.com/librecube/lib/python-stretchydb/-/compare/master...develop) and here is what I got: 11.5 Mio samples (time&value) retrieved via GET request within 1min5s, local storage consumption 125 MB. Just using plane parquet files via dask. That's about 3 times faster than the official parameter store we use here (Hadoop HBase). PS: Takes 7 sec to generate plot with matplotlib.
4 Mar 2020
07:05:50@bilderbuchi:matrix.orgbilderbuchinice, thanks for the update!
6 Mar 2020
17:58:44@priyanshurohilla:matrix.orgpriyanshurohilla joined the room.
17 Mar 2020
04:19:28@jgaspa01:matrix.orgjgaspa01 joined the room.
24 Mar 2020
21:39:53@artur.librecube:matrix.orgArtur Scholzhttps://gitlab.com/librecube/prototypes/python-webview3d
27 Mar 2020
11:09:12@nicolas_martinod:matrix.orgnicolas_martinod joined the room.
28 Mar 2020
20:50:47@artur.librecube:matrix.orgArtur ScholzHi all! Someone interested/volunteering to do the PCB layout of the PCDU (Power Control and Distribution Unit) prototype: https://gitlab.com/librecube/prototypes/system-pcdu
20:52:13@artur.librecube:matrix.orgArtur ScholzThe goal is to have all components on the CubeSat PC104 board, but not necessarly in a final configuration. Just so to demonstrate it's feasible to fit the components and to run tests with it.
20:53:47@artur.librecube:matrix.orgArtur ScholzFor the PCB layout (like ground planes etc.) one can refer to the eval boards (in the /test folder) or the reference layouts in the datasheets.
20:54:42@artur.librecube:matrix.orgArtur ScholzThe points is, it's a great execise to get work with KiCAD for PCB design and would help us to get this project ready soon!
29 Mar 2020
16:29:48@thodcrs:matrix.orgthodcrs joined the room.

There are no newer messages yet.


Back to Room List