!CJoUbovqKaCGrFkbrY:matrix.org

Spark with Scala

403 Members
A place to discuss and ask questions about using Scala for Spark programming.3 Servers

Load older messages


SenderMessageTime
18 Oct 2023
@_discord_750453990151684127:t2bot.iogeirolz changed their display name from Geirolz to geirolz#0.10:51:44
@_discord_750453990151684127:t2bot.iogeirolz changed their display name from geirolz#0 to geirolz.10:51:51
20 Oct 2023
@_discord_316762215975878666:t2bot.ioAlexITC 03:29:53
@_discord_316762215975878666:t2bot.ioAlexITC 03:29:57
@_discord_818401984230981642:t2bot.iocheapsolutionarchitect The ticket states it was resolved in 3.5.1, perhaps you could try that version? 06:34:59
@_discord_818401984230981642:t2bot.iocheapsolutionarchitect For Snapshot versions you need to include https://repository.apache.org/snapshots/ and append -SNAPSHOT. It looks like there is no 3.5.1 official release yet. 07:23:01
@_discord_818401984230981642:t2bot.iocheapsolutionarchitect * For Snapshot versions you need to include https://repository.apache.org/snapshots/ and append -SNAPSHOT to your dependency. It looks like there is no 3.5.1 official release yet. 07:23:17
@_discord_89507544619315200:t2bot.ioderya changed their display name from derya to derya#0.21:38:12
@_discord_89507544619315200:t2bot.ioderya changed their profile picture.21:38:46
@_discord_89507544619315200:t2bot.ioderya changed their display name from derya#0 to derya.21:43:39
24 Oct 2023
@_discord_255141202347687937:t2bot.iowhiteturq 19:11:37
25 Oct 2023
@_discord_264442795245174785:t2bot.iolafeychine changed their display name from Vincent Lafeychine to lafeychine.13:35:17
26 Oct 2023
@_discord_213494046864179210:t2bot.ioprogamermatt changed their display name from pgm to progamermatt.15:15:17
27 Oct 2023
@_discord_140207550816714752:t2bot.iotxdv 06:36:27
@_discord_818401984230981642:t2bot.iocheapsolutionarchitect I do not think Spark supports such a use case. Usually you would trigger the application via a cronjob, or some kind of cluster manager. And you could run into a timeout. If your driver does not run on the Spark-Cluster, you can however, try to wait in your driver application and then call start() on your stream. 09:39:35
@_discord_818401984230981642:t2bot.iocheapsolutionarchitect * I do not think Spark supports such a use case. Usually you would trigger the application via a cronjob, or some kind of cluster manager task. If your driver does not run on the Spark-Cluster, you can however, try to wait in your driver application and then call start() on your stream. 09:45:33
@_discord_818401984230981642:t2bot.iocheapsolutionarchitect * I do not think Spark supports such a use case. Usually you would trigger the application via a cronjob, or some kind of cluster manager task. If your driver does not directly run on the Spark-Cluster, you can however, try to wait in your driver application and then call start() on your stream. 09:45:58
@_discord_818401984230981642:t2bot.iocheapsolutionarchitect As far as I have understood, StreamingContext is old API and Structured Streaming is the successor. And I would not call the alignment on batch interval borders a feature. It breaks down really fast, e.g. how do you trigger the batch on every hour starting half past a defined hour? However, my experience is confined to my specific cluster architecture. I run Spark in standalone mode on a k8s cluster. So every Spark-driver-app is a running Pod. This allows me, for example to sleep-wait within the entry point script or within the driver app and so on. 20:21:09
@_discord_397996873354444810:t2bot.ioufodivebomb nice! I've been wanting to try a small k8s spark instance 20:45:20
29 Oct 2023
@_discord_840147810765766656:t2bot.iocu4381 changed their display name from cu4381#0 to cu4381.23:32:40
1 Nov 2023
@_discord_510415959480336393:t2bot.iopotatoef changed their profile picture.16:21:38
7 Nov 2023
@softinio:matrix.orgSalar Rahmanian (softinio) changed their profile picture.17:56:04
12 Nov 2023
@_discord_818401984230981642:t2bot.iocheapsolutionarchitect Take a look at the class comment here https://spark.apache.org/docs/latest/api/scala/org/apache/spark/sql/KeyValueGroupedDataset.html. Your K is the first element of the tuple, and your V is the tuple, that results in the given return type. If you call .collect, you will see your expected result. If you want a DF of type String, Array[Int], you could probably make use of mapGroups and turn the second position of the tuple into an Array. 06:09:12
@_discord_818401984230981642:t2bot.iocheapsolutionarchitect * Take a look at the class comment here https://spark.apache.org/docs/latest/api/scala/org/apache/spark/sql/KeyValueGroupedDataset.html. Your K is the first element of the tuple, and your V is the tuple, that results in the given return type. If you call .collect, you will probably see your expected result. If you want a DF of type String, Array[Int], you could probably make use of mapGroups and turn the second position of the tuple into an Array. 06:13:26
13 Nov 2023
@_discord_936395565154054144:t2bot.ioeje4073 changed their display name from eje to eje4073.18:30:41
14 Nov 2023
@_discord_734434444433293464:t2bot.ioatk91 changed their display name from atk91#0 to atk91.19:22:01
28 Nov 2023
@_discord_305362010374406144:t2bot.iomarouan28 changed their display name from marouan28#0 to marouan28.12:18:26
@_discord_902417901972258868:t2bot.iokagaku2340 changed their display name from kagaku to kagaku2340.20:37:26
30 Nov 2023
@_discord_736256129323106384:t2bot.iovazand 08:41:11
2 Dec 2023
@_discord_89507544619315200:t2bot.ioderya changed their profile picture.00:07:35

There are no newer messages yet.


Back to Room ListRoom Version: 9