!snChXSvNrZKnAAMbRV:matrix.org

kale

112 Members
2 Servers

Load older messages


SenderMessageTime
5 Apr 2022
@_slack_kubeflow_U03260LLNES:matrix.orgDave Scott set a profile picture.18:00:29
7 Apr 2022
@_slack_kubeflow_U03AZF4FF2M:matrix.org_slack_kubeflow_U03AZF4FF2M joined the room.19:06:54
12 Apr 2022
@_slack_kubeflow_U03BVHCHD4G:matrix.org_slack_kubeflow_U03BVHCHD4G joined the room.13:14:39
18 Apr 2022
@wybpip:matrix.org@wybpip:matrix.org joined the room.08:53:20
@wybpip:matrix.org@wybpip:matrix.org left the room.08:53:21
@_slack_kubeflow_U03260LLNES:matrix.orgDave Scott changed their profile picture.14:49:33
@_slack_kubeflow_U02U6A4ED1S:matrix.orgVlad Hey guys, I have a question .. Have you ever have this issue with pytorch and Kubeflow ?
RuntimeError: unable to write to file  /torch_253_338982586_0 : No space left on device (28)
19:49:55
19 Apr 2022
@_slack_kubeflow_U02U6A4ED1S:matrix.orgVlad Hi, anyone have worked with PyTorch and Kale? 19:02:59
20 Apr 2022
@_slack_kubeflow_UM56LA7N3:matrix.orgBenjamin Tan Is this in a VM? 01:09:39
@_slack_kubeflow_U02U6A4ED1S:matrix.orgVlad I think that this is from the pod 01:11:38
@_slack_kubeflow_UM56LA7N3:matrix.orgBenjamin Tan What about K8s, is this from vagrant or a full blown cloud K8s? 01:38:29
@_slack_kubeflow_U01K24XKKK9:matrix.orgJoseph Olaide joined the room.08:25:34
21 Apr 2022
@_slack_kubeflow_U02U6A4ED1S:matrix.orgVlad is a full eks cluster 23:35:47
22 Apr 2022
@_slack_kubeflow_UM56LA7N3:matrix.orgBenjamin Tan that's interesting. Any idea how big the generated files are locally? 01:57:30
@_slack_kubeflow_U02U6A4ED1S:matrix.orgVlad mm nop, but we create the shm volume manually (without kale) and now we have the next issue 18:57:52
@_slack_kubeflow_U02U6A4ED1S:matrix.orgVlad
RuntimeError: CUDA out of memory. Tried to allocate 162.00 MiB (GPU 0; 14.76 GiB total capacity; 13.53 GiB already allocated; 36.75 MiB free; 13.63 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
18:57:52
@_slack_kubeflow_U02U6A4ED1S:matrix.orgVlad I think that this is a limit of the gpu memory, maybe ... 20:04:52
@_slack_kubeflow_U02U6A4ED1S:matrix.orgVlad Denise Mariel Cari Martinez 20:06:51
26 Apr 2022
@_slack_kubeflow_U02U6A4ED1S:matrix.orgVlad Hey guys, how are you? I have a question: what size can get the pods for each step in the pipeline? And how I must define them? Thanks!! 18:24:32
27 Apr 2022
@_slack_kubeflow_UM56LA7N3:matrix.orgBenjamin Tan Size as in disk space ? 01:25:48
@_slack_kubeflow_UM56LA7N3:matrix.orgBenjamin Tan I think if u do a kubectl describe pod -n namespace and eyeball under the resources you should be able to tell 01:26:54
@_slack_kubeflow_UM56LA7N3:matrix.orgBenjamin Tan But thinking a bit, maybe you can store it in MinIO 01:28:08
@_slack_kubeflow_UM56LA7N3:matrix.orgBenjamin Tan i.e. use Output to do it 01:28:25
28 Apr 2022
@_slack_kubeflow_U0385A64B1B:matrix.orgAtra Akandeh joined the room.14:09:37
@_slack_kubeflow_U02U6A4ED1S:matrix.orgVlad cool!, Thanks BEnjamin, I'll try 20:02:17
29 Apr 2022
@_slack_kubeflow_U03E7U44FC0:matrix.orgBogdan Kowalczyk joined the room.15:14:00
@_slack_kubeflow_U03E7U44FC0:matrix.orgBogdan Kowalczyk changed their display name from _slack_kubeflow_U03E7U44FC0 to Bogdan Kowalczyk.15:14:50
@_slack_kubeflow_U03E7U44FC0:matrix.orgBogdan Kowalczyk set a profile picture.15:14:58
1 May 2022
@wybpip:matrix.org@wybpip:matrix.org joined the room.18:31:30
@wybpip:matrix.org@wybpip:matrix.org left the room.18:31:31

Show newer messages


Back to Room ListRoom Version: 6