24 May 2022 |
| Benjamin Tan changed their profile picture. | 15:25:16 |
droctothorpe | Why would you want to use EFS? It’s more expensive than EBS and you’ll probably run into performance issues.
If anything, use RDS. | 18:25:39 |
droctothorpe | We just migrated to RDS for the metadata store. It’s pretty straightforward and provides a ton of advantages by virtue of it being externally managed. | 18:27:34 |
| Ricardo Martinelli de Oliveira joined the room. | 18:50:19 |
Clark Updike | Is there anyway to update a pipeline referenced in an experiment as a recurring run? If I bump the version of the pipeline to a later version, it doesn't seem to have any effect on the experiment which keeps running an older version. Hoping to not have to delete the recurring run experiment every time... | 20:53:11 |
25 May 2022 |
Rahul Mehta | I also looked into this, and unfortunately couldn't find anything/wound up needing to delete and recreate the recurring run with the kfp client. This is especially unfortunate since a network flake when running this could result in inadvertently deleting a recurring run without recreating it | 00:00:07 |
Ferdinand von den Eichen [Kineo.ai] | Our use case makes the kubeflow cluster(s) really fluent. On a given day we may be running 20 clusters and only 1 the next day, or none at all. In terms of cost efficiency it would be problematic to run 20 rds at all times, to support those clusters when they are up. We used EBS in the past, and it worked really well, however EBS has such huge limitations. Just the fact that it can only be attached to a single node, makes the whole setup much trickier (limitation to availability zone, scheduling tasks between nodes becomes harder).
Regarding cost: we have found that EFS can be cheaper even, due to the nature of the pay per GB structure of EFS vs. EBS where you have to reserve 20, 30 GB and pay for it at all times. | 07:33:39 |
Ferdinand von den Eichen [Kineo.ai] | Follow up question on RDS though droctothorpe : did you try the serverless aurora variant? That can shutdown and spin up again by any chance? | 07:34:57 |
Ferdinand von den Eichen [Kineo.ai] | That actually looks very promising for our use case 😍 | 07:46:39 |
Cornelis Boon | Hi, I would like to keep track of the status of different branches in my pipeline. Specifically, I would like to be able to gracefully die and send something to my db that a step in the branch has failed. Are there tools in kubeflow (or kubernetes) for this or should I just write in the step’s code some lines that gracefully exit on failure and updates the state in my db before throwing an error? | 11:33:50 |
Clark Updike | Hmmm, that's too bad. Ok, thanks for sharing your experience. | 11:39:20 |
droctothorpe | I did not mostly because I couldn't find any precedent online and didn't want to spend my "innovation tokens" there, but I don't see any reason why it wouldn't work. It's all MySQL, regardless of the underlying infra, after all. I recommend starting with the AWS RDS overlay manifests and documentation, it's the most comprehensive. | 13:20:23 |
droctothorpe | Everything past provisioning is the same. All KF needs is an endpoint and credentials. | 13:20:54 |
Ferdinand von den Eichen [Kineo.ai] | Yeah I looked into it earlier today. Seems doable. I might reach out to you if we run into major issues 👼 | 13:21:44 |
droctothorpe | Happy to help (assuming I can, heh). | 13:22:08 |
| _slack_kubeflow_U03GW0Y4L3X joined the room. | 15:41:16 |
| Ricardo Martinelli de Oliveira changed their display name from _slack_kubeflow_U03GTL2Q3HR to Ricardo Martinelli de Oliveira. | 16:22:02 |
| Ricardo Martinelli de Oliveira set a profile picture. | 16:22:04 |
| _slack_kubeflow_U03H39N6V28 joined the room. | 21:39:15 |
Rahul Mehta | If a recurring run is deleted from Kubeflow during scheduled execution, will Kubeflow lose track of the job? We've observed this behavior with a particular recurring run that's scheduled around the same time that we re-deploy recurring runs (we remove and re-schedule them since they involve different dependencies etc). | 21:56:44 |
26 May 2022 |
Gerard Casas Saez | Hi folks, qq, is there any estimate on dates for kfp==2.0.0rc0 release and 2.0.0 full support from backend (trying to undestand better the current story around this) | 00:50:57 |
| Payman Touliat joined the room. | 01:46:11 |
| _slack_kubeflow_U03H27PMSMR joined the room. | 06:03:21 |
Atra Akandeh | Hello everyone. Could someone please direct me to a link that provides an example of pipeline that consume large amounts of data stored in Google Cloud Storage? or any suggestion on that. | 14:07:55 |
| _slack_kubeflow_U03H4QLQ202 joined the room. | 15:36:23 |
Kishan Savant | I think you can download the data using the URL from GCS and then pass on the path of the data using InputPath() and OutputPath(). Check out the following link. There are some examples with implementation that you can check out in the kubeflow/examples repo | 17:19:31 |
Atra Akandeh | Thanks, Kishan. Why would you not use pvc over InputPath()/OutputPath()? | 23:19:19 |
27 May 2022 |
| Changshin Yoo changed their display name from _slack_kubeflow_U03C4M32V1U to Changshin Yoo. | 02:10:35 |
| Changshin Yoo set a profile picture. | 02:10:37 |
Kishan Savant | https://kubeflow.slack.com/archives/CE10KS9M4/p1645603722299249?thread_ts=1645458077.332789&cid=CE10KS9M4 | 05:17:18 |