!caaCTdhYlLGpbJVvly:matrix.org

kubeflow-pipelines

741 Members
2 Servers

Load older messages


SenderMessageTime
19 May 2022
@_slack_kubeflow_U03FXUQEWFN:matrix.org_slack_kubeflow_U03FXUQEWFN joined the room.23:17:47
@_slack_kubeflow_U01444C5K89:matrix.orgjaklan joined the room.23:32:14
20 May 2022
@_slack_kubeflow_U01444C5K89:matrix.orgjaklan changed their display name from _slack_kubeflow_U01444C5K89 to jaklan.00:32:55
@_slack_kubeflow_U01444C5K89:matrix.orgjaklan set a profile picture.00:32:56
@_slack_kubeflow_U01444C5K89:matrix.orgjaklan Hi guys, we have a following issue: • we would like to have a root pipeline (== global pipeline), which would, after running some common jobs, trigger child pipelines (== country-specific pipelines) • from the root perspective - when one country pipeline fails, we should get failure status for it, but don't block other country pipelines (so just parallel tasks with sth like continueOn: failed: true in Argo Workflows) • but now - there are a few business requirements: ◦ we should be able to open a detailed view of a selected child pipeline from the root pipeline (but it can be simply e.g. a link to another pipeline run, it doesn't have to be any dynamically expandable DAG of DAGs view) ◦ users have to be able to both re-run or retry the child pipeline, depending on which step failed (for now - just manually) And now the question - how can we achieve sth like that with KFP? KFP doesn't have any concept of pipeline of pipelines, so we imagine we can create a root pipeline which would just trigger child pipelines via API calls, but it becomes very problematic to follow their status - we would need to e.g. send GET requests for the child pipelines' status each e.g. 30 seconds and if the child pipeline fails and someone decides to re-run or retry run it - then we won't be able to get the new status as it was already marked as failed before (at least without dirty workarounds like manually modifying the status in database etc.). What is a good example of what we want achieve are... GitLab CI child pipelines. You can create a parent pipeline, which just triggers child pipelines and define if the root pipeline should wait for the status of child pipeline (and fail if they fail), or just always pass no matter what happens in child pipelines. Also in the first scenario, if you decide to retry a failed job and it succeeds, the child pipeline is then just continued, and if it's green - the rest of given root pipeline as well. And if you want to re-run the whole pipeline - it's also pretty easy. 00:32:57
@_slack_kubeflow_U036ZCFAFLP:matrix.orgJonny Browning Hi Jakub - what's the timeline for your project? I think KFP V2 (later this year) will be adding support for graph-based components, so you should then be able to achieve pipelines-of-pipelines 10:21:01
@_slack_kubeflow_U01444C5K89:matrix.orgjaklan Oh that's interesting, can you provide any more details about it? Any article, presentation, docs link? 10:21:46
@_slack_kubeflow_U036ZCFAFLP:matrix.orgJonny Browning I will have a look! As I'm sure you know, a lot of the documentation isn't fully to up to date! 10:22:30
@_slack_kubeflow_U01444C5K89:matrix.orgjaklan haha that's the never-ending problem with KFP 😄 10:22:52
@_slack_kubeflow_U036ZCFAFLP:matrix.orgJonny Browning some hints at ongoing work in the GH repo https://github.com/kubeflow/pipelines/pull/7551 10:24:39
@_slack_kubeflow_U01444C5K89:matrix.orgjaklan thanks! 11:29:02
@_slack_kubeflow_U01444C5K89:matrix.orgjaklan and what would you recommend as a workaround for the nearest months not to create any dirty, unmaintainable workarounds? 11:30:37
@_slack_kubeflow_U036ZCFAFLP:matrix.orgJonny Browning Tricky - KFP v2 will likely require a decent migration effort anyway as it's a major release 11:31:45
@_slack_kubeflow_U036ZCFAFLP:matrix.orgJonny Browning some interesting discussion here also (slightly older but you might find it useful) - https://github.com/kubeflow/pipelines/issues/4555 11:33:23
@_slack_kubeflow_U01444C5K89:matrix.orgjaklan thanks again, will have a look 11:34:33
@_slack_kubeflow_U036ZCFAFLP:matrix.orgJonny Browning You might be able to use this piece from the client library https://kubeflow-pipelines.readthedocs.io/en/latest/source/kfp.client.html#kfp.Client.wait_for_run_completion 11:34:52
@_slack_kubeflow_U036ZCFAFLP:matrix.orgJonny Browning (within a pipeline component for your "parent" pipeline) 11:35:03
@_slack_kubeflow_U036ZCFAFLP:matrix.orgJonny Browning Don't know what platform you are running on, but you could also use something like Google Cloud Composer (/Airflow) for orchestrating the whole thing - but I appreciate that's quite a heavyweight solution! 11:37:07
@_slack_kubeflow_U01444C5K89:matrix.orgjaklan definitely useful, just it's still challenging how to monitor retried pipelines, as it wouldn't affect parent pipeline, because it's just one-way communication 11:37:22
@_slack_kubeflow_U01444C5K89:matrix.orgjaklan we have Kubeflow deployed on AWS and in general we try to leverage Argo for scheduling (but we rather just learn it) 11:38:22
@_slack_kubeflow_U01444C5K89:matrix.orgjaklan but even with Argo I think the issue is exactly the same, so how to build that two-way communication 11:39:01
@_slack_kubeflow_U03GAQYGV4K:matrix.orgKrzysztof Romanowski joined the room.11:42:27
@_slack_kubeflow_U03GAQYGV4K:matrix.orgKrzysztof Romanowski changed their display name from _slack_kubeflow_U03GAQYGV4K to Krzysztof Romanowski.11:45:14
@_slack_kubeflow_U03GAQYGV4K:matrix.orgKrzysztof Romanowski set a profile picture.11:45:15
@_slack_kubeflow_U03GAQYGV4K:matrix.orgKrzysztof Romanowski Hello, I work with jaklan and I'm the admin of our Kubeflow instance. This is very interesting Jonny Browning, will definitely take a look! 🙂 11:45:16
@fredkid:matrix.orgfredkid joined the room.11:49:24
@willykin:matrix.orgwillykin joined the room.18:24:04
@_slack_kubeflow_U03FGUC1VGA:matrix.orgVu Dat joined the room.21:45:41
@_slack_kubeflow_U03FGUC1VGA:matrix.orgVu Dat Do you have solution? 21:52:49
@_slack_kubeflow_U03FGUC1VGA:matrix.orgVu Dat Vinay Anantharaman 21:52:58

Show newer messages


Back to Room ListRoom Version: 6