27 May 2022 |
| laserK3000 joined the room. | 12:11:18 |
laserK3000 | Hi, what is the intended way in v2 to output multiple files of a priori unknown number as an Artifact ? Do I have to compress everything into a single file or can I treat Output.path as a directory? I did the latter. I works (files end up in minio) but I wanted to know if this is intended behavior or has some caveats. One thing that did not work with this approach though was to download the artifact via the UI. | 12:31:00 |
Chase Christensen | laserK3000 I believe output path is supposed to be a directory. Everything is pushed to S3 (minIO) for outputs. Now you can choose the URL or S3 endpoint in V2 is my understanding. You should be able to look at the DAG and see where the output artifacts are stored. I think its a good approach. You can also get real spicy and do VolumeOps and just write to local volumes for marshaling and build pipelines that write to expected local directories. Then you aren't as reliant on Minio and can just write plain ol Python functions and drop stuff wherever without worrying about how KFP handles marshaling. | 14:01:17 |
| _slack_kubeflow_U035NJ80VFS joined the room. | 15:07:50 |
| _slack_kubeflow_U034329LRB2 joined the room. | 15:41:31 |
28 May 2022 |
laserK3000 | Thank you! | 15:42:45 |
laserK3000 | I used mount_pvc from kfp.onprem before but apparently this has not yet been ported to the final v2 implementation. That's why I was looking into using minio for passing large amounts of data. | 15:45:48 |
Chase Christensen | You can just create the volumOp and pass the object to steps | 16:07:29 |
Chase Christensen | Especially if you are using light weight python components. I'm away from my desk..i can get you an example when im back | 16:08:02 |
29 May 2022 |
mobin nikkhesal | Hello everyone,
When using kfp, I would like to pull images from a private container registry. Could you please explain how to set this up?
---
kubeflow: 1.5 (wg distribution)
Kubernetes: 1.21 (on perimise)
kfp: 1.8 | 06:17:34 |
Yingding Wang | Please take a look at this kubeflow notebook from which I start a pipeline with a component using a private container registry. https://github.com/careforrare/kf-pipelines/blob/main/demo-examples/pipeline_image_registry_builder_sdk_v2.ipynb
I am still on kfp 1.7.0 kubeflow 1.4 und K8s 1.21 (on prem), but I think the prinzip for kf 1.5 should be the same. | 09:58:33 |
mobin nikkhesal | thanks Yingding Wang | 11:31:14 |
30 May 2022 |
| 레몬버터구이 changed their display name from _slack_kubeflow_U03EE7VFCDN to 레몬버터구이. | 02:19:24 |
| 레몬버터구이 set a profile picture. | 02:19:26 |
Frédéric Kaczynski | The only thing I think could be used for your use-case is Workflow Events (https://argoproj.github.io/argo-workflows/workflow-events/), but it stays it should not be used for automation. :/ | 08:43:26 |
Cornelis Boon | Thanks! Will have a look, but I guess I’ll just write code that catches any errors and reports them as such before raising/throwing them further | 08:47:34 |
Chase Christensen | #define a function to add 2 numbers
def add(a: float, b: float) -> float:
return a + b
import kfp.components as comp
add_op = comp.func_to_container_op(add) # a factory function that you can use to create kfp.dsl.ContainerOp class instances for your pipeline
#function to write a float to a path
def write(path: str, x: float) -> str:
num = str(x)
f =open(path, "a")
f.write(num)
f.close
return path
write("demofile2.txt",1) # testing our function
def read(path: str):
f=open(path,"r")
print(f.read())
f.close
write_op = comp.func_to_container_op(write) # converting our function to a ContainerOp
read_op = comp.func_to_container_op(read) # converting our function to a ContainerOp
# new pipeline with volume being passed.
@dsl.pipeline(
name='Volume Pipeline',
description='simple pipeline to create a FRESH volume, add some numbers, and attach that volume on our different pods for their runs'
)
def volume_pipeline(
a='1',
b='2',
c='3',
):
vop = dsl.VolumeOp(
name="volume_creation",
resource_name="mypvc",
size="5Gi",
modes=dsl.VOLUME_MODE_RWM
)
add_task1 = add_op(a,3).add_pvolumes({"/mnt": vop.volume})
add_task2 = add_op(add_task1.output,b)
add_task3 = add_op(add_task2.output,c)
write_task=write_op("/mnt/output.txt",add_task3.output).add_pvolumes({"/mnt": vop.volume})
read_task=read_op(write_task.output).add_pvolumes({"/mnt": vop.volume}) | 11:50:09 |
laserK3000 | Thank for your help! | 13:00:57 |
Shrinath Suresh | The PR - https://github.com/kubeflow/pipelines/pull/7615 is still blocked due to an upstream build failure. Can someone help ? | 17:35:32 |
Yingding Wang | Is there an API to define init container in KFP python SDK? I would like to use init container to mount a minio bucket with fuse (https://github.com/minio/minfs) for the kf components. | 18:51:05 |
Yingding Wang | I found the API to define init container (https://kubeflow-pipelines.readthedocs.io/en/stable/source/kfp.dsl.html?highlight=pvolumes), but I think it will not work as i thought. | 20:07:04 |
31 May 2022 |
Steven Tobias | Redacted or Malformed Event | 02:09:01 |
Nicholas Kosteski | Has anyone had trouble getting their visualizations to render in the Run Outputs tab with kubeflow pipelines 1.8.1? If it matters, I'm using the standalone deployment and my pipeline code is using the v1 compiler. I found an open issue for v2 but no mentions of v1 breaking. Any insight/experience from anyone would be super helpful! | 14:51:01 |
Joseph Olaide | Hi Nicholas, what are your issues? | 14:52:39 |
Joseph Olaide | Are you passing the visualization metadata to the right path? Nicholas Kosteski | 14:53:19 |
Nicholas Kosteski | I've attached some screenshots, the markdown shows up accurately in the pod visualizations tab but when I go to the Run Output tab nothing is show up. The data is inline, and the artifacts seem to be generated fine. | 14:56:13 |
Nicholas Kosteski | Here's the test code:
from typing import NamedTuple
import kfp
@kfp.components.create_component_from_func
def metadata_and_metrics() -> NamedTuple(
"Outputs",
[("mlpipeline_ui_metadata", "UI_metadata"), ("mlpipeline_metrics", "Metrics")],
):
metadata = {
"outputs": [
{"storage": "inline", "source": "this should be bold", "type": "markdown"}
]
}
metrics = {
"metrics": [
{
"name": "train-accuracy",
"numberValue": 0.9,
},
{
"name": "test-accuracy",
"numberValue": 0.7,
},
]
}
from collections import namedtuple
import json
return namedtuple("output", ["mlpipeline_ui_metadata", "mlpipeline_metrics"])(
json.dumps(metadata), json.dumps(metrics)
)
@kfp.dsl.pipeline()
def pipeline():
metadata_and_metrics()
| 14:57:02 |
Joseph Olaide | If I get you, it's not displaying in the pod visualization?
However in the pipeline's run output it shows the metrics. | 15:01:09 |
Nicholas Kosteski | Yeah, in the version of kfp we run in prod currently those pod visualizations are seen in the Run Output tab. However it seems something might have broken between v1.0.4 and v1.8.1 (I know big jump haha) | 15:03:09 |
Irvin Tang | hey sorry for the late reply. forgot how this was resolved, but i’m not seeing this error anymore | 15:11:03 |