!caaCTdhYlLGpbJVvly:matrix.org

kubeflow-pipelines

453 Members
1 Servers

Load older messages


SenderMessageTime
21 Jan 2022
@_slack_kubeflow_U02AYBVSLSK:matrix.orgAlexandre Brown
In reply to@_slack_kubeflow_U02AYBVSLSK:matrix.org
Thank you Rahul Mehta, here is how we can do it : Pipeline wise config
pipeline_conf = kfp.dsl.PipelineConf()
pipeline_conf.set_image_pull_policy("Always")
pipeline_conf.set_default_pod_node_selector("mdinfinity.com/compute", "true")
pipeline_conf.add_op_transformer(use_aws_secret(aws_region="us-east-2"))

compiler.Compiler(mode=kfp.dsl.PipelineExecutionMode.V2_COMPATIBLE).compile(
        pipeline_func=semantic_segmentation_baseline_pipeline,
        package_path=PIPELINE_DEFINITION_FILE_NAME,
        type_check=True,
        pipeline_conf=pipeline_conf
    )
Component config
train_segmentation_baseline_model_task = train_segmentation_baseline_model_task(
        number_of_classes=number_of_classes,
        backbone=backbone,
        head=head,
        pretrained=pretrained,
        learning_rate=learning_rate,
        max_epochs=max_epochs,
        batch_size=batch_size,
        number_of_gpus=number_of_gpus,
        precision=precision,
        seed=seed,
        train_images=split_image_dataset_component_task.outputs["train_x"],
        valid_images=split_image_dataset_component_task.outputs["valid_x"],
        train_masks=split_image_dataset_component_task.outputs["train_y"],
        valid_masks=split_image_dataset_component_task.outputs["valid_y"]
    ).set_gpu_limit(1).add_node_selector_constraint("k8s.amazonaws.com/accelerator","nvidia-tesla-v100")
I found how to do it by checking the sdk doc : https://kubeflow-pipelines.readthedocs.io/en/stable/source/kfp.dsl.html?highlight=PipelineConf#kfp.dsl.PipelineConf There a ton more that exist, these are just the one I use at the moment. For instance there seems to be one to add pod annotation to a component, see https://kubeflow-pipelines.readthedocs.io/en/stable/source/kfp.dsl.html?highlight=annotation#kfp.dsl.BaseOp.add_pod_annotation
(edited) ... config ```download_image_dataset_task = ... => ... config ```cpu_toleration = V1Toleration(effect="NoSchedule", key="<http://mdinfinity.com/compute|mdinfinity.com/compute>", operator="Equal", toleration_seconds=None, value="true") download_image_dataset_task = ...
13:48:40
@_slack_kubeflow_U02LW7FHWLS:matrix.orgRahul Mehta
In reply to@_slack_kubeflow_U02AYBVSLSK:matrix.org
Thank you Rahul Mehta, here is how we can do it : Pipeline wise config
pipeline_conf = kfp.dsl.PipelineConf()
pipeline_conf.set_image_pull_policy("Always")
pipeline_conf.set_default_pod_node_selector("your-node-label", "your-label-value")
pipeline_conf.add_op_transformer(use_aws_secret(aws_region="us-east-2"))

compiler.Compiler(mode=kfp.dsl.PipelineExecutionMode.V2_COMPATIBLE).compile(
        pipeline_func=semantic_segmentation_baseline_pipeline,
        package_path=PIPELINE_DEFINITION_FILE_NAME,
        type_check=True,
        pipeline_conf=pipeline_conf
    )
Component specific config
cpu_toleration = V1Toleration(effect="NoSchedule", key="mdinfinity.com/compute", operator="Equal", toleration_seconds=None, value="true")

download_image_dataset_task = download_image_dataset_task(
        dataset_region=dataset_region,
        s3_images_path=s3_images_path,
        s3_labels_path=s3_masks_path,
    ).add_toleration(cpu_toleration).set_caching_options(False)

...

train_segmentation_baseline_model_task = train_segmentation_baseline_model_task(
        number_of_classes=number_of_classes,
        backbone=backbone,
        head=head,
        pretrained=pretrained,
        learning_rate=learning_rate,
        max_epochs=max_epochs,
        batch_size=batch_size,
        number_of_gpus=number_of_gpus,
        precision=precision,
        seed=seed,
        train_images=split_image_dataset_component_task.outputs["train_x"],
        valid_images=split_image_dataset_component_task.outputs["valid_x"],
        train_masks=split_image_dataset_component_task.outputs["train_y"],
        valid_masks=split_image_dataset_component_task.outputs["valid_y"]
    ).set_gpu_limit(1).add_node_selector_constraint("k8s.amazonaws.com/accelerator","nvidia-tesla-v100")
I found how to do it by checking the sdk doc : https://kubeflow-pipelines.readthedocs.io/en/stable/source/kfp.dsl.html?highlight=PipelineConf#kfp.dsl.PipelineConf There a ton more that exist, these are just the one I use at the moment. For instance there seems to be one to add pod annotation to a component, see https://kubeflow-pipelines.readthedocs.io/en/stable/source/kfp.dsl.html?highlight=annotation#kfp.dsl.BaseOp.add_pod_annotation
That all makes sense, but how is the component-specific config applied to a specific lightweight component?
13:51:29
@_slack_kubeflow_U02LW7FHWLS:matrix.orgRahul Mehta
In reply to@_slack_kubeflow_U02LW7FHWLS:matrix.org
That all makes sense, but how is the component-specific config applied to a specific lightweight component?
Basically, adding annotations was supposed to make the workflow you just mentioned work with the new decorator. I'll look at what you posted and if it doesn't appear to work will post an example. OTOH, while the other approach may work I think it's cleaner to allow the arg in the decorator
13:52:33
@_slack_kubeflow_U02LW7FHWLS:matrix.orgRahul Mehta
In reply to@_slack_kubeflow_U02LW7FHWLS:matrix.org
Basically, adding annotations was supposed to make the workflow you just mentioned work with the new decorator. I'll look at what you posted and if it doesn't appear to work will post an example. OTOH, while the other approach may work I think it's cleaner to allow the arg in the decorator
Is download_dataset_image_task a v2 function component?
13:53:19
@_slack_kubeflow_U02LW7FHWLS:matrix.orgRahul Mehta
In reply to@_slack_kubeflow_U02LW7FHWLS:matrix.org
Is download_dataset_image_task a v2 function component?
(edited) ... component? => ... component? The factory produced by the decorator returns a TaskSpec, not a ContainerOp
13:53:53
@_slack_kubeflow_U02AYBVSLSK:matrix.orgAlexandre Brown
In reply to@_slack_kubeflow_U02LW7FHWLS:matrix.org
That all makes sense, but how is the component-specific config applied to a specific lightweight component? The factory produced by the decorator returns a TaskSpec, not a ContainerOp
Yes sorry for the bad communication on my part, the sample I sent are based on v2 dsl which is the new recommended way to create components.
 from kfp.v2.dsl import (
    component,
    Output,
    Dataset
)


@component(
    base_image="my_base_image",
   output_component_file="../../build/image/components/download_s3_image_dataset_component.yaml"
)
def download_s3_image_dataset(
13:57:55
@_slack_kubeflow_U02AYBVSLSK:matrix.orgAlexandre Brown
In reply to@_slack_kubeflow_U02AYBVSLSK:matrix.org
Yes sorry for the bad communication on my part, the sample I sent are based on v2 dsl which is the new recommended way to create components.
 from kfp.v2.dsl import (
    component,
    Output,
    Dataset
)


@component(
    base_image="my_base_image",
   output_component_file="../../build/image/components/download_s3_image_dataset_component.yaml"
)
def download_s3_image_dataset(
(edited) ... @component( base_image="my_base_image", output_component_file="../../build/image/components/download_s3_image_dataset_component.yaml" ) ... => ... @component( base_image="my_base_image" ) ...
13:58:42
@_slack_kubeflow_U02LW7FHWLS:matrix.orgRahul Mehta
In reply to@_slack_kubeflow_U02AYBVSLSK:matrix.org
Yes sorry for the bad communication on my part, the sample I sent are based on v2 dsl which is the new recommended way to create components.
 from kfp.v2.dsl import (
    component,
    Output,
    Dataset
)


@component(
    base_image="my_base_image"
)
def download_s3_image_dataset(
Aha, I think I understand your solution. Which KFP SDK version are you on? I’ll give it a try today. If that works, I’m happy to close this PR, but think the developer experience could be significantly better. Perhaps a first step to improving this would be improving the typings in the object returned by the factory function.
14:04:34
@_slack_kubeflow_U02LW7FHWLS:matrix.orgRahul Mehta
In reply to@_slack_kubeflow_U02LW7FHWLS:matrix.org
Aha, I think I understand your solution. Which KFP SDK version are you on? I’ll give it a try today. If that works, I’m happy to close this PR, but think the developer experience could be significantly better. Perhaps a first step to improving this would be improving the typings in the object returned by the factory function.
Alexandre Brown to close this out, this works equivalently for me:
generate_task = generate_data()\
    .add_pod_annotation("capability", "compute-optimized")
And then have the logic to apply the various requests/limits/tolerations still in the op_transformer
14:12:46
@_slack_kubeflow_U02AYBVSLSK:matrix.orgAlexandre Brown
In reply to@_slack_kubeflow_U02LW7FHWLS:matrix.org
Alexandre Brown to close this out, this works equivalently for me:
generate_task = generate_data()\
    .add_pod_annotation("capability", "compute-optimized")
And then have the logic to apply the various requests/limits/tolerations still in the op_transformer
I'm using kfp==1.8.6 and you can find how to use the v2 @component dsl here : https://www.kubeflow.org/docs/components/pipelines/sdk-v2/python-function-components/#example-python-function-based-component
14:12:57
@_slack_kubeflow_U02AYBVSLSK:matrix.orgAlexandre Brown
In reply to@_slack_kubeflow_U02AYBVSLSK:matrix.org
I'm using kfp==1.8.6 and you can find how to use the v2 @component dsl here : https://www.kubeflow.org/docs/components/pipelines/sdk-v2/python-function-components/#example-python-function-based-component
yes you can still use the other set_limit functions, my first sample showed how to limit gpu but you can do the same for memory for instance
14:14:57
@_slack_kubeflow_U02LW7FHWLS:matrix.orgRahul Mehta
In reply to@_slack_kubeflow_U02AYBVSLSK:matrix.org
yes you can still use the other set_limit functions, my first sample showed how to limit gpu but you can do the same for memory for instance
Sounds good. Think we’ll still have some work to do to make the interface nicer for our end-users but this is a good starting point. Closing the PR
14:15:24
@_slack_kubeflow_U02LW7FHWLS:matrix.orgRahul Mehta
In reply to@_slack_kubeflow_U02LW7FHWLS:matrix.org
Sounds good. Think we’ll still have some work to do to make the interface nicer for our end-users but this is a good starting point. Closing the PR
I’m happy to make a docs contribution to make this more explicit
14:15:35
@_slack_kubeflow_U02AYBVSLSK:matrix.orgAlexandre Brown
In reply to@_slack_kubeflow_U02LW7FHWLS:matrix.org
I’m happy to make a docs contribution to make this more explicit
Also there is the more generic add_resource_request https://kubeflow-pipelines.readthedocs.io/en/stable/source/kfp.dsl.html?highlight=annotation#kfp.dsl.Sidecar.add_resource_request There is just a ton of them but yes you are right, the doc is clearly lacking on this
14:16:47
@_slack_kubeflow_U02AYBVSLSK:matrix.orgAlexandre Brown
In reply to@_slack_kubeflow_U02AYBVSLSK:matrix.org
Also there is the more generic add_resource_request https://kubeflow-pipelines.readthedocs.io/en/stable/source/kfp.dsl.html?highlight=annotation#kfp.dsl.Sidecar.add_resource_request There is just a ton of them but yes you are right, the doc is clearly lacking on this
(edited) ... on this => ... on this, I had to dig a bit to find it.
14:17:23
@_slack_kubeflow_U02LW7FHWLS:matrix.orgRahul Mehta
In reply to@_slack_kubeflow_U02AYBVSLSK:matrix.org
Also there is the more generic add_resource_request https://kubeflow-pipelines.readthedocs.io/en/stable/source/kfp.dsl.html?highlight=annotation#kfp.dsl.Sidecar.add_resource_request There is just a ton of them but yes you are right, the doc is clearly lacking on this
I was aware of all those operations, but previously had used them only for explicit instances of ContainerOp (ie legacy components) The main confusion was from inspecting the type returned by the factory produced by the v2 @component decorator — inspecting it manually/looking at the source it seems to produce a TaskSpec - so, it’s not immediately clear it’s actually a ContainerOp 😅
14:18:24
@_slack_kubeflow_U02UWF57VU3:matrix.org_slack_kubeflow_U02UWF57VU3 joined the room.14:19:43
@_slack_kubeflow_UTB63257H:matrix.org_slack_kubeflow_UTB63257H joined the room.16:54:57
@_slack_kubeflow_U02UX46H74N:matrix.orgVictor Sadkov joined the room.19:08:57
@_slack_kubeflow_U02UX46H74N:matrix.orgVictor Sadkov changed their display name from _slack_kubeflow_U02UX46H74N to Victor Sadkov.19:11:15
@_slack_kubeflow_U02UX46H74N:matrix.orgVictor Sadkov set a profile picture.19:11:16
@_slack_kubeflow_U02UX46H74N:matrix.orgVictor Sadkov Can I please have a review on this 1-line change: https://github.com/kubeflow/pipelines/pull/6624/files 19:11:17
@_slack_kubeflow_U02LW7FHWLS:matrix.orgRahul Mehta
In reply to@_slack_kubeflow_U02UX46H74N:matrix.org
Can I please have a review on this 1-line change: https://github.com/kubeflow/pipelines/pull/6624/files
FWIW the scope of the PR should probably feat(sdk) vs feat(backend)
19:31:05
@_slack_kubeflow_U02LW7FHWLS:matrix.orgRahul Mehta
In reply to@_slack_kubeflow_U02LW7FHWLS:matrix.org
FWIW the scope of the PR should probably feat(sdk) vs feat(backend)
(edited) ... probably `feat(sdk)` ... => ... probably be `feat(sdk)` ...
19:31:09
@_slack_kubeflow_U02LW7FHWLS:matrix.orgRahul Mehta
In reply to@_slack_kubeflow_U02LW7FHWLS:matrix.org
FWIW the scope of the PR should probably feat(sdk) vs feat(backend)
(edited) ... `feat(sdk)` vs `feat(backend)` => ... `feat(sdk)` instead of `feat(backend)`
19:31:16
22 Jan 2022
@_slack_kubeflow_U02UX46H74N:matrix.orgVictor Sadkov
In reply to@_slack_kubeflow_U02LW7FHWLS:matrix.org
FWIW the scope of the PR should probably be feat(sdk) instead of feat(backend)
Thanks Rahul
00:05:01
@_slack_kubeflow_UTNUW80K1:matrix.orgChen Sun
In reply to@_slack_kubeflow_U02UX46H74N:matrix.org
Thanks Rahul
Replied you in the PR.
08:30:30
@gh0st-w00lf:matrix.orgGh0st W0lfYo what is this? What is kubeflow? Is it something I can use to replace the crappy Azure DevOps solution I'm forced to use?09:20:26
@_slack_kubeflow_U02TAA5NKS7:matrix.orgBabis Samothrakis
In reply to@_slack_kubeflow_U02AYBVSLSK:matrix.org
Also your code looks fine and you could even remove the packages_to_install and install the dependencies inside your docker image instead (eg: RUN pip install numpy==1.19.4)
Alexandre Brown 🙏 My first ever container image in docker hub is a fact, and it applies successfully to my kubeflow pipeline. I also removed '_packages_to_install_' and installed the dependencies (_numpy_ and _tensorflow_) directly through the image, as you advised.
21:24:26
@_slack_kubeflow_U02AYBVSLSK:matrix.orgAlexandre Brown
In reply to@_slack_kubeflow_U02TAA5NKS7:matrix.org
Alexandre Brown 🙏 My first ever container image in docker hub is a fact, and it applies successfully to my kubeflow pipeline. I also removed '_packages_to_install_' and installed the dependencies (_numpy_ and _tensorflow_) directly through the image, as you advised.
Awesome work Babis Samothrakis 🎉!
21:34:42

There are no newer messages yet.


Back to Room List