kubeflow-pipelines - Public Room Timeline

	kubeflow-pipelines	741 Members
		2 Servers

Load older messages

Sender	Message	Time
17 May 2022
Amit Jha	Do you have namespace selected on top of page drop down?	18:53:14
Amit Jha	If it does not let you select one or does not stick, some of the pods may need to be restarted.	18:54:06
Clark Updike	I think I know what happened. I was using the KFP Client in a Jupyter Notebook and it provided URL's directly to the Pipeline UI (outside of Kubeflow)... and those URL's don't apply any namespace. When I went back into experiments from the main Kubeflow UI (where the namespace is shown at the top), it started working...	18:58:38
Clark Updike	Thanks for the pointer that made me realize what was happening!	18:58:50
	leke onilude changed their display name from _slack_kubeflow_U038V0U5YFM to leke onilude.	19:39:03
	leke onilude set a profile picture.	19:39:16
	Jonny Browning changed their display name from _slack_kubeflow_U036ZCFAFLP to Jonny Browning.	22:20:28
	Jonny Browning set a profile picture.	22:20:31
	@californiatokens:matrix.org joined the room.	23:09:17
18 May 2022
	Abhishek Sharma joined the room.	08:05:47
Abhishek Sharma	Hi everyone, I am facing an issue while trying to pass `dict` as a parameters in a kfp component `Error: Structure "OrderedDict()" is incompatible with type "typing.Union[str, int, float, bool, NoneType]" - none of the types in Union are compatible.` According to kubeflow SDK v2 we can pass `dict` as pipeline parameter[1], but another KFP document[2] says that `Parameters are passed into your component by value, and can be of any of the following types: int, double, float, or str.` I am getting some conflicting information on this, can you please help me with that is going on, is this a versioning issue or am I missing something? I am loading the KFP component as follows `component = kfp.components.load_component_from_text(manifest_string)` [1] https://www.kubeflow.org/docs/components/pipelines/sdk-v2/v2-component-io/ [2] https://www.kubeflow.org/docs/components/pipelines/sdk-v2/build-pipeline/	08:06:30
Cornelis Boon	Been using KF pipelines (the one deployed by GCP AI platform) for a while. Something I’ve not really looked into is automated testing of the pipelines. Has anyone set this up for themselves? Are these way to go? Or is someone using an alternative toolset for e2e testing of pipelines? https://github.com/kubeflow/pipelines/wiki/Tests https://github.com/kubeflow/testing	08:06:42
	@californiatokens:matrix.org left the room.	10:56:55
	@billykin:matrix.org joined the room.	12:36:35
droctothorpe	Weird question for folks who use the KFP CLI: why? As in, why not just use the SDK? Does the CLI provide any particular features above and beyond what the SDK provides that motivate you to use it? Thanks!	13:24:03
	_slack_kubeflow_ULV26QXM4 joined the room.	14:07:12
	@billykin:matrix.org left the room.	16:53:13
19 May 2022
Ian Miller	Hi all, question for the community. Are most organizations backing kfp with Argo Workflows or Tekton? Is it possible to back it with both? We currently back with Tekton, but have users wanting to leverage kfp v2 features which (to my knowledge) are not yet supported by kfp-tekton. This results in users frequently finding tutorials/examples which don't work on our clusters.	01:45:41
	레몬버터구이 joined the room.	02:59:35
Ferdinand von den Eichen [Kineo.ai]	Who here is using spot instances for training as part of KF pipelines? I love the financial opportunities, but long running trainings seems to be uniquely poor for spot. 1. Has anyone figured out how to handle eviction properly during trainings? i.e. to pause and pick up on a new pod? 2. Failing that, would setting a `.set_retry(X)` on training steps be good enough? The intuition being that if we have to get evicted, we can just retry our training on the new node…	14:55:09
	Chris Chase joined the room.	16:20:47
	Amira Menfis joined the room.	16:55:06
	Amira Menfis changed their display name from _slack_kubeflow_U01S6RV9U9M to Amira Menfis.	16:55:23
	Amira Menfis set a profile picture.	16:55:24
Amira Menfis	Hi, Could you let me know how to run kubeflow pipeline outside of kubeflow?	16:55:24
Amit Jha	`kfp.Client.run_pipeline` - https://www.kubeflow.org/docs/components/pipelines/sdk/sdk-overview/	18:49:29
droctothorpe	Long running components are a bad use case for spot instances, IMO, but `set_retry` should theoretically help.	20:14:59
Rahul Mehta	In the case of model training, most libraries support some notion of checkpointing. If you include the checkpoint dir (in s3/other cloud storage) as an argument to your component & appropriately set the retry policy, then you should be able to resume from the checkpoint when the pod is scheduled	20:27:02
Rahul Mehta	Re (1), the kubernetes scheduler should handle that for you; when a node is removed from the cluster, k8s will taint that node with `NoSchedule` -- when the pod retries after being evicted, it will only be able to schedule on nodes without that taint	20:27:49
David Aronchick	Hi, can you say what you're looking for? Not sure I understand	21:52:04

Show newer messages

Back to Room ListRoom Version: 6