1 Jun 2022 |
Nicholas Kosteski | Yeah I definitely didn’t get any specific errors, it seems like the js function that is supposed to grab the artifacts is just kinda coming back with 0 things when I select the Run output tab. So I’m not sure if we’re having the same issue or not but it certainly could be that they’re related? ¯\(ツ)/¯
I don’t know if in a multi-user env this problem persists or not. I’m just using the standalone feature (since its all we need/use) right now. But I might try to look at KF1.5 next since this is taking up much more time than I was expecting…
What’s super bizarre is that I can sometimes get it to display in the Run output. It requires that I go to the run before the execution is done, click into the pod/task visualizations until they show up. Then as long as I don’t refresh the page, everything appears exactly how I would expect in both the pod visualizations and the run output tab. Once I refresh the page however, the visualizations disappear in the Run output tab only. It seems something is changing/clearing some state that it shouldn’t be 😖 | 18:58:46 |
Rahul Mehta | I think the idea is that you don't delete anything (I encountered this too when I found there isn't an out-of-the-box way to enforce a retention policy) -- my understanding is that it's helpful for experiment tracking/archival and audit purposes if you only archives are permitted instead of deletions | 20:51:45 |
Rahul Mehta | For example, ml-metadata is designed to be immutable | 20:52:21 |
droctothorpe | Interesting! Thanks, Rahul. I guess that helps contextualize the design choice. You got around this with the retention policy you introduced though, right? | 21:20:54 |
Rahul Mehta | Yes, basically a k8s CronJob that periodically deletes rows that are older than 30 days from the various tables in the MySQL DB | 21:21:33 |
Rahul Mehta | Though, we don't enforce anything like that for experiments (just runs & assoc. metadata) | 21:21:48 |
2 Jun 2022 |
| Balu joined the room. | 03:24:48 |
| _slack_kubeflow_U03HVGTGQHK joined the room. | 08:18:16 |
| _slack_kubeflow_U03J0DGHKFD joined the room. | 13:21:08 |
| _slack_kubeflow_U03HYSZAD45 joined the room. | 16:35:49 |
| _slack_kubeflow_U03J2CCP2FM joined the room. | 18:30:32 |
3 Jun 2022 |
| _slack_kubeflow_U03J221BNLD joined the room. | 06:52:40 |
| Balu changed their display name from _slack_kubeflow_U03HVCQDWVB to Balu. | 08:53:24 |
| Balu set a profile picture. | 08:53:26 |
Balu | Can someone help me on How to add label to dsl.ResourceOp . That label value must be dynamic one. | 08:53:27 |
NSLog0 | any experience with GCP Transfer jobs service? I have a requirement to build pipeline which related with Transfer jobs. the job will be sync data from s3 to gcs daily. after data sync done then next pipeline will be run after.
1. sync data -> wait until sync done
2. (train data after sync done)
Does the pipeline can work with Transfer jobs like above scenario ? | 11:02:01 |
Brett Koonce | i have done that | 13:24:43 |
Brett Koonce | i believe you can have gcp emit an event after the transfer is done | 13:24:54 |
Brett Koonce | you could use that to trigger your job | 13:25:03 |
Brett Koonce | if you don't have a ton of data changes | 13:25:17 |
Brett Koonce | then you might do a simple cron | 13:25:25 |
Brett Koonce | eg sync at 12am | 13:25:30 |
Brett Koonce | run job at 12 pm | 13:25:34 |
droctothorpe | Very cool. Any plans to open source that logic? If not, no worries. We can always reinvent the wheel, heh. That script sounds like a good candidate for an upstream contribution IMO. | 13:27:28 |
Joseph Olaide | Hi Nicholas Kosteski, I noticed it shows the output artifact as soon as the pipeline run ends however after some time, it disappears. I don't know why this is happening though. Trying to figure out the reason | 20:35:11 |
James Harr | Hi, I'm writing components and pipelines using the Python v2 SDK (kfp==1.8.12) so I have several functions that are decorated with @component and then a @pipeline that assembles the component functions. I'm attempting to use set_env_variable to assign a runtime environment variable with a value obtained from a different environment variable that is available when the pipeline is compiled.
import os
import kfp
from kfp.v2 import dsl
from kfp.v2.dsl import component
@component
def do_a_thing(input: str):
import os
print(f"--->>> {input} {os.getenv('MY_ENV_VAR')}")
@dsl.pipeline(name="A name", description="A description")
def my_pipeline(start_with_this: str):
the_thing = do_a_thing(start_with_this).set_env_variable(
"MY_ENV_VAR", os.getenv("ENV_VAR_FROM_BUILD")
)
def compile_pipeline():
print(f"--->>> {os.getenv('ENV_VAR_FROM_BUILD')}")
kfp.compiler.Compiler(mode=kfp.dsl.PipelineExecutionMode.V2_COMPATIBLE).compile(
pipeline_func=my_pipeline, package_path="my_pipeline.yaml"
)
if __name__ == "__main__":
compile_pipeline()
If I save this file as experiment.py and then compile this pipeline to YAML with the following command, ENV_VAR_FROM_BUILD=Hello python experiment.py , I get a YAML file that looks like a pipeline I can run. However, I don't see my compile-time environment variable MY_ENV_VAR or the value (Hello) in the YAML file (but I do see the print statement that show the compile-time value of ENV_VAR_FROM_BUILD ). Shouldn't it be there in the env: section of the do-a-thing component? Is there a better way to achieve what I'm trying to do? | 21:29:04 |
4 Jun 2022 |
| Andrej Albrecht changed their display name from _slack_kubeflow_U02QG484SPM to Andrej Albrecht. | 18:35:43 |
| Andrej Albrecht set a profile picture. | 18:35:48 |
5 Jun 2022 |
| wafaa abdelhafez joined the room. | 13:20:17 |
6 Jun 2022 |
| @clifontaive:matrix.org joined the room. | 06:00:13 |