!GVzqqhOOirSFrkrnzX:matrix.org

platform-aws

229 Members
3 Servers

Load older messages


SenderMessageTime
5 May 2022
@_slack_kubeflow_U03E8N9U56E:matrix.org_slack_kubeflow_U03E8N9U56E joined the room.16:07:19
@_slack_kubeflow_U03DZDPD7DH:matrix.orgSam Gaunt I am trying to run through the setup process here and keep getting stuck waiting for cloud formation stacks to finish. The script times out waiting for the IAM service account stack to finish. I can see the stack in the AWS Console finishes within ~30 seconds, but still the script will just keep printing the same waiting message. 23:33:56
@_slack_kubeflow_U03DZDPD7DH:matrix.orgSam Gaunt
=================================================================
                      Cluster Secrets Setup                      
=================================================================
Creating secrets IAM service account...
2022-05-06 09:27:19 [ℹ]  eksctl version 0.95.0
2022-05-06 09:27:19 [ℹ]  using region ap-southeast-2
2022-05-06 09:27:21 [ℹ]  1 iamserviceaccount (kubeflow/kubeflow-secrets-manager-sa) was included (based on the include/exclude rules)
2022-05-06 09:27:21 [!]  metadata of serviceaccounts that exist in Kubernetes will be updated, as --override-existing-serviceaccounts was set
2022-05-06 09:27:21 [ℹ]  1 task: { 
    2 sequential sub-tasks: { 
        create IAM role for serviceaccount "kubeflow/kubeflow-secrets-manager-sa",
        create serviceaccount "kubeflow/kubeflow-secrets-manager-sa",
    } }2022-05-06 09:27:21 [ℹ]  building iamserviceaccount stack "eksctl-kubeflow-spike-addon-iamserviceaccount-kubeflow-kubeflow-secrets-manager-sa"
2022-05-06 09:27:21 [ℹ]  deploying stack "eksctl-kubeflow-spike-addon-iamserviceaccount-kubeflow-kubeflow-secrets-manager-sa"
2022-05-06 09:27:21 [ℹ]  waiting for CloudFormation stack "eksctl-kubeflow-spike-addon-iamserviceaccount-kubeflow-kubeflow-secrets-manager-sa"
2022-05-06 09:27:51 [ℹ]  waiting for CloudFormation stack "eksctl-kubeflow-spike-addon-iamserviceaccount-kubeflow-kubeflow-secrets-manager-sa"
2022-05-06 09:28:35 [ℹ]  waiting for CloudFormation stack "eksctl-kubeflow-spike-addon-iamserviceaccount-kubeflow-kubeflow-secrets-manager-sa"
2022-05-06 09:30:12 [ℹ]  waiting for CloudFormation stack "eksctl-kubeflow-spike-addon-iamserviceaccount-kubeflow-kubeflow-secrets-manager-sa"
2022-05-06 09:31:47 [ℹ]  waiting for CloudFormation stack "eksctl-kubeflow-spike-addon-iamserviceaccount-kubeflow-kubeflow-secrets-manager-sa"
2022-05-06 09:33:36 [ℹ]  waiting for CloudFormation stack "eksctl-kubeflow-spike-addon-iamserviceaccount-kubeflow-kubeflow-secrets-manager-sa"
2022-05-06 09:34:20 [ℹ]  waiting for CloudFormation stack "eksctl-kubeflow-spike-addon-iamserviceaccount-kubeflow-kubeflow-secrets-manager-sa"
23:34:29
6 May 2022
@_slack_kubeflow_U03D067RTJN:matrix.orgRyan McCaffrey There's another permission you're missing. I forget which one but I think you can find it by logging into cloud formation in the console and checking the failure log. I'm pretty sure I ran into the same issue. 00:01:26
@_slack_kubeflow_U03DZDPD7DH:matrix.orgSam Gaunt There is no failure in cloud formation though. It shows all events and resources were complete with no errors. 00:02:52
@_slack_kubeflow_U03D067RTJN:matrix.orgRyan McCaffrey For the cloud formation parts I had to add these policies:
"cloudformation:ListStacks",
"cloudformation:CreateStack",
"iam:CreateRole",
"cloudformation:DescribeStacks",
"cloudformation:DescribeStackEvents"
Maybe check that you have those.
00:09:48
@_slack_kubeflow_U02U3PBAWMD:matrix.orgKartik Kalamadi Good point The IAM User which you pass as parameter to the automated script only requires read and write access to objects in an S3 bucket But to run the scripts itself you need a lot of different permissions We tested all the automated scripts with Admin credentials so we never ran into any errors We will fix the documentation Thanks ----------------------------- GITHUB ISSUES : https://github.com/awslabs/kubeflow-manifests/issues/219 https://github.com/awslabs/kubeflow-manifests/issues/215 03:18:15
@_slack_kubeflow_UK2BNJCJW:matrix.orgGautam Kumar Looking at this https://github.com/awslabs/kubeflow-manifests/blob/main/docs/deployment/cognito/README-automated.md It seems without custom domain its not possible. 03:26:40
@_slack_kubeflow_U03DZDPD7DH:matrix.orgSam Gaunt Thanks, thought so. Just going without cognito for now then. 03:27:10
@_slack_kubeflow_U03DZDPD7DH:matrix.orgSam Gaunt Ok I have finally got it working. I think there was a conflict between the IAM account created in the automated setup option and my own user account. I am using saml2aws to auth and I think that the export of the access key and secret access key in step 4 would override the saml2aws auth and cause issues. I got it to work by passing the access key and secret access key directly to the script rather than with an env like so.
PYTHONPATH=.. python utils/rds-s3/auto-rds-s3-setup.py --region $CLUSTER_REGION --cluster $CLUSTER_NAME --bucket $S3_BUCKET --s3_aws_access_key_id  AWS_ACCESS_KEY_ID_HERE  --s3_aws_secret_access_key  AWS_SECRET_ACCESS_KEY_HERE 
04:32:47
@_slack_kubeflow_UK2BNJCJW:matrix.orgGautam Kumar That comments coming from ekstcl command 05:09:10
@_slack_kubeflow_U03E5S89214:matrix.org_slack_kubeflow_U03E5S89214 joined the room.09:41:05
@_slack_kubeflow_U03EETBT66P:matrix.org_slack_kubeflow_U03EETBT66P joined the room.18:05:07
7 May 2022
@_slack_kubeflow_U03EDC2FA3X:matrix.org_slack_kubeflow_U03EDC2FA3X joined the room.03:40:42
10 May 2022
@_slack_kubeflow_U9D64K2P9:matrix.orgYihong Wang joined the room.15:28:36
12 May 2022
@_slack_kubeflow_U03F94272TU:matrix.orgJim Nolan joined the room.21:01:50
@_slack_kubeflow_U03F94272TU:matrix.orgJim Nolan changed their display name from _slack_kubeflow_U03F94272TU to Jim Nolan.21:01:54
@_slack_kubeflow_U03F94272TU:matrix.orgJim Nolan set a profile picture.21:01:55
@_slack_kubeflow_U03F94272TU:matrix.orgJim NolanRedacted or Malformed Event21:01:56
@_slack_kubeflow_U03EUHGBS07:matrix.org_slack_kubeflow_U03EUHGBS07 joined the room.21:09:34
@_slack_kubeflow_U03F94272TU:matrix.orgJim NolanRedacted or Malformed Event21:11:39
16 May 2022
@_slack_kubeflow_U03FYE5JUKT:matrix.org_slack_kubeflow_U03FYE5JUKT joined the room.10:58:23
@idahotokens:matrix.orgidahotokens joined the room.11:22:51
@_slack_kubeflow_U02U3Q4T2R4:matrix.orgThomas KorrisonRedacted or Malformed Event14:19:49
@_slack_kubeflow_U01H0MB53AS:matrix.orgRustam GimadievRedacted or Malformed Event14:52:40
@idahotokens:matrix.orgidahotokens left the room.16:41:03
@_slack_kubeflow_U033RA6APQQ:matrix.orgHaris Farooqui I am seeing issue after upgrading to EKS 1.22 (terraform deployment):
kubectl describe pod my_pipeline -n kubeflow
-----
...
  Warning  FailedMount       9m46s (x3 over 9m48s)  kubelet             MountVolume.SetUp failed for volume "kube-api-access-kz47c" : object "kubeflow"/"kube-root-ca.crt" not registered
  Warning  FailedMount       9m46s (x3 over 9m48s)  kubelet             MountVolume.SetUp failed for volume "mlpipeline-minio-artifact" : object "kubeflow"/"mlpipeline-minio-artifact" not registered
17:36:02
@_slack_kubeflow_U033RA6APQQ:matrix.orgHaris Farooqui I found some post suggesting explicitly setting automountServiceAccountToken: false to avoid rootCAConfigMap from publishing kube-root-ca.cert in kubeflow namespace. While this takes care of MountVolume.SetUp failed for volume "kube-api-access-kz47c" : object "kubeflow"/"kube-root-ca.crt" not registered Warning it creates other issues where pipeline Jobs start failing with following error:
This step is in Error state with this message: Error (exit code 2): invalid configuration: no configuration has been provided, try setting KUBERNETES_MASTER environment variable
17:36:20
@_slack_kubeflow_U033RA6APQQ:matrix.orgHaris Farooqui https://stackoverflow.com/questions/69038012/mountvolume-setup-failed-for-volume-kube-api-access-fcz9j-object-default 17:36:28
@_slack_kubeflow_U02AYBVSLSK:matrix.orgAlexandre Brown Note that Kubeflow doesn't support k8s 1.22 yet 17:48:37

Show newer messages


Back to Room ListRoom Version: 6