!LuUSGaeArTeoOgUpwk:matrix.org

kubeflow-kfserving

433 Members
2 Servers

Load older messages


SenderMessageTime
27 May 2022
@_slack_kubeflow_U019TS29HLN:matrix.orgNithin R Hi, i have grafana & prometheus running in different cluster and would like to intergrate to kubeflow model metrics. Can anyone please suggest 10:08:00
@_slack_kubeflow_U03CN7QAHN3:matrix.orgzorba(손주형) I`m using thanos 12:27:01
@_slack_kubeflow_U03CN7QAHN3:matrix.orgzorba(손주형) it`s good to manage multicluster prometheus 12:27:21
@_slack_kubeflow_U019TS29HLN:matrix.orgNithin R Ok, how did you integrate it with kserve metrics? 12:35:48
@_slack_kubeflow_U01B8DPEY01:matrix.orgJohn Paulett Thanks Mark Winter for the suggestion. I'm also looking at submitting a PR to lightgbm to re-use the training's categories for the prediction 14:47:04
@_slack_kubeflow_U03CN7QAHN3:matrix.orgzorba(손주형) add prometheus operator annotation on isvc 15:01:41
@_slack_kubeflow_U034329LRB2:matrix.org_slack_kubeflow_U034329LRB2 joined the room.15:41:34
@_slack_kubeflow_U022U7KG24W:matrix.orgRachit Chauhan Dan Sun: what about kserve itself ? Does it need all of it’s control plane resources to be in kserve namespace or can we change that too ? Or any pointers I should be aware of ? 22:37:21
28 May 2022
@_slack_kubeflow_U038QCM9C4D:matrix.orgรัชพล เเขมภูเขียว
I have a very long Terminating problem, is there any solution?
08:48:37
@_slack_kubeflow_UFVUV2UFP:matrix.orgDan Sun you can tune the timeout to control the grace period for termination 14:22:50
30 May 2022
@_slack_kubeflow_U03EE7VFCDN:matrix.org레몬버터구이 changed their display name from _slack_kubeflow_U03EE7VFCDN to 레몬버터구이.02:19:23
@_slack_kubeflow_U03EE7VFCDN:matrix.org레몬버터구이 set a profile picture.02:19:26
@_slack_kubeflow_U03ADPJCZBJ:matrix.org_slack_kubeflow_U03ADPJCZBJ joined the room.07:36:48
@_slack_kubeflow_U03HSETDZPA:matrix.orgKuba Dawczynski Dan Sun thx, it helped a lot, it's working 🙂 09:38:24
@californiatl:matrix.org@californiatl:matrix.org joined the room.15:45:21
@_slack_kubeflow_UFVUV2UFP:matrix.orgDan Sun I think there is an issue to support out of the box 16:07:41
31 May 2022
@_slack_kubeflow_U03CN7QAHN3:matrix.orgzorba(손주형)image.png
Download image.png
05:12:36
@_slack_kubeflow_U03CN7QAHN3:matrix.orgzorba(손주형) kserve user survey is not available 05:12:36
@_slack_kubeflow_UFVUV2UFP:matrix.orgDan Sun yes the survey is closed, just made a PR to update that https://github.com/kserve/website/pull/141/files 06:14:50
@_slack_kubeflow_U022U7KG24W:matrix.orgRachit Chauhan I missed last bi-weekly meeting but I do see KServe 0.9 release tracking item in the notes. Can someone summarize what was discussed around this ? OR do we have issue to track it’s timeline ? 17:27:08
@_slack_kubeflow_U03H9RTMF3R:matrix.orgThomas Watkin joined the room.19:11:04
@_slack_kubeflow_U03H9RTMF3R:matrix.orgThomas Watkin changed their display name from _slack_kubeflow_U03H9RTMF3R to Thomas Watkin.19:16:44
@_slack_kubeflow_U03H9RTMF3R:matrix.orgThomas Watkin set a profile picture.19:16:45
@_slack_kubeflow_U01JLCXHSJY:matrix.orgJuergen Stary joined the room.19:24:14
@_slack_kubeflow_U01JLCXHSJY:matrix.orgJuergen Stary changed their display name from _slack_kubeflow_U01JLCXHSJY to Juergen Stary.19:25:37
@_slack_kubeflow_U01JLCXHSJY:matrix.orgJuergen Stary set a profile picture.19:25:38
@_slack_kubeflow_U01JLCXHSJY:matrix.orgJuergen Stary Hi there, i wanted to join your community meeting from time to time but seems like there is not fixed schedule or at least no day given in the Group Sync log? Cheers 19:25:38
@_slack_kubeflow_U022U7KG24W:matrix.orgRachit Chauhan Hi all, I am installing knative-serving in namespace other than knative-serving and have enabled istio-injection (for that NS too). All other control plane components seems to be working fine except activator. Seeing this in activator’s logs:
{"severity":"WARNING","timestamp":"2022-05-31T08:49:23.436527148Z","logger":"activator","caller":"handler/healthz_handler.go:36","message":"Healthcheck failed: received SIGTERM from kubelet","commit":"6ec4509","knative.dev/controller":"activator","knative.dev/pod":"activator-848f9bfddf-txhbz"}
{"severity":"ERROR","timestamp":"2022-05-31T08:49:25.693721434Z","logger":"activator","caller":"websocket/connection.go:142","message":"Websocket connection could not be established","commit":"6ec4509","knative.dev/controller":"activator","knative.dev/pod":"activator-848f9bfddf-txhbz","error":"websocket: bad handshake","request":"HTTP/1.1 503 Service Unavailable\r\nConnection: close\r\nContent-Length: 195\r\nContent-Type: text/plain\r\nDate: Tue, 31 May 2022 08:49:25 GMT\r\nServer: envoy\r\n\r\nupstream connect error or disconnect/reset before headers. reset reason: connection failure, transport failure reason: TLS error: 268435581:SSL routines:OPENSSL_internal:CERTIFICATE_VERIFY_FAILED","stacktrace":"knative.dev/pkg/websocket.NewDurableConnection.func1\n\tknative.dev/pkg@v0.0.0-20220412134708-e325df66cb51/websocket/connection.go:142\nknative.dev/pkg/websocket.(*ManagedConnection).connect.func1\n\tknative.dev/pkg@v0.0.0-20220412134708-e325df66cb51/websocket/connection.go:226\nk8s.io/apimachinery/pkg/util/wait.ConditionFunc.WithContext.func1\n\tk8s.io/apimachinery@v0.23.5/pkg/util/wait/wait.go:220\nk8s.io/apimachinery/pkg/util/wait.runConditionWithCrashProtectionWithContext\n\tk8s.io/apimachinery@v0.23.5/pkg/util/wait/wait.go:233\nk8s.io/apimachinery/pkg/util/wait.runConditionWithCrashProtection\n\tk8s.io/apimachinery@v0.23.5/pkg/util/wait/wait.go:226\nk8s.io/apimachinery/pkg/util/wait.ExponentialBackoff\n\tk8s.io/apimachinery@v0.23.5/pkg/util/wait/wait.go:421\nknative.dev/pkg/websocket.(*ManagedConnection).connect\n\tknative.dev/pkg@v0.0.0-20220412134708-e325df66cb51/websocket/connection.go:223\nknative.dev/pkg/websocket.NewDurableConnection.func2\n\tknative.dev/pkg@v0.0.0-20220412134708-e325df66cb51/websocket/connection.go:163"}
{"severity":"ERROR","timestamp":"2022-05-31T08:49:26.692820496Z","logger":"activator","caller":"websocket/connection.go:192","message":"Failed to send ping message to  ws://autoscaler.data-mlplatform-knativeserving3-usw2-dev.svc.cluster.local:8080 ","commit":"6ec4509","knative.dev/controller":"activator","knative.dev/pod":"activator-848f9bfddf-txhbz","error":"connection has not yet been established","stacktrace":"knative.dev/pkg/websocket.NewDurableConnection.func3\n\tknative.dev/pkg@v0.0.0-20220412134708-e325df66cb51/websocket/connection.go:192"}
And this in istio-proxy (envoy) logs:
{ "time":"2022-05-31T20:02:02.586Z", "hostname":"activator-848f9bfddf-srghq", "txId":"e0bc1498-1bee-4c95-82cf-5a08526d92ab", "sourceIP":"-", "xFor":"-", "originatingIp":"-", "upstream_host":"10.199.173.222:8080", "user-agent":"Go-http-client/1.1", "downstreamRemoteAddress":"10.199.163.212:58164", "req":"/", "method":"GET", "protocol":"HTTP/1.1", "xHost":"autoscaler.data-mlplatform-knativeserving3-usw2-dev.svc.cluster.local:8080", "status":"503", "response_flags":"UF,URX", "msg":"upstream_reset_before_response_started{connection_failure,TLS_error:_268435581:SSL_routines:OPENSSL_internal:CERTIFICATE_VERIFY_FAILED}", "authority":"autoscaler.data-mlplatform-knativeserving3-usw2-dev.svc.cluster.local:8080", "reqSize":"0", "respSize":"195", "upstreamTime":"-", "requestDuration":"-", "responseDuration":"-", "envoyTime":"-", "txTime":"11", "routeName":"default", "upstreamCluster":"outbound,8080,,autoscaler.data-mlplatform-knativeserving3-usw2-dev.svc.cluster.local", "upstreamTransportFailureReason":"TLS error: 268435581:SSL routines:OPENSSL_internal:CERTIFICATE_VERIFY_FAILED", "downstreamLocalSAN":"-", "downstreamPeerSAN":"-", "downstreamLocalSubject":"-", "downstreampeerSubject":"-", "event":"source", "destinationAsset":"autoscaler.data-mlplatform-knativeserving3-usw2-dev.svc.cluster.local:8080", "sourceAsset":"Intuit.data.mlplatform.mlpinfrastructure", "app":"knative-serving" }
20:02:44
@_slack_kubeflow_UFVUV2UFP:matrix.orgDan Sun I guess mTLS is turned on ? 23:48:57
@_slack_kubeflow_UFVUV2UFP:matrix.orgDan Sun seems like activtor is not able to connect to autoscaler 23:49:11

Show newer messages


Back to Room ListRoom Version: 6