!LuUSGaeArTeoOgUpwk:matrix.org

kubeflow-kfserving

433 Members
2 Servers

Load older messages


SenderMessageTime
27 Apr 2022
@_slack_kubeflow_U03CN7QAHN3:matrix.orgzorba(손주형) yes it is opened 02:16:29
@_slack_kubeflow_U03CN7QAHN3:matrix.orgzorba(손주형) this is prometheus data source by helm chart 02:16:53
@_slack_kubeflow_U03CN7QAHN3:matrix.orgzorba(손주형) and it collect inferenceservice pod’s metric but not triton metric 02:17:12
@_slack_kubeflow_UFVUV2UFP:matrix.orgDan Sun these are cpu metrics 02:17:52
@_slack_kubeflow_U03CN7QAHN3:matrix.orgzorba(손주형) yap there’s no triton metric 02:18:17
@_slack_kubeflow_UFVUV2UFP:matrix.orgDan Sun what’s your query 02:18:33
@_slack_kubeflow_U03CN7QAHN3:matrix.orgzorba(손주형) nv_inference_compute_infer_duration_us 02:18:46
@_slack_kubeflow_UFVUV2UFP:matrix.orgDan Sun can we see that in the prometheus ? 02:19:07
@_slack_kubeflow_U03CN7QAHN3:matrix.orgzorba(손주형) and when query it with prometheus data source with istio addons, triton metric is shown 02:19:18
@_slack_kubeflow_U03CN7QAHN3:matrix.orgzorba(손주형) prom with helm: no prom with istio addon: yes 02:19:31
@_slack_kubeflow_U03CN7QAHN3:matrix.orgzorba(손주형)image.png
Download image.png
02:20:39
@_slack_kubeflow_U03CN7QAHN3:matrix.orgzorba(손주형)image.png
Download image.png
02:20:40
@_slack_kubeflow_U03CN7QAHN3:matrix.orgzorba(손주형) left is prom with helm, right is prom with addon 02:20:59
@_slack_kubeflow_UFVUV2UFP:matrix.orgDan Sun what’s your annotation added on isvc? 02:21:32
@_slack_kubeflow_U03CN7QAHN3:matrix.orgzorba(손주형) you mean inferenceService? 02:21:54
@_slack_kubeflow_UFVUV2UFP:matrix.orgDan Sun yes 02:22:00
@_slack_kubeflow_U03CN7QAHN3:matrix.orgzorba(손주형)
Annotations: map[string]string{
   "prometheus.io/scrape": "true",
   "prometheus.io/port":   "8002",
},
02:22:05
@_slack_kubeflow_UFVUV2UFP:matrix.orgDan Sun you might need to compare the difference between the two installation.. it is probably setup issue 02:23:10
@_slack_kubeflow_U03CN7QAHN3:matrix.orgzorba(손주형) okay I will try. 02:23:36
@_slack_kubeflow_U03CN7QAHN3:matrix.orgzorba(손주형) Is it possible to make triton automatically pull latest model? I have a model in s3 s3://~~/torchscript/cifar/1 and create isvc and then I add model 2 s3://~~/torchscript/cifar/2 but triton server in isvc does not pull latest model. 08:15:17
@_slack_kubeflow_U023ZTGHZ41:matrix.orgJulius von Kohout These are the default configmaps from the offcial kserve_kubeflow.yaml at https://github.com/kserve/kserve/blob/release-0.7/install/v0.7.0/kserve_kubeflow.yaml There is a misconfiguration somewhere in this official release. I want to find it and fix it upstream. Here are the routes. I do not see seldon stuff, but Seldon is working compared to KServe
[~/deployment-master]$ ISTIO_TAG=1.11.7
[~/deployment-master]$ curl -L https://github.com/istio/istio/releases/download/${ISTIO_TAG}/istio-${ISTIO_TAG}-linux-amd64.tar.gz -o istio.tar.gz
[~/deployment-master]$ tar xzf istio.tar.gz
[~/deployment-master]$ istio-${ISTIO_TAG}/bin/istioctl proxy-config routes  cluster-local-gateway-696db88886-8g54r -n istio-system
NAME          DOMAINS     MATCH                  VIRTUAL SERVICE
http.8080                /                     404
http.8081                /                     404
                         /stats/prometheus     
                         /healthz/ready        
[~/deployment-master]$ istio-${ISTIO_TAG}/bin/istioctl proxy-config routes  istio-ingressgateway-7669f996d7-mk4r5 -n istio-system
NAME          DOMAINS     MATCH                   VIRTUAL SERVICE
http.8080                /dex/                  dex.auth
http.8080                /jupyter/              jupyter-web-app-jupyter-web-app.kubeflow
http.8080                /katib/                katib-ui.kubeflow
http.8080                /ml_metadata           metadata-grpc.kubeflow
http.8080                /pipeline              ml-pipeline-ui.kubeflow
http.8080                /kfam/                 profiles-kfam.kubeflow
http.8080                /volumes/              volumes-web-app-volumes-web-app.kubeflow
http.8080                /kserve-endpoints/     kserve-models-web-app.kubeflow
http.8080                /                      centraldashboard.kubeflow
                         /stats/prometheus      
                         /healthz/ready   
I will read into https://knative.dev/docs/serving/setting-up-custom-ingress-gateway/
08:48:24
@_slack_kubeflow_U023ZTGHZ41:matrix.orgJulius von Kohout 1. seldon_cluster-local-gateway 2. seldon_istio-ingressgateway 3. kserve_knative-local-gateway 4. kserve_cluster-local-gateway(error) 5. kserve_istio-ingressgateway(error) 6. kserve_error HTML 09:34:51
@_slack_kubeflow_U023ZTGHZ41:matrix.orgJulius von Kohout Dan Sun here is the error 09:35:37
@_slack_kubeflow_U023ZTGHZ41:matrix.orgJulius von Kohout I have also created an issue with all relevant information https://github.com/kserve/kserve/issues/2157 10:18:03
@_slack_kubeflow_U0113251MGS:matrix.orgPeilun Li Hey community! Maybe a followup questions on today's community meeting regarding knative vs. raw deployment mode: in which case generally you'd recommend to use raw deployment mode rather than knative serving? We are using knative serving right now and through our benchmarking it looks like compared to vanilla K8S & Istio based service, knative can help with a lower long tail latency, which is in line with the findings in kserve benchmark. Also considering KPA reacts faster than HPA under traffic spikes and as it supports GPU autoscaling, we don't see a reason yet to favor raw deployment mode, other than the consideration to reduce operation workload to maintain the knative component. Are we missing any vital pieces that raw deployment mode could potentially outplay knative? Thanks! 17:58:40
@_slack_kubeflow_UFVUV2UFP:matrix.orgDan Sun Agree with you, other than reducing operation workload to maintain knative I do not see other advantages with raw deployment. I think raw deployment is more suitable for small team who only need to deploy a handful models. Knative revision management also addresses the kubernetes deployment’s rolling upgrade limitations for the inability to stage traffic and do deeper rollout validations(I am going to talk about this on KnativeCon). 21:59:45
@_slack_kubeflow_U0113251MGS:matrix.orgPeilun Li Sounds great, thanks! Looking forward to the talk (if there's recording available) 23:38:05
28 Apr 2022
@_slack_kubeflow_U03CN7QAHN3:matrix.orgzorba(손주형) I had to add additional scrape config when install prometheus by helm. https://raw.githubusercontent.com/istio/istio/release-1.13/samples/addons/prometheus.yaml 02:14:23
@_slack_kubeflow_U03D3KEEYSJ:matrix.org_slack_kubeflow_U03D3KEEYSJ joined the room.08:06:24
@_slack_kubeflow_U03CN7QAHN3:matrix.orgzorba(손주형) How do you use Logger?? I`m thinking of using loki-grafana or elk stack to save `InferenceService` logs, but not sure this is best practice or not. 08:37:00

Show newer messages


Back to Room ListRoom Version: 6