Sender | Message | Time |
---|---|---|
27 Apr 2022 | ||
zorba(손주형) | yes it is opened | 02:16:29 |
zorba(손주형) | this is prometheus data source by helm chart | 02:16:53 |
zorba(손주형) | and it collect inferenceservice pod’s metric but not triton metric | 02:17:12 |
Dan Sun | these are cpu metrics | 02:17:52 |
zorba(손주형) | yap there’s no triton metric | 02:18:17 |
Dan Sun | what’s your query | 02:18:33 |
zorba(손주형) | nv_inference_compute_infer_duration_us | 02:18:46 |
Dan Sun | can we see that in the prometheus ? | 02:19:07 |
zorba(손주형) | and when query it with prometheus data source with istio addons, triton metric is shown | 02:19:18 |
zorba(손주형) | prom with helm: no prom with istio addon: yes | 02:19:31 |
zorba(손주형) | Download image.png | 02:20:39 |
zorba(손주형) | Download image.png | 02:20:40 |
zorba(손주형) | left is prom with helm, right is prom with addon | 02:20:59 |
Dan Sun | what’s your annotation added on isvc? | 02:21:32 |
zorba(손주형) | you mean inferenceService? | 02:21:54 |
Dan Sun | yes | 02:22:00 |
zorba(손주형) | Annotations: map[string]string{ "prometheus.io/scrape": "true", "prometheus.io/port": "8002", }, | 02:22:05 |
Dan Sun | you might need to compare the difference between the two installation.. it is probably setup issue | 02:23:10 |
zorba(손주형) | okay I will try. | 02:23:36 |
zorba(손주형) | Is it possible to make triton automatically pull latest model?
I have a model in s3 s3://~~/torchscript/cifar/1 and create isvc and then I add model 2 s3://~~/torchscript/cifar/2 but triton server in isvc does not pull latest model. | 08:15:17 |
Julius von Kohout | These are the default configmaps from the offcial kserve_kubeflow.yaml at https://github.com/kserve/kserve/blob/release-0.7/install/v0.7.0/kserve_kubeflow.yaml There is a misconfiguration somewhere in this official release. I want to find it and fix it upstream.
Here are the routes. I do not see seldon stuff, but Seldon is working compared to KServe
[~/deployment-master]$ ISTIO_TAG=1.11.7 [~/deployment-master]$ curl -L https://github.com/istio/istio/releases/download/${ISTIO_TAG}/istio-${ISTIO_TAG}-linux-amd64.tar.gz -o istio.tar.gz [~/deployment-master]$ tar xzf istio.tar.gz [~/deployment-master]$ istio-${ISTIO_TAG}/bin/istioctl proxy-config routes cluster-local-gateway-696db88886-8g54r -n istio-system NAME DOMAINS MATCH VIRTUAL SERVICE http.8080 / 404 http.8081 / 404 /stats/prometheus /healthz/ready [~/deployment-master]$ istio-${ISTIO_TAG}/bin/istioctl proxy-config routes istio-ingressgateway-7669f996d7-mk4r5 -n istio-system NAME DOMAINS MATCH VIRTUAL SERVICE http.8080 /dex/ dex.auth http.8080 /jupyter/ jupyter-web-app-jupyter-web-app.kubeflow http.8080 /katib/ katib-ui.kubeflow http.8080 /ml_metadata metadata-grpc.kubeflow http.8080 /pipeline ml-pipeline-ui.kubeflow http.8080 /kfam/ profiles-kfam.kubeflow http.8080 /volumes/ volumes-web-app-volumes-web-app.kubeflow http.8080 /kserve-endpoints/ kserve-models-web-app.kubeflow http.8080 / centraldashboard.kubeflow /stats/prometheus /healthz/readyI will read into https://knative.dev/docs/serving/setting-up-custom-ingress-gateway/ | 08:48:24 |
Julius von Kohout | 1. seldon_cluster-local-gateway 2. seldon_istio-ingressgateway 3. kserve_knative-local-gateway 4. kserve_cluster-local-gateway(error) 5. kserve_istio-ingressgateway(error) 6. kserve_error HTML | 09:34:51 |
Julius von Kohout | Dan Sun here is the error | 09:35:37 |
Julius von Kohout | I have also created an issue with all relevant information https://github.com/kserve/kserve/issues/2157 | 10:18:03 |
Peilun Li | Hey community! Maybe a followup questions on today's community meeting regarding knative vs. raw deployment mode: in which case generally you'd recommend to use raw deployment mode rather than knative serving? We are using knative serving right now and through our benchmarking it looks like compared to vanilla K8S & Istio based service, knative can help with a lower long tail latency, which is in line with the findings in kserve benchmark. Also considering KPA reacts faster than HPA under traffic spikes and as it supports GPU autoscaling, we don't see a reason yet to favor raw deployment mode, other than the consideration to reduce operation workload to maintain the knative component. Are we missing any vital pieces that raw deployment mode could potentially outplay knative? Thanks! | 17:58:40 |
Dan Sun | Agree with you, other than reducing operation workload to maintain knative I do not see other advantages with raw deployment. I think raw deployment is more suitable for small team who only need to deploy a handful models. Knative revision management also addresses the kubernetes deployment’s rolling upgrade limitations for the inability to stage traffic and do deeper rollout validations(I am going to talk about this on KnativeCon). | 21:59:45 |
Peilun Li | Sounds great, thanks! Looking forward to the talk (if there's recording available) | 23:38:05 |
28 Apr 2022 | ||
zorba(손주형) | I had to add additional scrape config when install prometheus by helm. https://raw.githubusercontent.com/istio/istio/release-1.13/samples/addons/prometheus.yaml | 02:14:23 |
_slack_kubeflow_U03D3KEEYSJ joined the room. | 08:06:24 | |
zorba(손주형) | How do you use Logger ??
I`m thinking of using loki-grafana or elk stack to save `InferenceService` logs, but not sure this is best practice or not. | 08:37:00 |