!LuUSGaeArTeoOgUpwk:matrix.org

kubeflow-kfserving

433 Members
2 Servers

Load older messages


SenderMessageTime
29 Apr 2022
@_slack_kubeflow_U03E7U44FC0:matrix.orgBogdan Kowalczyk joined the room.15:13:45
@_slack_kubeflow_U03E7U44FC0:matrix.orgBogdan Kowalczyk changed their display name from _slack_kubeflow_U03E7U44FC0 to Bogdan Kowalczyk.15:14:55
@_slack_kubeflow_U03E7U44FC0:matrix.orgBogdan Kowalczyk set a profile picture.15:14:57
@_slack_kubeflow_U0315UY2WRM:matrix.orgShri Javadekar Fantastic... let me try this out 15:40:42
30 Apr 2022
@_slack_kubeflow_UFVUV2UFP:matrix.orgDan Sun Jan Migoń this seems like the related issue https://github.com/kserve/kserve/issues/1342 00:26:51
@_slack_kubeflow_UL3871NM6:matrix.org_slack_kubeflow_UL3871NM6 joined the room.10:27:25
@_slack_kubeflow_UFVUV2UFP:matrix.orgDan Sun Let me know if that works, would be great to see your monitoring doc contribution!! 14:27:39
@_slack_kubeflow_U0315UY2WRM:matrix.orgShri Javadekar Here's what I have come up so far: • I could see a bunch of metrics at ports 9091 and 9090 that are exported by the queue-proxy • Particularly the ones in 9090 were a little more interesting to me such as requests_per_second, etc. The ones on 9091 are about go I made the changes to Prometheus as suggested here and also imported the grafana dashbhoards. However, I do not see all the metrics being scraped by Prometheus. Particularly, the KNative Serving - Revision HTTP Requests dashboard shows up with No Data. I see that there are no activator_request_count metrics in Prometheus. I don't even know if these are exported by any component. Are the activator metrics available to be scraped by Prometheus? 20:59:52
@_slack_kubeflow_UFVUV2UFP:matrix.orgDan Sun Yes you will need to add the Prometheus annotation on the activator pod 21:40:01
@_slack_kubeflow_UFVUV2UFP:matrix.orgDan Sun Also autoscaler pod 21:40:02
@_slack_kubeflow_UFVUV2UFP:matrix.orgDan Sun Did you see the revision http request metrics ? 21:41:04
@_slack_kubeflow_U0315UY2WRM:matrix.orgShri Javadekar Oh... I see. I just added the following annotations to the activator, autoscaler (and also the controller) pod. I see they have metrics being exported on port 9090.
prometheus.io/scrape: 'true'
prometheus.io/port: '9090'
Will know the results shortly..
23:16:59
@_slack_kubeflow_U0315UY2WRM:matrix.orgShri Javadekar Hmm.. does Prometheus config need to be configured for explicitly including specific namespaces? 23:32:50
@_slack_kubeflow_U0315UY2WRM:matrix.orgShri Javadekar Ok.. I think I see some metrics in Prometheus. This article helped alot. • Basically, the config serviceMonitorNamespaceSelector: {} in the Prometheus CRD means all namespaces will be watched for service monitor objects. • The serviceMonitorSelector: field in the Prometheus CRD indicates the labels that should be put on ServiceMonitor objects. I had this set to release: kube-prometheus-stack-1651295153 because I used --generate-name when install the helm chart. • The ServiceMonitors created in the Knative-serving namespace didn't have this label. • I added this label to the service monitor objects. • Now, the service monitor objects need to select which services it should select. • I saw that all services had the label serving.knative.dev/release=v0.22.1 . I added this in the ServiceMonitor and I'm seeing this in Prometheus. 23:56:05
@_slack_kubeflow_U0315UY2WRM:matrix.orgShri Javadekar Let me look at the Grafana dashboards 23:56:30
1 May 2022
@_slack_kubeflow_U0315UY2WRM:matrix.orgShri Javadekar Seems to be working 😄 00:02:09
@_slack_kubeflow_U0315UY2WRM:matrix.orgShri Javadekar I think I have everything I need at this point. I will these details to the https://github.com/knative/docs repo and send out a PR by Monday. 00:04:00
@_slack_kubeflow_U0315UY2WRM:matrix.orgShri Javadekar Thanks a lot Dan Sun! 00:04:05
@_slack_kubeflow_U0315UY2WRM:matrix.orgShri Javadekar I wanted to explore how I could get prediction metrics itself (e.g. confidence score of the predictions) into Prometheus. But, I will explore that later. 00:05:29
@_slack_kubeflow_U03DQAW3Z36:matrix.org_slack_kubeflow_U03DQAW3Z36 joined the room.03:28:20
@_slack_kubeflow_UM56LA7N3:matrix.orgBenjamin Tan In think those are custom so u would have to push those metrics yourself 05:47:32
@wybpip:matrix.org@wybpip:matrix.org joined the room.16:03:35
@wybpip:matrix.org@wybpip:matrix.org left the room.16:03:36
2 May 2022
@_slack_kubeflow_U024SH58F43:matrix.orgAjay kumar saini joined the room.12:02:32
@_slack_kubeflow_U02TG7VHLF4:matrix.orgJan Migoń Thanks it helped me. Problem solved. I deleted the mutating and validation webhooks for v1alpha2 as I dont need them and they had same names as the ones for v1beta1 which was causing the error. 12:27:52
@_slack_kubeflow_U9UFLSBM4:matrix.org_slack_kubeflow_U9UFLSBM4 I have some fairly vague questions. Hopefully, someone can help. Are there any docs around that deal with various aspects of monitoring models served with modelmesh-serving? Specifically, data drift, model drift, outlier detection. Also, is there a way to get explanations for predictions or to figure out feature attribution? 18:56:03
@_slack_kubeflow_U022U7KG24W:matrix.orgRachit Chauhan Is there a version compatibility matrix for what versions of kubeflow work with what versions of kserve ? 19:20:09
@_slack_kubeflow_U03E5L0V0SV:matrix.org_slack_kubeflow_U03E5L0V0SV joined the room.19:29:19
@_slack_kubeflow_U0127AUTPMH:matrix.orgNick Hi croberts we don't have this kind of thing in modelmesh yet, but have discussed it in the past. There's some in-progress work (mostly complete I think) to support kserve transformers with modelmesh predictors, I expect something similar could be done with explainers. 20:54:45
@_slack_kubeflow_U9UFLSBM4:matrix.org_slack_kubeflow_U9UFLSBM4 Thanks Nick. Is there support for batch inference in mm-serving? The other thing I couldn't find anything on was the possibility of canary rollout/a-b testing. Anything on those? 20:56:15

Show newer messages


Back to Room ListRoom Version: 6