28 Apr 2022 |
Johnu | Dan Sun Talk contents are revealed 🙂 | 09:13:30 |
| Atra Akandeh joined the room. | 14:55:41 |
Rachit Chauhan | Question around custom-predictors, I created a custom predictor like this :
class AlexNetModel(kserve.Model):
def __init__(self, name: str):
super().__init__(name)
self.name = name
self.load()
def load(self):
print("loaded model successfully")
def predict(self, request: Dict) -> Dict:
print("I am an ECHO model. You get what you pass...")
return {"predictions": request}
if __name__ == "__main__":
print("From inside main logs")
model = AlexNetModel("custom-model")
model.load()
kserve.ModelServer(workers=1).start([model])
And then created an ISVC using this predictor image:
apiVersion: serving.kserve.io/v1beta1
kind: InferenceService
metadata:
name: echo-model-rac
spec:
predictor:
containers:
- name: kserve-container
image: rachitchauhan885/custom-model:v4
Now when invoking this isvc I used this curl
curl -v -H "Host: ${SERVICE_HOSTNAME}" http://${INGRESS_HOST}:${INGRESS_PORT}/v1/models/echo-model-rac:predict -d @./iris-input.json
and I got this error:
{"error": "Model with name echo-model-rac does not exist."}
and when i used model name custom-model in my curl it worked fine i.e.:
curl -v -H "Host: ${SERVICE_HOSTNAME}" http://${INGRESS_HOST}:${INGRESS_PORT}/v1/models/custom-model:predict -d @./iris-input.json
I observed in case of cluster serving runtimes, whatever name we give to ISVC can be used to invoke the model’s predictor but not in case of custom runtimes. Why so ? What can we do to assign same name, as ISVC, to the model in case of custom predictor ? | 17:37:11 |
Rachit Chauhan | What is the need of knative-local-gateway? In mesh mode, anyways all the east-west traffic for example predictor to transformer can be done via envoys.
And in non-mesh mode, k8s anyways creates DNS records for it’s services like my-svc.my-namespace.svc.cluster-domain.example . And kserve creates ClusterIP services for every ISVC. Why not just use those ClusterIP services instead of going via knative-local-gateway ? | 21:59:44 |
29 Apr 2022 |
Dan Sun | in non-mesh mode, you need local gateway for transformer/predictor communication, there is indirection of virtual service with Knative service which is not calling kubernetes service directly. | 01:51:36 |
Dan Sun | everytime you rolls a new revision, virtual service changes the pointer to the latest revision | 01:52:39 |
Dan Sun | with no-mesh mode you have to go through the gateway | 01:53:17 |
Dan Sun | I mean envoy proxy which has the routing info | 01:53:38 |
Dan Sun | with mesh-node you don’t need that because transformer/predictor is directly injected with istio sidecar | 01:54:21 |
Dan Sun | because you are assigning the name explicitly AlexNetModel("custom-model") | 02:05:18 |
Shri Javadekar | Hey everyone.. I have gotten an sklearn model to be served via KServe's InferenceService. Are there any prometheus metrics exported by the predictor service for a given model? | 03:32:34 |
Benjamin Tan | I think u just need add the the prometheus related annotations | 03:37:34 |
Benjamin Tan | It should be covered in the KServe docs | 03:37:45 |
Shri Javadekar | I found this.. not sure if that is specific to Torchserve | 03:38:26 |
Shri Javadekar | I added the two annotations for my sklearn model and there are no metrics exported (nor is there an open port):
annotations:
prometheus.io/port: "8082"
prometheus.io/scrape: "true" | 03:41:24 |
Shri Javadekar | This yaml showed the manager pods metrics being scraped. But it doesn't have metrics specific to a given model (predictor) service.
If someone can point me to how to get metrics for a specific model, I can volunteer to contribute it to the documentation for future use. | 03:47:42 |
Benjamin Tan | I have this for my model:
prometheus.io/scrape: "true"
prometheus.io/path: "/metrics"
prometheus.io/port: "8000" | 03:50:25 |
Benjamin Tan | i assume u have made some inferences to that metrics are generated in the first place 😄 | 03:51:51 |
Shri Javadekar | 🙂 . yes.. I am continuously sending it some inputs and it is correctly returning the outputs. | 03:59:29 |
Rachit Chauhan | Dan Sun: Ok, so indirection of virtual service with Knative service is by design of knative serving then, since for every service creates virtual service, right ? | 05:14:45 |
Rachit Chauhan | Redacted or Malformed Event | 05:20:51 |
Dan Sun | That’s correct | 11:14:42 |
Dan Sun | If you are running serverless mode you can scrape the queue proxy metrics on port 9091 | 11:18:56 |
Dan Sun | https://knative.dev/docs/serving/observability/metrics/collecting-metrics/ | 11:19:35 |
Dan Sun | https://github.com/knative-sandbox/monitoring | 11:28:19 |
Dan Sun | Shri Javadekar | 11:31:33 |
| Jan Migoń joined the room. | 14:16:44 |
| Jan Migoń changed their display name from _slack_kubeflow_U02TG7VHLF4 to Jan Migoń. | 14:23:10 |
| Jan Migoń set a profile picture. | 14:23:13 |
Jan Migoń | Hey everyone! I have been using kubeflow for some time now and I added a while ago a kfserving component (not kserve yet). When I create a v1beta1 inference service I get the error from the conversion webhook saying:
[500] Internal error occurred: conversion webhook for serving.kubeflow.org/v1beta1, Kind=InferenceService failed: Post "https://kfserving-webhook-server-service.kubeflow.svc:443/convert?timeout=30s": EOF https://host/models/api/namespaces/kubeflow-shortid/inferenceservices
Looking into logs of kfserving-controller-manager I see
http: panic serving 192.168.0.195:43244: runtime error: invalid memory address or nil pointer dereference
goroutine 82750 [running]:
net/http.(*conn).serve.func1(0xc000d5b680)
/usr/local/go/src/net/http/server.go:1800 +0x139
panic(0x19ea7e0, 0x2e45000)
/usr/local/go/src/runtime/panic.go:975 +0x3e3
github.com/kubeflow/kfserving/pkg/apis/serving/v1alpha2.(*InferenceService).ConvertFrom(0xc0005a2b40, 0x2001260, 0xc00028a000, 0x2001260, 0xc00028a000)
/go/src/github.com/kubeflow/kfserving/pkg/apis/serving/v1alpha2/inferenceservice_conversion.go:236 +0x1801
sigs.k8s.io/controller-runtime/pkg/webhook/conversion.(*Webhook).convertObject(0xc00007d3c0, 0x1fdaaa0, 0xc00028a000, 0x1fda9e0, 0xc0005a2b40, 0x1fda9e0, 0xc0005a2b40)
/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.7.0/pkg/webhook/conversion/conversion.go:142 +0x7bc
sigs.k8s.io/controller-runtime/pkg/webhook/conversion.(*Webhook).handleConvertRequest(0xc00007d3c0, 0xc0002a0140, 0xc000b4c960, 0x0, 0x0)
/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.7.0/pkg/webhook/conversion/conversion.go:107 +0x1f8
sigs.k8s.io/controller-runtime/pkg/webhook/conversion.(*Webhook).ServeHTTP(0xc00007d3c0, 0x7f6bc47bca30, 0xc0005396d0, 0xc000a0c300)
/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.7.0/pkg/webhook/conversion/conversion.go:74 +0x10b
github.com/prometheus/client_golang/prometheus/promhttp.InstrumentHandlerInFlight.func1(0x7f6bc47bca30, 0xc0005396d0, 0xc000a0c300)
/go/pkg/mod/github.com/prometheus/client_golang@v1.7.1/prometheus/promhttp/instrument_server.go:40 +0xab
net/http.HandlerFunc.ServeHTTP(0xc0007bc930, 0x7f6bc47bca30, 0xc0005396d0, 0xc000a0c300)
/usr/local/go/src/net/http/server.go:2041 +0x44
github.com/prometheus/client_golang/prometheus/promhttp.InstrumentHandlerCounter.func1(0x2010360, 0xc000f04620, 0xc000a0c300)
/go/pkg/mod/github.com/prometheus/client_golang@v1.7.1/prometheus/promhttp/instrument_server.go:100 +0xda
net/http.HandlerFunc.ServeHTTP(0xc0007bca80, 0x2010360, 0xc000f04620, 0xc000a0c300)
/usr/local/go/src/net/http/server.go:2041 +0x44
github.com/prometheus/client_golang/prometheus/promhttp.InstrumentHandlerDuration.func2(0x2010360, 0xc000f04620, 0xc000a0c300)
/go/pkg/mod/github.com/prometheus/client_golang@v1.7.1/prometheus/promhttp/instrument_server.go:76 +0xb2
net/http.HandlerFunc.ServeHTTP(0xc0007bcb70, 0x2010360, 0xc000f04620, 0xc000a0c300)
/usr/local/go/src/net/http/server.go:2041 +0x44
net/http.(*ServeMux).ServeHTTP(0xc000860f00, 0x2010360, 0xc000f04620, 0xc000a0c300)
/usr/local/go/src/net/http/server.go:2416 +0x1a5
net/http.serverHandler.ServeHTTP(0xc0009b61c0, 0x2010360, 0xc000f04620, 0xc000a0c300)
/usr/local/go/src/net/http/server.go:2836 +0xa3
net/http.(*conn).serve(0xc000d5b680, 0x20166a0, 0xc00076ac00)
/usr/local/go/src/net/http/server.go:1924 +0x86c
created by net/http.(*Server).Serve
/usr/local/go/src/net/http/server.go:2962 +0x35c
What I don't understand is why the conversion is even happening as I am no longer using any v1alpha2 inference services and my crds also serve only v1beta1 versions. And Why the conversion is using the v1alpha2 packages if my manifests do not include any v1alpha2 versions for inference services.
I would appreciate if you could guide me into some direction with further debugging. Thanks! | 14:23:13 |