Sender | Message | Time |
---|---|---|
6 May 2022 | ||
Priyanka Choudhary set a profile picture. | 09:46:03 | |
Saurabh Agarwal | I want to curl my inferenceservice from inside cluster. Do I need to use http://{Inferenceservice name}.{namespace}/v1/models or http://{service name}.{namespace}/v1/models | 11:19:25 |
Alexandre Brown | For me this worked :
apiVersion: "serving.kubeflow.org/v1beta1" kind: "InferenceService" metadata: name: "sklearn-iris" spec: predictor: sklearn: storageUri: " gs://kfserving-samples/models/sklearn/iris " import requests data = { "instances": [ [6.8, 2.8, 4.8, 1.4], [6.0, 3.4, 4.5, 1.6] ] } url = "http://sklearn-iris.dev.svc.cluster.local/v1/models/sklearn-iris:predict" headers = { "Host" : "sklearn-iris.dev.svc.cluster.local", } response = requests.post(url, headers=headers, json=data) print("Status Code", response.status_code) print("JSON Response ", response.json())In this example my model name is sklearn-iris , my namespace is dev | 11:24:04 |
Chris Chase | so, i'm new to this. Out of curiosity, why etcd at all? Would it be architecturally flawed to just use/watch kube custom resources? | 14:12:21 |
Shri Javadekar | Look at the status field in your inferenceservice object. You should see the url that you can use internally.
e.g.
$ k get inferenceservice my-model-b -o yaml ... status: address: url: http://my-model-b.kserve-test.svc.cluster.local/v1/models/my-model-b:predict | 23:49:12 |
7 May 2022 | ||
_slack_kubeflow_U03EDC2FA3X joined the room. | 03:38:44 | |
Saurabh Agarwal | Shri Javadekar This url doesn't work but the predictor url is working | 10:53:24 |
9 May 2022 | ||
Nick | Hey Chris Chase sorry for being late to the thread. Re your last question, I wrote an explanation of this in response to the same question here | 21:41:21 |
Nick | W.r.t. suitable etcd cluster config for production use, typically a cluster of 3 members should be sufficient. The amount of memory/storage required will depend a bit on how many models you have but generally should be pretty small since the stored metadata is minimal. | 21:44:31 |
Nick | As Paul Van Eck mentioned, we have an internal Kube/OpenShift operator for managing such clusters. There is also the "original" coreos etcd operator though it's no longer maintained. But I just came across another one here that also looks promising | 21:52:42 |
10 May 2022 | ||
Diego Kiner | Trying to figure out if it's possible to do MAB deployments - it's on the 0.9 project list but just links to this short thread: https://github.com/kserve/kserve/issues/1324. I tried manually editing the traffic spec in the Knative service spec to distribute among multiple revisions as suggested, but it seems to get reverted immediately by the controller. Of course this also wouldn't be a full solution since it's presumably not available via the SDK. Was hoping there was another recommended way to do this, or at least that there is some update on the work to add the feature. | 00:31:00 |
Chris Chase | Nick thanks for the explanation. Very helpful! | 12:29:02 |
Timos | Download image.png | 19:27:43 |