!LuUSGaeArTeoOgUpwk:matrix.org

kubeflow-kfserving

433 Members
2 Servers

You have reached the beginning of time (for this room).


SenderMessageTime
6 May 2022
@_slack_kubeflow_U03E5SSBG3G:matrix.orgPriyanka Choudhary set a profile picture.09:46:03
@_slack_kubeflow_U032CM1LH3N:matrix.orgSaurabh Agarwal I want to curl my inferenceservice from inside cluster. Do I need to use http://{Inferenceservice name}.{namespace}/v1/models or http://{service name}.{namespace}/v1/models 11:19:25
@_slack_kubeflow_U02AYBVSLSK:matrix.orgAlexandre Brown For me this worked :
apiVersion: "serving.kubeflow.org/v1beta1"
 kind: "InferenceService"
 metadata:
   name: "sklearn-iris"
 spec:
   predictor:
     sklearn:
      storageUri: " gs://kfserving-samples/models/sklearn/iris "
import requests
 
 data = {
   "instances": [
     [6.8,  2.8,  4.8,  1.4],
    [6.0,  3.4,  4.5,  1.6]
   ]
 }

url = "http://sklearn-iris.dev.svc.cluster.local/v1/models/sklearn-iris:predict"
headers = {
 "Host" : "sklearn-iris.dev.svc.cluster.local",
 }
 
response = requests.post(url, headers=headers, json=data)
 
print("Status Code", response.status_code)
print("JSON Response ", response.json())
In this example my model name is sklearn-iris , my namespace is dev
11:24:04
@_slack_kubeflow_U013VE77D62:matrix.orgChris Chase so, i'm new to this. Out of curiosity, why etcd at all? Would it be architecturally flawed to just use/watch kube custom resources? 14:12:21
@_slack_kubeflow_U0315UY2WRM:matrix.orgShri Javadekar Look at the status field in your inferenceservice object. You should see the url that you can use internally. e.g.
$ k get inferenceservice my-model-b -o yaml
...
status:
  address:
    url: http://my-model-b.kserve-test.svc.cluster.local/v1/models/my-model-b:predict
23:49:12
7 May 2022
@_slack_kubeflow_U03EDC2FA3X:matrix.org_slack_kubeflow_U03EDC2FA3X joined the room.03:38:44
@_slack_kubeflow_U032CM1LH3N:matrix.orgSaurabh Agarwal Shri Javadekar This url doesn't work but the predictor url is working 10:53:24
9 May 2022
@_slack_kubeflow_U0127AUTPMH:matrix.orgNick Hey Chris Chase sorry for being late to the thread. Re your last question, I wrote an explanation of this in response to the same question here 21:41:21
@_slack_kubeflow_U0127AUTPMH:matrix.orgNick W.r.t. suitable etcd cluster config for production use, typically a cluster of 3 members should be sufficient. The amount of memory/storage required will depend a bit on how many models you have but generally should be pretty small since the stored metadata is minimal. 21:44:31
@_slack_kubeflow_U0127AUTPMH:matrix.orgNick As Paul Van Eck mentioned, we have an internal Kube/OpenShift operator for managing such clusters. There is also the "original" coreos etcd operator though it's no longer maintained. But I just came across another one here that also looks promising 21:52:42
10 May 2022
@_slack_kubeflow_U02PHBULPDZ:matrix.orgDiego Kiner Trying to figure out if it's possible to do MAB deployments - it's on the 0.9 project list but just links to this short thread: https://github.com/kserve/kserve/issues/1324. I tried manually editing the traffic spec in the Knative service spec to distribute among multiple revisions as suggested, but it seems to get reverted immediately by the controller. Of course this also wouldn't be a full solution since it's presumably not available via the SDK. Was hoping there was another recommended way to do this, or at least that there is some update on the work to add the feature. 00:31:00
@_slack_kubeflow_U013VE77D62:matrix.orgChris Chase Nick thanks for the explanation. Very helpful! 12:29:02
@_slack_kubeflow_U02SF1C1Y67:matrix.orgTimosimage.png
Download image.png
19:27:43

Show newer messages


Back to Room ListRoom Version: 6