!LuUSGaeArTeoOgUpwk:matrix.org

kubeflow-kfserving

433 Members
2 Servers

Load older messages


SenderMessageTime
6 May 2022
@_slack_kubeflow_U0169DM0D8T:matrix.orgAniruddha Choudhury joined the room.09:45:56
@_slack_kubeflow_U0169DM0D8T:matrix.orgAniruddha ChoudhuryRedacted or Malformed Event09:45:56
@_slack_kubeflow_U03E5SSBG3G:matrix.orgPriyanka Choudhary joined the room.09:45:59
@_slack_kubeflow_U03E5SSBG3G:matrix.orgPriyanka Choudhary changed their display name from _slack_kubeflow_U03E5SSBG3G to Priyanka Choudhary.09:46:01
@_slack_kubeflow_U03E5SSBG3G:matrix.orgPriyanka Choudhary set a profile picture.09:46:03
@_slack_kubeflow_U032CM1LH3N:matrix.orgSaurabh Agarwal I want to curl my inferenceservice from inside cluster. Do I need to use http://{Inferenceservice name}.{namespace}/v1/models or http://{service name}.{namespace}/v1/models 11:19:25
@_slack_kubeflow_U02AYBVSLSK:matrix.orgAlexandre Brown For me this worked :
apiVersion: "serving.kubeflow.org/v1beta1"
 kind: "InferenceService"
 metadata:
   name: "sklearn-iris"
 spec:
   predictor:
     sklearn:
      storageUri: " gs://kfserving-samples/models/sklearn/iris "
import requests
 
 data = {
   "instances": [
     [6.8,  2.8,  4.8,  1.4],
    [6.0,  3.4,  4.5,  1.6]
   ]
 }

url = "http://sklearn-iris.dev.svc.cluster.local/v1/models/sklearn-iris:predict"
headers = {
 "Host" : "sklearn-iris.dev.svc.cluster.local",
 }
 
response = requests.post(url, headers=headers, json=data)
 
print("Status Code", response.status_code)
print("JSON Response ", response.json())
In this example my model name is sklearn-iris , my namespace is dev
11:24:04
@_slack_kubeflow_U013VE77D62:matrix.orgChris Chase so, i'm new to this. Out of curiosity, why etcd at all? Would it be architecturally flawed to just use/watch kube custom resources? 14:12:21
@_slack_kubeflow_U0315UY2WRM:matrix.orgShri Javadekar Look at the status field in your inferenceservice object. You should see the url that you can use internally. e.g.
$ k get inferenceservice my-model-b -o yaml
...
status:
  address:
    url: http://my-model-b.kserve-test.svc.cluster.local/v1/models/my-model-b:predict
23:49:12
7 May 2022
@_slack_kubeflow_U03EDC2FA3X:matrix.org_slack_kubeflow_U03EDC2FA3X joined the room.03:38:44
@_slack_kubeflow_U032CM1LH3N:matrix.orgSaurabh Agarwal Shri Javadekar This url doesn't work but the predictor url is working 10:53:24
9 May 2022
@_slack_kubeflow_U0127AUTPMH:matrix.orgNick Hey Chris Chase sorry for being late to the thread. Re your last question, I wrote an explanation of this in response to the same question here 21:41:21
@_slack_kubeflow_U0127AUTPMH:matrix.orgNick W.r.t. suitable etcd cluster config for production use, typically a cluster of 3 members should be sufficient. The amount of memory/storage required will depend a bit on how many models you have but generally should be pretty small since the stored metadata is minimal. 21:44:31
@_slack_kubeflow_U0127AUTPMH:matrix.orgNick As Paul Van Eck mentioned, we have an internal Kube/OpenShift operator for managing such clusters. There is also the "original" coreos etcd operator though it's no longer maintained. But I just came across another one here that also looks promising 21:52:42
10 May 2022
@_slack_kubeflow_U02PHBULPDZ:matrix.orgDiego Kiner Trying to figure out if it's possible to do MAB deployments - it's on the 0.9 project list but just links to this short thread: https://github.com/kserve/kserve/issues/1324. I tried manually editing the traffic spec in the Knative service spec to distribute among multiple revisions as suggested, but it seems to get reverted immediately by the controller. Of course this also wouldn't be a full solution since it's presumably not available via the SDK. Was hoping there was another recommended way to do this, or at least that there is some update on the work to add the feature. 00:31:00
@_slack_kubeflow_U013VE77D62:matrix.orgChris Chase Nick thanks for the explanation. Very helpful! 12:29:02
@_slack_kubeflow_U02SF1C1Y67:matrix.orgTimosimage.png
Download image.png
19:27:43
@_slack_kubeflow_U02SF1C1Y67:matrix.orgTimosRedacted or Malformed Event19:27:44
11 May 2022
@_slack_kubeflow_U0104H1616Z:matrix.orgiamlovingit Hi, Diego Kiner community plans to supports ABN test by using the new feature inference graph, which is in reviewing progress now, you can try this if you are interesting. 01:00:11
@_slack_kubeflow_U02JYD39G57:matrix.org_slack_kubeflow_U02JYD39G57 changed their display name from _slack_kubeflow_U02JYD39G57 to Zoltán R. Jánki.11:24:06
@_slack_kubeflow_U02JYD39G57:matrix.org_slack_kubeflow_U02JYD39G57 set a profile picture.11:24:08
@_slack_kubeflow_UFVUV2UFP:matrix.orgDan Sun We are cancelling today's community as many folks are not available, also remind that Kubecon EU is next week and we have quite a few contributors giving KServe talks there! 12:51:04
@_slack_kubeflow_U03D067RTJN:matrix.orgRyan McCaffrey joined the room.19:01:26
@_slack_kubeflow_U9UFLSBM4:matrix.orgcroberts I'm trying out the pvc example for with kserve-raw on OpenShift. I have the model on my PV, but when I try to spin-up the inferenceservice, I get the following from the storage-initializer: https://paste.centos.org/view/e5848b4b Has anyone ran into something similar or better yet, solved it? 20:23:11
@_slack_kubeflow_U9UFLSBM4:matrix.orgcroberts Here is the example I'm working with: https://kserve.github.io/website/modelserving/storage/pvc/pvc/ 20:29:32
@_slack_kubeflow_U9UFLSBM4:matrix.orgcroberts Might be an issue with my storage class being set to WaitForFirstConsumer. Tweaking that to Immediate seems to maybe get me rolling again. 21:08:57
12 May 2022
@_slack_kubeflow_U01T25HRREK:matrix.orgMark Winter Seems like maybe it can't find the file in the PVC? Is your model file called model.joblib like it expects? /mnt/pvc/model.joblib 03:09:17
@_slack_kubeflow_U01T25HRREK:matrix.orgMark Winter It seems scikit-learn model serving is hardcoded to model.joblib file name at the moment. https://github.com/kserve/kserve/issues/2079 03:27:21
@_slack_kubeflow_U03CN7QAHN3:matrix.orgzorba(손주형) Is kserve not support tensorRT?? I thought it possible because of triton but tensorRT is not in the guide. 06:14:48
@_slack_kubeflow_U01T25HRREK:matrix.orgMark Winter When you use Triton with KServe you get just a normal Triton server. So you can use TensorRT with Triton as you would normally 06:20:31

Show newer messages


Back to Room ListRoom Version: 6