!LuUSGaeArTeoOgUpwk:matrix.org

kubeflow-kfserving

433 Members
2 Servers

Load older messages


SenderMessageTime
5 May 2022
@_slack_kubeflow_U02S0UGJWCV:matrix.orgwenyang zhou I mean is it a normal phenomenon that the first revision can work while the second revision can not deploy. The two revisions do not meet the consistency. 06:18:04
@_slack_kubeflow_U02NJHK0Z19:matrix.orgPierre Prange Hey Folks, great work with kserve. Currently looking into leveraging KServe with KNative Eventing and OTEL Tracing. I Built and deployed some custom Python Models from kserve.model, configured Broker/Trigger aswell as Tracing in KNative. I use the InferenceService response Logger of Model A to send events to broker ingress and a KNative Trigger to forward those to Model B. I can see 2 Traces created but they don't correlate. Inspecting the Message Headers (Kafka Broker Implementation) i can see a traceparent field. What's in your Opinion the most convinient way to correlate those Traces? is it advisable to use otel SDK in custom model preprocessing to extract the trace-id and set context? 08:58:58
@_slack_kubeflow_U017QCZSQ48:matrix.orgPaul Van Eck Hey, sorry to hear that. The quickstart minio and etcd instances should probably be deployments, so I will convert them to provide some resiliency in these cases. These were meant for dev/experimentation, but I believe if you were to apply the quickstart dependencies again to bring the pods back up, the modelmesh controller would repopulate etcd based on the currently deployed predictors/isvcs. 20:04:00
@_slack_kubeflow_U013VE77D62:matrix.orgChris Chase yup, I plan on recreating it as needed for dev. What do you guys use/expect on production deployments of modelmesh? 20:51:18
@_slack_kubeflow_U013VE77D62:matrix.orgChris Chase (for etcd) 20:52:26
@_slack_kubeflow_U017QCZSQ48:matrix.orgPaul Van Eck Nick can answer better than I can, but definitely a multi-node cluster of etcd. At IBM, i believe we use some OpenShift based etcd operator for deploying and managing HA etcd 21:08:11
6 May 2022
@_slack_kubeflow_UFVUV2UFP:matrix.orgDan Sun when you deploy the second revision it generates the prev tag for the previous version, so additional characters are added to the dns name. 00:40:22
@_slack_kubeflow_UFVUV2UFP:matrix.orgDan Sun Pierre Prange sounds a great idea, would you be interested in to open an issue for this? 00:41:10
@_slack_kubeflow_UFVUV2UFP:matrix.orgDan Sun We probably want to document a production etcd setup this is critical for production modelmesh deployment 00:44:03
@_slack_kubeflow_U02S0UGJWCV:matrix.orgwenyang zhou Yea,but the first revision and the second revision can not deploy consistent , is it a right design ? 00:44:10
@_slack_kubeflow_UFVUV2UFP:matrix.orgDan Sun Well nothing wrong with the design rather a side effect adding the tag to previous deployment 00:47:21
@_slack_kubeflow_UFVUV2UFP:matrix.orgDan Sun Probably need a validation for dns name length 00:48:25
@_slack_kubeflow_U02S0UGJWCV:matrix.orgwenyang zhou Oh , get it 00:48:43
@_slack_kubeflow_U03E5S89214:matrix.org_slack_kubeflow_U03E5S89214 joined the room.09:41:34
@_slack_kubeflow_U0169DM0D8T:matrix.orgAniruddha Choudhury joined the room.09:45:56
@_slack_kubeflow_U0169DM0D8T:matrix.orgAniruddha ChoudhuryRedacted or Malformed Event09:45:56
@_slack_kubeflow_U03E5SSBG3G:matrix.orgPriyanka Choudhary joined the room.09:45:59
@_slack_kubeflow_U03E5SSBG3G:matrix.orgPriyanka Choudhary changed their display name from _slack_kubeflow_U03E5SSBG3G to Priyanka Choudhary.09:46:01
@_slack_kubeflow_U03E5SSBG3G:matrix.orgPriyanka Choudhary set a profile picture.09:46:03
@_slack_kubeflow_U032CM1LH3N:matrix.orgSaurabh Agarwal I want to curl my inferenceservice from inside cluster. Do I need to use http://{Inferenceservice name}.{namespace}/v1/models or http://{service name}.{namespace}/v1/models 11:19:25
@_slack_kubeflow_U02AYBVSLSK:matrix.orgAlexandre Brown For me this worked :
apiVersion: "serving.kubeflow.org/v1beta1"
 kind: "InferenceService"
 metadata:
   name: "sklearn-iris"
 spec:
   predictor:
     sklearn:
      storageUri: " gs://kfserving-samples/models/sklearn/iris "
import requests
 
 data = {
   "instances": [
     [6.8,  2.8,  4.8,  1.4],
    [6.0,  3.4,  4.5,  1.6]
   ]
 }

url = "http://sklearn-iris.dev.svc.cluster.local/v1/models/sklearn-iris:predict"
headers = {
 "Host" : "sklearn-iris.dev.svc.cluster.local",
 }
 
response = requests.post(url, headers=headers, json=data)
 
print("Status Code", response.status_code)
print("JSON Response ", response.json())
In this example my model name is sklearn-iris , my namespace is dev
11:24:04
@_slack_kubeflow_U013VE77D62:matrix.orgChris Chase so, i'm new to this. Out of curiosity, why etcd at all? Would it be architecturally flawed to just use/watch kube custom resources? 14:12:21
@_slack_kubeflow_U0315UY2WRM:matrix.orgShri Javadekar Look at the status field in your inferenceservice object. You should see the url that you can use internally. e.g.
$ k get inferenceservice my-model-b -o yaml
...
status:
  address:
    url: http://my-model-b.kserve-test.svc.cluster.local/v1/models/my-model-b:predict
23:49:12
7 May 2022
@_slack_kubeflow_U03EDC2FA3X:matrix.org_slack_kubeflow_U03EDC2FA3X joined the room.03:38:44
@_slack_kubeflow_U032CM1LH3N:matrix.orgSaurabh Agarwal Shri Javadekar This url doesn't work but the predictor url is working 10:53:24
9 May 2022
@_slack_kubeflow_U0127AUTPMH:matrix.orgNick Hey Chris Chase sorry for being late to the thread. Re your last question, I wrote an explanation of this in response to the same question here 21:41:21
@_slack_kubeflow_U0127AUTPMH:matrix.orgNick W.r.t. suitable etcd cluster config for production use, typically a cluster of 3 members should be sufficient. The amount of memory/storage required will depend a bit on how many models you have but generally should be pretty small since the stored metadata is minimal. 21:44:31
@_slack_kubeflow_U0127AUTPMH:matrix.orgNick As Paul Van Eck mentioned, we have an internal Kube/OpenShift operator for managing such clusters. There is also the "original" coreos etcd operator though it's no longer maintained. But I just came across another one here that also looks promising 21:52:42
10 May 2022
@_slack_kubeflow_U02PHBULPDZ:matrix.orgDiego Kiner Trying to figure out if it's possible to do MAB deployments - it's on the 0.9 project list but just links to this short thread: https://github.com/kserve/kserve/issues/1324. I tried manually editing the traffic spec in the Knative service spec to distribute among multiple revisions as suggested, but it seems to get reverted immediately by the controller. Of course this also wouldn't be a full solution since it's presumably not available via the SDK. Was hoping there was another recommended way to do this, or at least that there is some update on the work to add the feature. 00:31:00
@_slack_kubeflow_U013VE77D62:matrix.orgChris Chase Nick thanks for the explanation. Very helpful! 12:29:02

Show newer messages


Back to Room ListRoom Version: 6