!LuUSGaeArTeoOgUpwk:matrix.org

kubeflow-kfserving

434 Members
3 Servers

Load older messages


SenderMessageTime
16 May 2022
@_slack_kubeflow_U0315UY2WRM:matrix.orgShri Javadekar Saurabh Agarwal, if this still doesn't work, feel free to DM me and we can setup a zoom call to go over it. I'm no expert.. but happy to see it with a fresh perspective. If we do get the solution, we can update this thread so that others learn from it too.. 16:41:38
@_slack_kubeflow_U032CM1LH3N:matrix.orgSaurabh Agarwal sure , Thanks. let me try ways 16:42:06
@_slack_kubeflow_U02AYBVSLSK:matrix.orgAlexandre Brown Also keep in mind that depending on your model server runtime protocol you will have a different url. KServe V1 protocol uses
/v1/models/model_name:predict"
KServe V2 protocol uses
/v2/models/model_name/infer
16:42:16
@_slack_kubeflow_U032CM1LH3N:matrix.orgSaurabh Agarwal using v1 only 16:42:48
@_slack_kubeflow_U0315UY2WRM:matrix.orgShri Javadekar An KServe inferenceservice provides lot more capabilities out of the box: • (option of) automatic logging of inputs and outputs • scalable pre and post processing options • ability to add explainers It's still in the works, but ability to do A/B testing, champion-challenger, etc. (with InferenceGraph CRDs). Yes, one could setup all of these things themselves using some Kubernetes primitives or one's own code. But then, you need to maintain that code forever and sometimes you might lag behind the open-source features and have to figure out plans of migrations, etc. IMHO, nothing right or wrong about any of the methods. Just each team's preference. Best option (IMO) is to use open-source and contribute bug fixes/features to the open-source which is a win-win for everyone. 17:16:37
@_slack_kubeflow_U0315UY2WRM:matrix.orgShri Javadekar I've been thinking about this problem too. I wanted to consume the output of one of my models in my own logger service and have the model's confidence scores exported as Prometheus metrics. But, not sure how I could do that across models which could have different types of outputs. If there was a way to declare the output format, I could parse that in the logger service and accordingly look for the confidence scores. 17:19:44
@_slack_kubeflow_U035NJU7P9N:matrix.orgRoman Solovyev Thanks 17:22:36
@_slack_kubeflow_U02AYBVSLSK:matrix.orgAlexandre BrownRedacted or Malformed Event17:47:43
@_slack_kubeflow_U03D067RTJN:matrix.orgRyan McCaffrey Hi everyone, I'm new to KServe and trying to start by running things locally. I am running minikube on my laptop and have done the full Kubeflow v1.4 install successfully with the appropriate dependencies versions shown here: https://www.kubeflow.org/docs/releases/kubeflow-1.4/ I am trying to serve my first sklearn example model locally and have followed these instructions: Istio gateway/ingress setup: https://istio.io/latest/docs/tasks/traffic-management/ingress/ingress-control/ Running a prediction: https://github.com/kserve/kserve/tree/release-0.6/docs/samples/v1beta1/sklearn/v1 And for some reason I keep getting 302 errors, does has anyone encountered this before?:
MODEL_NAME=sklearn-iris
INPUT_PATH=@./iris-input.json
SERVICE_HOSTNAME=$(kubectl get inferenceservice sklearn-iris -o jsonpath='{.status.url}' , cut -d "/" -f 3)
curl -v -H "Host: ${SERVICE_HOSTNAME}" http://${INGRESS_HOST}:${INGRESS_PORT}/v1/models/$MODEL_NAME:predict -d $INPUT_PATH
   Trying 192.168.64.16:31781...
 Connected to 192.168.64.16  (192.168.64.16 ) port 31781 (#0)
> POST /v1/models/sklearn-iris:predict HTTP/1.1
> Host: sklearn-iris.default.example.com
> User-Agent: curl/7.78.0
> Accept: /
> Content-Length: 76
> Content-Type: application/x-www-form-urlencoded
> 
 Mark bundle as not supporting multiuse
< HTTP/1.1 302 Found
< location: /dex/auth?client_id=kubeflow-oidc-authservice&redirect_uri=%2Flogin%2Foidc&response_type=code&scope=profile+email+groups+openid&state=MTY1MjcyODM0NnxFd3dBRUZCWlN6ZDJiWFZpTWxnNGRWaEpkVFk9fFIBBaMr4YRh4f0g-Q7-ZeT_vBPw2OWyqQUXSI7asDpv
< date: Mon, 16 May 2022 19:12:26 GMT
< x-envoy-upstream-service-time: 1
< server: istio-envoy
< content-length: 0
< 
 Connection #0 to host 192.168.64.16 left intact
19:14:21
@_slack_kubeflow_U03GCSE8U48:matrix.org_slack_kubeflow_U03GCSE8U48 joined the room.20:45:52
@_slack_kubeflow_U02LE3KB53M:matrix.orgVivian Pan Doe anyone how the version control works for this branch, where is the image of the kserve controller deployed to? 22:46:52
17 May 2022
@_slack_kubeflow_U03CN7QAHN3:matrix.orgzorba(손주형) I have a question about how Kserve expose InferenceService. 02:22:24
@_slack_kubeflow_UM56LA7N3:matrix.orgBenjamin Tan Yeah because you're using Dex, you'll need to specify session cookies 04:00:11
@_slack_kubeflow_UM56LA7N3:matrix.orgBenjamin Tan one sec lemme find the docs 04:00:17
@_slack_kubeflow_UM56LA7N3:matrix.orgBenjamin Tan https://github.com/kserve/kserve/tree/master/docs/samples/istio-dex 04:00:45
@_slack_kubeflow_U03CN7QAHN3:matrix.orgzorba(손주형) What happens if I put more than 1 container in Transformer yaml??
apiVersion: serving.kserve.io/v1beta1
kind: InferenceService
metadata:
  name: torch-transformer
spec:
  predictor:
    model:
      modelFormat:
        name: pytorch
      storageUri:  gs://kfserving-examples/models/torchserve/image_classifier 
  transformer:
    containers:
      - image: kserve/image-transformer:latest
        name: kserve-container
        command:
          - "python"
          - "-m"
          - "model"
        args:
          - --model_name
          - mnist
08:40:29
@_slack_kubeflow_UFVUV2UFP:matrix.orgDan Sun You can horizontally scale up ingress gateway as you want, it is a reverse proxy so it can process ~1000/s for each replica 08:58:10
@_slack_kubeflow_U032CM1LH3N:matrix.orgSaurabh Agarwalimage.png
Download image.png
12:11:44
@_slack_kubeflow_U032CM1LH3N:matrix.orgSaurabh Agarwal facing the issue where the inferenceservice is created but never gets into running(respective pod is not initializing) 12:11:44
@_slack_kubeflow_UAYJJUQJZ:matrix.orgtheofpa https://twitter.com/techatbloomberg/status/1526554734680850432?s=21&t=_gw_HEdPK0TMgWhEnvtYxg 14:23:10
@_slack_kubeflow_U02Q0DARM8B:matrix.org_slack_kubeflow_U02Q0DARM8B joined the room.15:46:18
@_slack_kubeflow_U0315UY2WRM:matrix.orgShri Javadekar I've see the transformer (or even the predictor) components get translated into Kubernetes deployments. So, it could become a deployment with two containers (unless there are checks in place to prevent it). Care to try it out and let us know? 🙂 16:01:07
@_slack_kubeflow_U02PHBULPDZ:matrix.orgDiego Kiner It's also not clear how to actually make a request to a graph - there doesn't appear to be any extra virtual service stood up to provide an entrypoint? 19:29:15
@_slack_kubeflow_U036ZCFAFLP:matrix.orgJonny Browning changed their display name from _slack_kubeflow_U036ZCFAFLP to Jonny Browning.22:20:28
@_slack_kubeflow_U036ZCFAFLP:matrix.orgJonny Browning set a profile picture.22:20:33
@_slack_kubeflow_U022U7KG24W:matrix.orgRachit Chauhan (commenting to follow the thread) 22:58:55
18 May 2022
@_slack_kubeflow_UFVUV2UFP:matrix.orgDan Sun Diego Kiner We create graph orchestrator as a service to chain the requests, so the entry point is the graph orchestrator service 00:27:36
@_slack_kubeflow_U02PHBULPDZ:matrix.orgDiego Kiner Ok, we're trying to work off of the changes in the PR to get it working - how is the orchestrator created? 00:28:32
@_slack_kubeflow_UFVUV2UFP:matrix.orgDan Sun what’s the use case of having two containers on transformer? 00:29:57
@_slack_kubeflow_UFVUV2UFP:matrix.orgDan Sun there is a docker file on the PR to build the graph orchestrator image 00:32:10

Show newer messages


Back to Room ListRoom Version: 6