16 May 2022
@_slack_kubeflow_U02S0UGJWCV:matrix.orgwenyang zhou I have release several revision in a period of time. 03:09:02
@_slack_kubeflow_U02S0UGJWCV:matrix.orgwenyang zhou The old revision can not terminate before it finish pending . 03:09:47
@_slack_kubeflow_U02S0UGJWCV:matrix.orgwenyang zhou And the new revision can not allocate enough resource before the old revision release resource. 03:10:33
@_slack_kubeflow_U02S0UGJWCV:matrix.orgwenyang zhou Could we improve this problem ? 03:10:54
@_slack_kubeflow_U02S0UGJWCV:matrix.orgwenyang zhou Current revision is 15 with 100% traffic percent, and I have roll out the revision 16 with 100% this time. 03:14:19
@_slack_kubeflow_U02S0UGJWCV:matrix.orgwenyang zhou After a few minutes, the revision 16 has release resources. But 13 and 17 keep it. 03:16:32
@_slack_kubeflow_U02S0UGJWCV:matrix.orgwenyang zhou Is this a knative problem? 03:23:19
@_slack_kubeflow_U02S0UGJWCV:matrix.orgwenyang zhou截屏2022-05-16 上午11.32.16.png
@_slack_kubeflow_U02S0UGJWCV:matrix.orgwenyang zhou So confused! 03:32:58
@_slack_kubeflow_U02S0UGJWCV:matrix.orgwenyang zhou Revision 13 can not terminate~ 03:33:31
@_slack_kubeflow_U032CM1LH3N:matrix.orgSaurabh Agarwal Shri Javadekar host not found error 06:27:47
@_slack_kubeflow_U032CM1LH3N:matrix.orgSaurabh Agarwal v0.7.0 this is the version 07:36:59
@_slack_kubeflow_U02R5JH4KNK:matrix.orgCesar Flores Hi everyone, is anyone using a way to define the output of the models for downstream applications? like for example having openAPI so the consumers of the models know what the ouutput of the response will be??? 16:30:46
@_slack_kubeflow_U02R5JH4KNK:matrix.orgCesar Flores this is pretty annoying, but if it helps in any way, I solved my issue by using the Host: as the url that appears when you run kubectl get ksvc -n namespace 16:31:53
@_slack_kubeflow_U02AYBVSLSK:matrix.orgAlexandre Brown Saurabh Agarwal Make sure to add the host to the headers 16:32:43
@_slack_kubeflow_U02R5JH4KNK:matrix.orgCesar Flores thanks! yes I meant that the host in the headers shoudl be the URL that appeas in the ksvc type in the cluster 16:33:35
@_slack_kubeflow_U02AYBVSLSK:matrix.orgAlexandre Brown I think you can even omit the http part, I have the host as this in my working sample :
host = f"{model_server_name}.{stage}.svc.cluster.local"
@_slack_kubeflow_U0315UY2WRM:matrix.orgShri Javadekar Also, are you calling this endpoint from inside the cluster? These are my python code snippets for calling from inside the cluster. Alexandre Brown and Cesar Flores are correct that the HOST header needs to be set.
def set_header(hostname):
    headers = {
        'Host': hostname

    return headers

class ProxyHTTPRequestHandler(BaseHTTPRequestHandler):
    protocol_version = 'HTTP/1.1'
    def req_send(self, url, hostname):
        content_len = int(self.headers.get('content-length', 0))
        post_body = self.rfile.read(content_len)
        req_header = self.parse_headers()

        resp = requests.post(url, data=post_body, headers=merge_two_dicts(req_header, set_header(hostname)), verify=False)
        return resp
url = "http://my-model-a.kserve-test.svc.cluster.local/v1/models/my-model-a:predict"
resp = self.req_send(url, hostname)
@_slack_kubeflow_U0315UY2WRM:matrix.orgShri Javadekar Saurabh Agarwal, if this still doesn't work, feel free to DM me and we can setup a zoom call to go over it. I'm no expert.. but happy to see it with a fresh perspective. If we do get the solution, we can update this thread so that others learn from it too.. 16:41:38
@_slack_kubeflow_U032CM1LH3N:matrix.orgSaurabh Agarwal sure , Thanks. let me try ways 16:42:06
@_slack_kubeflow_U02AYBVSLSK:matrix.orgAlexandre Brown Also keep in mind that depending on your model server runtime protocol you will have a different url. KServe V1 protocol uses
KServe V2 protocol uses
@_slack_kubeflow_U032CM1LH3N:matrix.orgSaurabh Agarwal using v1 only 16:42:48
@_slack_kubeflow_U0315UY2WRM:matrix.orgShri Javadekar An KServe inferenceservice provides lot more capabilities out of the box: • (option of) automatic logging of inputs and outputs • scalable pre and post processing options • ability to add explainers It's still in the works, but ability to do A/B testing, champion-challenger, etc. (with InferenceGraph CRDs). Yes, one could setup all of these things themselves using some Kubernetes primitives or one's own code. But then, you need to maintain that code forever and sometimes you might lag behind the open-source features and have to figure out plans of migrations, etc. IMHO, nothing right or wrong about any of the methods. Just each team's preference. Best option (IMO) is to use open-source and contribute bug fixes/features to the open-source which is a win-win for everyone. 17:16:37
@_slack_kubeflow_U0315UY2WRM:matrix.orgShri Javadekar I've been thinking about this problem too. I wanted to consume the output of one of my models in my own logger service and have the model's confidence scores exported as Prometheus metrics. But, not sure how I could do that across models which could have different types of outputs. If there was a way to declare the output format, I could parse that in the logger service and accordingly look for the confidence scores. 17:19:44
@_slack_kubeflow_U035NJU7P9N:matrix.orgRoman Solovyev Thanks 17:22:36
@_slack_kubeflow_U03D067RTJN:matrix.orgRyan McCaffrey Hi everyone, I'm new to KServe and trying to start by running things locally. I am running minikube on my laptop and have done the full Kubeflow v1.4 install successfully with the appropriate dependencies versions shown here: https://www.kubeflow.org/docs/releases/kubeflow-1.4/ I am trying to serve my first sklearn example model locally and have followed these instructions: Istio gateway/ingress setup: https://istio.io/latest/docs/tasks/traffic-management/ingress/ingress-control/ Running a prediction: https://github.com/kserve/kserve/tree/release-0.6/docs/samples/v1beta1/sklearn/v1 And for some reason I keep getting 302 errors, does has anyone encountered this before?:
SERVICE_HOSTNAME=$(kubectl get inferenceservice sklearn-iris -o jsonpath='{.status.url}' , cut -d "/" -f 3)
curl -v -H "Host: ${SERVICE_HOSTNAME}" http://${INGRESS_HOST}:${INGRESS_PORT}/v1/models/$MODEL_NAME:predict -d $INPUT_PATH
 Connected to  ( ) port 31781 (#0)
> POST /v1/models/sklearn-iris:predict HTTP/1.1
> Host: sklearn-iris.default.example.com
> User-Agent: curl/7.78.0
> Accept: /
> Content-Length: 76
> Content-Type: application/x-www-form-urlencoded
 Mark bundle as not supporting multiuse
< HTTP/1.1 302 Found
< location: /dex/auth?client_id=kubeflow-oidc-authservice&redirect_uri=%2Flogin%2Foidc&response_type=code&scope=profile+email+groups+openid&state=MTY1MjcyODM0NnxFd3dBRUZCWlN6ZDJiWFZpTWxnNGRWaEpkVFk9fFIBBaMr4YRh4f0g-Q7-ZeT_vBPw2OWyqQUXSI7asDpv
< date: Mon, 16 May 2022 19:12:26 GMT
< x-envoy-upstream-service-time: 1
< server: istio-envoy
< content-length: 0
 Connection #0 to host left intact
