Sender | Message | Time |
---|---|---|
16 May 2022 | ||
I have release several revision in a period of time. | 03:09:02 | |
The old revision can not terminate before it finish pending . | 03:09:47 | |
And the new revision can not allocate enough resource before the old revision release resource. | 03:10:33 | |
Could we improve this problem ? | 03:10:54 | |
Current revision is 15 with 100% traffic percent, and I have roll out the revision 16 with 100% this time. | 03:14:19 | |
After a few minutes, the revision 16 has release resources. But 13 and 17 keep it. | 03:16:32 | |
Is this a knative problem? | 03:23:19 | |
Download 截屏2022-05-16 上午11.32.16.png | 03:32:32 | |
So confused! | 03:32:58 | |
Revision 13 can not terminate~ | 03:33:31 | |
Shri Javadekar host not found error | 06:27:47 | |
v0.7.0 this is the version | 07:36:59 | |
11:11:19 | ||
16:30:27 | ||
Hi everyone, is anyone using a way to define the output of the models for downstream applications? like for example having openAPI so the consumers of the models know what the ouutput of the response will be??? | 16:30:46 | |
this is pretty annoying, but if it helps in any way, I solved my issue by using the Host: as the url that appears when you run kubectl get ksvc -n namespace | 16:31:53 | |
Saurabh Agarwal Make sure to add the host to the headers | 16:32:43 | |
thanks! yes I meant that the host in the headers shoudl be the URL that appeas in the ksvc type in the cluster | 16:33:35 | |
I think you can even omit the http part, I have the host as this in my working sample :
host = f"{model_server_name}.{stage}.svc.cluster.local" | 16:34:37 | |
Also, are you calling this endpoint from inside the cluster?
These are my python code snippets for calling from inside the cluster. Alexandre Brown and Cesar Flores are correct that the HOST header needs to be set.
def set_header(hostname): headers = { 'Host': hostname } return headers class ProxyHTTPRequestHandler(BaseHTTPRequestHandler): protocol_version = 'HTTP/1.1' def req_send(self, url, hostname): content_len = int(self.headers.get('content-length', 0)) post_body = self.rfile.read(content_len) req_header = self.parse_headers() resp = requests.post(url, data=post_body, headers=merge_two_dicts(req_header, set_header(hostname)), verify=False) return resp url = "http://my-model-a.kserve-test.svc.cluster.local/v1/models/my-model-a:predict" resp = self.req_send(url, hostname) print(resp.content) | 16:40:12 | |
Saurabh Agarwal, if this still doesn't work, feel free to DM me and we can setup a zoom call to go over it. I'm no expert.. but happy to see it with a fresh perspective. If we do get the solution, we can update this thread so that others learn from it too.. | 16:41:38 | |
sure , Thanks. let me try ways | 16:42:06 | |
Also keep in mind that depending on your model server runtime protocol you will have a different url.
KServe V1 protocol uses
/v1/models/model_name:predict"KServe V2 protocol uses /v2/models/model_name/infer | 16:42:16 | |
using v1 only | 16:42:48 | |
An KServe inferenceservice provides lot more capabilities out of the box: • (option of) automatic logging of inputs and outputs • scalable pre and post processing options • ability to add explainers It's still in the works, but ability to do A/B testing, champion-challenger, etc. (with InferenceGraph CRDs). Yes, one could setup all of these things themselves using some Kubernetes primitives or one's own code. But then, you need to maintain that code forever and sometimes you might lag behind the open-source features and have to figure out plans of migrations, etc. IMHO, nothing right or wrong about any of the methods. Just each team's preference. Best option (IMO) is to use open-source and contribute bug fixes/features to the open-source which is a win-win for everyone. | 17:16:37 | |
I've been thinking about this problem too. I wanted to consume the output of one of my models in my own logger service and have the model's confidence scores exported as Prometheus metrics. But, not sure how I could do that across models which could have different types of outputs.
If there was a way to declare the output format, I could parse that in the logger service and accordingly look for the confidence scores. | 17:19:44 | |
Thanks | 17:22:36 | |
Redacted or Malformed Event | 17:47:43 | |
Hi everyone, I'm new to KServe and trying to start by running things locally. I am running minikube on my laptop and have done the full Kubeflow v1.4 install successfully with the appropriate dependencies versions shown here: https://www.kubeflow.org/docs/releases/kubeflow-1.4/
I am trying to serve my first sklearn example model locally and have followed these instructions:
Istio gateway/ingress setup:
https://istio.io/latest/docs/tasks/traffic-management/ingress/ingress-control/
Running a prediction:
https://github.com/kserve/kserve/tree/release-0.6/docs/samples/v1beta1/sklearn/v1
And for some reason I keep getting 302 errors, does has anyone encountered this before?:
MODEL_NAME=sklearn-iris INPUT_PATH=@./iris-input.json SERVICE_HOSTNAME=$(kubectl get inferenceservice sklearn-iris -o jsonpath='{.status.url}' , cut -d "/" -f 3) curl -v -H "Host: ${SERVICE_HOSTNAME}" http://${INGRESS_HOST}:${INGRESS_PORT}/v1/models/$MODEL_NAME:predict -d $INPUT_PATH Trying 192.168.64.16:31781... Connected to 192.168.64.16 (192.168.64.16 ) port 31781 (#0) > POST /v1/models/sklearn-iris:predict HTTP/1.1 > Host: sklearn-iris.default.example.com > User-Agent: curl/7.78.0 > Accept: / > Content-Length: 76 > Content-Type: application/x-www-form-urlencoded > Mark bundle as not supporting multiuse < HTTP/1.1 302 Found < location: /dex/auth?client_id=kubeflow-oidc-authservice&redirect_uri=%2Flogin%2Foidc&response_type=code&scope=profile+email+groups+openid&state=MTY1MjcyODM0NnxFd3dBRUZCWlN6ZDJiWFZpTWxnNGRWaEpkVFk9fFIBBaMr4YRh4f0g-Q7-ZeT_vBPw2OWyqQUXSI7asDpv < date: Mon, 16 May 2022 19:12:26 GMT < x-envoy-upstream-service-time: 1 < server: istio-envoy < content-length: 0 < Connection #0 to host 192.168.64.16 left intact | 19:14:21 | |
20:45:52 |