4 Dec 2021 |
Dan Sun | In reply toundefined
(edited) ... gRPC with more ... => ... gRPC for more ... | 00:04:10 |
Alexandre Brown | In reply to@_slack_kubeflow_U02NN0J9K5G:matrix.org Thanks for the info Alexandre Brown. My question is when we use the GPU on Kubernetes cluster, it won't be serverless (always running) right?
I want to trigger GPU-on only on when it's event handler or API endpoint is trigged. Yes that is totally possible and what KServe is for.
By setting the min replica to 0, KServe will automatically scale down to 0 the pods of your model server. So there's really not much configuration, it's a 1 liner in the inference service definition.
See https://kserve.github.io/website/modelserving/autoscaling/autoscaling/#enable-scale-down-to-zero
Now, this gives us a scale to and from 0 pod, but if you want your GPU node to scale to and from 0, that's outside the scope of KServe. You handle that at the cluster level. So you must use kubernetes auto scaler. Here is the setup for AWS. https://docs.aws.amazon.com/eks/latest/userguide/autoscaling.html#cluster-autoscaler
In your cluster configuration you set the min to 0 for your gpu node and then setup autoscaler.
Check your cloud provider doc about auto scaling for more details but fir aws it's not too complicated if you follow the doc. | 00:38:30 |
Alexandre Brown | In reply toundefined
(edited) ... the doc. => ... the doc.
With both of these points covered, you'll have a gpu node with a min of 0 and not running initially. Then when a request comes in, kserve will try to schedule a pod for the model server. The pod will be in pending because the GPU nodes running is : 0. The autoscaler will react and start a GPU node. The GPU node running is now : 1. The autoscaler can now schedule the pending pod to the GPU node.
Once the request is over, since we set a minReplicas of 0 in the inference service definition. KServe will automatically scale down the model server pod from 1 to 0.
After X minutes (configurable), the autoscaler will realize that the GPU node has no running pod meaning it can scale it down from 1 to 0.
And voilà, that's the jist of it | 00:43:42 |
Alexandre Brown | In reply toundefined
(edited) Yes that ... => Amit Singh Yes that ... | 00:44:07 |
Alexandre Brown | In reply toundefined
(edited) ... but fir aws ... => ... but for aws ... | 00:44:53 |
Alexandre Brown | In reply toundefined
(edited) ... pod to ... => ... pod (model server) to ... | 00:45:44 |
Alexandre Brown | In reply toundefined
(edited) ... is for.
By ... => ... is for (being serverless).
By ... | 00:48:56 |
Alexandre Brown | In reply toundefined
(edited) ... Yes that is totally possible and what KServe is for (being serverless).
By ... => ... Yes Serverless that is totally possible and what KServe is for.
By ... | 00:49:11 |
Alexandre Brown | In reply toundefined
(edited) ... And voilà, that's the jist of it => ... And voilà | 15:00:36 |
5 Dec 2021 |
Alexandre Brown | In reply toundefined
(edited) ... to 0.
And voilà => ... to 0. | 21:13:43 |
7 Dec 2021 |
| _slack_kubeflow_U02NWEG4PD1 joined the room. | 07:44:58 |
| Dimitris Poulopoulos joined the room. | 10:07:09 |
| Dimitris Poulopoulos changed their display name from _slack_kubeflow_U019640DQ06 to Dimitris Poulopoulos. | 10:24:41 |
| Dimitris Poulopoulos set a profile picture. | 10:24:43 |
Dimitris Poulopoulos | Hello to the community. We want to serve a TensorFlow Recommenders model, which contains a ScaNN layer (https://www.tensorflow.org/recommenders/api_docs/python/tfrs/layers/factorized_top_k/ScaNN). Is there out-of-the-box support on TFServing, Triton, or Seldon MLServer backend for this? | 10:24:43 |
Benjamin Tan | In reply to@_slack_kubeflow_U019640DQ06:matrix.org Hello to the community. We want to serve a TensorFlow Recommenders model, which contains a ScaNN layer (https://www.tensorflow.org/recommenders/api_docs/python/tfrs/layers/factorized_top_k/ScaNN). Is there out-of-the-box support on TFServing, Triton, or Seldon MLServer backend for this? If you can convert it to a SavedModel , then TFServing can serve it | 10:33:02 |
8 Dec 2021 |
Sidhartha Panigrahi | (edited) ... on Kubeflow,
Su => ... on Kubeflow, | 05:02:58 |
| Yoshihiro NISHIWAKI joined the room. | 07:25:13 |
| Yoshihiro NISHIWAKI changed their display name from _slack_kubeflow_U02B94TFPCJ to Yoshihiro NISHIWAKI. | 07:36:03 |
| Yoshihiro NISHIWAKI set a profile picture. | 07:36:05 |
Yoshihiro NISHIWAKI | Hi.
I’m trying to build a custom model image with docker build and I’m getting the following error.
docker build -t username/custom -f python/custom_model.Dockerfile python
[+] Building 7.4s (8/10)
=> [internal] load build definition from custom_model.Dockerfile 0.0s
=> => transferring dockerfile: 50B 0.0s
=> [internal] load .dockerignore 0.0s
=> => transferring context: 2B 0.0s
=> [internal] load metadata for docker.io/library/python:3.7-slim 0.8s
=> [internal] load build context 0.0s
=> => transferring context: 12.03kB 0.0s
=> [1/6] FROM docker.io/library/python:3.7-slim@sha256:9e51c1a3fea7e0a2b93df2538c02f1afe31d2c69b10d6dcbd372c10c72b325aa 0.0s
=> CACHED [2/6] COPY custom_model custom_model 0.0s
=> CACHED [3/6] COPY kserve kserve 0.0s
=> ERROR [4/6] RUN pip install --upgrade pip && pip install -e ./kserve 6.5s
------
> [4/6] RUN pip install --upgrade pip && pip install -e ./kserve:
#8 0.864 Requirement already satisfied: pip in /usr/local/lib/python3.7/site-packages (21.2.4)
#8 0.958 Collecting pip
#8 1.012 Downloading pip-21.3.1-py3-none-any.whl (1.7 MB)
#8 1.104 Installing collected packages: pip
#8 1.104 Attempting uninstall: pip
#8 1.104 Found existing installation: pip 21.2.4
#8 1.170 Uninstalling pip-21.2.4:
#8 1.269 Successfully uninstalled pip-21.2.4
#8 1.727 Successfully installed pip-21.3.1
#8 1.727 WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
#8 1.962 Obtaining file:///kserve
#8 1.962 Preparing metadata (setup.py): started
#8 2.146 Preparing metadata (setup.py): finished with status 'done'
#8 2.230 Collecting certifi>=14.05.14
#8 2.270 Downloading certifi-2021.10.8-py2.py3-none-any.whl (149 kB)
#8 2.310 Collecting six>=1.15
#8 2.318 Downloading six-1.16.0-py2.py3-none-any.whl (11 kB)
#8 2.341 Collecting python_dateutil>=2.5.3
#8 2.353 Downloading python_dateutil-2.8.2-py2.py3-none-any.whl (247 kB)
#8 2.364 Requirement already satisfied: setuptools>=21.0.0 in /usr/local/lib/python3.7/site-packages (from kserve==0.7.0) (57.5.0)
#8 2.396 Collecting urllib3>=1.15.1
#8 2.405 Downloading urllib3-1.26.7-py2.py3-none-any.whl (138 kB)
#8 2.439 Collecting kubernetes>=12.0.0
#8 2.450 Downloading kubernetes-20.13.0-py2.py3-none-any.whl (1.8 MB)
#8 2.567 Collecting tornado>=6.0.0
#8 2.578 Downloading tornado-6.1-cp37-cp37m-manylinux2014_aarch64.whl (428 kB)
#8 2.607 Collecting argparse>=1.4.0
#8 2.615 Downloading argparse-1.4.0-py2.py3-none-any.whl (23 kB)
#8 2.653 Collecting minio 7.0.0, =4.0.9
#8 2.663 Downloading minio-6.0.2-py2.py3-none-any.whl (73 kB)
#8 2.706 Collecting google-cloud-storage==1.41.1
#8 2.719 Downloading google_cloud_storage-1.41.1-py2.py3-none-any.whl (105 kB)
#8 2.738 Collecting adal>=1.2.2
#8 2.747 Downloading adal-1.2.7-py2.py3-none-any.whl (55 kB)
#8 2.763 Collecting table_logger>=0.3.5
#8 2.771 Downloading table_logger-0.3.6-py3-none-any.whl (14 kB)
#8 3.000 Collecting numpy>=1.17.3
#8 3.010 Downloading numpy-1.21.4-cp37-cp37m-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (13.0 MB)
#8 3.475 Collecting azure-storage-blob==12.8.1
#8 3.485 Downloading azure_storage_blob-12.8.1-py2.py3-none-any.whl (345 kB)
#8 3.515 Collecting azure-identity>=1.6.0
#8 3.527 Downloading azure_identity-1.7.1-py2.py3-none-any.whl (129 kB)
#8 3.545 Collecting cloudevents>=1.2.0
#8 3.553 Downloading cloudevents-1.2.0-py3-none-any.whl (26 kB)
#8 3.571 Collecting avro>=1.10.1
#8 3.603 Downloading avro-1.11.0.tar.gz (83 kB)
#8 3.676 Installing build dependencies: started
#8 4.841 Installing build dependencies: finished with status 'done'
#8 4.846 Getting requirements to build wheel: started
#8 4.948 Getting requirements to build wheel: finished with status 'done'
#8 4.950 Preparing metadata (pyproject.toml): started
#8 5.053 Preparing metadata (pyproject.toml): finished with status 'done'
#8 5.303 Collecting boto3==1.18.18
#8 5.315 Downloading boto3-1.18.18-py3-none-any.whl (131 kB)
#8 5.650 Collecting botocore==1.21.18
#8 5.665 Downloading botocore-1.21.18-py3-none-any.whl (7.8 MB)
#8 6.061 Collecting psutil>=5.0
#8 6.071 Downloading psutil-5.8.0.tar.gz (470 kB)
#8 6.121 Preparing metadata (setup.py): started
#8 6.255 Preparing metadata (setup.py): finished with status 'done'
#8 6.317 ERROR: Could not find a version that satisfies the requirement ray[serve]==1.5.0 (from kserve) (from versions: none)
#8 6.317 ERROR: No matching distribution found for ray[serve]==1.5.0
------
executor failed running [/bin/sh -c pip install --upgrade pip && pip install -e ./kserve]: exit code: 1
| 07:36:05 |
Yoshihiro NISHIWAKI | In reply to@_slack_kubeflow_U02B94TFPCJ:matrix.org Hi.
I’m trying to build a custom model image with docker build and I’m getting the following error.
docker build -t username/custom -f python/custom_model.Dockerfile python
[+] Building 7.4s (8/10)
=> [internal] load build definition from custom_model.Dockerfile 0.0s
=> => transferring dockerfile: 50B 0.0s
=> [internal] load .dockerignore 0.0s
=> => transferring context: 2B 0.0s
=> [internal] load metadata for docker.io/library/python:3.7-slim 0.8s
=> [internal] load build context 0.0s
=> => transferring context: 12.03kB 0.0s
=> [1/6] FROM docker.io/library/python:3.7-slim@sha256:9e51c1a3fea7e0a2b93df2538c02f1afe31d2c69b10d6dcbd372c10c72b325aa 0.0s
=> CACHED [2/6] COPY custom_model custom_model 0.0s
=> CACHED [3/6] COPY kserve kserve 0.0s
=> ERROR [4/6] RUN pip install --upgrade pip && pip install -e ./kserve 6.5s
------
> [4/6] RUN pip install --upgrade pip && pip install -e ./kserve:
#8 0.864 Requirement already satisfied: pip in /usr/local/lib/python3.7/site-packages (21.2.4)
#8 0.958 Collecting pip
#8 1.012 Downloading pip-21.3.1-py3-none-any.whl (1.7 MB)
#8 1.104 Installing collected packages: pip
#8 1.104 Attempting uninstall: pip
#8 1.104 Found existing installation: pip 21.2.4
#8 1.170 Uninstalling pip-21.2.4:
#8 1.269 Successfully uninstalled pip-21.2.4
#8 1.727 Successfully installed pip-21.3.1
#8 1.727 WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
#8 1.962 Obtaining file:///kserve
#8 1.962 Preparing metadata (setup.py): started
#8 2.146 Preparing metadata (setup.py): finished with status 'done'
#8 2.230 Collecting certifi>=14.05.14
#8 2.270 Downloading certifi-2021.10.8-py2.py3-none-any.whl (149 kB)
#8 2.310 Collecting six>=1.15
#8 2.318 Downloading six-1.16.0-py2.py3-none-any.whl (11 kB)
#8 2.341 Collecting python_dateutil>=2.5.3
#8 2.353 Downloading python_dateutil-2.8.2-py2.py3-none-any.whl (247 kB)
#8 2.364 Requirement already satisfied: setuptools>=21.0.0 in /usr/local/lib/python3.7/site-packages (from kserve==0.7.0) (57.5.0)
#8 2.396 Collecting urllib3>=1.15.1
#8 2.405 Downloading urllib3-1.26.7-py2.py3-none-any.whl (138 kB)
#8 2.439 Collecting kubernetes>=12.0.0
#8 2.450 Downloading kubernetes-20.13.0-py2.py3-none-any.whl (1.8 MB)
#8 2.567 Collecting tornado>=6.0.0
#8 2.578 Downloading tornado-6.1-cp37-cp37m-manylinux2014_aarch64.whl (428 kB)
#8 2.607 Collecting argparse>=1.4.0
#8 2.615 Downloading argparse-1.4.0-py2.py3-none-any.whl (23 kB)
#8 2.653 Collecting minio 7.0.0, =4.0.9
#8 2.663 Downloading minio-6.0.2-py2.py3-none-any.whl (73 kB)
#8 2.706 Collecting google-cloud-storage==1.41.1
#8 2.719 Downloading google_cloud_storage-1.41.1-py2.py3-none-any.whl (105 kB)
#8 2.738 Collecting adal>=1.2.2
#8 2.747 Downloading adal-1.2.7-py2.py3-none-any.whl (55 kB)
#8 2.763 Collecting table_logger>=0.3.5
#8 2.771 Downloading table_logger-0.3.6-py3-none-any.whl (14 kB)
#8 3.000 Collecting numpy>=1.17.3
#8 3.010 Downloading numpy-1.21.4-cp37-cp37m-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (13.0 MB)
#8 3.475 Collecting azure-storage-blob==12.8.1
#8 3.485 Downloading azure_storage_blob-12.8.1-py2.py3-none-any.whl (345 kB)
#8 3.515 Collecting azure-identity>=1.6.0
#8 3.527 Downloading azure_identity-1.7.1-py2.py3-none-any.whl (129 kB)
#8 3.545 Collecting cloudevents>=1.2.0
#8 3.553 Downloading cloudevents-1.2.0-py3-none-any.whl (26 kB)
#8 3.571 Collecting avro>=1.10.1
#8 3.603 Downloading avro-1.11.0.tar.gz (83 kB)
#8 3.676 Installing build dependencies: started
#8 4.841 Installing build dependencies: finished with status 'done'
#8 4.846 Getting requirements to build wheel: started
#8 4.948 Getting requirements to build wheel: finished with status 'done'
#8 4.950 Preparing metadata (pyproject.toml): started
#8 5.053 Preparing metadata (pyproject.toml): finished with status 'done'
#8 5.303 Collecting boto3==1.18.18
#8 5.315 Downloading boto3-1.18.18-py3-none-any.whl (131 kB)
#8 5.650 Collecting botocore==1.21.18
#8 5.665 Downloading botocore-1.21.18-py3-none-any.whl (7.8 MB)
#8 6.061 Collecting psutil>=5.0
#8 6.071 Downloading psutil-5.8.0.tar.gz (470 kB)
#8 6.121 Preparing metadata (setup.py): started
#8 6.255 Preparing metadata (setup.py): finished with status 'done'
#8 6.317 ERROR: Could not find a version that satisfies the requirement ray[serve]==1.5.0 (from kserve) (from versions: none)
#8 6.317 ERROR: No matching distribution found for ray[serve]==1.5.0
------
executor failed running [/bin/sh -c pip install --upgrade pip && pip install -e ./kserve]: exit code: 1
Buildpacks also doesnt work.
pack build --builder=heroku/buildpacks:20 username/custom-model:v1
20: Pulling from heroku/buildpacks
Digest: sha256:09935c3a5011d5c5720b7c6cb56ca2cd8d4a042808edfc15f4c3adc91469894b
Status: Image is up to date for heroku/buildpacks:20
20: Pulling from heroku/pack
Digest: sha256:b5a4da988ac2918ba50d9ab8e6ab685acf18a34bfef63714d52c6e6266237e66
Status: Image is up to date for heroku/pack:20
===> DETECTING
heroku/go 0.3.1
heroku/procfile 0.6.2
===> ANALYZING
Previous image with name "nishiwakidf/custom-model:v1" not found
===> RESTORING
===> BUILDING
-----> Fetching jq... done
-----> Fetching stdlib.sh.v8... done
----->
Detected go modules via go.mod
----->
Detected Module Name: github.com/kserve/kserve
----->
!! The go.mod file for this project does not specify a Go version
!!
!! Defaulting to go1.12.17
!!
!! For more details see: https://devcenter.heroku.com/articles/go-apps-with-modules#build-configuration
!!
-----> New Go Version, clearing old cache
-----> Installing go1.12.17
-----> Fetching go1.12.17.linux-amd64.tar.gz... done
-----> Determining packages to install
| 07:40:09 |
Yoshihiro NISHIWAKI | In reply toundefined
(edited) ... heroku/go 0.3.1
heroku/procfile 0.6.2
===> ANALYZING
Previous image with name "nishiwakidf/custom-model:v1" not found
===> RESTORING
===> BUILDING
-----> Fetching jq... done
-----> Fetching stdlib.sh.v8... done
----->
Detected go modules via go.mod
----->
Detected Module Name: <http://github.com/kserve/kserve|github.com/kserve/kserve>
----->
!! The go.mod file for this project does not specify a Go version
!!
!! Defaulting to go1.12.17
!!
!! For more details see: <https://devcenter.heroku.com/articles/go-apps-with-modules#build-configuration>
!!
-----> New Go Version, clearing old cache
-----> Installing go1.12.17
-----> Fetching go1.12.17.linux-amd64.tar.gz... done
-----> Determining packages to install```
=> ... heroku/go 0.3.1
heroku/procfile 0.6.2
===> ANALYZING
Previous image with name "nishiwakidf/custom-model:v1" not found
===> RESTORING
===> BUILDING
-----> Fetching jq... done
-----> Fetching stdlib.sh.v8... done
----->
Detected go modules via go.mod
----->
Detected Module Name: <http://github.com/kserve/kserve|github.com/kserve/kserve>
----->
!! The go.mod file for this project does not specify a Go version
!!
!! Defaulting to go1.12.17
!!
!! For more details see: <https://devcenter.heroku.com/articles/go-apps-with-modules#build-configuration>
!!
-----> New Go Version, clearing old cache
-----> Installing go1.12.17
-----> Fetching go1.12.17.linux-amd64.tar.gz... done
-----> Determining packages to install
ERROR: failed to build: exit status 1
ERROR: failed to build: executing lifecycle: failed with status code: 51``` | 07:55:12 |
Bhagat Khemchandani | In reply to@_slack_kubeflow_U02B94TFPCJ:matrix.org Buildpacks also doesnt work.
pack build --builder=heroku/buildpacks:20 username/custom-model:v1
20: Pulling from heroku/buildpacks
Digest: sha256:09935c3a5011d5c5720b7c6cb56ca2cd8d4a042808edfc15f4c3adc91469894b
Status: Image is up to date for heroku/buildpacks:20
20: Pulling from heroku/pack
Digest: sha256:b5a4da988ac2918ba50d9ab8e6ab685acf18a34bfef63714d52c6e6266237e66
Status: Image is up to date for heroku/pack:20
===> DETECTING
heroku/go 0.3.1
heroku/procfile 0.6.2
===> ANALYZING
Previous image with name "nishiwakidf/custom-model:v1" not found
===> RESTORING
===> BUILDING
-----> Fetching jq... done
-----> Fetching stdlib.sh.v8... done
----->
Detected go modules via go.mod
----->
Detected Module Name: github.com/kserve/kserve
----->
!! The go.mod file for this project does not specify a Go version
!!
!! Defaulting to go1.12.17
!!
!! For more details see: https://devcenter.heroku.com/articles/go-apps-with-modules#build-configuration
!!
-----> New Go Version, clearing old cache
-----> Installing go1.12.17
-----> Fetching go1.12.17.linux-amd64.tar.gz... done
-----> Determining packages to install
ERROR: failed to build: exit status 1
ERROR: failed to build: executing lifecycle: failed with status code: 51 mind sharing Dockerfile here , or on personal chat ? | 08:35:03 |
Bhagat Khemchandani | In reply to@_slack_kubeflow_U026RKS3A87:matrix.org mind sharing Dockerfile here , or on personal chat ? to understand this better, please send across the git repo/url of the sample being referred by you | 09:01:27 |
| _slack_kubeflow_U02PXTTNGKC joined the room. | 09:01:39 |
Yoshihiro NISHIWAKI | In reply to@_slack_kubeflow_U026RKS3A87:matrix.org to understand this better, please send across the git repo/url of the sample being referred by you here is the url.
https://github.com/kserve/kserve | 09:03:49 |
Mark Winter | In reply to@_slack_kubeflow_U02B94TFPCJ:matrix.org here is the url.
https://github.com/kserve/kserve ray==1.5.0 doesn't have a release for M1 unfortunately | 12:24:47 |
Mark Winter | In reply to@_slack_kubeflow_U01T25HRREK:matrix.org ray==1.5.0 doesn't have a release for M1 unfortunately They started doing arm64 builds from ray 1.8 or something like that | 12:25:01 |
Mark Winter | In reply to@_slack_kubeflow_U01T25HRREK:matrix.org They started doing arm64 builds from ray 1.8 or something like that Maybe we can get ray updated in kserve | 12:25:27 |