KFServing Python SDK can be installed by `pip` or `Setuptools`.
### pip install
```sh
pip install kfserving
```
### Setuptools
Install via [Setuptools](http://pypi.python.org/pypi/setuptools).
```sh
python setup.py install--user
```
(or `sudo python setup.py install` to install the package for all users)
## KFServing Server
KFServing's python server libraries implement a standardized KFServing library that is extended by model serving frameworks such as Scikit Learn, XGBoost and PyTorch. It encapsulates data plane API definitions and storage retrieval for models.
KFServing provides many functionalities, including among others:
* Registering a model and starting the server
* Prediction Handler
* Liveness Handler
* Readiness Handlers
KFServing supports the following storage providers:
* Google Cloud Storage with a prefix: "gs://"
* By default, it uses `GOOGLE_APPLICATION_CREDENTIALS` environment variable for user authentication.
* If `GOOGLE_APPLICATION_CREDENTIALS` is not provided, anonymous client will be used to download the artifacts.
* S3 Compatible Object Storage with a prefix "s3://"
* By default, it uses `S3_ENDPOINT`, `AWS_ACCESS_KEY_ID`, and `AWS_SECRET_ACCESS_KEY` environment variables for user authentication.
* Azure Blob Storage with the format: "https://{$STORAGE_ACCOUNT_NAME}.blob.core.windows.net/{$CONTAINER}/{$PATH}"
* By default, it uses anonymous client to download the artifacts.
* For e.g. https://kfserving.blob.core.windows.net/triton/simple_string/
* Local filesystem either without any prefix or with a prefix "file://". For example:
* Absolute path: `/absolute/path` or `file:///absolute/path`
* Relative path: `relative/path` or `file://relative/path`
* For local filesystem, we recommended to use relative path without any prefix.
* Persistent Volume Claim (PVC) with the format "pvc://{$pvcname}/[path]".
* The `pvcname` is the name of the PVC that contains the model.
* The `[path]` is the relative path to the model on the PVC.
* For e.g. `pvc://mypvcname/model/path/on/pvc`
* Generic URI, over either `HTTP`, prefixed with `http://` or `HTTPS`, prefixed with `https://`. For example:
*`https://<some_url>.com/model.joblib`
*`http://<some_url>.com/model.joblib`
## KFServing Client
### Getting Started
KFServing's python client interacts with KFServing APIs for executing operations on a remote KFServing cluster, such as creating, patching and deleting of a InferenceService instance. See the [Sample for KFServing Python SDK Client](../../docs/samples/client/kfserving_sdk_sample.ipynb) to get started.
### Documentation for Client API
Class | Method | Description
------------ | ------------- | -------------
[KFServingClient](docs/KFServingClient.md) | [set_credentials](docs/KFServingClient.md#set_credentials) | Set Credentials|
[KFServingClient](docs/KFServingClient.md) | [get](docs/KFServingClient.md#get) | Get or watch the specified InferenceService or all InferenceServices in the namespace |
[KFServingClient](docs/KFServingClient.md) | [patch](docs/KFServingClient.md#patch) | Patch the specified InferenceService|
[KFServingClient](docs/KFServingClient.md) | [replace](docs/KFServingClient.md#replace) | Replace the specified InferenceService|
[KFServingClient](docs/KFServingClient.md) | [rollout_canary](docs/KFServingClient.md#rollout_canary) | Rollout the traffic on `canary` version for specified InferenceService|
[KFServingClient](docs/KFServingClient.md) | [promote](docs/KFServingClient.md#promote) | Promote the `canary` version of the InferenceService to `default`|
[KFServingClient](docs/KFServingClient.md) | [delete](docs/KFServingClient.md#delete) | Delete the specified InferenceService |
[KFServingClient](docs/KFServingClient.md) | [wait_isvc_ready](docs/KFServingClient.md#wait_isvc_ready) | Wait for the InferenceService to be ready |
[KFServingClient](docs/KFServingClient.md) | [is_isvc_ready](docs/KFServingClient.md#is_isvc_ready) | Check if the InferenceService is ready |
[[Back to Model list]](../README.md#documentation-for-models)[[Back to API list]](../README.md#documentation-for-api-endpoints) [[Back to README]](../README.md)
**last_transition_time** | [**KnativeVolatileTime**](KnativeVolatileTime.md) | LastTransitionTime is the last time the condition transitioned from one status to another. We use VolatileTime in place of metav1.Time to exclude this from creating equality.Semantic differences (all other things held constant). | [optional]
**message** | **str** | A human readable message indicating details about the transition. | [optional]
**reason** | **str** | The reason for the condition's last transition. | [optional]
**severity** | **str** | Severity with which to treat failures of this type of condition. When this is not specified, it defaults to Error. | [optional]
**status** | **str** | Status of the condition, one of True, False, Unknown. |
**type** | **str** | Type of condition. |
[[Back to Model list]](../README.md#documentation-for-models)[[Back to API list]](../README.md#documentation-for-api-endpoints) [[Back to README]](../README.md)
**force_query** | **bool** | encoded path hint (see EscapedPath method) |
**fragment** | **str** | encoded query values, without '?' |
**host** | **str** | username and password information |
**opaque** | **str** | |
**path** | **str** | host or host:port |
**raw_path** | **str** | path (relative paths may omit leading slash) |
**raw_query** | **str** | append a query ('?') even if RawQuery is empty |
**scheme** | **str** | |
**user** | [**NetUrlUserinfo**](NetUrlUserinfo.md) | encoded opaque data |
[[Back to Model list]](../README.md#documentation-for-models)[[Back to API list]](../README.md#documentation-for-api-endpoints) [[Back to README]](../README.md)
[[Back to Model list]](../README.md#documentation-for-models)[[Back to API list]](../README.md#documentation-for-api-endpoints) [[Back to README]](../README.md)
[[Back to Model list]](../README.md#documentation-for-models)[[Back to API list]](../README.md#documentation-for-api-endpoints) [[Back to README]](../README.md)
[[Back to Model list]](../README.md#documentation-for-models)[[Back to API list]](../README.md#documentation-for-api-endpoints) [[Back to README]](../README.md)
**resources** | [**V1ResourceRequirements**](https://github.com/kubernetes-client/python/blob/master/kubernetes/docs/V1ResourceRequirements.md) | Defaults to requests and limits of 1CPU, 2Gb MEM. | [optional]
**runtime_version** | **str** | Alibi docker image version which defaults to latest release | [optional]
**storage_uri** | **str** | The location of a trained explanation model | [optional]
**type** | **str** | The type of Alibi explainer |
[[Back to Model list]](../README.md#documentation-for-models)[[Back to API list]](../README.md#documentation-for-api-endpoints) [[Back to README]](../README.md)
**max_batch_size** | **int** | MaxBatchSize of batcher service | [optional]
**max_latency** | **int** | MaxLatency of batcher service | [optional]
**timeout** | **int** | Timeout of batcher service | [optional]
[[Back to Model list]](../README.md#documentation-for-models)[[Back to API list]](../README.md#documentation-for-api-endpoints) [[Back to README]](../README.md)
[[Back to Model list]](../README.md#documentation-for-models)[[Back to API list]](../README.md#documentation-for-api-endpoints) [[Back to README]](../README.md)
**max_replicas** | **int** | This is the up bound for autoscaler to scale to | [optional]
**min_replicas** | **int** | Minimum number of replicas which defaults to 1, when minReplicas = 0 pods scale down to 0 in case of no traffic | [optional]
**parallelism** | **int** | Parallelism specifies how many requests can be processed concurrently, this sets the hard limit of the container concurrency(https://knative.dev/docs/serving/autoscaling/concurrency). | [optional]
**service_account_name** | **str** | ServiceAccountName is the name of the ServiceAccount to use to run the service | [optional]
[[Back to Model list]](../README.md#documentation-for-models)[[Back to API list]](../README.md#documentation-for-api-endpoints) [[Back to README]](../README.md)
**explainer** | [**V1alpha2ExplainerSpec**](V1alpha2ExplainerSpec.md) | Explainer defines the model explanation service spec, explainer service calls to predictor or transformer if it is specified. | [optional]
**predictor** | [**V1alpha2PredictorSpec**](V1alpha2PredictorSpec.md) | Predictor defines the model serving spec |
**transformer** | [**V1alpha2TransformerSpec**](V1alpha2TransformerSpec.md) | Transformer defines the pre/post processing before and after the predictor call, transformer service calls to predictor service. | [optional]
[[Back to Model list]](../README.md#documentation-for-models)[[Back to API list]](../README.md#documentation-for-api-endpoints) [[Back to README]](../README.md)
**max_replicas** | **int** | This is the up bound for autoscaler to scale to | [optional]
**min_replicas** | **int** | Minimum number of replicas which defaults to 1, when minReplicas = 0 pods scale down to 0 in case of no traffic | [optional]
**parallelism** | **int** | Parallelism specifies how many requests can be processed concurrently, this sets the hard limit of the container concurrency(https://knative.dev/docs/serving/autoscaling/concurrency). | [optional]
**service_account_name** | **str** | ServiceAccountName is the name of the ServiceAccount to use to run the service | [optional]
[[Back to Model list]](../README.md#documentation-for-models)[[Back to API list]](../README.md#documentation-for-api-endpoints) [[Back to README]](../README.md)
**api_version** | **str** | APIVersion defines the versioned schema of this representation of an object. Servers should convert recognized schemas to the latest internal value, and may reject unrecognized values. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources | [optional]
**kind** | **str** | Kind is a string value representing the REST resource this object represents. Servers may infer this from the endpoint the client submits requests to. Cannot be updated. In CamelCase. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds | [optional]
[[Back to Model list]](../README.md#documentation-for-models)[[Back to API list]](../README.md#documentation-for-api-endpoints) [[Back to README]](../README.md)