Model

Model#

pai.model.container_serving_spec(command: str, image_uri: Union[str, ImageInfo], source_dir: Optional[str] = None, git_config: Optional[Dict[str, Any]] = None, port: Optional[int] = None, environment_variables: Optional[Dict[str, str]] = None, requirements: Optional[List[str]] = None, requirements_path: Optional[str] = None, health_check: Optional[Dict[str, Any]] = None, session: Optional[Session] = None) → InferenceSpec#

A convenient function to create an InferenceSpec instance that serving the model with given container and script.

Examples:

infer_spec: InferenceSpec = container_serving_spec(
    command="python run.py",
    source_dir="./model_server/",
    image_uri="<ServingImageUri>",
)

m = Model(
    model_data="oss://<YourOssBucket>/path/to/your/model",
    inference_spec=infer_spec,
)
m.deploy(
    instance_type="ecs.c6.xlarge"
)

参数:

command (str) -- The command used to launch the Model server.
source_dir (str) --
A relative path or an absolute path to the source code directory used to load model and launch the HTTP server, it will be uploaded to the OSS bucket and mounted to the container. If there is a requirements.txt file under the directory, it will be installed before the prediction server started.

If 'git_config' is provided, 'source_dir' should be a relative location to a directory in the Git repo. With the following GitHub repo directory structure:
```
|----- README.md
|----- src
            |----- train.py
            |----- test.py
```
if you need 'src' directory as the source code directory, you can assign source_dir='./src/'.
git_config (Dict[str, str]) --
Git configuration used to clone the repo. Including repo, branch, commit, username, password and token. The repo is required. All other fields are optional. repo specifies the Git repository. If you don't provide branch, the default value 'master' is used. If you don't provide commit, the latest commit in the specified branch is used. username, password and token are for authentication purpose. For example, the following config:
```
git_config = {
    'repo': 'https://github.com/modelscope/modelscope.git',
    'branch': 'master',
    'commit': '9bfc4a9d83c4beaf8378d0a186261ffc1cd9f960'
}
```
results in cloning the repo specified in 'repo', then checking out the 'master' branch, and checking out the specified commit.
image_uri (str) -- The Docker image used to run the prediction service.
port (int) -- Expose port of the server in container, the prediction request will be forward to the port. The environment variable LISTENING_PORT in the container will be set to this value. Default to 8000.
environment_variables (Dict[str, str], optional) -- Dictionary of environment variable key-value pairs to set on the running container.
requirements (List[str], optional) -- A list of Python package dependency, it will be installed before the serving container run.
requirements_path (str, optional) -- A absolute path to the requirements.txt in the container.
health_check (Dict[str, Any], optional) -- The health check configuration. If it not set, A TCP readiness probe will be used to check the health of the HTTP server.
session (Session, optional) -- A PAI session instance used for communicating with PAI service.

返回:

An InferenceSpec instance.

返回类型:

pai.model.InferenceSpec

class pai.model.ResourceConfig(cpu: int, memory: int, gpu: Optional[int] = None, gpu_memory: Optional[int] = None)#

A class that represents the resource used by a PAI prediction service instance.

ResourceConfig initializer.

The public resource group does not support requesting GPU resources with ResourceConfig. Use the 'gpu' and 'gpu_memory' parameter only for services deployed to dedicated resource groups that provide GPU machine instances.

参数:

cpu (int) -- The number of CPUs that each instance requires.
memory (int) -- The amount of memory that each instance requires, must be an integer, Unit: MB.
gpu (int) -- The number of GPUs that each instance requires.
gpu_memory (int) --
The amount of GPU memory that each instance requires. The value must be an integer, Unit: GB.

PAI allows memory resources of a GPU to be allocated to multiple instances. If you want multiple instances to share the memory resources of a GPU, set the gpu parameter to 0. If you set the gpu parameter to 1, each instance occupies a GPU and the gpu_memory parameter does not take effect.

备注

Important PAI does not enable the strict isolation of GPU memory. To prevent out of memory (OOM) errors, make sure that the GPU memory used by each instance does not exceed the requested amount.

class pai.model.Model(model_data: Optional[str] = None, inference_spec: Optional[InferenceSpec] = None, session: Optional[Session] = None)#

The Class representing a ready-to-deploy model.

A Model instance includes the model artifact path and information on how to create prediction service in PAI (specified by the inference_spec). By calling the model.deploy method, a prediction service is created in PAI and a pai.predictor.Predictor instance is returned that can be used to make prediction to the service.

Example:

from pai.model import Model
from pai.model import InferenceSpec

m: Model = Model(
    inference_spec=InferenceSpec(processor="xgboost"),
    model_data="oss://bucket-name/path/to/model",
)

# register model to PAI ModelRegistry
registered_model = m.register(
    model_name="example_xgb_model"
    version="1.0.0",
)

# Deploy to model to create a prediction service.
p: Predictor = m.deploy(
    service_name="xgb_model_service",
    instance_count=2,
    instance_type="ecs.c6.large",
    options={
        "metadata.rpc.batching": true,
        "metadata.rpc.keepalive": 10000
    }
)

# make a prediction by send the data to the prediction service.
result = p.predict([[2,3,4], [54.12, 2.9, 45.8]])

Model initializer.

参数:

model_data (str) -- An OSS URI or file path specifies the location of the model. If model_data is a local file path, it will be uploaded to OSS bucket before deployment or model registry.
inference_spec (pai.model.InferenceSpec, optional) -- An InferenceSpec object representing how to create the prediction service using the model.
session (pai.session.Session, optional) -- A pai session object manages interactions with PAI REST API.

deploy(service_name: str, instance_count: Optional[int] = 1, instance_type: Optional[str] = None, resource_config: Optional[Union[Dict[str, int], ResourceConfig]] = None, resource_id: Optional[str] = None, service_type: Optional[str] = None, options: Optional[Dict[str, Any]] = None, wait: bool = True, serializer: Optional[SerializerBase] = None, **kwargs)#

Deploy an online prediction service.

参数:

service_name (str, optional) -- Name for the online prediction service. The name must be unique in a region.
instance_count (int) -- Number of instance request for the service deploy (Default 1).
instance_type (str, optional) -- Type of the machine instance, for example, 'ecs.c6.large'. For all supported instance, view the appendix of the link: https://help.aliyun.com/document_detail/144261.htm?#section-mci-qh9-4j7

resource_config (Union[ResourceConfig, Dict[str, int]], optional) --

Request resource for each instance of the service. Required if instance_type is not set. Example config:

resource_config = {
    "cpu": 2,       # The number of CPUs that each instance requires
    "memory: 4000,  # The amount of memory that each instance
                    # requires, must be an integer, Unit: MB.
    # "gpu": 1,         # The number of GPUs that each instance
                        # requires.
    # "gpu_memory": 3   # The amount of GPU memory that each
                        # instance requires, must be an integer,
                        # Unit: GB.
}

resource_id (str, optional) --
The ID of the resource group. The service can be deployed to public resource group and dedicated resource group.
- If resource_id is not specified, the service is deployed
  to public resource group.
- If the service deployed in a dedicated resource group, provide
  the parameter as the ID of the resource group. Example: "eas-r-6dbzve8ip0xnzte5rp".
service_type (str, optional) -- The type of the service.
options (Dict[str, Any], optional) -- Advanced deploy parameters used to create the online prediction service.
wait (bool) -- Whether the call should wait until the online prediction service is ready (Default True).
serializer (pai.predictor.serializers.BaseSerializer, optional) -- A serializer object used to serialize the prediction request and deserialize the prediction response.

返回:

A PredictorBase instance used for making prediction to the prediction service.

class pai.model.InferenceSpec(*args, **kwargs)#

A class used to describe how to create a prediction service.

InferenceSpec is using to describe how the model is serving in PAI. To view the full supported parameters, please see the following hyperlink: Parameters of model services.

Example of how to config a InferneceSpec:

>>> # build an inference_spec that using XGBoost processor.
>>> infer_spec = InferenceSpec(processor="xgboost")
>>> infer_spec.metadata.rpc.keepalive  = 1000
>>> infer_spec.warm_up_data_path = "oss://bucket-name/path/to/warmup-data"
>>> infer_spec.add_option("metadata.rpc.max_batch_size", 8)
>>> print(infer_spec.processor)
xgboost
>>> print(infer_spec.metadata.rpc.keepalive)
1000
>>> print(infer_spec.metadata.rpc.max_batch_size)
8
>>> print(infer_spec.to_dict())
{'processor': 'xgboost', 'metadata': {'rpc': {'keepalive': 1000, 'max_batch_size': 8}},
'warm_up_data_path': 'oss://bucket-name/path/to/warmup-data'}

InferenceSpec initializer.

参数:: **kwargs -- Parameters of the inference spec.

to_dict() → Dict#: Return a dictionary that represent the InferenceSpec.

add_option(name: str, value)#

Add an option to the inference_spec instance.

参数:

name (str) --
Name of the option to set, represented as the JSON path of the parameter for the InferenceSpec. To view the full supported parameters, please see the following hyperlink: Parameters of model services.
value -- Value for the option.

示例

>>> infer_spec = InferenceSpec(processor="tensorflow_gpu_1.12")
>>> infer_spec.add_option("metadata.rpc.keepalive", 10000)
>>> infer_spec.metadata.rpc.keepalive
10000
>>> infer_spec.to_dict()
{'processor': 'tensorflow_gpu_1.12', 'metadata': {'rpc': {'keepalive': 10000}}}

merge_options(options: Dict[str, Any])#: Merge options from a dictionary.

classmethod from_dict(config: Dict[str, Any]) → InferenceSpec#

Initialize a InferenceSpec from a dictionary.

You can use this method to initialize a InferenceSpec instance from a dictionary.

返回:: A InferenceSpec instance.
返回类型:: pai.model.InferenceSpec

mount(source: str, mount_path: str, session: Optional[Session] = None, properties: Optional[Dict[str, Any]] = None) → Dict[str, Any]#

Mount a source storage to the running container.

备注

If source is a local path, it will be uploaded to the OSS bucket and mounted. If source is a OSS path, it will be mounted directly.

参数:

source (str) -- The source storage to be attached, currently only support OSS path in OSS URI format and local path.
mount_path (str) -- The mount path in the container.
session (Session, optional) -- A PAI session instance used for communicating with PAI service.

返回:

The storage config.

返回类型:

Dict[str, Any]

抛出:

DuplicateMountException -- If the mount path is already used or source OSS path is mounted to the container.

Examples::

# Mount a OSS storage path to the running container. >>> inference_spec.mount("oss://<YourOssBucket>/path/to/directory/model.json", ... "/ml/model/")

# 'Mount' a local path to the running container. >>> inference_spec.mount("/path/to/your/data/", "/ml/model/")

set_model_data(model_data: str, mount_path: Optional[str] = None)#

Set the model data for the InferenceSpec instance.

参数:

model_data (str) -- The model data to be set. It must be an OSS URI.
mount_path (str, optional) -- The mount path in the container.

抛出:

DuplicatedMountException -- If the model data is already mounted to the container.

class pai.model.RegisteredModel(model_name: str, model_version: Optional[str] = None, model_provider: Optional[str] = None, session: Optional[Session] = None, **kwargs)#

A class that represents a registered model in PAI model registry.

A RegisteredModel instance has a unique tuple of (model_name, model_version, model_provider), and can be used for downstream tasks such as creating an online prediction service, or creating an AlgorithmEstimator to start a training job.

Examples:

from pai.model import RegisteredModel

# retrieve a registered model from PAI model registry by
# specifying the model_name, model_version and model_provider
m = RegisteredModel(
    model_name="easynlp_pai_bert_small_zh",
    model_version="0.1.0",
    model_provider="pai",
)


# deploy the Registered Model to create an online prediction
# service if the model has inference_spec
m.deploy()


# create an AlgorithmEstimator to start a training job if the
# model has training_spec
est = m.get_estimator()
inputs = m.get_estimator_inputs()
est.fit(inputs)

Get a RegisteredModel instance from PAI model registry.

参数:

model_name (str) -- The name of the registered model.
model_version (str, optional) -- The version of the registered model. If not provided, the latest version is retrieved from the model registry.
model_provider (str, optional) -- The provider of the registered model. Currently, only "pai", "huggingface" or None are supported.
session (pai.session.Session, optional) -- A PAI session object used for interacting with PAI Service.

classmethod list(model_name: Optional[str] = None, model_provider: Optional[str] = None, task: Optional[str] = None, session: Optional[Session] = None) → Iterator[RegisteredModel]#

List registered models in model registry.

参数:

model_name (str, optional) -- The name of the registered model. Default to None.
model_provider (str, optional) -- The provider of the registered model. Optional values are "pai", "huggingface" or None. If None, list registered models in the workspace of the current session. Default to None.
task (str, optional) -- The task of the registered model. Default to None.
session (Session, optional) -- A PAI session object used for interacting with PAI Service.

返回:

An iterator of RegisteredModel instances matching: the given criteria.

返回类型:

Iterator[RegisteredModel]

list_versions(model_version: Optional[str] = None) → Iterator[RegisteredModel]#

List all versions of the registered model.

参数:: model_version (str, optional) -- The version of the registered model. Default to None.

delete(delete_all_version: bool = False)#

Delete the specific registered model from PAI model registry.

参数:: delete_all_version (bool) -- Whether to delete all versions of the registered model.

deploy(service_name: Optional[str] = None, instance_count: Optional[int] = None, instance_type: Optional[str] = None, resource_config: Optional[Union[Dict[str, int], ResourceConfig]] = None, resource_id: Optional[str] = None, options: Optional[Dict[str, Any]] = None, service_type: Optional[str] = None, wait: bool = True, serializer: Optional[SerializerBase] = None, **kwargs)#

Deploy an online prediction service with the registered model.

If the RegisteredModel already has a registered inference_spec, then the model can be deployed directly. Give more specific arguments to override the registered inference_spec. Otherwise, the model will be deployed through the same process as the deploy() method of pai.model.Model.

参数:

service_name (str, optional) -- Name for the online prediction service. The name must be unique in a region.
instance_count (int, optional) -- Number of instance requested for the service deploy.
instance_type (str, optional) -- Type of the machine instance, for example, 'ecs.c6.large'. For all supported instance, view the appendix of the link: https://help.aliyun.com/document_detail/144261.htm?#section-mci-qh9-4j7

resource_config (Union[ResourceConfig, Dict[str, int]], optional) --

Request resource for each instance of the service. Required if instance_type is not set. Example config:

resource_config = {
    "cpu": 2,       # The number of CPUs that each instance requires
    "memory: 4000,  # The amount of memory that each instance
                    # requires, must be an integer, Unit: MB.
    # "gpu": 1,         # The number of GPUs that each instance
                        # requires.
    # "gpu_memory": 3   # The amount of GPU memory that each
                        # instance requires, must be an integer,
                        # Unit: GB.
}

resource_id (str, optional) --
The ID of the resource group. The service can be deployed to public resource group and dedicated resource group.
- If resource_id is not specified, the service is deployed
  to public resource group.
- If the service deployed in a dedicated resource group, provide
  the parameter as the ID of the resource group. Example: "eas-r-6dbzve8ip0xnzte5rp".
service_type (str, optional) -- The type of the service.
options (Dict[str, Any], optional) -- Advanced deploy parameters used to create the online prediction service.
wait (bool) -- Whether the call should wait until the online prediction service is ready (Default True).
serializer (pai.predictor.serializers.BaseSerializer, optional) -- A serializer object used to serialize the prediction request and deserialize the prediction response.

返回:

A PredictorBase instance used for making prediction to the prediction service.

get_estimator(training_method: Optional[str] = None, instance_type: Optional[str] = None, instance_count: Optional[int] = None, hyperparameters: Optional[Dict[str, Any]] = None, base_job_name: Optional[str] = None, output_path: Optional[str] = None, max_run_time: Optional[int] = None, **kwargs) → AlgorithmEstimator#

Generate an AlgorithmEstimator.

Generate an AlgorithmEstimator object from RegisteredModel's training_spec.

参数:

training_method (str, optional) -- Used to selected the training algorithm that supported by the model. If not specified, the default training algorithm will be retrieved from the model version.
instance_type (str, optional) -- The machine instance type used to run the training job. If not provider, the default instance type will be retrieved from the algorithm definition. To view the supported machine instance types, please refer to the document: https://help.aliyun.com/document_detail/171758.htm#section-55y-4tq-84y.
instance_count (int, optional) -- The number of machines used to run the training job. If not provider, the default instance count will be retrieved from the algorithm definition.
hyperparameters (dict, optional) -- A dictionary that represents the hyperparameters used in the training job. Default hyperparameters will be retrieved from the algorithm definition.
base_job_name (str, optional) -- The base name used to generate the training job name. If not provided, a default job name will be generated.
output_path (str, optional) -- An OSS URI to store the outputs of the training jobs. If not provided, an OSS URI will be generated using the default OSS bucket in the session. When the estimator.fit method is called, a specific OSS URI under the output_path for each channel is generated and mounted to the training container.
max_run_time (int, optional) -- The maximum time in seconds that the training job can run. The training job will be terminated after the time is reached (Default None).

返回:

An AlgorithmEstimator object.

返回类型:

pai.estimator.AlgorithmEstimator

get_estimator_inputs() → Dict[str, str]#

Get the AlgorithmEstimator's default input channels

Get the AlgorithmEstimator's default input channels from RegisteredModel's training_spec.

返回:: A dict of input channels.
返回类型:: dict[str, str]

get_eval_processor(base_job_name: Optional[str] = None, output_path: Optional[str] = None, parameters: Optional[Dict[str, Any]] = None, max_run_time: Optional[int] = None, instance_type: Optional[str] = None, instance_count: Optional[int] = None, user_vpc_config: Optional[UserVpcConfig] = None)#

Generate a Processor for model evaluation.

Generate a Processor object from RegisteredModel's evaluation_spec.

参数:

parameters (dict, optional) -- A dictionary that represents the parameters used in the job. Default parameters will be retrieved from the evaluation spec.
base_job_name (str, optional) -- The base name used to generate the job name. If not provided, a default job name will be generated.
output_path (str, optional) -- An OSS URI to store the outputs of the jobs. If not provided, an OSS URI will be generated using the default OSS bucket in the session. When the estimator.fit method is called, a specific OSS URI under the output_path for each channel is generated and mounted to the container.
max_run_time (int, optional) -- The maximum time in seconds that the job can run. The job will be terminated after the time is reached (Default None).
instance_type (str, optional) -- The machine instance type used to run the job. If not provider, the default instance type will be retrieved from the evaluation spec. To view the supported machine instance types, please refer to the document: https://help.aliyun.com/document_detail/171758.htm#section-55y-4tq-84y.
instance_count (int, optional) -- The number of machines used to run the job. If not provider, the default instance count will be retrieved from the evaluation spec.
user_vpc_config (pai.estimator.UserVpcConfig, optional) -- The VPC configuration used to enable the job instance to connect to the specified user VPC. If provided, an Elastic Network Interface (ENI) will be created and attached to the job instance, allowing the instance to access the resources within the specified VPC. Default to None.

返回:

An Processor object.

返回类型:

pai.processor.Processor

get_evaluation_inputs() → Dict[str, Any]#

Get the Processor's default input channels

Get the Processor's default input channels from RegisteredModel's evaluation_spec.

返回:: A dict of input channels.
返回类型:: dict[str, str]

Model

目录

Model#