This document describes how to set a virtual machine (VM) instance's host maintenance policy to control how the VM behaves when a host event occurs.
Before you begin
-
If you haven't already, set up authentication.
Authentication is
the process by which your identity is verified for access to Google Cloud services and APIs.
To run code or samples from a local development environment, you can authenticate to
Compute Engine as follows.
Select the tab for how you plan to use the samples on this page:
Console
When you use the Google Cloud console to access Google Cloud services and APIs, you don't need to set up authentication.
gcloud
-
Install the Google Cloud CLI, then initialize it by running the following command:
gcloud init
- Set a default region and zone.
REST
To use the REST API samples on this page in a local development environment, you use the credentials you provide to the gcloud CLI.
Install the Google Cloud CLI, then initialize it by running the following command:
gcloud init
For more information, see Authenticate for using REST in the Google Cloud authentication documentation.
-
Limitations
- You can't change the host maintenance policy of a preemptible VM. When there is a maintenance event, the preemptible VM stops and it does not migrate. You must manually restart the preempted VM.
- After you create a VM using an E2 machine type, you can't change the VM's host
maintenance settings from
MIGRATE
toTERMINATE
or the other way around.
Available host maintenance properties
You can configure a VM's maintenance behavior, restart behavior, and behavior after a host error occurs with the following properties.
Compute Engine configures each VM with the default values unless you specify otherwise.
During host events, depending on the configured host maintenance policy, VMs that don't support live migration are terminated or automatically restarted.
onHostMaintenance
: determines the behavior when a maintenance event occurs that might cause your VM to reboot.MIGRATE
(Default): causes Compute Engine to live migrate an instance when there is a maintenance event.TERMINATE
: stops a VM instead of migrating it.
automaticRestart
: determines the behavior when a VM crashes or is stopped by the system.true
(Default): Compute Engine restarts an instance if the instance crashes or is stopped.false
: Compute Engine does not restart a VM if the VM crashes or is stopped.
localSsdRecoveryTimeout
: Sets the Local SSD recovery timeout. This is the maximum amount of time, in hours, that Compute Engine waits to recover Local SSD data after a host error. This setting only applies to VMs with attached Local SSD disks.- Unset (Default): Compute Engine waits up to 1 hour to recover the disk. For Z3 VMs, the default wait time is 6 hours.
- A number from 0 to 168: specifies how long Compute Engine waits to recover the disk. The number is must be an integer, in increments of 1 hour, with a maximum value of 7 days. A value of 0 means that Compute Engine won't wait to recover the data.
hostErrorTimeoutSeconds
(Preview): Sets the maximum amount of time, in seconds, that Compute Engine waits to restart or terminate a VM after detecting that the VM is unresponsive.- Unset (Default): Compute Engine waits up to 5.5 minutes (330 seconds) before restarting an unresponsive VM.
- Number from 90 to 330: specifies the number of seconds, in increments of 30, that Compute Engine waits before restarting an unresponsive VM.
Set host maintenance policy of a VM
You can change the host maintenance policy of a VM when you first create the VM or after the VM is created.
Set host maintenance policy during VM creation
The information in this section focuses on how to set the host maintenance policy when you create a VM. For more VM creation examples, see Create and start a VM instance.
You can set the host maintenance policy of a VM at creation using the Google Cloud console, gcloud CLI or the Compute Engine API.
Console
In the Google Cloud console, go to the Create an instance page.
Specify a Name for the VM.
Select a Region and Zone for the VM.
In the Machine configuration section, do the following:
- Specify the details of the machine type for the VM.
- Expand the VM provisioning model advanced settings menu.
- In the On host maintenance menu, select one of the following steps:
- To migrate VMs during maintenance events, select Migrate VM instance.
- To stop VMs during maintenance events, select Terminate VM instance.
To create the VM, click Create.
gcloud
In the Google Cloud console, activate Cloud Shell.
At the bottom of the Google Cloud console, a Cloud Shell session starts and displays a command-line prompt. Cloud Shell is a shell environment with the Google Cloud CLI already installed and with values already set for your current project. It can take a few seconds for the session to initialize.
To set the host maintenance policy of a new VM, use the
gcloud compute instances create
command. Include one or more of the following
parameters:
--maintenance-policy
: whether the VM is migrated or stopped during host maintenance. The VM is migrated by default if you omit this property.--no-restart-on-failure
or--restart-on-failure
: whether the VM restarts automatically after a host error. By default, the VM will always restart when a failure is detected.--local-ssd-recovery-timeout
: how much time Compute Engine spends recovering any attached Local SSD disks after a host error. The default is 1 hour.
Set the host maintenance policy of a new VM with the following command. If you omit any of the flags, the flag's default is used.
gcloud compute instances create VM_NAME \
--maintenance-policy=MAINTENANCE_POLICY \
--RESTART_ON_FAILURE_BEHAVIOR \
--local-ssd-recovery-timeout=SSD_RECOVERY_TIMEOUT
Replace the following:
VM_NAME
: the VM name.MAINTENANCE_POLICY
: the maintenance policy for this VM, eitherTERMINATE
orMIGRATE
.RESTART_ON_FAILURE_BEHAVIOR
: Restart behaviour for the VM, set to either--no-restart-on-failure
or--restart-on-failure
.SSD_RECOVERY_TIMEOUT
: the number of hours to spend recovering a Local SSD attached to an unresponsive VM. Valid values are from 0 to 168, in increments of 1 hour.
Set the host error detection timeout
To set the maximum amount of time Compute Engine
waits to restart or terminate an unresponsive VM, use the
gcloud compute instances create
command. Specify the timeout with the --host-error-timeout-seconds
flag.
gcloud beta compute instances create VM_NAME \
--maintenance-policy=MAINTENANCE_POLICY \
--RESTART_ON_FAILURE_BEHAVIOR \
--local-ssd-recovery-timeout=SSD_RECOVERY_TIMEOUT \
--host-error-timeout-seconds=ERROR_DETECTION_TIMEOUT
Replace the following:
VM_NAME
: the VM name.MAINTENANCE_POLICY
: the maintenance policy for this VM, eitherTERMINATE
orMIGRATE
.RESTART_ON_FAILURE_BEHAVIOR
: Restart behaviour for the VM, set to either--no-restart-on-failure
or--restart-on-failure
.SSD_RECOVERY_TIMEOUT
: the number of hours Compute Engine spends recovering a Local SSD that was attached to an unresponsive VM. Valid values are from 0 to 168, in increments of 1 hour.ERROR_DETECTION_TIMEOUT
: the number of seconds Compute Engine waits before restarting an unresponsive VM, from 90 to 330, in increments of 30.
REST
To set the host maintenance policy of a new VM using the
Compute Engine API, use the
instances.insert
method.
Include one or more of the following properties in the scheduling
object of
the request body:
onHostMaintenance
: whether the VM is migrated or stopped during host maintenance. The VM is migrated by default.automaticRestart
: whether the VM restarts automatically after a host error. VMs are restarted automatically by default.localSsdRecoveryTimeout
: how much time Compute Engine spends recovering any attached Local SSD disks after detecting a host error. The default is 1 hour.
POST https://compute.googleapis.com/compute/v1/projects/PROJECT_ID/zones/ZONE/instances
{
"name": "VM_NAME",
"scheduling": {
"onHostMaintenance": "MAINTENANCE_POLICY",
"automaticRestart": "RESTART_POLICY,
"localSsdRecoveryTimeout": SSD_RECOVERY_TIMEOUT
}
}
Replace the following:
PROJECT_ID
: the project for the VM.ZONE
: the zone where you want to create the VM.VM_NAME
: the VM name.MAINTENANCE_POLICY
: the maintenance policy for this VM, eitherTERMINATE
orMIGRATE
.RESTART_POLICY
: the restart policy for this VM, eithertrue
orfalse
.SSD_RECOVERY_TIMEOUT
: the number of hours Compute Engine spends recovering a Local SSD disk that was attached to an unresponsive VM. Valid values are from 0 to 168, in increments of 1 hour.
Set the host error detection timeout
To set the maximum amount of time Compute Engine
waits to restart or terminate an unresponsive VM, use the
beta instances.insert
method
because this option is available in Preview.
Add the hostErrorTimeoutSeconds
property to the scheduling
object of the
request body.
POST https://compute.googleapis.com/compute/beta/projects/PROJECT_ID/zones/ZONE/instances
{
"name": "VM_NAME",
"scheduling": {
"onHostMaintenance": "MAINTENANCE_POLICY",
"automaticRestart": "RESTART_POLICY,
"localSsdRecoveryTimeout": SSD_RECOVERY_TIMEOUT
"hostErrorTimeoutSeconds": HOST_ERROR_TIMEOUT,
}
}
Replace the following:
PROJECT_ID
: the project for the VM.ZONE
: the zone where you want to create the VM.VM_NAME
: the VM name.MAINTENANCE_POLICY
: the maintenance policy for this VM, eitherTERMINATE
orMIGRATE
.RESTART_POLICY
: the restart policy for this VM, eithertrue
orfalse
.SSD_RECOVERY_TIMEOUT
: the number of hours Compute Engine to spend recovering a Local SSD disk that was attached to an unresponsive VM. Valid values are from 0 to 168, in increments of 1 hour.HOST_ERROR_TIMEOUT
: the number of seconds Compute Engine waits before restarting or terminating an unresponsive VM. Valid values are from 90 to 330, in increments of 30.
Update the host maintenance policy of an existing VM
Console
In the Google Cloud console, go to the VM instances page.
Click the VM for which you want to change settings. The VM details page displays.
On the VM details page, complete the following steps:
- Click the Edit button at the top of the page.
- Go to the Management section. From the Availability policies section, you can set the On host maintenance and Automatic restart options.
- Click Save.
gcloud
In the Google Cloud console, activate Cloud Shell.
At the bottom of the Google Cloud console, a Cloud Shell session starts and displays a command-line prompt. Cloud Shell is a shell environment with the Google Cloud CLI already installed and with values already set for your current project. It can take a few seconds for the session to initialize.
Update the host maintenance policy of an existing VM with the
gcloud compute instances set-scheduling
command. Use the same parameters described in the VM creation command
in the preceding section.
gcloud compute instances set-scheduling VM_NAME \
--maintenance-policy=MAINTENANCE_POLICY \
--RESTART_ON_FAILURE_BEHAVIOR \
--local-ssd-recovery-timeout=SSD_RECOVERY_TIMEOUT
Replace the following:
VM_NAME
: the VM name.MAINTENANCE_POLICY
: the policy for this VM, eitherTERMINATE
orMIGRATE
.RESTART_ON_FAILURE_BEHAVIOR
: restart behaviour for the VM, either--no-restart-on-failure
or--restart-on-failure
.SSD_RECOVERY_TIMEOUT
: the time, in hours, Compute Engine spends recovering a Local SSD disk attached to an unresponsive VM. Valid values are from 0 to 168.
Update the host error detection timeout
To update the maximum amount of time Compute Engine
waits to restart or terminate an unresponsive VM, use the
gcloud beta compute instances set-scheduling
command, because this feature is only
available in Preview.
Update the timeout with the --host-error-timeout-seconds
parameter.
For example:
gcloud beta compute instances set-scheduling VM_NAME \
--maintenance-policy=MAINTENANCE_POLICY \
--RESTART_ON_FAILURE_BEHAVIOR \
--local-ssd-recovery-timeout=SSD_RECOVERY_TIMEOUT \
--host-error-timeout-seconds=NUMBER_OF_SECONDS
Replace the following:
VM_NAME
: the VM name.MAINTENANCE_POLICY
: the maintenance policy for this VM, eitherTERMINATE
orMIGRATE
.RESTART_ON_FAILURE_BEHAVIOR
: Restart behaviour for the VM, set to either--no-restart-on-failure
or--restart-on-failure
.SSD_RECOVERY_TIMEOUT
: the time, in hours, Compute Engine spends recovering a Local SSD disk that was attached to an unresponsive VM. Valid values are from 0 to 168.NUMBER_OF_SECONDS
: the number of seconds Compute Engine waits before restarting or terminating an unresponsive VM, from 90 to 330, in increments of 30.
REST
Update the host maintenance policy of an existing VM with a POST
request
to the
instances.setScheduling
method.
Include one or more of the following properties in the request body:
onHostMaintenance
: whether the VM is migrated or stopped during host maintenance. The VM is migrated by default.automaticRestart
: whether the VM restarts automatically after a host error. VMs are restarted automatically by default.localSsdRecoveryTimeout
: how much time Compute Engine spends recovering any attached Local SSD disks after detecting a host error. If omitted, the default is 1 hour.
POST https://compute.googleapis.com/compute/v1/projects/PROJECT_ID/zones/ZONE/instances/VM_NAME/setScheduling
{
"onHostMaintenance": "MAINTENANCE_POLICY",
"automaticRestart": RESTART_POLICY,
"localSsdRecoveryTimeout": SSD_RECOVERY_TIMEOUT
}
Replace the following:
PROJECT_ID
: the project for the VM.ZONE
: the zone where the VM is located.VM_NAME
: the VM name.MAINTENANCE_POLICY
: the maintenance policy for this VM, eitherTERMINATE
orMIGRATE
.RESTART_POLICY
: the restart policy for this VM, eithertrue
orfalse
.SSD_RECOVERY_TIMEOUT
: the time, in hours, that Compute Engine spends recovering a Local SSD disk that was attached to an unresponsive VM. Valid values are from 0 to 168.
Update the host error detection timeout
To update the maximum amount of time Compute Engine
waits to restart or terminate an unresponsive VM, you must use the beta instances.setScheduling
method because this feature is
available in Preview.
Add the hostErrorTimeoutSeconds
parameter to the request body.
POST https://compute.googleapis.com/compute/beta/projects/PROJECT_ID/zones/ZONE/instances/VM_NAME/setScheduling
{
"hostErrorTimeoutSeconds": NUMBER_OF_SECONDS,
}
Replace the following:
PROJECT_ID
: the project for the VM.ZONE
: the zone where the VM is located.VM_NAME
: the VM name.NUMBER_OF_SECONDS
: the number of seconds Compute Engine waits before restarting or terminating an unresponsive VM, from 90 to 330, in increments of 30.
View host maintenance policy settings of a VM
Console
Go to the VM instances page.
Click the Name of the VM for which you want to view settings. The VM instance details page opens.
Go to the Management section. The Availability policies subsection shows your current settings for On host maintenance and Automatic restart.
gcloud
View the host maintenance option settings for a VM with the
gcloud compute instances describe
command:
gcloud compute instances describe VM_NAME --format="yaml(scheduling)"
Replace VM_NAME
with the VM name.
The output includes the VM's host error detection timeout, for example:
scheduling:
automaticRestart: true
localSsdRecoveryTimeout:
nanos: 0
seconds: '10800'
onHostMaintenance: MIGRATE
preemptible: false
provisioningModel: STANDARD
View the host error detection timeout setting
View the current value of the hostErrorTimeoutSeconds
with the
gcloud beta compute instances describe
command,
because this option is only available in Preview.
gcloud beta compute instances describe VM_NAME --format="yaml(scheduling)"
Replace VM_NAME
with the VM name.
The output includes the VM's host error detection timeout, for example:
scheduling:
automaticRestart: true
hostErrorTimeoutSeconds: 120
localSsdRecoveryTimeout:
nanos: 0
seconds: '10800'
onHostMaintenance: MIGRATE
preemptible: false
provisioningModel: STANDARD
REST
To view the host maintenance settings for a VM, use the
instances.get
method:
GET https://compute.googleapis.com/compute/v1/projects/PROJECT_ID/zones/ZONE/instances/VM_NAME
Replace the following:
PROJECT_ID
: the project where the VM is located.ZONE
: the zone where the VM is located.VM_NAME
: the VM name.
In the output, the scheduling
object contains the VM's host maintenance policy,
for example:
"scheduling": {
"onHostMaintenance": "MIGRATE",
"automaticRestart": true,
"preemptible": false,
"provisioningModel": "STANDARD",
"localSsdRecoveryTimeout": {
"seconds": "10800",
"nanos": 0
}
}
View the host error timeout settings
View the current hostErrorTimeoutSeconds
setting with
a GET
request to the
beta instances.get
method,
because this option is
only available in Preview.
GET https://compute.googleapis.com/compute/beta/projects/PROJECT_ID/zones/ZONE/instances/VM_NAME
Replace the following:
PROJECT_ID
: the project for the VM.ZONE
: the zone where the VM is located.VM_NAME
: the VM name.
In the output, the scheduling
object includes the VM's host error detection
timeout, for example:
"scheduling": {
"hostErrorTimeoutSeconds": 120
}
What's next
- Learn more about host maintenance.
- Learn more about live migration.
- Learn how to detect a live migration event.