Platform
Health checksAmazon CloudWatch Health Check
To add an Amazon CloudWatch monitor or alarm as a health check, you'll need three pieces of information:
Note
The AWS integration requires the cloudwatch::DescribeAlarms permission when used with a Cloudwatch Healthcheck.- Your AWS account ID.
- An AWS access key. If you need to create one, see Managing access keys for IAM users.
- The URL of the CloudWatch monitor or alarm you wish to use.
To add an Amazon CloudWatch health check:
- Open the Health Checks page in the Gremlin web app, click + Health Check, then select AWS from the Integrations drop-down.
- If you've already authenticated AWS, continue to step 3. Otherwise, follow these steps:
- Enter your AWS account ID in the appropriate text box.
- In the Authentication section, enter your AWS Access Key ID and AWS Secret Access Key in the respective boxes.
- You can optionally change the Authentication Endpoint, e.g. to test authenticating with a different AWS region. In most cases, you can leave this as the default.
- Click Authenticate Observability Tool. If the authentication passes, click Save Authentication. Otherwise, try changing your settings and retry authenticating.
- Click Next.
- Confirm that your newly added AWS account is selected in the Health Checks drop-down, then click Next.
- Select whether to use an Amazon CloudWatch alarm as the basis of the Health Check, or the AWS API. Using a CloudWatch alarm is the most straightforward and recommended method while using the API gives you more control.
- If you're using a CloudWatch alarm:
- Select Create a Health Check from an Amazon CloudWatch alarm URL.
- Enter the URL of the alarm. To get the URL, open the alarm (or monitor) in a web browser, then copy and paste the URL into the text box.
- Click Test Health Check to confirm that it works.
- If you're using the AWS API:
- Enter a name for the health check in the Health Check Name box.
- Enter the URL of the alarm. To get the URL, open the alarm (or monitor) in a web browser, then copy and paste the URL into the text box.
- Click Test Connection to confirm that the health check works. This also shows the response code and body, which you can use to adjust your success criteria.
- Adjust the Success Evaluation criteria to your needs. By default, Gremlin considers the check to be successful if it returns an HTTP 200 status code within 1000 milliseconds, and the
.DescribeAlarmsResponse.DescribeAlarmsResult.MetricAlarms[0].StateValue
field containsOK
. You can change these values to fit your requirements or keep the defaults. Read adding success evaluation criteria for more information. - Click Test Evaluation to send another test request to your endpoint. This is to ensure the response meets your criteria.
- If you're using a CloudWatch alarm:
- Click Create Health Check.