- Amazon API Gateway
- Description
Amazon API Gateway is an AWS service for creating, publishing, maintaining, monitoring, and securing REST, HTTP, and WebSocket APIs at any scale.
- Build to run service included in the OTC
- Build service pre-requisite
- Refer to generic description.
- Build to run service
- Refer to generic description.
- RUN services included in the MRC
- Run service pre-requisite
- RUN services included in the MRC
- A referential file exists in the Git including the reference configuration of the API Gateway.
- This file can be executed with a CI/CD and the execution has been tested successfully.
- KPI & alerts
Monitoring
Yes
KPI monitored
Metrics supported for Amazon API Gateway:
- 4XXError
- 5XXError
- CacheHitCount
- CacheMissCount
- Count
- IntegrationLatency
- Latency
Alerts observed
- 4XXError
- 5XXError
- Latency
- Backup and restore
Restore from Infra as Code. No native backup exists for this service.
- Disaster Recovery:
No native Disaster Recovery is available for this service.
- AWS SLA High Availability
The service is HA by design in a single AWS Region.
- Limitations & pre-requisite
Whenever the API is customized, there should be procedures provided by the customer describing how to monitor and troubleshoot the API.
- Charging model
Work Unit |
Per API |
Changes examples | Effort |
Modify API behavior | On quote |
Other changes | Estimation in tokens based on time spent |
- Elastic Load Balancing – Application Load Balancer (ALB)
- Description
An Application Load Balancer functions at the application layer, the seventh layer of the Open Systems Interconnection (OSI) model. After the load balancer receives a request, it evaluates the listener rules in priority order to determine which rule to apply, and then selects a target from the target group for the rule action.
- Build to run service included in the OTC
- Build service pre-requisite
- Refer to generic description.
- Build to run service
- Refer to generic description.
- RUN services included in the MRC
- Run service pre-requisite
- RUN services included in the MRC
- A referential file exists in the Git including the reference configuration of the Application Load Balancer (ALB).
- This file can be executed with a CI/CD and the execution has been tested successfully.
- KPI & alerts
Monitoring
Yes
KPI monitored
Metrics supported by Application Gateway:
- ActiveConnectionCount
- ClientTLSNegotiationErrorCount
- ConsumedLCUs
- DesyncMitigationMode_NonCompliant_Request_Count
- DroppedInvalidHeaderRequestCount
- ForwardedInvalidHeaderRequestCount
- GrpcRequestCount
- HTTP_Fixed_Response_Count
- HTTP_Redirect_Count
- HTTP_Redirect_Url_Limit_Exceeded_Count
- HTTPCode_ELB_3XX_Count
- HTTPCode_ELB_4XX_Count
- HTTPCode_ELB_5XX_Count
- HTTPCode_ELB_500_Count
- HTTPCode_ELB_502_Count
- HTTPCode_ELB_503_Count
- HTTPCode_ELB_504_Count
- IPv6ProcessedBytes
- IPv6RequestCount
- NewConnectionCount
- NonStickyRequestCount
- ProcessedBytes
- RejectedConnectionCount
- RequestCount
- RuleEvaluations
Alerts observed:
- HTTPCode_ELB_3XX_Count
- HTTPCode_ELB_4XX_Count
- HTTPCode_ELB_5XX_Count (Backend down)
- HTTPCode_ELB_500_Count
- HTTPCode_ELB_502_Count
- HTTPCode_ELB_503_Count
- HTTPCode_ELB_504_Count
- RejectedConnectionCount
- Backup and restore
Data backup and restore
Can be exported from CI/CD Pipeline.
Service restore
The Continuous Deployment chain is used to redeploy the Elastic Load Balancing – Application Load Balancer from the configuration file of reference for production environment committed in the Git.
- AWS SLA High Availability and Disaster Recovery inter-region
The service is Highly Available by design by AWS.
There is no native Disaster Recovery for this service.
- Charging model
Work Unit |
Per Application Load Balancer |
- Changes catalogue – in Tokens, per act
Changes examples | Effort |
Add/modify Backend | 1 Token |
Certificate Installation | 2 Tokens |
Add a pool member (if members are static) | 2 Tokens |
Modify configuration (routing/Web Application Firewall) | Estimation in tokens based on time spent |
Add a listener | 2 Tokens |
Add a Target Group | 2 Tokens |
Modify listener rule (Re-writing URL) | Estimation in tokens based on time spent |
Other changes | Estimation in tokens based on time spent |
Amazon Route 53 is a highly available and scalable Domain Name System (DNS) web service. You can use Route 53 to perform three main functions in any combination: domain registration, DNS routing, and health checking.
- Build to run service included in the OTC
- Build service pre-requisite
- Refer to generic description.
- Build to run service
- Refer to generic description.
- A referential file exists in the Git used by OBS which includes the reference configuration of the Route 53.
- This file can be executed with a CI/CD used by OBS and the execution has been tested successfully.
- Co-manage option
For the Public Hosted Zones, OBS work with the customer for the public domain naming context.
For the Private Hosted Zones, a RACI must be done.
- Limitations & pre-requisite
For domain registration, we highly recommend that customer handles the intellectual property part and registers the domain wherever he wants. The domain hosting configuration on AWS will be handled by OBS.
OBS could be the registrar.
The Customer must be the registrant.
- KPI & alerts
Monitoring
Yes
KPI monitored
Global metrics supported by Route 53:
- ChildHealthCheckHealthyCount
- ConnectionTime
- HealthCheckPercentageHealthyHealthCheckStatus
- SSLHandshakeTime
- TimeToFirstByte
Hosted Zone metrics supported by Route 53:
- DNSQueries
- DNSSECInternalFailure
- DNSSECKeySigningKeysNeedingAction
- DNSSECKeySigningKeyMaxNeedingActionAge
- DNSSECKeySigningKeyAge
Alerts observed
Default
DNSQueries
Optional
DNSSECKeySigningKeyMaxNeedingActionAge to detect a security breach
- Backup and restore
Data backup and restore
No data to backup.
Service restore
Recovery will be from Infra as Code.
- AWS SLA High Availability and Disaster Recovery inter-region
Route 53 is a native Global AWS service. The service is Highly Available by design by AWS. Disaster Recovery is native.
Work Unit |
Per Hosted Zone |
Changes examples | Effort |
Create / update Zone | 2 tokens |
Zone delegation* | 4 tokens |
Configure inbound and outbound resolvers | 8 tokens |
Other changes | Estimation in tokens based on time spent |
Zone Delegation*: Specification should be received as a prerequisite.
AWS Lambda
Lambda is a compute service that lets you run code without provisioning or managing servers. Lambda runs your code on a high-availability compute infrastructure and performs all of the administration of the compute resources, including server and operating system maintenance, capacity provisioning and automatic scaling, code monitoring and logging.
- Build to run service included in the OTC
- Build service pre-requisite
- Refer to generic description.
- Build to run service
- Refer to generic description.
- A referential file exists in the Git including the reference code of the AWS Lambda.
- This file can be executed with a CI/CD and the execution has been tested successfully.
- Co-manage option
Yes, based on RACI determined during pre-sales or build jointly with the customer.
- KPI
KPIs monitored:
Invocation metrics
- Invocations
- Errors
- DeadLetterErrors
- DestinationDeliveryFailures
- Throttles
- ProvisionedConcurrencyInvocations
- ProvisionedConcurrencySpilloverInvocations
Performance metrics
- Duration
- PostRuntimeExtensionsDuration
- IteratorAge
- OffsetLag
Concurrency metrics
- ConcurrentExecutions
- ProvisionedConcurrentExecutions
- ProvisionedConcurrencyUtilization
- UnreservedConcurrentExecutions
Alerts Observed:
Errors
Customized alerting can be added as an option based on customer needs.
- Backup and restore
Data backup and restore
Backup is not used by default.
Service restore
By default, the Lambda Function code in the GIT is the referential and the Continuous Deployment chain workflow is used to deploy it. Shall a problem occur on a function, the Continuous Deployment chain is used to redeploy the Function from the version of reference in the GIT.
- AWS SLA High Availability
The service is HA by design by AWS. The highly Availability of the business function depends on its design, its interfaces with other business functions and external services and its dependencies on operating systems, middleware, databases, micro-services, Kubernetes services, big data services, and cloud native services mainly event driven services (e.g. SNS, SQS, API Gateway, EventBridge, etc.).
- Disaster Recovery inter-region
Redeploy in another region from Infra as Code.
Work Unit |
Per package of 100 lines of Lambda function code |
Changes examples | Effort |
Activate / deactivate a function | 2 tokens |
Deploy a new Lambda version | 2 tokens |
Deploy a new Lambda | Estimation in tokens based on time spent |
Develop and deploy a new Lambda | Estimation in tokens based on time spent |
Amazon Simple Storage Service (Amazon S3) is an object storage service that offers industry-leading scalability, data availability, security, and performance.
- Build to run service included in the OTC
- Build service pre-requisite
- Please refer to generic description
- Build to run service
- Build to run service for Simple Storage Service (S3) are necessary. They encompass the parameters setting e.g. Intelligent-Tiering, encryption, versioning, access policies, etc. Optionally, if an optional recurring managed service using S3 has been requested, build to run task will include the selection of KPIs to be observed and alerts to be set up based on KPI thresholds, or external calls to test the availability of the Simple Storage Service (S3). Please refer to generic build to run description.
Run a managed Simple Storage Service (S3) service is optional. Depending on Customer’s interest in monitoring the storage KPIs, in alerting based on KPIs, the Customer may request the service. By default, there is no recurring task proposed on Simple Storage Service (S3), but on demand changes and on demand investigations.
- Run service pre-requisite
- A referential file exists in the Git including the reference configuration of the Simple Storage Service (S3).
- This file can be executed with a CI/CD and the execution has been tested successfully.
- Co-manage option
Yes
- KPI & alerts
Monitoring
S3 is monitored through AWS Health Dashboard.
This service is also monitored through the other services using it (CloudFront, Lambda Function, etc.).
- Backup and restore
Data backup
Optional: Simple Storage Service is a highly available service. Backup is done only through replication in another region. Replication can be added as an option based on customer needs.
If the customer requests the replication in another, it will imply additional fees (network transfer and additional storage).
Service restore
Optional: subject to customer having ordered replication in another region.
- AWS SLA High Availability and Disaster Recovery inter-region
Yes, by default by AWS.
Work Unit |
Per S3 Bucket |
Changes examples | Effort |
Change Access Tier | 2 Tokens |
Other Changes | Estimation in tokens based on time spent |
Amazon CloudFront is a web service that speeds up distribution of your static and dynamic web content, such as .html, .css, .js, and image files, to your users. CloudFront delivers your content through a worldwide network of data centers called edge locations. When a user requests content that you’re serving with CloudFront, the request is routed to the edge location that provides the lowest latency (time delay), so that content is delivered with the best possible performance.
- Build to run service included in the OTC
- Build service pre-requisite
- Refer to generic description.
- Build to run service
- Refer to generic description.
- A referential file exists in the Git including the reference configuration of the CloudFront.
- This file can be executed with a CI/CD and the execution has been tested successfully.
- Co-manage option
Yes, based on RACI determined during pre-sales or build.
- KPI & alerts
Monitoring
Yes
We can optionally configure CloudFront to create log files that contain detailed information about every user request that CloudFront receives.
KPI monitored
- Requests
- Bytes downloaded
- Bytes uploaded
- 4xx error rate
- 5xx error rate
- Total error rate
Alerts observed
- Requests
- 4xx error rate
- 5xx error rate
- Total error rate
- Backup and restore
Data backup and restore
Can be exported from Infra as Code
Service restore
The Continuous Deployment chain is used to redeploy the CloudFront from the configuration file of reference for production environment committed in the Git.
- AWS SLA High Availability and Disaster Recovery
The service is Highly available by AWS (a global service).
Work Unit |
per CloudFront Distribution |
Changes examples | Effort |
Modify Origin | 2 Tokens |
Customize HTTP headers | 4 Tokens |
Modify cache Rules | Estimation in tokens based on time spent |
Specify cache and compression settings | Estimation in tokens based on time spent |
Specify the values to include in origin requests | Estimation in tokens based on time spent |
Specify the HTTP headers to add to viewer responses | Estimation in tokens based on time spent |
Other changes | Estimation in tokens based on time spent |
AWS Key Management Service (AWS KMS) is a managed service that makes it easy for you to create and control the cryptographic keys that are used to protect your data. AWS KMS uses hardware security modules (HSM) to protect and validate your AWS KMS keys under the FIPS 140-2 Cryptographic Module Validation Program, except in the China (Beijing) and China (Ningxia) Regions.
- Build to run service included in the OTC
- Build service pre-requisite
- Refer to generic description.
- Build to run service
- Refer to generic description.
- A referential file exists in the Git including the reference configuration of the AWS Key Management Service (AWS KMS).
- This file can be executed with a CI/CD and the execution has been tested successfully.
- KPI & alerts
KPIs monitored:
- SecondsUntilKeyMaterialExpiration
Alerts observed:
- SecondsUntilKeyMaterialExpiration if the customer imports his keys
- Backup and restore
Data backup and restore
There is no data to Backup. Keys durability is assured by AWS (Roll-back on deletion can be set up : deletion delay can be set up between 7 and 30 days).
- AWS SLA High Availability and Disaster Recovery inter-region
High Availability is supported by AWS for this service.
- Security
Security recommendations can be part of an optional security scope of work based on customer request.
By default, the MRC does not cover security recommendations.
Work Unit |
Per KMS key |
Changes examples | Effort |
Add/remove key | 1 token |
Configure access policy | 2 tokens |
Configure AWS native services to use key KMS | Estimation in tokens based on time spent |
Other changes | Estimation in tokens based on time spent |
A Network Load Balancer functions at the fourth layer of the Open Systems Interconnection (OSI) model. It can handle millions of requests per second. After the load balancer receives a connection request, it selects a target from the target group for the default rule. It attempts to open a TCP connection to the selected target on the port specified in the listener configuration.
- Build to run service included in the OTC
- Build service pre-requisite
- Refer to generic description.
- Build to run service
- Refer to generic description.
- A referential file exists in the Git including the reference configuration of the Elastic Load Balancing – Network Load Balancer (NLB).
- This file can be executed with a CI/CD and the execution has been tested successfully.
- Co-manage option
No, OBS manages the Network Load Balancer
Monitoring
Yes:
KPI monitored
- ActiveFlowCount
- ActiveFlowCount_TCP
- ActiveFlowCount_TLS
- ActiveFlowCount_UDP
- ClientTLSNegotiationErrorCount
- ConsumedLCUs
- ConsumedLCUs_TCP
- ConsumedLCUs_TLS
- ConsumedLCUs_UDP
- HealthyHostCount
- NewFlowCount
- NewFlowCount_TCP
- NewFlowCount_TLS
- NewFlowCount_UDP
- PeakBytesPerSecond
- PeakPacketsPerSecond
- ProcessedBytes
- ProcessedBytes_TCP
- ProcessedBytes_TLS
- ProcessedBytes_UDP
- ProcessedPackets
- TargetTLSNegotiationErrorCount
- TCP_Client_Reset_Count
- TCP_ELB_Reset_Count
- TCP_Target_Reset_Count
- UnHealthyHostCount
Alerts observed
UnHealthyHostCount (correlated with an EC2 instance Down)
- Backup and restore
Not applicable. Load balancer does not store data.
Service restore
The Continuous Deployment chain is used to redeploy the Load Balancer from the configuration file of reference for production environment committed in the Git.
- AWS SLA High Availability and Disaster Recovery inter-region
The high availability is ensured by AWS.
There is no native DR for this service.
Maintaining a cross region Disaster Recovery requires specific design and subject to a specific additional charging.
Work Unit |
Per Network Load Balancer instance |
Changes examples | Effort | Impact on MRC |
Add a pool member (if members are static) | 1 token | |
Add a listener | 2 tokens | |
Add a Target Group | 4 tokens | |
Other changes | Estimation in tokens based on time spent |
A security group acts as a virtual firewall for your instance to control inbound and outbound traffic. When you launch an instance in a VPC, you can assign up to five security groups to the instance. Security groups act at the instance level, not the subnet level. Therefore, each instance in a subnet in your VPC can be assigned to a different set of security groups.
At the basic level, managing Network Security group consists in building, deploying, and maintaining the Infra as Code for it and managing the changes.
The management of Security Groups is included as part of a larger bundle of Network and Security Managed services which provides network and security design, maintenance, network watching, intrusion detection, troubleshooting depending on an agreed Scope of Work.
Optionally, AWS Config can be used on demand with extra fees.
Can be exported from Infra as Code.
Work Unit | OTC & MRC |
Network and security management services | Custom, depending on agreed Scope of Work |
OCB team has to validate rules before being applied for security compliance.
Example: for Frontend we can open port 80 or 443 and this can consume 1 token, for Backend we open port 3306 (1 token).
Changes examples | Effort |
Add / modify / delete Security group (up to 5 rules) excluding dependencies* | 2 tokens |
Add / modify / delete Security rules (up to 5 rules) excluding dependencies* | 1 token |
Other changes | Estimation in tokens based on time spent |
*Dependencies include all triggered applications like, EC2, AWS Firewall, AWS DB services and other native services.
1.10 Amazon Elastic Compute Cloud (EC2) and OS
The Managed Service for EC2 is called Managed OS. OBS manages both the OS and the EC2.
Amazon Elastic Compute Cloud (Amazon EC2) provides scalable computing capacity in the Amazon Web Services (AWS) Cloud. Using Amazon EC2 eliminates your need to invest in hardware up front, so you can develop and deploy applications faster. You can use Amazon EC2 to launch as many or as few virtual servers as you need, configure security and networking, and manage storage. Amazon EC2 enables you to scale up or down to handle changes in requirements or spikes in popularity, reducing your need to forecast traffic.
- Build to run service included in the OTC
- Build service pre-requisite
- Refer to generic description.
- Build to run service
- Refer to generic description.
- Run service pre-requisite
- A referential file exists in the Git including the reference configuration of the EC2.
- This file can be executed with a CI/CD and the execution has been tested successfully.
- KPI & alerts
KPI monitored for Instances:
- CPUUtilization
- DiskReadOps
- DiskWriteOps
- DiskReadBytes
- DiskWriteBytes
- MetadataNoToken
- NetworkIn
- NetworkOut
- NetworkPacketsIn
- NetworkPacketsOut
Other metrics can be collected from CloudWatch agent (Deployed in each EC2 instance) like MemoryUsage and DiskUsage.
Alerts observed:
Alert on CPU, MemoryUsage and DiskUsage.
Alerts also on Status Checks: SystemStatusChecks and InstanceStatusChecks (aggregated in StatusCheckMetric)
Optional: Depending on the criticity of the application we might activate Detailed monitoring: With Basic monitoring. Data is available automatically in 5-minute periods. Using Detailed monitoring, data is available in 1-minute periods. To get this level of data, you must specifically enable it for the instance.
Activating Detailed Monitoring will be charged by AWS.
- OS patching
AWS Systems Manager Patch Manager
For managed OS, OBS leverages AWS Systems Manager Patch Manager for the patching of the Operating System (OS).
Behavior: With AWS Systems Manager Patch Manager, patches are decided by Amazon and all patches are to be applied if mandatory for the EC2 for Windows and Linux.
Additional reporting could be asked by the customer and extra fees will be charged.
- Antivirus
For managed OS, OBS leverages its central anti-virus system based on Sophos. This requires the installation of the anti-virus agent on the OS for each EC2 as well as the VPN connectivity to OBS Centralized Administration Zone. OBS systems allows for central reporting on Malware from its backend console system.
Would the Customer desire to keep its own Antivirus system, then OBS shall not be taken responsible for protection against viruses.
- Backup and restore
Data backup and restore
By default, OBS leverages AWS Backup on the EC2 for Managed OS. The configuration of AWS Backup pattern as well as retention period shall be agreed with the Customer prior to the RUN. The first backup is full. The following backups are incremental. You can the frequency of the backup. As example: 1 x backup per week, 1x incremental backup per day per EC2. The retention period depends on customer request. AWS charges will be calculated based on change rate.
Restore of EC2 are performed from the backup.
- In case of incident, latest version of backup can be restored
- Upon change request, a previous version of backup can be restored.
- AWS SLA High Availability and Disaster Recovery inter-region
Service is Highly Available within a single Availability Zone.
Multi-Availability Zones design requires specific design and subject to a specific additional charging.
This service is covered by AWS Backup which enables the creation of backup copies across AWS Regions.
If this option is activated, traffic between regions and storage will be charged by Amazon.
- Administration tasks tracing
Actions performed by OBS managed teams on the managed OS are done from OBS Administration Zone through an access controlled by a CyberArk bastion. OBS CyberArk bastion protects the access and keep trace of the actions performed by the maintenance team allowing for audit.
The VPN connectivity to the OBS Administration Zone necessary for the management.
- Login on to the Virtual Machine
For Windows OS based EC2, access shall be granted by the Customer to OBS managed application operations staff through a domain account configured with proper privilege groups.
For Linux OS based EC2, an encrypted key is created and provided to OBS managed application operations staff to log onto the VM.
For Applications, in case of managed application: a secret stored in a safe.
- Logs
Log management is not included in the managed OS / managed EC2 service.
Optionally it can be activated through AWS CloudWatch Logs through Change Request process.
- Security
By default, the MRC includes the use of security policies and groups as per customer’s configuration request.
The MRC does not cover security recommendations. Security recommendations can be part of an optional security scope of work based on customer request.
- Limitations
Managed Application services is provided only for OS versions supported by the CSP vendor.
Work Unit |
Per EC2 Instance |
Changes examples | Effort |
Create an EC2 Instance | 4 Tokens |
Create an EC2 instance integrated to AD | 2 Tokens |
Modify/delete Security Groups | 2 Tokens |
Extend and existing volume | 3 Tokens |
Attach a new volume | 2 Tokens |
Modify EC2 | Estimation in tokens based on time spent |
Delete EC2 | 2 Tokens |
Start/Stop/Restart EC2 | 1 Token |
Other changes | Estimation in tokens based on time spent |
AWS Web Application Firewall (WAF) provides centralized protection of your web applications from common exploits and vulnerabilities. Web applications are increasingly targeted by malicious attacks that exploit commonly known vulnerabilities. SQL injection and cross-site scripting are among the most common attacks.
- Build to run service included in the OTC
- Build service pre-requisite
- Refer to generic description.
- Build to run service
- Refer to generic description.
- A referential file exists in the Git including the reference configuration of the service.
- This file can be executed with a CI/CD and the execution has been tested successfully.
- Co-manage option
No, OBS manages the WAF.
The customer provides the WAF rules to OBS Team who will review, configure and apply them.
- KPI & alerts
Monitoring
Yes
KPI monitored
- AllowedRequests
- BlockedRequests
- CountedRequests
- CaptchaRequests
- RequestsWithValidCaptchaToken
- PassedRequests
Alerts observed
BlockedRequests
1.11.3.4 Reporting
By default, no. Reporting can be requested by customer through change request to have point in time report.
- Backup and restore
Data backup and restore
No persistent data to backup up
Service restore
The Continuous Deployment chain is used to redeploy the rules from the configuration file of reference for production environment committed in the Git.
- AWS SLA High Availability and Disaster Recovery inter-region
The service is Highly available by design by AWS.
WAF is a global AWS service. Disaster Recovery is native.
- Network and security managed services
Additional Network and Security Managed services might be added optionally depending on Scope of Work.
Work Unit |
Access Control list (ACL) |
Changes examples | Effort |
Add already existing rule | 2 tokens |
modify/delete rule/rules (up to 5) | 2 tokens |
Create a simple rule | Estimation in tokens based on time spent |
Create a complex rule | Estimation in tokens based on time spent |
Other changes | Estimation in tokens based on time spent |
Amazon EFS provides simple, scalable, elastic file storage for use with compute instances on the AWS Cloud and on-premises servers.
- Build to run service included in the OTC
- Build service pre-requisite
- Please refer to generic description
- Build to run service
- Please refer to generic description.
Yes, based on RACI determined during pre-sales or build.
- KPI & alerts
Monitoring
Yes
KPI monitored
- PermittedThroughput
- ClientConnections
- StorageBytes
- TotalIOBytes
Alerts observed
StorageBytes
1.12.3.4 Reporting
By default, no. Reporting can be requested by customer through change request to have point in time report on the total storage size.
- Backup and restore
Data backup and restore
AWS Backup native service is used for Backup. The Backup is at a block level: If the customer wants to restore a specific file, he must implement his own file-level backup solution.
Service restore
Restore from Backup
- AWS SLA High Availability and Disaster Recovery inter-region
Yes, by default by AWS. There is no native disaster recovery.
Work Unit |
per EFS filesystem |
Changes examples | Effort |
Change Access Tier | Estimation in tokens based on time spent |
Change permission policy (for encrypted file systems) | Estimation in tokens based on time spent |
Other Changes | Estimation in tokens based on time spent |
AWS Elastic Beanstalk is an easy-to-use AWS service for deploying and managing applications developed with Python, Ruby, Java, .NET, PHP, Node.js and Go and Docker on familiar servers such as Apache, Nginx, Passenger, and IIS. Elastic Beanstalk reduces management complexity by automatically handling the details of capacity provisioning, load balancing, scaling, and application health monitoring.
- Build to run service included in the OTC
- Build service pre-requisite
- Please refer to generic description.
- Build to run service
- Please refer to generic description.
Yes, based on RACI determined during pre-sales or build.
- KPI & alerts
Monitoring
Yes
KPI monitored
- EnvironmentHealth
- InstancesSevere
- InstancesDegraded
- InstancesWarning
- InstancesInfo
- InstancesOk
- InstancesPending
- InstancesUnknown
- InstancesNoData
- ApplicationRequestsTotal
- ApplicationRequests5xx
- ApplicationRequests4xx
- ApplicationRequests3xx
- ApplicationRequests2xx
- ApplicationLatencyP10
- ApplicationLatencyP50
- ApplicationLatencyP75
- ApplicationLatencyP85
- ApplicationLatencyP90
- ApplicationLatencyP95
- ApplicationLatencyP99
- 9
- InstanceHealth
Available metrics—Linux
- CPUIrq
- CPUIdle
- CPUUser
- CPUSystem
- CPUSoftirq
- CPUIowait
- CPUNice
- LoadAverage1min
- RootFilesystemUtil
Available metrics—Windows
- CPUIdle
- CPUUser
- CPUPriveleged
Alerts observed
Alert on InstanceHealth
Alert on CPUIdle and CPUUser for both Linux and Windows
1.13.3.4 Reporting
By default, no. Reporting can be requested by customer through change request to have point in time report.
- Backup and restore
Data backup and restore
There is no data to backup.
Service restore
Recovery will be from Infra as Code.
- AWS SLA High Availability and Disaster Recovery inter-region
Disaster Recovery is optional for this service. The service will be rebuilt in another region based on Infra as Code.
Work Unit |
per Web Application |
Changes examples | Effort |
Add a custom domain on an AWS Elastic Beanstalk | 3 Tokens |
Configure a connection string to access another resource | 2 Tokens |
Deploy a new version of an existing webapp | 2 Tokens |
Migrate an on-Premises webapp on AWS Elastic Beanstalk | Estimation in tokens based on time spent |
Move Elastic Beanstalk in another region | Estimation in tokens based on time spent |
Create and deploy a new webapp | Estimation in tokens based on time spent |
Other Changes | Estimation in tokens based on time spent |
Amazon GuardDuty is a threat detection service that continuously monitors your AWS accounts and workloads for malicious activity and delivers detailed security findings for visibility and remediation.
It is highly recommended to activate Amazon GuardDuty regardless of the AWS services used.
- Build to run service included in the OTC
- Build service pre-requisite
- Please refer to generic description
- Build to run service
- Please refer to generic description.
Yes, based on RACI determined during pre-sales or build.
- KPI & alerts
Amazon GuardDuty analyzes and processes the following Data sources: VPC Flow Logs, AWS CloudTrail management event logs, CloudTrail S3 data event logs, and DNS logs.
An action is taken if a threat is detected (trigger a lambda function, event management, workflow logs).
By default, we will set CloudWatch events.
Optionally, a Lambda Function for automatic remediation can be requested by customer. The estimation will be based on time spent.
- Backup and restore
Data backup and restore
There is no native backup for this service.
Service restore
Recovery will be from Infra as Code.
- AWS SLA High Availability and Disaster Recovery inter-region
The service is highly available by design by AWS.
There is no native Disaster Recovery for this service.
Work Unit |
per security threat |
Changes examples | Effort |
Add Trusted IP List or threat list | 2 Tokens |
Modify, Delete an existing rule | Estimation in tokens based on time spent |
Other Changes | Estimation in tokens based on time spent |
1.15 Amazon MQ
1.15.1 Description
Amazon MQ is a managed message broker service. A message broker allows software applications and components to communicate using various programming languages, operating systems, and formal messaging protocols. Currently, Amazon MQ supports Apache ActiveMQ and RabbitMQ engine types.
1.15.2 Build to run service included in the OTC
1.15.2.1 Build service pre-requisite
- Refer to generic description.
1.15.2.2 Build to run service
- Refer to generic description.
1.15.3 RUN services included in the MRC
1.15.3.1 Run service pre-requisite
- A referential file exists in the Git used by OBS which includes the reference configuration of the service.
- This file can be executed with a CI/CD used by OBS and the execution has been tested successfully.
1.15.3.2 Reporting
By default, no. Reporting can be requested by customer through change request to have point in time report.
1.15.3.3 KPI & alerts
ActiveMQ
Monitoring
Yes
KPI monitored
Amazon MQ for ActiveMQ metrics:
- BurstBalance
- CpuCreditBalance
- CpuUtilization
- CurrentConnectionsCount
- EstablishedConnectionsCount
- HeapUsage
- InactiveDurableTopicSubscribersCount
- JobSchedulerStorePercentUsage
- JournalFilesForFastRecovery
- JournalFilesForFullRecovery
- NetworkIn
- NetworkOut
- OpenTransactionCount
- StorePercentUsage
- TempPercentUsage
- TotalConsumerCount
- TotalMessageCount
- TotalProducerCount
- VolumeReadOps
- VolumeWriteOps
ActiveMQ destination (queue and topic) metrics:
- ConsumerCount
- EnqueueCount
- EnqueueTime
- ExpiredCount
- DispatchCount
- DequeueCount
- InFlightCount
- ReceiveCount
- MemoryUsage
- ProducerCount
- QueueSize
- TotalEnqueueCount
- TotalDequeueCount
Alerts observed
Specific ActiveMQ alerts: CpuUtilization and HeapUsage
For Message Broker alerts: EnqueueCount and EnqueueTime
RabbitMQ
Monitoring
Yes
KPI monitored
RabbitMQ broker metrics:
- ExchangeCount
- QueueCount
- ConnectionCount
- ChannelCount
- ConsumerCount
- MessageCount
- MessageReadyCount
- MessageUnacknowledgedCount
- PublishRate
- ConfirmRate
- AckRate
RabbitMQ node metrics:
- SystemCpuUtilization
- RabbitMQMemLimit
- RabbitMQMemUsed
- RabbitMQDiskFreeLimit
- RabbitMQDiskFree
- RabbitMQFdUsed
RabbitMQ queue metrics:
- ConsumerCount
- MessageReadyCount
- MessageUnacknowledgedCount
- MessageCount
Alerts observed
Message Broker alerts: AckRate
Node alerts: SystemCpuUtilization, RabbitMQMemUsed and RabbitMQDiskFree
1.15.3.4 Backup and restore
Data backup and restore
There is no data to backup.
Service restore
Recovery will be from Infra as Code. The messages in queue when the incident occurs won’t be recovered.
1.15.3.5 AWS SLA High Availability and Disaster Recovery inter-region
There is no native Disaster Recovery for this service.
Optionally, the DR can be customized by design. We recommend activating this option for Production Workloads that require High Availability and message durability.
1.15.4 Charging model
Work Unit |
Per broker |
1.15.5 Changes catalogue – in Tokens, per act
Changes examples | Effort |
Upgrade broker | 6 Tokens |
Reboot broker | 2 Tokens |
Change maintenance window | 1 Token |
Change Broker configuration* | Estimation in tokens based on time spent |
Other changes | Estimation in tokens based on time spent |
*Each broker has his own specificities that can be configured.
1.16 Amazon Simple Notification Service (SNS)
1.16.1 Description
Amazon Simple Notification Service (Amazon SNS) is a managed service that provides message delivery from publishers to subscribers (also known as producers and consumers). Publishers communicate asynchronously with subscribers by sending messages to a topic, which is a logical access point and communication channel. Clients can subscribe to the SNS topic and receive published messages using a supported endpoint type, such as Amazon Kinesis Data Firehose, Amazon SQS, AWS Lambda, HTTP, email, mobile push notifications, and mobile text messages (SMS).
1.16.2 Build to run service included in the OTC
1.16.2.1 Build service pre-requisite
- Refer to generic description.
1.16.2.2 Build to run service
- Refer to generic description.
1.16.3 RUN services included in the MRC
1.16.3.1 Run service pre-requisite
- A referential file exists in the Git used by OBS which includes the reference configuration of the service.
- This file can be executed with a CI/CD used by OBS and the execution has been tested successfully.
1.16.3.2 Reporting
By default, no. Reporting can be requested by customer through change request to have point in time report.
1.16.3.3 KPI & alerts
Monitoring
optional
KPI monitored
- NumberOfMessagesPublished
- NumberOfNotificationsDelivered
- NumberOfNotificationsFailed
- NumberOfNotificationsFilteredOut
- NumberOfNotificationsFilteredOut-InvalidAttributes
- NumberOfNotificationsFilteredOut-NoMessageAttributes
- NumberOfNotificationsRedrivenToDlq
- NumberOfNotificationsFailedToRedriveToDlq
- PublishSize
- SMSMonthToDateSpentUSD
- SMSSuccessRate
Alerts observed
Alert on NumberOfNotificationsFailed.
Optionally, other alerts could be observed. The selection of these additional alerts depends on the Application’s requirements.
1.16.3.4 Backup and restore
Data backup and restore
There is no data to backup.
Service restore
Recovery will be from Infra as Code.
1.16.3.5 AWS SLA High Availability and Disaster Recovery inter-region
The service is Highly Available by default by AWS.
1.16.4 Charging model
Work Unit |
Per SNS topic |
1.16.5 Changes catalogue – in Tokens, per act
Changes examples | Effort |
Adding a subscription | 1 token |
Other changes | Estimation in tokens based on time spent |
1.17 Amazon Simple Queue Service (SQS)
1.17.1 Description
Amazon Simple Queue Service (Amazon SQS) offers a secure, durable, and available hosted queue that lets you integrate and decouple distributed software systems and components. Amazon SQS offers common constructs such as dead-letter queues and cost allocation tags.
1.17.2 Build to run service included in the OTC
1.17.2.1 Build service pre-requisite
- Refer to generic description.
1.17.2.2 Build to run service
- Refer to generic description.
1.17.3 RUN services included in the MRC
1.17.3.1 Run service pre-requisite
- A referential file exists in the Git used by OBS which includes the reference configuration of the service.
- This file can be executed with a CI/CD used by OBS and the execution has been tested successfully.
1.17.3.2 Reporting
By default, no. Reporting can be requested by customer through change request to have point in time report.
1.17.3.3 KPI & alerts
Monitoring
optional
KPI monitored
- ApproximateAgeOfOldestMessage
- ApproximateNumberOfMessagesDelayed
- ApproximateNumberOfMessagesNotVisible
- ApproximateNumberOfMessagesVisible
- NumberOfEmptyReceives
- NumberOfMessagesDeleted
- NumberOfMessagesReceived
- NumberOfMessagesSent
- SentMessageSize
Alerts observed
Alert on ApproximateAgeOfOldestMessage.
Optionally, other alerts could be observed. The selection of these additional alerts depends on the Application’s requirements.
1.17.3.4 Backup and restore
Data backup and restore
There is no data to backup.
Service restore
Recovery will be from Infra as Code.
1.17.3.5 AWS SLA High Availability and Disaster Recovery inter-region
The service is Highly Available by default by AWS. There is no native Disaster Recovery.
1.17.4 Charging model
Work Unit |
Per SQS Queue |
1.17.5 Changes catalogue – in Tokens, per act
Changes examples | Effort |
Adding a subscription | 1 Token |
Other changes | Estimation in tokens based on time spent |
1.18 Amazon CloudWatch – basic monitoring with class 2 transition
1.18.1 Description
Amazon CloudWatch is a monitoring and observability service. CloudWatch provides you with data and actionable insights to monitor your applications, respond to system-wide performance changes, and optimize resource utilization. CloudWatch collects monitoring and operational data in the form of logs, metrics, and events. You get a unified view of operational health and gain complete visibility of your AWS resources, applications, and services running on AWS and on-premises. You can use CloudWatch to detect anomalous behavior in your environments, set alarms, visualize logs and metrics side by side, take automated actions, troubleshoot issues, and discover insights to keep your applications running smoothly.
1.18.2 Build to run service included in the OTC
1.18.2.1 Build to run service pre-requisite
The pre-requisite to Amazon CloudWatch basic monitoring with class 2 transition is that it has been configured by the Customer including
- Resources monitored
- CloudWatch Agent deployed on the resources when applicable
- Metrics and alerts forwarded to Amazon CloudWatch
- Performance dashboards using CloudWatch Dashboards
1.18.2.2 Build to run service
For Amazon CloudWatch basic monitoring with class 2 transition, the build to run service included in the OTC consists in integrating the alerts into OBS supervision backend.
1.18.3 RUN services included in the MRC
1.18.3.1 Run service pre-requisite
- The resource monitored is in the inventory Scope of Work of managed service: infrastructure resource, middleware resource, application resource, database resource, Kubernetes cluster resource, microservice resource, etc.…
- A referential file exists in the Git including the reference configuration of Amazon CloudWatch.
- This file can be executed with a CI/CD and the execution has been tested successfully.
1.18.3.2 KPI & alerts
Monitoring
Yes
Alerts observed
- Alerts defined in Amazon CloudWatch for resources in the Scope of Work of managed services.
1.18.3.3 Monitoring service
As part of the Application basic monitoring service, OBS operations will monitor the alerts, raise tickets and inform the Customer on incident. The basic service excludes remedial of incident.
1.18.3.4 Backup and restore
Backup and restore of Amazon CloudWatch: N/A
Service restore of Amazon CloudWatch: The configuration of Amazon CloudWatch can be recovered from Infrastructure-as-code if its configuration has been done through infrastructure as code.
Backup and restore of resources monitored by Amazon CloudWatch: N/A
Restore from IaC for resources monitored by Amazon CloudWatch: N/A
1.18.3.5 Limitations & pre-requisite
The Amazon CloudWatch basic monitoring service is monitoring only.
1.18.4 Charging model
Work Unit |
Per managed resource |
1.18.5 Changes catalogue – in Tokens, per act
Changes examples | Effort |
Other changes | Estimation in tokens based on time spent |
1.19 AWS Backup – basic backup with class 2 transition
1.19.1 Description
AWS Backup enables you to centralize and automate data protection across AWS services and hybrid workloads. AWS Backup offers a cost-effective, fully managed, policy-based service that further simplifies data protection at scale. AWS Backup also helps you support your regulatory compliance or business policies for data protection. Together with AWS Organizations, AWS Backup enables you to centrally deploy data protection policies to configure, manage, and govern your backup activity across your organization’s AWS accounts and resources, including Amazon Elastic Compute Cloud (Amazon EC2) instances, Amazon Elastic Block Store (Amazon EBS) volumes, Amazon Simple Storage Service (Amazon S3) buckets, Amazon Relational Database Service (Amazon RDS) databases (including Amazon Aurora clusters), Amazon DynamoDB tables, Amazon Neptune databases, Amazon DocumentDB (with MongoDB compatibility) databases, Amazon Elastic File System (Amazon EFS) file systems, Amazon FSx for Lustre file systems, Amazon FSx for Windows File Server file systems, and AWS Storage Gateway volumes, and VMware workloads on premises and in VMware CloudTM on AWS.
1.19.2 Build to run service included in the OTC
1.19.2.1 Build to run service pre-requisite
The pre-requisite to AWS backup – backup with class 2 transition is that it has been configured by the Customer including
- Resources backed up when applicable
- Backup configured and VSS deployed when applicable
- Metrics and alerts on backup status forwarded to Amazon CloudWatch
1.19.2.2 Build to run service
For AWS Backup with class 2 transition, the build to run service included in the OTC consists in integrating the alerts on backup status into OBS supervision backend.
1.19.3 RUN services included in the MRC
1.19.3.1 Run service pre-requisite
- The resource backed-up is in the inventory Scope of Work of managed services supported by AWS Backup
- A referential file exists in the Git including the reference configuration of AWS Backup.
- This file can be executed with a CI/CD and the execution has been tested successfully.
1.19.3.2 KPI & alerts
Monitoring
Yes
Job status monitored KPIs
- CREATED
- PENDING
- RUNNING
- ABORTED
- COMPLETED
- FAILED
- EXPIRED
Alerts observed
- Alerts on ABORTED and FAILED for each resource supported by AWS Backup
1.19.3.3 Backup service
As part of the AWS Backup service, OBS operations will monitor the alerts related to backup status, raise tickets and inform the Customer on incident. The basic service excludes data recovery. Data Recovery is requested through a change request.
1.19.3.4 Backup and restore
Backup and restore of AWS Backup: N/A
Service restore of AWS Backup: The configuration of AWS Backup can be recovered from Infrastructure-as-code if its configuration has been done through infrastructure as code.
1.19.3.5 Limitations & pre-requisite
AWS Backup native service is used for Backup. The Backup provided is a block-level one: If the customer wants to restore a specific file, he needs to implement his own file-level backup solution.
1.19.4 Charging model
Work Unit |
Per managed resource |
1.19.5 Changes catalogue – in Tokens, per act
Changes examples | Effort |
Change the configuration of backup plan | 3 Tokens |
Other changes | Estimation in tokens based on time spent |
1.20 Amazon Elastic Container Service (ECS)
1.20.1 Description
Amazon Elastic Container Service (ECS) is a highly scalable, high performance container management service that supports Docker containers and allows you to easily run applications on a managed cluster of Amazon Elastic Compute Cloud (Amazon EC2) instances. Amazon ECS eliminates the need for you to install, operate, and scale your own cluster management infrastructure. With simple API calls, you can launch and stop container-enabled applications, query the complete state of your cluster, and access many familiar features like security groups, Elastic Load Balancing, Amazon Elastic Block Store (EBS) volumes, and Identity Access Management (IAM). roles. You can use Amazon ECS to schedule container placement across your cluster based on your resource needs and availability requirements. You can also integrate your own scheduler or third-party schedulers to meet business or application specific requirements.
1.20.2 Build to run service included in the OTC
1.20.2.1 Build service pre-requisite
- Refer to generic description.
1.20.2.2 Build to run service
- Refer to generic description.
1.20.3 RUN services included in the MRC
1.20.3.1 Run service pre-requisite
- A referential file exists in the Git used by OBS which includes the reference configuration of the service.
- This file can be executed with a CI/CD used by OBS and the execution has been tested successfully.
1.20.3.2 Reporting
By default, no. Reporting can be requested by customer through change request to have point in time report.
1.20.3.3 KPI & alerts
Monitoring
optional
KPI monitored
- CPUReservation
- CPUUtilization
- MemoryReservation
- MemoryUtilization
- GPUReservation
When Amazon ECS runs containers on top of EC2 instances, EC2 metrics will be collected as well (Please refer to EC2 section).
When “EC2 launch type” is used with Linux container instances, the Amazon ECS container agent relies on Docker stats metrics to gather CPU and memory data for each container running on the instance. For burstable performance instances (T3, T3a, and T2 instances), the CPU utilization metric may reflect different data compared to instance-level CPU metrics.
Alerts observed
- CPUReservation
- CPUUtilization
- MemoryReservation
- MemoryUtilization
- GPUReservation (optional, only if the application requires GPU)
Alerts for EC2 when Amazon ECS runs containers on top of EC2 instances (please refer to EC2 section)
1.20.3.4 Backup and restore
Data backup and restore
There is no data to backup.
Service restore
Recovery will be from Infra as Code.
1.20.3.5 AWS SLA High Availability and Disaster Recovery inter-region
When Amazon ECS runs containers on top of EC2, High Availability depends on the service configuration and is optional.
When Amazon ECS is used with Amazon Fargate, the service is natively Highly Available.
Disaster Recovery requires specific configuration and is optional.
1.20.4 Charging model
Work Unit |
Per Docker Image |
1.20.5 Changes catalogue – in Tokens, per act
Changes examples | Effort |
Modify a container Version | 4 Tokens |
Adjust the CPU and Memory | 2 Tokens |
Create ECS cluster | 6 Tokens |
Task deployment | 4 Tokens |
Service deployment | 4 Tokens |
Other changes | Estimation in tokens based on time spent |
1.21 Elastic Container Registry (ECR)
1.21.1 Description
Amazon Elastic Container Registry (Amazon ECR) is an AWS managed container image registry service that is secure, scalable, and reliable.
1.21.2 Build to run service included in the OTC
1.21.2.1 Build service pre-requisite
- Refer to generic description.
1.21.2.2 Build to run service
- Refer to generic description.
1.21.3 RUN services included in the MRC
1.21.3.1 Run service pre-requisite
- A referential file exists in the Git used by OBS which includes the reference configuration of the service.
- This file can be executed with a CI/CD used by OBS and the execution has been tested successfully.
1.21.3.2 Reporting
By default, no. Reporting can be requested by customer through change request to have point in time report.
1.21.3.3 KPI & alerts
Monitoring
optional
KPI monitored
- CallCount
Alerts observed
No alerts observed
1.21.3.4 Backup and restore
Data backup and restore
There is no native backup for this service. In case of an issue with ECR, all Docker images will be lost.
Service restore
Recovery will be from Infra as Code.
1.21.3.5 AWS SLA High Availability and Disaster Recovery inter-region
The service is highly available by design by AWS.
Disaster Recovery is optional. Set up multi-region backup (synchronize with another region) can be requested by the customer. This will have an impact on storage cost.
1.21.4 Charging model
Work Unit |
Per Docker Image |
1.21.5 Changes catalogue – in Tokens, per act
Changes examples | Effort |
Set up access to image | 2 Tokens |
Other changes | Estimation in tokens based on time spent |
1.22 AWS Directory Service
1.22.1 Description
AWS Directory Service provides multiple ways to use Microsoft Active Directory (AD) with other AWS services. Directories store information about users, groups, and devices, and administrators use them to manage access to information and resources. AWS Directory Service provides multiple directory choices for customers who want to use existing Microsoft AD or Lightweight Directory Access Protocol (LDAP)–aware applications in the cloud. It also offers those same choices to developers who need a directory to manage users, groups, devices, and access.
1.22.2 Build to run service included in the OTC
1.22.2.1 Build service pre-requisite
- Refer to generic description.
1.22.2.2 Build to run service
- Refer to generic description.
1.22.3 RUN services included in the MRC
1.22.3.1 Run service pre-requisite
- A referential file exists in the Git used by OBS which includes the reference configuration of the service.
- This file can be executed with a CI/CD used by OBS and the execution has been tested successfully.
1.22.3.2 Reporting
By default, no. Reporting can be requested by customer through change request to have point in time report.
1.22.3.3 KPI & alerts
Monitoring
Yes
KPI monitored
- Active
- Creating
- Deleted
- Deleting
- Failed
- Impaired
- Inoperable
- Requested
- RestoreFailed
- Restoring
- Processor
- Memory
- Logical Disk
- Network Interface
- LDAP searches
- Binds
- DNS queries
- Directory reads
- Directory writes
Alerts
Alert on Failed, Impaired and Inoperable.
We will also trigger alerts on Processor and Memory.
Optionally, other alerts could be requested by the customer based on quote.
1.22.3.4 Backup and restore
Data backup and restore
Optionally, Plugging AD Backup tools can be requested by the customer.
Service restore
Recovery will be from Infra as Code.
1.22.3.5 AWS SLA High Availability and Disaster Recovery inter-region
The service is highly available by design by AWS.
There is no native Disaster Recovery for this service.
Multi-Region replication is only supported for the Enterprise Edition of AWS Managed Microsoft AD. You can use automated multi-Region replication in all Regions where AWS Managed Microsoft AD is available.
1.22.4 Charging model
Work Unit |
Per AD |
1.22.5 Changes catalogue – in Tokens, per act
Changes examples | Effort |
Modify the maintenance window | 1 Token |
Set trusted relationship | Estimation in tokens based on time spent |
Share Directory Service with another AWS account | 6 Tokens |
Other changes | Estimation in tokens based on time spent |
1.23 Amazon Cognito
1.23.1 Description
Amazon Cognito handles user authentication and authorization for your web and mobile apps. With user pools, you can easily and securely add sign-up and sign-in functionality to your apps. With identity pools (federated identities), your apps can get temporary credentials that grant users access to specific AWS resources, whether the users are anonymous or are signed in.
1.23.2 Build to run service included in the OTC
1.23.2.1 Build service pre-requisite
- Refer to generic description.
1.23.2.2 Build to run service
- Refer to generic description.
1.23.3 RUN services included in the MRC
1.23.3.1 Run service pre-requisite
- A referential file exists in the Git used by OBS which includes the reference configuration of the service.
- This file can be executed with a CI/CD used by OBS and the execution has been tested successfully.
1.23.3.2 Reporting
By default, no. Reporting can be requested by customer through change request to have point in time report.
1.23.3.3 KPI & alerts
Monitoring
yes
KPI monitored
- SignUpSuccesses
- SignUpThrottles
- SignInSuccesses
- SignInThrottles
- TokenRefreshSuccesses
- TokenRefreshThrottles
- FederationSuccesses
- FederationThrottles
- CallCount
- ThrottleCount
Alerts observed
Alert on SignInThrottles
Optionally, other alerts can be observed.
1.23.3.4 Backup and restore
Data backup and restore
No native Backup option is provided. Optionally, the customer can request automatic accounts backup.
Service restore
Recovery will be from Infra as Code for user pools and identity pools.
Optionally, a customized backup can be provided on quote.
If the customer chooses to activate this option, the infrastructure cost will be impacted.
1.23.3.5 AWS SLA High Availability and Disaster Recovery inter-region
The service is Highly available by design by AWS. There is no native Disaster Recovery.
Optionally, implement the customized backup solution cross-region.
1.23.4 Charging model
Work Unit |
Per Pool |
1.23.5 Changes catalogue – in Tokens, per act
Changes examples | Effort |
Add IAM role for Cognito | 2 Tokens |
Create a User pool | 6 Tokens |
Create Identity pool | 6 Tokens |
Other changes | Estimation in tokens based on time spent |
1.24 Amazon DynamoDB
1.24.1 Description
Amazon DynamoDB is a fully managed, serverless, key-value NoSQL database designed to run high-performance applications at any scale. DynamoDB offers built-in security, continuous backups, automated multi-Region replication, in-memory caching, and data export tools.
1.24.2 Build to run service included in the OTC
1.24.2.1 Build service pre-requisite
- Refer to generic description.
1.24.2.2 Build to run service
- Refer to generic description.
1.24.3 RUN services included in the MRC
1.24.3.1 Run service pre-requisite
- A referential file exists in the Git used by OBS which includes the reference configuration of the service.
- This file can be executed with a CI/CD used by OBS and the execution has been tested successfully.
1.24.3.2 Reporting
By default, no. Reporting can be requested by customer through change request to have point in time report.
1.24.3.3 KPI & alerts
Monitoring
yes
Available metrics
- AccountMaxReads
- AccountMaxTableLevelReads
- AccountMaxTableLevelWrites
- AccountMaxWrites
- AccountProvisionedReadCapacityUtilization
- AccountProvisionedWriteCapacityUtilization
- AgeOfOldestUnreplicatedRecord
- ConditionalCheckFailedRequests
- ConsumedChangeDataCaptureUnits
- ConsumedReadCapacityUnits
- ConsumedWriteCapacityUnits
- FailedToReplicateRecordCount
- MaxProvisionedTableReadCapacityUtilization
- MaxProvisionedTableWriteCapacityUtilization
- OnlineIndexConsumedWriteCapacity
- OnlineIndexPercentageProgress
- OnlineIndexThrottleEvents
- PendingReplicationCount
- ProvisionedReadCapacityUnits
- ProvisionedWriteCapacityUnits
- ReadThrottleEvents
- ReplicationLatency
- ReturnedBytes
- ReturnedItemCount
- ReturnedRecordsCount
- SuccessfulRequestLatency
- SystemErrors
- TimeToLiveDeletedItemCount
- ThrottledPutRecordCount
- ThrottledRequests
- TransactionConflict
- UserErrors
- WriteThrottleEvents
Alerts observed
Alert on SystemErrors reveals an issue with a Dynamo DB table.
Alert on UserErrors reveals an issue on the application side.
In case of multi-region replication, alert on ReplicationLatency.
Alert on AccountProvisionedReadCapacityUtilization and AccountProvisionedWriteCapacityUtilization explains an application latency.
Optionally, other alerts can be observed.
1.24.3.4 Backup and restore
Data backup and restore
DynamoDB offers two methods to back up your table data. Continuous backups with point-in-time recovery (PITR) provide an ongoing backup of your table for the preceding 35 days. You can restore your table to the state of any specified second in the preceding five weeks. On-demand backups create snapshots of your table to archive for extended periods to help you meet corporate and governmental regulatory requirements.
- On demand backup: On-demand backup allows you to create full backups of your Amazon DynamoDB table at specified points in time. Recovery Point Objective will depend on the backup frequency chosen by the customer. This option is suitable for long-term retention and archival. It can help you to comply with regulatory requirements.
There are two options available for creating and managing DynamoDB on-demand backups:
- AWS Backup service
- DynamoDB
With AWS Backup, you can configure backup policies and monitor activity for your AWS resources and on-premises workloads in one place. Using DynamoDB with AWS Backup, you can copy your on-demand backups across AWS accounts and Regions, add cost allocation tags to on-demand backups, and transition on-demand backups to cold storage for lower costs.
DynamoDB charges for on-demand backups based on the storage size of the table (table data and local secondary indexes). The size of each backup is determined at the time of each backup request. The total backup storage size billed each month is the sum of all backups of DynamoDB tables. DynamoDB monitors the size of on-demand backups continuously throughout the month to determine your backup charges.
- Continuous Backup: With continuous backups, you can restore your AWS Backup-supported resource by rewinding it back to a specific time that you choose, within 1 second of precision (going back a maximum of 35 days). This built-in feature protects against accidental writes or deletes. Continuous backup works by first creating a full backup of your resource, and then constantly backing up your resource’s transaction logs. PITR restore works by accessing your full backup and replaying the transaction log to the time that you tell AWS Backup to recover.
The Recovery Point Objective (RPO) is close to zero.
DynamoDB charges for PITR based on the size of each DynamoDB table (table data and local secondary indexes) on which it is enabled. DynamoDB monitors the size of your PITR-enabled tables continuously throughout the month to determine your backup charges and continues to bill you until you disable PITR on each table.
We could set both options to have a longer retention period and to minimize the RPO.
By default, we will set “on demand backup”.
Restore will be done from backup depending on the option chosen by the customer: full backup or point in time recovery.
Service restore
Recovery will be from Infra as Code.
1.24.3.5 AWS SLA High Availability and Disaster Recovery inter-region
DynamoDB automatically spreads the data and traffic for your tables over a sufficient number of servers to handle your throughput and storage requirements, while maintaining consistent and fast performance.
All your data is stored on solid-state disks (SSDs) and is automatically replicated across multiple Availability Zones in an AWS Region, providing built-in high availability and data durability.
You can use global tables to keep DynamoDB tables in sync across AWS Regions.
In same region, High Availability is a built-in feature.
in other regions, we will optionally use global tables to replicate tables across regions.
1.24.4 Charging model
Work Unit |
per Dynamo DB table |
1.24.5 Changes catalogue – in Tokens, per act
Changes examples | Effort |
Edit table capacity | 2 tokens |
Update table class | 2 tokens |
Create snapshot | 2 tokens |
Delete table | Estimation in tokens based on table size |
Create item | Estimation in tokens based on number of rows |
Create index | Estimation in tokens based on table size |
Create replica | Estimation in tokens based on table size |
Other changes | Estimation in tokens based on time spent |
1.25 ElastiCache for Redis
1.25.1 Description
Amazon ElastiCache is a fully managed, in-memory caching service supporting flexible, real-time use cases. You can use ElastiCache for caching, which accelerates application and database performance, or as a primary data store for use cases that don’t require durability like session stores, gaming leaderboards, streaming, and analytics.
Built on open-source Redis and compatible with the Redis APIs, ElastiCache for Redis works with your Redis clients and uses the open Redis data format to store your data.
1.25.2 Build to run service included in the OTC
1.25.2.1 Build service pre-requisite
- Refer to generic description.
1.25.2.2 Build to run service
- Refer to generic description.
1.25.3 RUN services included in the MRC
1.25.3.1 Run service pre-requisite
- A referential file exists in the Git used by OBS which includes the reference configuration of the service.
- This file can be executed with a CI/CD used by OBS and the execution has been tested successfully.
1.25.3.2 Reporting
By default, no. Reporting can be requested by customer through change request to have point in time report.
1.25.3.3 KPI & alerts
Monitoring
yes
Available metrics
- CPUUtilization
- CPUCreditBalance
- CPUCreditUsage
- FreeableMemory
- NetworkBytesIn
- NetworkBytesOut
- NetworkPacketsIn
- NetworkPacketsOut
- NetworkBandwidthInAllowanceExceeded
- NetworkConntrackAllowanceExceeded
- NetworkLinkLocalAllowanceExceeded
- NetworkBandwidthOutAllowanceExceeded
- Network Packets Per Second Allowance Exceeded
- SwapUsage
- ActiveDefragHits
- AuthenticationFailures
- BytesUsedForCache
- BytesReadFromDisk
- BytesWrittenToDisk
- CacheHits
- CacheMisses
- CommandAuthorizationFailures
- CacheHitRate
- CurrConnections
- CurrItems
- CurrVolatileItems
- DatabaseMemoryUsagePercentage
- DatabaseMemoryUsageCountedForEvictPercentage
- DB0AverageTTL
- EngineCPUUtilization
- Evictions
- GlobalDatastoreReplicationLag
- IsPrimary
- KeyAuthorizationFailures
- KeysTracked
- MemoryFragmentationRatio
- NewConnections
- NumItemsReadFromDisk
- NumItemsWrittenToDisk
- PrimaryLinkHealthStatus
- Reclaimed
- ReplicationBytes
- ReplicationLag
- SaveInProgress
Alerts observed
- CPUUtilization (Host Level)
- EngineCPUUtilization (Node Level) {analyze the load of the Redis process}
- Evictions (Node Level) {if the number is very high we need to increase maxmemory limit}
- CurrConnections (Node Level)
- ReplicationLag (Node Level)
- DatabaseMemoryUsagePercentage (Node Level)
1.25.3.4 Backup and restore
Data backup and restore
ElasticCache for Redis offers two methods to back up your table data:
- Automatic Backup: For any Redis cluster, you can enable automatic backups. When automatic backups are enabled, ElastiCache creates a backup of the cluster on a daily basis. Automatic backups can help guard against data loss. In the event of a failure, you can create a new cluster, restoring your data from the most recent backup. The result is a warm-started cluster, preloaded with your data and ready for use. The minimum length for the backup window is 60 minutes. The maximum backup retention limit is 35 days.
- Manual backups: In addition to automatic backups, you can create a manual backup at any time. Unlike automatic backups, which are automatically deleted after a specified retention period, manual backups do not have a retention period after which they are automatically deleted. You must manually delete any manual backup. Even if you delete a cluster or node, any manual backups from that cluster or node are retained. Manual backups are useful for testing and archiving.
Service restore
Recovery will be from Infra as Code.
1.25.3.5 AWS SLA High Availability and Disaster Recovery inter-region
Beginning with Redis version 3.2, you have the ability to create one of two distinct types of Redis clusters (API/CLI: replication groups). A Redis (cluster mode disabled) cluster always has a single shard (API/CLI: node group) with up to 5 read replica nodes. A Redis (cluster mode enabled) cluster has up to 500 shards with 1 to 5 read replica nodes in each.
With cluster mode enabled, your Redis Cluster gains enhanced scalability and high availability.
In addition, Amazon ElastiCache offers multiple Availability Zone (Multi-AZ) support with auto failover that enables you to set up a cluster with one or more replicas across zones. In the event of a failure on the primary node, Amazon ElastiCache for Redis automatically fails over to a replica to ensure high availability.
You can enable multi-AZ only on Redis (cluster mode disabled) clusters that have at least one available read replica. Clusters without read replicas do not provide high availability or fault tolerance.
Creating a Replication Group Using an Available Redis (Cluster Mode Disabled) is optional.
1.25.4 Charging model
Work Unit |
per node |
1.25.5 Changes catalogue – in Tokens, per act
Changes examples | Effort |
setup specific alert | 1 token |
purge cache | 1 token |
configuration modification | 1 token |
Other changes | Estimation in tokens based on time spent |
1.26 Amazon MemoryDB for Redis
1.26.1 Description
MemoryDB for Redis is a durable, in-memory database service that delivers ultra-fast performance. It is purpose-built for modern applications with microservices architectures. MemoryDB is compatible with Redis, a popular open-source data store, enabling you to quickly build applications using the same flexible and friendly Redis data structures, APIs, and commands that they already use today.
1.26.2 Build to run service included in the OTC
1.26.2.1 Build service pre-requisite
- Refer to generic description.
1.26.2.2 Build to run service
- Refer to generic description.
1.26.3 RUN services included in the MRC
1.26.3.1 Run service pre-requisite
- A referential file exists in the Git used by OBS which includes the reference configuration of the service.
- This file can be executed with a CI/CD used by OBS and the execution has been tested successfully.
1.26.3.2 Reporting
By default, no. Reporting can be requested by customer through change request to have point in time report.
1.26.3.3 KPI & alerts
Monitoring
yes
Available metrics
- CPUUtilization
- FreeableMemory
- NetworkBytesIn
- NetworkBytesOut
- NetworkPacketsIn
- NetworkPacketsOut
- SwapUsage
- ActiveDefragHits
- AuthenticationFailures
- BytesUsedForMemoryDB
- CommandAuthorizationFailures
- CurrConnections
- CurrItems
- DatabaseMemoryUsagePercentage
- DB0AverageTTL
- EngineCPUUtilization
- Evictions
- IsPrimary
- KeyAuthorizationFailures
- KeyspaceHits
- KeyspaceMisses
- KeysTracked
- MaxReplicationThroughput
- MemoryFragmentationRatio
- NewConnections
- PrimaryLinkHealthStatus
- Reclaimed
- ReplicationBytes
- ReplicationDelayedWriteCommands
- ReplicationLag
Alerts observed
- CPUUtilization (Host Level)
- EngineCPUUtilization (Node Level) {analyze the load of the Redis process}
- Evictions (Node Level) {if the number is very high, we need to increase maxmemory limit}
- CurrConnections (Node Level)
- ReplicationLag (Node Level)
- DatabaseMemoryUsagePercentage (Node Level)
1.26.3.4 Backup and restore
Data backup and restore
MemoryDB for Redis clusters automatically back up data to a multi-AZ transactional log, but you can choose to create point-in-time snapshots of a cluster either periodically or on-demand. These snapshots can be used to recreate a cluster at a previous point or to seed a brand-new cluster. The snapshot consists of the cluster’s metadata, along with all of the data in the cluster. All snapshots are written to Amazon Simple Storage Service (Amazon S3), which provides durable storage. At any time, you can restore your data by creating a new MemoryDB cluster and populating it with data from a snapshot.
Service restore
Recovery will be from Infra as Code.
1.26.4 Charging model
Work Unit |
per node |
1.26.5 Changes catalogue – in Tokens, per act
Changes examples | Effort |
Modify Associated subnets | 1 token |
Modify parameter groups | 1 token |
Modify Node type | 1 token |
Modify Security Groups | 1 token |
Modify ACL (Access Control List) | 1 token |
Modify Snapshot | 1 token |
Modify Maintenance window | 1 token |
Take snapshot | Estimation in tokens based on table size |
Other changes | Estimation in tokens based on time spent |
1.27 Amazon Neptune
1.27.1 Description
Amazon Neptune is a fast, reliable, fully managed graph database service that makes it easy to build and run applications that work with highly connected datasets. Neptune supports the popular graph query languages Apache TinkerPop Gremlin, the W3C’s SPARQL, and Neo4j’s openCypher, enabling you to build queries that efficiently navigate highly connected datasets. Neptune powers graph use cases such as recommendation engines, fraud detection, knowledge graphs, drug discovery, and network security.
1.27.2 Build to run service included in the OTC
1.27.2.1 Build service pre-requisite
- Refer to generic description.
1.27.2.2 Build to run service
- Refer to generic description.
1.27.3 RUN services included in the MRC
1.27.3.1 Run service pre-requisite
- A referential file exists in the Git used by OBS which includes the reference configuration of the service.
- This file can be executed with a CI/CD used by OBS and the execution has been tested successfully.
1.27.3.2 Reporting
By default, no. Reporting can be requested by customer through change request to have point in time report.
1.27.3.3 KPI & alerts
Monitoring
yes
Available metrics
- BufferCacheHitRatio: The percentage of requests that are served by the buffer cache. Cache misses add significant latency to query execution. If the cache hit ratio is below 99.9% and latency is an issue for your application, consider upgrading the instance type to cache more data in memory.
- CPU utilization: Percentage of computer processing capacity used. High values for CPU consumption might be appropriate, depending on your query-performance goals.
- Freeable memory: How much RAM is available on the DB instance, in megabytes. Neptune has its own memory manager, so this metric may be lower than you expect. A good sign that you should consider upgrading your instance class to one with more RAM is if queries often throw out-of-memory exceptions.
- BackupRetentionPeriodStorageUsed
- ClusterReplicaLag
- ClusterReplicaLagMaximum
- ClusterReplicaLagMinimum
- EngineUptime
- GremlinRequestsPerSec
- GremlinWebSocketOpenConnections
- LoaderRequestsPerSec
- MainRequestQueuePendingRequests
- NetworkReceiveThroughput
- NetworkThroughput
- NetworkTransmitThroughput
- NumTxCommitted
- NumTxOpened
- NumTxRolledBack
- SnapshotStorageUsed
- SparqlRequestsPerSec
- StatsNumStatementsScanned
- TotalBackupStorageBilled
- TotalRequestsPerSec
- TotalClientErrorsPerSec
- TotalServerErrorsPerSec
- VolumeBytesUsed
- VolumeReadIOPs
- VolumeWriteIOPs
Alerts observed
- ClusterReplicaLag
- CPUUtilization
- FreeableMemory
- TotalClientErrorsPerSec
- TotalServerErrorsPerSec
- BufferCacheHitRatio
1.27.3.4 Backup and restore
Data backup and restore
Neptune backs up your cluster volume automatically and retains restore data for the length of the backup retention period. Neptune backups are continuous and incremental so you can quickly restore to any point within the backup retention period. No performance impact or interruption of database service occurs as backup data is being written. You can specify a backup retention period, from 1 to 35 days, when you create or modify a DB cluster.
If you want to retain a backup beyond the backup retention period, you can also take a snapshot of the data in your cluster volume. Storing snapshots incurs the standard storage charges for Neptune.
Service restore
Recovery will be from Infra as Code.
1.27.3.5 AWS SLA High Availability and Disaster Recovery inter-region
A Neptune DB cluster is fault tolerant by design. The cluster volume spans multiple Availability Zones in a single AWS Region, and each Availability Zone contains a copy of the cluster volume data. This functionality means that your DB cluster can tolerate a failure of an Availability Zone without any loss of data and only a brief interruption of service.
1.27.4 Charging model
Work Unit |
per Database instance |
1.27.5 Changes catalogue – in Tokens, per act
Changes examples | Effort |
Create clone | 1 token |
Restore to point in time, depends on DB size | Estimation in tokens based on Database size |
Upgrade | Estimation in tokens based on time spent |
Operational (start, stop, reboot, failover) | 1 token |
Modify and change configuration | 1 token |
Create reader replica instance | 1 token |
Other changes | Estimation in tokens based on time spent |
1.28 Amazon Keyspaces (for Apache Cassandra)
1.28.1 Description
Amazon Keyspaces (for Apache Cassandra) is a scalable, highly available, and managed Apache Cassandra–compatible database service. With Amazon Keyspaces, you don’t have to provision, patch, or manage servers, and you don’t have to install, maintain, or operate software.
Amazon Keyspaces is serverless, so you pay for only the resources that you use, and the service automatically scales tables up and down in response to application traffic.
1.28.2 Build to run service included in the OTC
1.28.2.1 Build service pre-requisite
- Refer to generic description.
1.28.2.2 Build to run service
- Refer to generic description.
1.28.3 RUN services included in the MRC
1.28.3.1 Run service pre-requisite
- A referential file exists in the Git used by OBS which includes the reference configuration of the service.
- This file can be executed with a CI/CD used by OBS and the execution has been tested successfully.
1.28.3.2 Reporting
By default, no. Reporting can be requested by customer through change request to have point in time report.
1.28.3.3 KPI & alerts
Monitoring
yes
Available metrics
- AccountMaxTableLevelReads (The maximum number of read capacity units that can be used by a table of the account.)
- AccountMaxTableLevelWrites (The maximum number of write capacity units that can be used by a table of the account.)
- AccountProvisionedReadCapacityUtilization (The percentage of provisioned read capacity units utilized by an account.)
- AccountProvisionedWriteCapacityUtilization (The percentage of provisioned write capacity units utilized by an account.)
- ConditionalCheckFailedRequests
- ConsumedReadCapacityUnits (The number of read capacity units consumed over the specified time period.)
- ConsumedWriteCapacityUnits (The number of write capacity units consumed over the specified time period.)
- MaxProvisionedTableReadCapacityUtilization (The maximum percentage of provisioned read capacity units utilized by the highest provisioned read table of the account.)
- MaxProvisionedTableWriteCapacityUtilization
- PerConnectionRequestRateExceeded (Requests to Amazon Keyspaces that exceed the per-connection request rate quota)
- ProvisionedReadCapacityUnits
- ProvisionedWriteCapacityUnits
- ReadThrottleEvents (Requests to Amazon Keyspaces that exceed the provisioned read capacity for a table.)
- ReturnedItemCount (The number of rows returned by multi-row SELECT queries during the specified time period.)
- StoragePartitionThroughputCapacityExceeded (Requests to an Amazon Keyspaces storage partition that exceed the throughput capacity of the partition.)
- SuccessfulRequestLatency (The successful requests to Amazon Keyspaces during the specified time period.)
- SystemErrors (The requests to Amazon Keyspaces that generate a ServerError during the specified time period.)
- TTLDeletes (The units consumed to delete or update data in a row by using Time to Live (TTL).)
- UserErrors (Requests to Amazon Keyspaces that generate an InvalidRequest error during the specified time period.)
- WriteThrottleEvents (Requests to Amazon Keyspaces that exceed the provisioned write capacity for a table.)
Alerts observed
- SystemErrors (usually indicates an internal service error.)
- UserErrors (usually indicates a client-side error, such as an attempt to update a nonexistent table)
- AccountProvisionedReadCapacityUtilization
- AccountProvisionedWriteCapacityUtilization
1.28.3.4 Backup and restore
Data backup and restore
Point-in-time recovery (PITR) helps protect your Amazon Keyspaces tables from accidental write or delete operations by providing you continuous backups of your table data.
For example, suppose that a test script writes accidentally to a production Amazon Keyspaces table. With point-in-time recovery, you can restore that table’s data to any second in time since PITR was enabled within the last 35 days. If you delete a table with point-in-time recovery enabled, you can query for the deleted table’s data for 35 days (at no additional cost), and restore it to the state it was in just before the point of deletion.
Point-in-time operations have no performance or availability impact on the base table and restoring a table doesn’t consume additional throughput.
Amazon Keyspaces PITR uses two timestamps to maintain the time frame for which restorable backups are available for a table.
- Earliest restorable time – Marks the time of the earliest restorable backup. The earliest restorable backup goes back up to 35 days or when PITR was enabled, whichever is more recent. The maximum backup window of 35 days can’t be modified.
- Current time – The timestamp for the latest restorable backup is the current time. If no timestamp is provided during a restore, current time is used.
When PITR is enabled, you can restore to any point in time between EarliestRestorableDateTime and CurrentTime. You can only restore table data to a time when PITR was enabled.
Service restore
Recovery will be from Infra as Code.
1.28.3.5 AWS SLA High Availability and Disaster Recovery inter-region
Amazon Keyspaces replicates data automatically three times in multiple AWS Availability Zones within the same AWS Region for durability and high availability.
1.28.4 Charging model
Work Unit |
Per Table within Keyspace |
1.28.5 Changes catalogue – in Tokens, per act
Changes examples | Effort |
Create a Keyspace | 1 token |
Create table | 1 token |
Restore PITR | Estimation in tokens based on table size |
Other changes | Estimation in tokens based on time spent |
1.29 ElastiCache for Memcached
1.29.1 Description
Amazon ElastiCache for Memcached is a Memcached-compatible in-memory key-value store service that can be used as a cache or a data store. It delivers the performance, ease-of-use, and simplicity of Memcached. ElastiCache for Memcached is fully managed, scalable, and secure – making it an ideal candidate for use cases where frequently accessed data must be in-memory. It is a popular choice for use cases such as Web, Mobile Apps, Gaming, Ad-Tech, and E-Commerce.
1.29.2 Build to run service included in the OTC
1.29.2.1 Build service pre-requisite
- Refer to generic description.
1.29.2.2 Build to run service
- Refer to generic description.
1.29.3 RUN services included in the MRC
1.29.3.1 Run service pre-requisite
- A referential file exists in the Git used by OBS which includes the reference configuration of the service.
- This file can be executed with a CI/CD used by OBS and the execution has been tested successfully.
1.29.3.2 Reporting
By default, no. Reporting can be requested by customer through change request to have point in time report.
1.29.3.3 KPI & alerts
Monitoring
yes
KPI monitored
Host-Level Metrics:
- CPUUtilization
- CPUCreditBalance
- CPUCreditUsage
- FreeableMemory
- NetworkBytesIn
- NetworkBytesOut
- NetworkPacketsIn
- NetworkPacketsOut
- NetworkBandwidthInAllowanceExceeded
- NetworkConntrackAllowanceExceeded
- NetworkLinkLocalAllowanceExceeded
- NetworkBandwidthOutAllowanceExceeded
- Network Packets Per Second Allowance Exceeded
- SwapUsage
Cache Node Level:
- BytesReadIntoMemcached
- BytesUsedForCacheItems
- BytesWrittenOutFromMemcached
- CasBadval
- CasHits
- CasMisses
- CmdFlush
- CmdGets
- CmdSet
- CurrConnections
- CurrItems
- DecrHits
- DecrMisses
- DeleteHits
- DeleteMisses
- Evictions
- GetHits
- GetMisses
- IncrHits
- IncrMisses
- Reclaimed
Alerts observed
- CPUUtilization (Host Level): Memcached is multi-threaded, this metric can be as high as 90%. If you exceed this threshold, scale your cache cluster up by using a larger cache node type, or scale out by adding more cache nodes.
- Evictions (Node Level): This is a cache engine metric. We recommend that you determine your own alarm threshold for this metric based on your application needs. If you exceed your chosen threshold, scale your cluster up by using a larger node type, or scale out by adding more nodes.
- CurrConnections (Node Level): This is a cache engine metric. We recommend that you determine your own alarm threshold for this metric based on your application needs. An increasing number of CurrConnections might indicate a problem with your application; you will need to investigate the application behavior to address this issue.
1.29.3.4 Backup and restore
Data backup and restore
The backup feature is not available for Memcached Clusters.
Service restore
Recovery will be from Infra as Code.
1.29.3.5 AWS SLA High Availability and Disaster Recovery inter-region
High Availability is not supported because since the service does not support replication.
When running the Memcached engine, you have the following options for minimizing the impact of a failure.
There are two types of failures to address in your failure mitigation plans:
- Node failure
- Availability Zone failure.
- Mitigating Node Failures: spread your cached data over more nodes. Because Memcached does not support replication, a node failure will always result in some data loss from your cluster.
- Mitigating Availability Zone Failures: locate your nodes in as many Availability Zones as possible.
1.29.4 Charging model
Work Unit |
Per node |
1.29.5 Changes catalogue – in Tokens, per act
Changes examples | Effort |
Modify Engine version compatibility | 2 tokens |
Modify VPC Security Group | 2 tokens |
Modify Parameter group | 2 tokens |
Modify Maintenance Window | 2 tokens |
Modify Topic for SNS Notification | 2 tokens |
Reboot | 1 token |
Other changes | Estimation in tokens based on time spent |
1.30 Amazon Aurora PostgreSQL Compatible
1.30.1 Description
Amazon Aurora PostgreSQL is a fully managed, PostgreSQL-compatible, and ACID-compliant relational database engine that combines the speed and reliability of high-end commercial databases with the simplicity and cost-effectiveness of open-source databases. Aurora PostgreSQL is a drop-in replacement for PostgreSQL and makes it simple and cost-effective to set up, operate, and scale your new and existing PostgreSQL deployments, thus freeing you to focus on your business and applications.
1.30.2 Build to run service included in the OTC
1.30.2.1 Build service pre-requisite
- Refer to generic description.
1.30.2.2 Build to run service
- Refer to generic description.
1.30.3 RUN services included in the MRC
1.30.3.1 Run service pre-requisite
- A referential file exists in the Git used by OBS which includes the reference configuration of the service.
- This file can be executed with a CI/CD used by OBS and the execution has been tested successfully.
1.30.3.2 Reporting
By default, no. Reporting can be requested by customer through change request to have point in time report.
1.30.3.3 KPI & alerts
Monitoring
yes
Available metrics
Cluster level Metrics:
- AuroraGlobalDBDataTransferBytes
- AuroraGlobalDBProgressLag
- AuroraGlobalDBReplicatedWriteIO
- AuroraGlobalDBReplicationLag
- AuroraGlobalDBRPOLag
- AuroraVolumeBytesLeftTotal
- BacktrackChangeRecordsCreationRate
- BacktrackChangeRecordsStored
- BackupRetentionPeriodStorageUsed
- ServerlessDatabaseCapacity
- SnapshotStorageUsed
- TotalBackupStorageBilled
- VolumeBytesUsed
- VolumeReadIOPs
- VolumeWriteIOPs
Instance Level Metrics:
- AbortedClients
- ActiveTransactions
- AuroraBinlogReplicaLag
- AuroraReplicaLag
- AuroraReplicaLagMaximum
- AuroraReplicaLagMinimum
- BacktrackWindowActual
- BacktrackWindowAlert
- BlockedTransactions
- BufferCacheHitRatio
- CommitLatency
- CommitThroughput
- CPUCreditBalance
- CPUCreditUsage
- CPUUtilization
- DatabaseConnections
- DDLLatency
- DDLThroughput
- Deadlocks
- DeleteLatency
- DeleteThroughput
- DiskQueueDepth
- DMLLatency
- DMLThroughput
- EBSByteBalance%
- EBSIOBalance%
- EngineUptime
- FreeableMemory
- FreeLocalStorage
- InsertLatency
- InsertThroughput
- LoginFailures
- MaximumUsedTransactionIDs
- NetworkReceiveThroughput
- NetworkThroughput
- NetworkTransmitThroughput
- NumBinaryLogFiles
- Queries
- RDSToAuroraPostgreSQLReplicaLag
- ReadIOPS
- ReadLatency
- ReadThroughput
- ReplicationSlotDiskUsage
- ResultSetCacheHitRatio
- RollbackSegmentHistoryListLength
- RowLockTime
- SelectLatency
- SelectThroughput
- StorageNetworkReceiveThroughput
- StorageNetworkThroughput
- StorageNetworkTransmitThroughput
- SumBinaryLogSize
- SwapUsage
- TransactionLogsDiskUsage
- UpdateLatency
- UpdateThroughput
- WriteIOPS
- WriteLatency
- WriteThroughput
Alerts observed
- WriteLatency
- ReadLatency
- FreeableMemory
- Deadlocks
- CPUUtilization
- DatabaseConnections
- BlockedTransactions
- BufferCacheHitRatio
- CommitLatency
- AbortedClients
- AuroraGlobalDBReplicationLag
1.30.3.4 Backup and restore
Data backup and restore
Aurora backs up your cluster volume automatically and retains restore data for the length of the backup retention period.
Aurora backups are continuous and incremental so you can quickly restore to any point within the backup retention period.
You can specify a backup retention period, from 1 to 35 days, when you create or modify a DB cluster.
Aurora backups are stored in Amazon S3.
AWS backup can be used as snapshot backup.
If you want to retain a backup beyond the backup retention period, you can also take a snapshot of the data in your cluster volume.
Restore will be done from backup to a new instance if restoring from backup or snapshot.
Service restore
Recovery will be from Infra as Code.
1.30.3.5 AWS SLA High Availability and Disaster Recovery inter-region
An Aurora DB cluster is fault tolerant by design.
The cluster volume spans multiple Availability Zones in a single AWS Region, and each Availability Zone contains a copy of the cluster volume data.
This functionality means that your DB cluster can tolerate a failure of an Availability Zone without any loss of data and only a brief interruption of service.
1.30.4 Charging model
Work Unit |
per Database Instance |
1.30.5 Changes catalogue – in Tokens, per act
Changes examples | Effort |
Provision database | 2 tokens |
Reboot an instance | 2 tokens |
Delete an instance | 2 tokens |
Instance failover | 2 tokens |
Take snapshot of an instance | 2 tokens |
Stop & start a cluster | 2 tokens |
Delete a cluster | 2 tokens |
Add reader Instance | 2 tokens |
Add AWS region at cluster level | 2 tokens |
Create clone at cluster level | 2 tokens |
Restore a cluster to point in Time | Estimation in tokens based on the database size |
Modify cluster configuration | 1 token |
Upgrade a database | Estimation in tokens based on time spent |
Minor Version patching | Estimation in tokens based on time spent |
Export database snapshot to S3 | Estimation in tokens based on the size of extracted data |
Other changes | Estimation in tokens based on time spent |
1.31 Amazon Aurora MySQL Compatible
1.31.1 Description
Amazon Aurora is a relational database management system (RDBMS) built for the cloud with full MySQL and PostgreSQL compatibility.
1.31.2 Build to run service included in the OTC
1.31.2.1 Build service pre-requisite
- Refer to generic description.
1.31.2.2 Build to run service
- Refer to generic description.
1.31.3 RUN services included in the MRC
1.31.3.1 Run service pre-requisite
- A referential file exists in the Git used by OBS which includes the reference configuration of the service.
- This file can be executed with a CI/CD used by OBS and the execution has been tested successfully.
1.31.3.2 Reporting
By default, no. Reporting can be requested by customer through change request to have point in time report.
1.31.3.3 KPI & alerts
Monitoring
yes
Available metrics
Cluster level Metrics:
- AuroraGlobalDBDataTransferBytes
- AuroraGlobalDBProgressLag
- AuroraGlobalDBReplicatedWriteIO
- AuroraGlobalDBReplicationLag
- AuroraGlobalDBRPOLag
- AuroraVolumeBytesLeftTotal
- BacktrackChangeRecordsCreationRate
- BacktrackChangeRecordsStored
- BackupRetentionPeriodStorageUsed
- ServerlessDatabaseCapacity
- SnapshotStorageUsed
- TotalBackupStorageBilled
- VolumeBytesUsed
- VolumeReadIOPs
- VolumeWriteIOPs
Instance Level Metrics:
- AbortedClients
- ActiveTransactions
- AuroraBinlogReplicaLag
- AuroraReplicaLag
- AuroraReplicaLagMaximum
- AuroraReplicaLagMinimum
- BacktrackWindowActual
- BacktrackWindowAlert
- BlockedTransactions
- BufferCacheHitRatio
- CommitLatency
- CommitThroughput
- CPUCreditBalance
- CPUCreditUsage
- CPUUtilization
- DatabaseConnections
- DDLLatency
- DDLThroughput
- Deadlocks
- DeleteLatency
- DeleteThroughput
- DiskQueueDepth
- DMLLatency
- DMLThroughput
- EBSByteBalance%
- EBSIOBalance%
- EngineUptime
- FreeableMemory
- FreeLocalStorage
- InsertLatency
- InsertThroughput
- LoginFailures
- MaximumUsedTransactionIDs
- NetworkReceiveThroughput
- NetworkThroughput
- NetworkTransmitThroughput
- NumBinaryLogFiles
- Queries
- RDSToAuroraPostgreSQLReplicaLag
- ReadIOPS
- ReadLatency
- ReadThroughput
- ReplicationSlotDiskUsage
- ResultSetCacheHitRatio
- RollbackSegmentHistoryListLength
- RowLockTime
- SelectLatency
- SelectThroughput
- StorageNetworkReceiveThroughput
- StorageNetworkThroughput
- StorageNetworkTransmitThroughput
- SumBinaryLogSize
- SwapUsage
- TransactionLogsDiskUsage
- UpdateLatency
- UpdateThroughput
- WriteIOPS
- WriteLatency
- WriteThroughput
Alerts observed
- WriteLatency
- ReadLatency
- FreeableMemory
- Deadlocks
- CPUUtilization
- DatabaseConnections
- BlockedTransactions
- BufferCacheHitRatio
- CommitLatency
- AbortedClients
- AuroraGlobalDBReplicationLag
1.31.3.4 Backup and restore
Data backup and restore
Aurora backs up your cluster volume automatically and retains restore data for the length of the backup retention period.
Aurora backups are continuous and incremental so you can quickly restore to any point within the backup retention period.
You can specify a backup retention period, from 1 to 35 days, when you create or modify a DB cluster.
Aurora backups are stored in Amazon S3.
AWS backup can be used as snapshot backup.
If you want to retain a backup beyond the backup retention period, you can also take a snapshot of the data in your cluster volume.
Restore will be done from backup to a new instance if restoring from backup or snapshot.
With Amazon Aurora MySQL-Compatible Edition, you can backtrack a DB cluster to a specific time, without restoring data from a backup.
Backtracking it lets you quickly move an Aurora database to a prior point in time without needing to restore data from a backup.
This lets you quickly recover from user errors, such as dropping the wrong table or deleting the wrong row.
Service restore
Recovery will be from Infra as Code.
1.31.3.5 AWS SLA High Availability and Disaster Recovery inter-region
An Aurora DB cluster is fault tolerant by design.
The cluster volume spans multiple Availability Zones in a single AWS Region, and each Availability Zone contains a copy of the cluster volume data.
This functionality means that your DB cluster can tolerate a failure of an Availability Zone without any loss of data and only a brief interruption of service.
1.31.4 Charging model
Work Unit |
per Database Instance |
1.31.5 Changes catalogue – in Tokens, per act
Changes examples | Effort |
Provision database | 2 tokens |
Reboot an instance | 2 tokens |
Delete an instance | 2 tokens |
Instance failover | 2 tokens |
Take snapshot of an instance | 2 tokens |
Stop & start a cluster | 2 tokens |
Delete a cluster | 2 tokens |
Add reader Instance | 2 tokens |
Add AWS region at cluster level | 2 tokens |
Create clone at cluster level | 2 tokens |
Restore a cluster to point in Time | Estimation in tokens based on the database size |
Modify cluster configuration | 1 token |
Export database snapshot to S3 | Estimation in tokens based on the size of extracted data |
Upgrade a database | Estimation in tokens based on time spent |
Minor Version patching | Estimation in tokens based on time spent |
Other changes | Estimation in tokens based on time spent |
1.32 Amazon Quantum Ledger Database
1.32.1 Description
Amazon QLDB is a new class of database that helps eliminate the need to engage in the complex development effort of building your own ledger-like applications. With QLDB, the history of changes to your data is immutable—it can’t be altered, updated, or deleted. And using cryptography, you can easily verify that there have been no unintended changes to your application’s data. QLDB uses an immutable transactional log, known as a journal. The journal is append-only and is composed of a sequenced and hash-chained set of blocks that contain your committed data.
1.32.2 Build to run service included in the OTC
1.32.2.1 Build service pre-requisite
- Refer to generic description.
1.32.2.2 Build to run service
- Refer to generic description.
1.32.3 RUN services included in the MRC
1.32.3.1 Run service pre-requisite
- A referential file exists in the Git used by OBS which includes the reference configuration of the service.
- This file can be executed with a CI/CD used by OBS and the execution has been tested successfully.
1.32.3.2 Reporting
By default, no. Reporting can be requested by customer through change request to have point in time report.
1.32.3.3 KPI & alerts
Monitoring
yes
Available metrics
- JournalStorage
- IndexedStorage
- ReadIOs
- WriteIOs
- CommandLatency
- IsImpaired
- OccConflictExceptions
- Session4xxExceptions
- Session5xxExceptions
- SessionRateExceededExceptions
Alerts observed
- JournalStorage
- IndexedStorage
- ReadIOs
- WriteIOs
- CommandLatency
1.32.3.4 Backup and restore
Data backup and restore
QLDB doesn’t provide a dedicated backup and related restore feature at this time.
QLDB provides an on-demand journal export feature. You can access the contents of your journal by exporting journal blocks from your ledger into an Amazon Simple Storage Service (Amazon S3) bucket. You can use this data for various purposes such as data retention, analytics, and auditing.
Service restore
Recovery will be from Infra as Code.
1.32.3.5 AWS SLA High Availability and Disaster Recovery inter-region
The service is Highly Available by design by AWS.
QLDB journal storage features synchronous replication to multiple Availability Zones on transaction commits. This ensures that even a full Availability Zone failure of journal storage would not compromise data integrity or the ability to maintain an active service. Additionally, the QLDB journal features asynchronous archives to fault-tolerant storage. This feature supports disaster recovery in the highly unlikely event of simultaneous storage failure for multiple Availability Zones.
QLDB doesn’t provide an automated recovery feature for logical corruption scenarios at this time.
1.32.4 Charging model
Work Unit |
per Table within ledger |
1.32.5 Changes catalogue – in Tokens, per act
Changes examples | Effort |
Create Ledger | 1 token |
Other changes | Estimation in tokens based on time spent |
1.33 Microsoft SQL Server on Amazon RDS
1.33.1 Description
SQL Server is a relational database management system developed by Microsoft. Amazon RDS for SQL Server makes it easy to set up, operate, and scale SQL Server deployments in the cloud. Amazon RDS frees you up to focus on application development by managing time-consuming database administration tasks including provisioning, backups, software patching, monitoring, and hardware scaling.
Amazon RDS supports DB instances running several versions and editions of Microsoft SQL Server. For the full list of supported versions, editions, and RDS engine versions, see Microsoft SQL Server versions on Amazon RDS.
For information about licensing for SQL Server, see Licensing Microsoft SQL Server on Amazon RDS. For information about SQL Server builds, see this Microsoft support article about the latest SQL Server builds.
1.33.2 Build to run service included in the OTC
1.33.2.1 Build service pre-requisite
- Refer to generic description.
1.33.2.2 Build to run service
- Refer to generic description.
1.33.3 RUN services included in the MRC
1.33.3.1 Run service pre-requisite
- A referential file exists in the Git used by OBS which includes the reference configuration of the service.
- This file can be executed with a CI/CD used by OBS and the execution has been tested successfully.
1.33.3.2 Reporting
By default, no. Reporting can be requested by customer through change request to have point in time report.
1.33.3.3 KPI & alerts
Monitoring
yes
Available metrics
- BinLogDiskUsage
- BurstBalance
- CPUUtilization
- CPUCreditUsage
- CPUCreditBalance
- DatabaseConnections
- DiskQueueDepth
- EBSByteBalance%
- EBSIOBalance%
- FailedSQLServerAgentJobsCount
- FreeableMemory
- FreeLocalStorage
- FreeStorageSpace
- MaximumUsedTransactionIDs
- NetworkReceiveThroughput
- NetworkTransmitThroughput
- OldestReplicationSlotLag
- ReadIOPS
- ReadIOPSLocalStorage
- ReadLatency
- ReadLatencyLocalStorage
- ReadThroughput
- ReadThroughputLocalStorage
- ReplicaLag
- ReplicationSlotDiskUsage
- SwapUsage
- TransactionLogsDiskUsage
- TransactionLogsGeneration
- WriteIOPS
- WriteIOPSLocalStorage
- WriteLatency
- WriteLatencyLocalStorage
- WriteThroughput
- WriteThroughputLocalStorage
Alerts observed
- DatabaseConnections
- FreeStorageSpace
- FreeableMemory
- ReadLatency
- ReadThroughput
- WriteLatency
- WriteThroughput
- ReadIOPS
- DiskQueueDepth
- WriteIOPS
- NetworkTransmitThroughput
- NetworkReceiveThroughput
- SwapUsage
- EBSByteBalance%
- EBSIOBalance%
- CPUSurplusCreditBalance
- CPUCreditUsage
- CPUCreditBalance
- CPUSurplusCreditsCharged
- CPUUtilization
- BurstBalance
1.33.3.4 Backup and restore
Data backup and restore
Amazon RDS creates and saves automated backups of your DB instance during the backup window of your DB instance. RDS creates a storage volume snapshot of your DB instance, backing up the entire DB instance and not just individual databases. RDS saves the automated backups of your DB instance according to the backup retention period that you specify. If necessary, you can recover your database to any point in time during the backup retention period.
You can also back up your DB instance manually, by manually creating a DB snapshot.
The first snapshot of a DB instance contains the data for the full DB instance. Subsequent snapshots of the same DB instance are incremental, which means that only the data that has changed after your most recent snapshot is saved.
You can set the backup retention period to between 0 and 35 days.
You can have up to 100 manual snapshots per region. Manual snapshot limits (100 per Region) do not apply to automated backups.
Restore will be done from backup depending on the option chosen by the customer: full backup or point in time recovery (automated backup).
Service restore
Recovery will be from Infra as Code.
1.33.3.5 AWS SLA High Availability and Disaster Recovery inter-region
Multi-AZ deployments provide increased availability, data durability, and fault tolerance for DB instances. Multi-AZ deployments for SQL Server are implemented using SQL Server’s native DBM or AGs (Availability Groups) technology.
1.33.4 Charging model
Work Unit |
per DB Instance |
1.33.5 Changes catalogue – in Tokens, per act
Changes examples | Effort |
Creating a database user | 1 Token |
Dropping a Microsoft SQL Server database | 1 Token |
Resetting the db_owner role password | 1 Token |
Restoring license-terminated DB instances | 1 Token |
Transitioning a Microsoft SQL Server database from OFFLINE to ONLINE | 1 Token |
Enable Change data capture (CDC) | 1 Token |
Enable and modify Database Mail | 1 Token |
Native backup and restore | size * tokens |
Amazon S3 file transfer | Estimation in tokens based on file size |
Enable Microsoft Distributed Transaction Coordinator (MSDTC) | 1 Token |
Enable Microsoft Business Intelligence (MSBI) | 1 Token |
Enable Microsoft SQL Server Integration Services (SSIS) | 1 Token |
Enable Microsoft SQL Server Reporting Services (SSRS) | 1 Token |
SQL Server Audit (track the changes not related to the data (table creation, user creation, etc.) | 1 Token |
Other changes | Estimation in tokens based on time spent |
1.34 Amazon RDS for MariaDB
1.34.1 Description
Amazon RDS supports an array of database engines to store and organize data among which MariaDB. It also helps with relational database management tasks, such as data migration, backup, recovery, and patching.
1.34.2 Build to run service included in the OTC
1.34.2.1 Build service pre-requisite
- Refer to generic description.
1.34.2.2 Build to run service
- Refer to generic description.
1.34.3 RUN services included in the MRC
1.34.3.1 Run service pre-requisite
- A referential file exists in the Git used by OBS which includes the reference configuration of the service.
- This file can be executed with a CI/CD used by OBS and the execution has been tested successfully.
1.34.3.2 Reporting
By default, no. Reporting can be requested by customer through change request to have point in time report.
1.34.3.3 KPI & alerts
Monitoring
Yes
Available metrics
- BinLogDiskUsage
- BurstBalance
- CPUUtilization
- CPUCreditUsage
- CPUCreditBalance
- DatabaseConnections
- DiskQueueDepth
- EBSByteBalance%
- EBSIOBalance%
- FailedSQLServerAgentJobsCount
- FreeableMemory
- FreeLocalStorage
- FreeStorageSpace
- MaximumUsedTransactionIDs
- NetworkReceiveThroughput
- NetworkTransmitThroughput
- OldestReplicationSlotLag
- ReadIOPS
- ReadIOPSLocalStorage
- ReadLatency
- ReadLatencyLocalStorage
- ReadThroughput
- ReadThroughputLocalStorage
- ReplicaLag
- ReplicationSlotDiskUsage
- SwapUsage
- TransactionLogsDiskUsage
- TransactionLogsGeneration
- WriteIOPS
- WriteIOPSLocalStorage
- WriteLatency
- WriteLatencyLocalStorage
- WriteThroughput
- WriteThroughputLocalStorage
Alerts observed
- DatabaseConnections
- FreeStorageSpace
- FreeableMemory
- ReadLatency
- ReadThroughput
- WriteLatency
- WriteThroughput
- ReadIOPS
- DiskQueueDepth
- WriteIOPS
- NetworkTransmitThroughput
- NetworkReceiveThroughput
- SwapUsage
- EBSByteBalance%
- EBSIOBalance%
- CPUSurplusCreditBalance
- CPUCreditUsage
- CPUCreditBalance
- CPUSurplusCreditsCharged
- CPUUtilization
- BurstBalance
1.34.3.4 Backup and restore
Data backup and restore
Amazon RDS creates and saves automated backups of your DB instance during the backup window of your DB instance. RDS creates a storage volume snapshot of your DB instance, backing up the entire DB instance and not just individual databases. RDS saves the automated backups of your DB instance according to the backup retention period that you specify. If necessary, you can recover your database to any point in time during the backup retention period.
You can also back up your DB instance manually, by manually creating a DB snapshot.
The first snapshot of a DB instance contains the data for the full DB instance. Subsequent snapshots of the same DB instance are incremental, which means that only the data that has changed after your most recent snapshot is saved.
You can set the backup retention period to between 0 and 35 days.
You can have up to 100 manual snapshots per region. Manual snapshot limits (100 per Region) do not apply to automated backups.
Restore will be done from backup depending on the option chosen by the customer: full backup or point in time recovery (automated backup).
Service restore
Recovery will be from Infra as Code.
1.34.3.5 AWS SLA High Availability and Disaster Recovery inter-region
For your MySQL, MariaDB, PostgreSQL, Oracle, and SQL Server database (DB) instances, you can use Amazon RDS Multi-AZ deployments. When you provision a Multi-AZ DB instance, Amazon RDS automatically creates a primary DB instance and synchronously replicates the data to a standby instance in a different Availability Zone (AZ). In case of an infrastructure failure, Amazon RDS performs an automatic failover to the standby DB instance. Since the endpoint for your DB instance remains the same after a failover, your application can resume database operation without the need for manual administrative intervention.
1.34.4 Charging model
Work Unit |
per DB Instance |
1.34.5 Changes catalogue – in Tokens, per act
Changes examples | Effort |
Provision a database | 2 Tokens |
Reboot a DB instance | 2 Tokens |
Delete a DB instance | 2 Tokens |
Stop a DB instance | 2 Tokens |
Create replica Database | 2 Tokens |
Take a snapshot | 2 Tokens |
Restore to point in time (Instance level) | 2 Tokens |
Start database activity stream (Instance level) | 2 Tokens |
Restore from S3 (Instance level) | Estimation in tokens based on database size |
Other changes | Estimation in tokens based on time spent |
1.35 Amazon RDS for Oracle
1.35.1 Description
Amazon RDS is a managed service for relational databases, including Oracle. RDS is offered with AWS licensing or in a bring your own license (BYOL) model. Once you set up your Oracle database on RDS, you can use the AWS platform to monitor, configure, backup, secure, and scale your workloads.
1.35.2 Build to run service included in the OTC
1.35.2.1 Build service pre-requisite
- Refer to generic description.
1.35.2.2 Build to run service
- Refer to generic description.
1.35.3 RUN services included in the MRC
1.35.3.1 Run service pre-requisite
- A referential file exists in the Git used by OBS which includes the reference configuration of the service.
- This file can be executed with a CI/CD used by OBS and the execution has been tested successfully.
1.35.3.2 Reporting
By default, no. Reporting can be requested by customer through change request to have point in time report.
1.35.3.3 KPI & alerts
Monitoring
yes
Available metrics
- BinLogDiskUsage
- BurstBalance
- CPUUtilization
- CPUCreditUsage
- CPUCreditBalance
- DatabaseConnections
- DiskQueueDepth
- EBSByteBalance%
- EBSIOBalance%
- FailedSQLServerAgentJobsCount
- FreeableMemory
- FreeLocalStorage
- FreeStorageSpace
- MaximumUsedTransactionIDs
- NetworkReceiveThroughput
- NetworkTransmitThroughput
- OldestReplicationSlotLag
- ReadIOPS
- ReadIOPSLocalStorage
- ReadLatency
- ReadLatencyLocalStorage
- ReadThroughput
- ReadThroughputLocalStorage
- ReplicaLag
- ReplicationSlotDiskUsage
- SwapUsage
- TransactionLogsDiskUsage
- TransactionLogsGeneration
- WriteIOPS
- WriteIOPSLocalStorage
- WriteLatency
- WriteLatencyLocalStorage
- WriteThroughput
- WriteThroughputLocalStorage
Alerts observed
- DatabaseConnections
- FreeStorageSpace
- FreeableMemory
- ReadLatency
- ReadThroughput
- WriteLatency
- WriteThroughput
- ReadIOPS
- DiskQueueDepth
- WriteIOPS
- NetworkTransmitThroughput
- NetworkReceiveThroughput
- SwapUsage
- EBSByteBalance%
- EBSIOBalance%
- CPUSurplusCreditBalance
- CPUCreditUsage
- CPUCreditBalance
- CPUSurplusCreditsCharged
- CPUUtilization
- BurstBalance
1.35.3.4 Backup and restore
Data backup and restore
Amazon RDS creates and saves automated backups of your DB instance during the backup window of your DB instance. RDS creates a storage volume snapshot of your DB instance, backing up the entire DB instance and not just individual databases. RDS saves the automated backups of your DB instance according to the backup retention period that you specify. If necessary, you can recover your database to any point in time during the backup retention period.
You can also back up your DB instance manually, by manually creating a DB snapshot.
The first snapshot of a DB instance contains the data for the full DB instance. Subsequent snapshots of the same DB instance are incremental, which means that only the data that has changed after your most recent snapshot is saved.
You can set the backup retention period to between 0 and 35 days.
You can have up to 100 manual snapshots per region. Manual snapshot limits (100 per Region) do not apply to automated backups.
Restore will be done from backup depending on the option chosen by the customer: full backup or point in time recovery (automated backup).
Service restore
Recovery will be from Infra as Code.
1.35.3.5 AWS SLA High Availability and Disaster Recovery inter-region
- Automatic Host Replacement – Amazon RDS will automatically replace the compute instance powering your deployment in the event of a hardware failure.
- Multi-AZ Deployments – A deployment option for your production DB Instances that enhances database availability while protecting your latest database updates against unplanned outages. When you create or modify your DB Instance to run as a Multi-AZ deployment, Amazon RDS will automatically provision and manage a “standby” replica in a different Availability Zone (independent infrastructure in a physically separate location). Database updates are made concurrently on the primary and standby resources to prevent replication lag. In the event of planned database maintenance, DB Instance failure, or an Availability Zone failure, Amazon RDS will automatically failover to the up-to-date standby so that database operations can resume quickly without administrative intervention. Prior to failover you cannot directly access the standby, and it cannot be used to serve read traffic.
1.35.4 Charging model
Work Unit |
per DB Instance |
1.35.5 Changes catalogue – in Tokens, per act
Changes examples | Effort |
Provision a database | 2 Tokens |
Reboot a DB instance | 2 Tokens |
Delete a DB instance | 2 Tokens |
Stop a DB instance | 2 Tokens |
Create replica Database | 2 Tokens |
Take a snapshot | 2 Tokens |
Restore to point in time (Instance level) | 2 Tokens |
Start database activity stream (Instance level) | 2 Tokens |
Restore from S3 (Instance level) | Estimation in tokens based on database size |
Other changes | Estimation in tokens based on time spent |
1.36 Amazon RDS for PostgreSQL
1.36.1 Description
Amazon RDS for Oracle is a fully managed commercial database that makes it easy to set up, operate, and scale Oracle deployments in the cloud.
Amazon RDS currently supports the following versions of MariaDB:
1.36.2 Build to run service included in the OTC
1.36.2.1 Build service pre-requisite
- Refer to generic description.
1.36.2.2 Build to run service
- Refer to generic description.
1.36.3 RUN services included in the MRC
1.36.3.1 Run service pre-requisite
- A referential file exists in the Git used by OBS which includes the reference configuration of the service.
- This file can be executed with a CI/CD used by OBS and the execution has been tested successfully.
1.36.3.2 Reporting
By default, no. Reporting can be requested by customer through change request to have point in time report.
1.36.3.3 KPI & alerts
Monitoring
yes
Available metrics
- BinLogDiskUsage
- BurstBalance
- CPUUtilization
- CPUCreditUsage
- CPUCreditBalance
- DatabaseConnections
- DiskQueueDepth
- EBSByteBalance%
- EBSIOBalance%
- FailedSQLServerAgentJobsCount
- FreeableMemory
- FreeLocalStorage
- FreeStorageSpace
- MaximumUsedTransactionIDs
- NetworkReceiveThroughput
- NetworkTransmitThroughput
- OldestReplicationSlotLag
- ReadIOPS
- ReadIOPSLocalStorage
- ReadLatency
- ReadLatencyLocalStorage
- ReadThroughput
- ReadThroughputLocalStorage
- ReplicaLag
- ReplicationSlotDiskUsage
- SwapUsage
- TransactionLogsDiskUsage
- TransactionLogsGeneration
- WriteIOPS
- WriteIOPSLocalStorage
- WriteLatency
- WriteLatencyLocalStorage
- WriteThroughput
- WriteThroughputLocalStorage
Alerts observed
- CPU Utilization
- DB Connections
- Write IOPS
- Read IOPS
- Queue Depth
- Freeable Memory
- Swap Usage
- Write Latency
- Read Latency
- Write Throughput
- Read Throughput
1.36.3.4 Backup and restore
Data backup and restore
Amazon RDS creates and saves automated backups of your DB instance during the backup window of your DB instance. RDS creates a storage volume snapshot of your DB instance, backing up the entire DB instance and not just individual databases. RDS saves the automated backups of your DB instance according to the backup retention period that you specify. If necessary, you can recover your database to any point in time during the backup retention period.
You can also back up your DB instance manually, by manually creating a DB snapshot.
The first snapshot of a DB instance contains the data for the full DB instance. Subsequent snapshots of the same DB instance are incremental, which means that only the data that has changed after your most recent snapshot is saved.
You can set the backup retention period to between 0 and 35 days.
You can have up to 100 manual snapshots per region. Manual snapshot limits (100 per Region) do not apply to automated backups.
Restore will be done from backup depending on the option chosen by the customer: full backup or point in time recovery (automated backup).
Service restore
Recovery will be from Infra as Code.
1.36.3.5 AWS SLA High Availability and Disaster Recovery inter-region
For your MySQL, MariaDB, PostgreSQL, Oracle, and SQL Server database (DB) instances, you can use Amazon RDS Multi-AZ deployments. When you provision a Multi-AZ DB instance, Amazon RDS automatically creates a primary DB instance and synchronously replicates the data to a standby instance in a different Availability Zone (AZ). In case of an infrastructure failure, Amazon RDS performs an automatic failover to the standby DB instance. Since the endpoint for your DB instance remains the same after a failover, your application can resume database operation without the need for manual administrative intervention.
1.36.4 Charging model
Work Unit |
per DB Instance |
1.36.5 Changes catalogue – in Tokens, per act
Changes examples | Effort |
Provision database | 2 tokens |
Reboot an instance | 2 tokens |
Delete an instance | 2 tokens |
Instance failover | 2 tokens |
Take snapshot of an instance | 2 tokens |
Stop & start a cluster | 2 tokens |
Delete a cluster | 2 tokens |
Add reader Instance | 2 tokens |
Add AWS region at cluster level | 2 tokens |
Create clone at cluster level | 2 tokens |
Restore a cluster to point in Time | Estimation in tokens based on the database size |
Modify cluster configuration | 1 token |
Upgrade a database | Estimation in tokens based on time spent |
Minor Version patching | Estimation in tokens based on time spent |
Export database snapshot to S3 | Estimation in tokens based on the size of extracted data |
Other changes | Estimation in tokens based on time spent |
1.37 Amazon DocumentDB
1.37.1 Description
Amazon DocumentDB is an AWS database service that is fully managed and compatible with MongoDB. You can use this service to migrate and host MongoDB workloads and application data while working with native Mongo code, tools, and drivers.
Through DocumentDB you gain access to the following features:
- Automatic scaling to match the size of your storage needs. Scaling occurs in increments of 10GB up to 64TB.
- Ability to create up to 15 read replicas for higher throughput. Storage is shared so writes only need to be performed to centralized volumes, not duplicated across replicas.
- Enables you to scale memory and compute resources independently for greater flexibility and cost optimization.
- Operates in a Virtual Private Cloud (VPC) with firewalls for greater isolation and security.
- Supports encryption with keys managed in AWS Key Management Service (AWS KMS). Active data, backups, replicas, and snapshots are all encrypted.
1.37.2 Build to run service included in the OTC
1.37.2.1 Build service pre-requisite
- Refer to generic description.
1.37.2.2 Build to run service
- Refer to generic description.
1.37.3 RUN services included in the MRC
1.37.3.1 Run service pre-requisite
- A referential file exists in the Git used by OBS which includes the reference configuration of the service.
- This file can be executed with a CI/CD used by OBS and the execution has been tested successfully.
1.37.3.2 Reporting
By default, no. Reporting can be requested by customer through change request to have point in time report.
1.37.3.3 KPI & alerts
Monitoring
yes
Available metrics
Amazon CloudWatch metrics for Amazon DocumentDB are available at: Monitoring Amazon DocumentDB with CloudWatch – Amazon DocumentDB
Alerts observed
- DatabaseConnections
- DatabaseConnectionsMax
- NetworkThroughput
- CPUUtilization
- CPUCreditUsage
- CPUCreditBalance
- FreeLocalStorage
- FreeableMemory
- BufferCacheHitRatio
- VolumeBytesUsed
- VolumeReadIOPs
- VolumeWriteIOPs
1.37.3.4 Backup and restore
Data backup and restore
Amazon DocumentDB (with MongoDB compatibility) continuously backs up your data to Amazon Simple Storage Service (Amazon S3) for 1–35 days so that you can quickly restore to any point within the backup retention period. Amazon DocumentDB also takes automatic snapshots of your data as part of this continuous backup process.
You can also retain backup data beyond the backup retention period by creating a manual snapshot of your cluster’s data. The backup process does not impact your cluster’s performance.
Service restore
Recovery will be from Infra as Code.
1.37.3.5 AWS SLA High Availability and Disaster Recovery inter-region
You can achieve high availability and read scaling in Amazon DocumentDB (with MongoDB compatibility) by using replica instances. A single Amazon DocumentDB cluster supports a single primary instance and up to 15 replica instances. These instances can be distributed across Availability Zones within the cluster’s Region. The primary instance accepts read and write traffic, and replica instances accept only read requests.
Sharding Option:
No. Amazon DocumentDB’s distributed storage architecture is a different approach to scaling than MongoDB sharding.
1.37.4 Charging model
Work Unit |
per DB Instance |
1.37.5 Changes catalogue – in Tokens, per act
Changes examples | Effort |
Adding a Replica to an Amazon DocumentDB Cluster | 1 token |
Creating a Cluster Snapshot | 1 token |
Restoring from a Snapshot | 1 token |
Removing an Instance from a Cluster | 1 token |
Deleting a Cluster | 1 token |
Other changes | Estimation in tokens based on time spent |