Customer’s business application deployed on AWS are dependent on AWS Cloud Native Services (IaaS, PaaS). Orange Business Services provides the managed services necessary to ensure service assurance and change management for those dependences, as well as the configuration and deployment for building and recovering them.
One can typically distinguish 3 categories of services:
- The user plane services: if a business application depends on it, the business application is likely to be affected by a defect of it. The service does not have persistent data, therefore the recovery does not necessitate data restore.
- The data services: if a business application depends on a data service, the business application is likely to be affected by a defect of it. The service has persistent data, therefore a recovery may necessitate data restore. Data loss, data corruption may affect the business application as well.
- The other services: the business application does not depend on them. Most of those services are used for automation, observation, migration. The loss of the service is not likely to affect the business application. Some of the services are used for managing the user plane and data plane services of the business application, some others have specific usage for which a scope of work shall be established would the customer requires OBS to leverage them as part of the managed service provided.
User plane services | Data services | Other services |
Compute
AWS Elastic Beanstalk AWS Lambda Amazon Elastic Compute Cloud (EC2)
Networking Application Load Balancer Route 53 AWS Network Firewall AWS Direct Connect Amazon Elastic Load Balancer (ELB)
|
Storage
Simple Storage Services (S3) Elastic Block Store (EBS)
Databases DynamoDB Amazon RDS Amazon DocumentDB ElastiCache for Redis Amazon MemoryDB for Redis Amazon Aurora
|
Management &
Governance Trusted Advisor AWS Backup AWS CloudWatch AWS Organizations
Security management AWS Security Hub Amazon Inspector
|
AWS Cloud Native services by category
The tasks involved for the management of a cloud native service depends on the service. They consist in:
- Configuring and deploying the service: Infrastructure as Code is leveraged in order to configure the service, the observability, the backup. Level 3 expertise on the service is leveraged for proper implementation thanks to the scope of work (refer to detailed description of build and SRE services)
- Applying the security group and access control policy defined by the customer.
- Service recovery thanks to Infrastructure as Code: in case of failure, most of the services requires to be recovered thanks to a redeployment. Re-configuring the service manually from scratch is not an efficient option: it takes time and is error prone. This is why recovery / redeployment from Infrastructure as Code is preferred.
- Supervision and remedial consists in watching for alarms raised on the service during the monitoring range (typically: 8×5 or 24×7). When an alarm occurs, an incident ticket is raised, a priority is assigned, the customer is notified. Then remedial action is taken thanks to the procedures made available to Level 2 / 1 by the Level 3. The remedial on a cloud native service may be necessary to restore the service of the business application. Would the procedure not remedy to the incident, then the incident is escaladed to the Level 3. Would the root cause be the CSP itself, then the incident is raised to the CSP by the Level 3.
- Backup and restore: depending on the service (if the service has persistence), it is necessary to backup the service data. The management service consists in configuring the backup solution and monitoring the proper run of it. Note: the backup solution has to be subscribed separately e.g. AWS backup. Restoring the service on incident may involve restoring the data from a backup.
- OS patching and anti-virus: keeping OS up to date and virus free is a managed service for Managed Virtual Machine / Managed OS. Please refer to the detailed description.
- Specifics: some cloud native services may have specific configuration or management tasks.
- Business application specifics: by default, standard alerts are watched. The configuration of alerts, logs on a cloud native service which are specific to a business application is subject to a specific scope of work.
Tasks involved in managed services for cloud native service
Depending on the cloud native service managed, more or less management tasks are necessary and included in the managed service. This drives the complexity of the managed service.
The tasks involved typically depends on the category of the cloud native service, whether user plane, data plane on which the business application depends, or other services upon which the business application does not depend.
Charging model | User plane services | Data plane services | Other services | |
Purpose | Used to support customer application | Used to support customer application | Used to operate user plane or data plane | |
Build | One-time charge based on SoW | IaC in Git, pushed via CI / CD | IaC in Git, pushed via CI / CD | IaC in Git, pushed via CI / CD |
Maintaining IaC without changes | Monthly recurring charge | Yes | Yes | Yes |
Monitoring & alerts | Monthly recurring charge | Yes | Yes | |
Configuration restore on incident | Included in MRC | Yes, from IaC or export | Yes, from IaC
or backup |
Yes, from IaC when applicable |
Data backup and restore on incident | Included in MRC | Yes | ||
Network and Security Management | Based on SoW | Optional: Based on SoW | Optional: Based on SoW | |
Service Desk | Per incident ticket or percentage | Yes | Yes | Yes |
Change Management | Per change, in Tokens vs complexity | Via IaC in Git, pushed via CI / CD. | Via IaC in Git, pushed via CI / CD. | Via IaC in Git, pushed via CI / CD |
Disaster recovery | Specific design and quote | Optional: Based on SoW | Optional: Based on SoW |
AWS service | Type | Configuration | Monitoring and alerts configured in Amazon CloudWatch | Backup configured in AWS Backup | Recovery procedure | Patch management | Antivirus management | Specificities |
Pre-requisite in case of | Class 2, Class 4 when no AWS backup available for the service | Class 2 | Class 2 | Class 2 If different from a restore then Class 4, Class 5 |
Class 2 | n/a | ||
Amazon Elastic Compute Cloud (EC2) – per instance | Managed | Terraform or AWS CloudFormation | Amazon CloudWatch | AWS Backup | From Backup | AWS Systems Manager Patch Manager | OBS Sophos | Only supported OS versions |
Elastic Block Store (EBS) – included in Amazon EC2 | Managed | Terraform or AWS CloudFormation | Amazon CloudWatch | AWS Backup | From Backup | n/a | n/a | Part of managed Amazon EC2 |
Auto Scaling – per Auto Scaling Group | Managed | Terraform or AWS CloudFormation | Amazon CloudWatch | n/a | n/a | Only supported OS versions | ||
Elastic Kubernetes Service (EKS) – per cluster per vCPU | Managed | Terraform or AWS CloudFormation | Amazon CloudWatch | n/a | From IaC | n/a | n/a | Patch management is included in the service |
Elastic Kubernetes Service (EKS) Fargate – per pod | Managed | Terraform or AWS CloudFormation | Amazon CloudWatch | n/a | From IaC | n/a | n/a | |
AWS Elastic Beanstalk – per Web Application | Managed | Terraform or AWS CloudFormation | Amazon CloudWatch | From IaC | n/a | n/a | Python, Ruby, Java, .NET, PHP, Node.js, Go and Docker | |
AWS Lambda – per 100 lines of code | Managed | Terraform or AWS CloudFormation | Amazon CloudWatch | n/a | From IaC | n/a | n/a | Customer to provide AWS Lambda code |
AWS Key Management Service (KMS) (mutualized HSM) | Managed | Terraform Plan or AWS CloudFormation | Amazon CloudWatch | From IaC | n/a | n/a | Natively Redundant | |
Amazon Elastic Load Balancer | Managed | Terraform or AWS CloudFormation | Amazon CloudWatch | n/a | From IaC | n/a | n/a | |
Application Load Balancer – per Application Load Balancer | Managed | Terraform or AWS CloudFormation | Amazon CloudWatch | From IaC | n/a | n/a | ||
Route 53 – Per zone | Managed | Terraform or AWS CloudFormation | Amazon CloudWatch | n/a | From IaC | n/a | n/a | |
Cloud Front – per End Point | Managed | Terraform or AWS CloudFormation | Amazon CloudWatch | n/a | From IaC | n/a | n/a | |
AWS Direct Connect | Managed | Terraform or AWS CloudFormation | Amazon CloudWatch | n/a | Configured once, need to re-configure if an incident occurs | n/a | n/a | On-premises connection/routing is excluded from SoW |
Network ACLs | Managed | Terraform or AWS CloudFormation | Amazon CloudWatch | n/a | From IaC | n/a | n/a | Recovery from IaC is sow |
AWS Network Firewall | Managed | Terraform or AWS CloudFormation | Amazon CloudWatch | n/a | From IaC | n/a | n/a | Recovery from IaC is sow |
AWS WAF (web application firewall) | Managed | Terraform or AWS CloudFormation | Amazon CloudWatch | n/a | From IaC | n/a | n/a | Recovery is sow |
VPN Gateway – per connexion (cloud side MS only – link and e2e excluded) | Managed | Terraform or AWS CloudFormation | Amazon CloudWatch | n/a | From IaC | n/a | n/a | e2e link excluded, SIC required on top export is sow (e.g. Preshared keys) |
Network Security Groups – per 5 security groups with limitation of number of rules inside the security group | Change mgt | Terraform or AWS CloudFormation | n/a | n/a | From IaC | n/a | n/a | |
VPC (up to 5) – included in managed tenant | Change mgt | Terraform or AWS CloudFormation | n/a | n/a | From IaC | n/a | n/a | |
Simple Storage Services (S3) | Managed | Terraform or AWS CloudFormation | Amazon CloudWatch | Optional: AWS Backup | From Data backup | n/a | n/a | SoW necessary for data backup |
DynamoDB – Dynamo DB table | Managed | Terraform or AWS CloudFormation | Amazon CloudWatch | Build-in continuous backup and on demand backup using AWS backup or DynamoDB | From Backup | n/a | n/a | Execution of script provided by customer sow |
DynamoDB – per additional Dynamo DB table | Managed | Terraform or AWS CloudFormation | Amazon CloudWatch | Build-in continuous backup and on demand backup using AWS backup or DynamoDB | From Backup | n/a | n/a | Execution of script provided by customer sow |
Amazon Aurora MySQL Compatible – per DB instance | Managed | Terraform or AWS CloudFormation | Amazon CloudWatch | Build-in continuous backup, snapshot using AWS Backup and backtrack | From Backup | IAC | n/a | Execution of script provided by customer sow |
Amazon Aurora MySQL Compatible – per additional DB instance | Managed | Terraform or AWS CloudFormation | Amazon CloudWatch | Build-in continuous backup, snapshot using AWS Backup and backtrack | From Backup | IAC | n/a | Execution of script provided by customer sow |
Amazon RDS – per DB instance | Managed | Terraform or AWS CloudFormation | Amazon CloudWatch | Build-in automated backup or manual backup | From Backup | IAC | n/a | Execution of script provided by customer sow |
Amazon RDS – per additional DB instance | Managed | Terraform or AWS CloudFormation | Amazon CloudWatch | Build-in automated backup or manual backup | From Backup | IAC | n/a | Execution of script provided by customer sow |
Amazon Aurora PostgreSQL Compatible – per DB instance | Managed | Terraform or AWS CloudFormation | Amazon CloudWatch | Build-in continuous backup and snapshot using AWS Backup | From Backup | IAC | n/a | Execution of script provided by customer sow |
Amazon Aurora PostgreSQL Compatible – per additional DB instance | Managed | Terraform or AWS CloudFormation | Amazon CloudWatch | Build-in continuous backup and snapshot using AWS Backup | From Backup | IAC | n/a | Execution of script provided by customer sow |
Amazon Neptune – per DB instance | Managed | Terraform or AWS CloudFormation | Amazon CloudWatch | Build-in automated backup or manual backup | From Backup | IAC | n/a | Execution of script provided by customer sow |
Amazon Neptune – per additional DB instance | Managed | Terraform or AWS CloudFormation | Amazon CloudWatch | Build-in automated backup or manual backup | From Backup | IAC | n/a | Execution of script provided by customer sow |
ElastiCache for Redis- per node | Managed | Terraform or AWS CloudFormation | Amazon CloudWatch | Build-in automated backup or manual backup | From Backup | IAC | n/a | Execution of script provided by customer sow |
ElastiCache for Redis- per additional node | Managed | Terraform or AWS CloudFormation | Amazon CloudWatch | Build-in automated backup or manual backup | From Backup | IAC | n/a | Execution of script provided by customer sow |
Amazon MemoryDB for Redis – per node | Managed | Terraform or AWS CloudFormation | Amazon CloudWatch | Build-in automated backup or snapshot | From Backup | IAC | n/a | Execution of script provided by customer sow |
Amazon MemoryDB for Redis – per additional node | Managed | Terraform or AWS CloudFormation | Amazon CloudWatch | Build-in automated backup or snapshot | From Backup | IAC | n/a | Execution of script provided by customer sow |
Amazon Keyspaces (for Apache Cassandra) – per Per Table within Keyspace | Managed | Terraform or AWS CloudFormation | Amazon CloudWatch | Amazon Keyspaces PITR | From Backup | IAC | n/a | Execution of script provided by customer sow |
Amazon Keyspaces (for Apache Cassandra) – per additional Per Table within Keyspace | Managed | Terraform or AWS CloudFormation | Amazon CloudWatch | Amazon Keyspaces PITR | From Backup | IAC | n/a | Execution of script provided by customer sow |
ElastiCache for Memcached – per node | Managed | Terraform or AWS CloudFormation | Amazon CloudWatch | not available | From IaC | IAC | n/a | Execution of script provided by customer sow |
ElastiCache for Memcached – per additional node | Managed | Terraform or AWS CloudFormation | Amazon CloudWatch | not available | From IaC | IAC | n/a | Execution of script provided by customer sow |
Amazon Quantum Ledger Database- per Table within ledger | Managed | Terraform or AWS CloudFormation | Amazon CloudWatch | not available / on-demand journal export feature | From IaC | IAC | n/a | Execution of script provided by customer sow |
Amazon Quantum Ledger Database- per additional Table within ledger | Managed | Terraform or AWS CloudFormation | Amazon CloudWatch | not available/ on-demand journal export feature | From IaC | IAC | n/a | Execution of script provided by customer sow |
AWS MQ for Apache ActiveMQ | Managed | Terraform or AWS CloudFormation | AWS CloudWatch | From Terraform IaC or CloudFormation | n/a | n/a | ||
Amazon Simple Queue Service | Managed | Terraform or AWS CloudFormation | AWS CloudWatch | From Terraform IaC or CloudFormation | n/a | n/a | ||
AWS Amazon Simple Notification Service (SNS) | Managed | Terraform or AWS CloudFormation | AWS CloudWatch | From Terraform IaC or CloudFormation | n/a | n/a | ||
AWS MQ for RabbitMQ | Managed | Terraform or AWS CloudFormation | AWS CloudWatch | From Terraform IaC or CloudFormation | n/a | n/a | ||
Cognito | Terraform or AWS CloudFormation | From Infra as Code for user pools and Identity pools | n/a | n/a | ||||
AWS Directory Service (managed AD) | Managed | Terraform or AWS CloudFormation | AWS CloudWatch | From Terraform IaC or CloudFormation | n/a | n/a | ||
AWS API Gateway | Managed | Terraform or AWS CloudFormation | AWS CloudWatch | From Terraform IaC or CloudFormation | n/a | n/a | ||
CloudWatch (option) – per managed ressource | Change mgt | Terraform or AWS CloudFormation | n/a | n/a | From IaC | n/a | n/a | Optional based on change mgt |
SSM – per SSM Document | Change mgt | n/a | n/a | custom SSM Document configured and maintained by customer | ||||
Trusted Advisor | Use & Change mgt | n/a | n/a | Used by the Service Reliability Engineer | ||||
AWS Backup | Use & Change mgt | Terraform or AWS CloudFormation | Backup policy from Terraform IaC. Data restored from backup. | n/a | n/a | Used by default for backup when available for the resource. Data restore is based on SoW | ||
AWS Code Star / Code Commit / Code Pipeline / Code Deploy / Code Artifact | Use & Change mgt | n/a | n/a | SoW | ||||
AWS Organizations | Change mgt | Terraform or AWS CloudFormation | n/a | n/a | ||||
AWS Management Console | Use & Change mgt | n/a | n/a | |||||
AWS Security Hub | SoW | n/a | n/a | Specific sow | ||||
AWS Batch | SoW | n/a | n/a | Specific sow |
Table of tasks involved in the management of cloud services (extract of services)
AWS tooling and OBS backend operations tooling are leveraged to deliver the managed services. Would the customer require the use of a different tooling, the feasibility shall be confirmed with OBS and the RACI and work-units may be revised.
Process | Tool used by OBS MA delivery |
Configuration of the infrastructure | Terraform plan
CloudFormation (option) GIT referential CI / CD |
Supervision solution | AWS CloudWatch with connector to OBS supervision |
Backup | AWS Backup (incl snapshots) |
OS patching solution | AWS Systems Manager Patch Manager
OBS MA patching tool (BRAC) OBS OS factory |
Antivirus solution | OBS MA Sophos tool |
Logging solution | Amazon CloudWatch Logs (on demand) |
Recovery | From backup when it exists
From Terraform plan in GIT when it exists Ideally from up-to-date Infra as code with CI/CD |
Admin connectivity | VPN to OBS CASA Zone – connection through CyberArk |
Portal for access to MA contract, incident & change ITSM | OBS CloudStore |
The following pre-requisites are necessary to all managed services:
- The Customer shall have defined a valid architecture. (OBS can optionally provide Professional Services for architecture definition).
- The Customer shall have a valid subscription to AWS including subscription to AWS Support plan and procure the AWS resources and AWS support plan. OBS can optionally supply this subscription inclusive of AWS support (ref to Multi-Cloud Ready offer for AWS), however, the subscription, the IaaS resources, the AWS support are not part of the Managed Services. The Managed Services will leverage this support contract to escalate incident to AWS CSP.
- AWS platform for the Customer shall be urbanized alongside best practices of AWS’s landing zone or shall offer comparable services.
- OBS proposes a default RACI depending on the class of transition and the resource managed. As a pre-requisite to the project, OBS and the Customer shall agree on the RACI.
- Agreement on the tooling used for GIT, CI / CD chain, Monitoring, Logging and Alerting solution.
- Additional pre-requisites are required when transition is not the entire responsibility of OBS (not Full Build, ref to Build chapter of the document)
In the case of Fully Managed service, OBS is using its own Git, CI / CD chain, Monitoring, Logging and Alerting solution.
In the case of a Co-managed service, OBS and the Customer agree on the Git, CI / CD chain, Monitoring, Logging and Alerting solution to be used. By default, the tooling is
- Either based on AWS tools i.e. CloudFormation, Amazon CloudWatch
- Or based on generic multi-cloud tooling proposed by OBS e.g. CaasCad (Prometheus, Grafana, …)
This tooling not included in the Managed Applications work units and can be purchased separately as part of AWS Subscription or as a multi-cloud tooling proposal made by OBS.
Criteria shall be met with an approval by Level 2 before turning a cloud native component to an active managed service (i.e. Run) by the Level 2 / Level 1 operations. The owner of the Build and of the Level 3 support owns the responsibility of making sure that the criteria are met:
- The architecture and deployment of the service shall be defined.
- The service shall be deployed thanks to Infrastructure-as-Code and tested prior to transitioning to the run team. Typically, successful testing in pre-production, with a pre-production environment iso-production. Note: IaC is necessary to recover the services in case of major failure.
- The use of the service shall be explained to the operation team
- The security policies and access control shall have been configured.
- The access shall have been configured allowing OBS Level 2 teams access.
- The service shall export the necessary metrics towards Amazon CloudWatch.
- The data backup shall be configured in AWS Backup when backup is applicable.
- The disaster recovery shall be configured when applicable.
- The troubleshooting and service restoration procedures shall be provided to Level 2.
- Whereas a procedure requires logs or dashboard those shall have been developed and deployed prior to transferring to run phase.
- A remedial procedure on incident shall not last more than 15 minutes. Beyond, that time amount, the effort would be charged on time base.