Cloud Load Balancing
Description
Google Cloud Load Balancing operates at layer 4 or 7 of the Open Systems Interconnection (OSI) model. Google Cloud Load Balancing is a software-based managed service for distributing traffic in a single or multiple region across multiple instances of applications It’s the single point of contact for clients. Load balancer distributes inbound flows that arrive at the load balancer’s front end to backend pool instances. These flows are according to configured load-balancing rules and health probes. The backend pool instances can be GCP Virtual Machines or instances in a virtual machine scale set.
A public load balancer can provide outbound connections for virtual machines (VMs) inside your virtual network. These connections are accomplished by translating their private IP addresses to public IP addresses. Public Load Balancers are used to load balance internet traffic to your VMs.
An internal (or private) load balancer is used where private IPs are needed at the frontend only. Internal load balancers are used to load balance traffic inside a virtual network. A load balancer frontend can be accessed from an on-premises network in a hybrid scenario.
Build to run service included in the OTC
Build service pre-requisite
- Refer to generic description.
Build to run service
- Refer to generic description.
RUN services included in the MRC
Run service pre-requisite
- A referential file exists in the Git including the reference configuration of the load balancer.
- This file can be executed with a CI/CD and the execution has been tested successfully.
Co-manage option
Yes, if CI/CD shared with the customer
KPI & alerts
Monitoring
Yes, Insights, Metrics, New metric possible with logs, Health probes
KPI monitored
L4/TCP
- l3/external/rtt_latencies >= xms
- l3/internal/rtt_latencies >= xms
L7 / HTTP(s)
- https/backend_latencies >= xms
- https/internal/backend_latencies >= xms
HTTP codes ratio (on demand)
- https/backend_request_count response_code_class = 500 / https/backend_request_count response_code_class = 200 >= x%
- https/backend_request_count response_code_class = 400 / https/backend_request_count response_code_class = 200 >= x%”
Alerts observed
L3/TCP
- l3/external/rtt_latencies >= xms
- l3/internal/rtt_latencies >= xms
L4 / HTTP(s)
- https/backend_latencies >= xms
- https/internal/backend_latencies >= xms
HTTP codes ratio (on demand)
- https/backend_request_count response_code_class = 500 / https/backend_request_count response_code_class = 200 >= x%
- https/backend_request_count response_code_class = 400 / https/backend_request_count response_code_class = 200 >= x%”
Backup and restore
Data backup and restore
Not applicable. Load balancer does not store data persistently.
Service restore
GCP SLA High Availability and Disaster Recovery inter-region
GCP ensures High Availability of the Load Balancer with standard SKU.
Maintaining a cross region Disaster Recovery requires specific design and subject to a specific additional charging.
Charging model
Work Unit |
Per Load Balancer instance |
Changes catalogue – in Tokens, per act
Changes examples | Effort | Impact on MRC |
Setup / modify / delete URI | 1 token | |
Change health probes / Add new backend | 2 tokens | |
Other changes | Estimation in tokens based on time spent |
Cloud DNS
Description
Cloud DNS host your Domain Name System (DNS) domains in GCP. Cloud DNS offers both public zones and private managed DNS zones. A public zone is visible to the public internet, while a private zone is visible only from one or more Virtual Private Cloud (VPC) networks that you specify.
Build to run service included in the OT
Build service pre-requisite
- Refer to generic description.
Build to run service
- Refer to generic description.
RUN services included in the MRC
Run a managed Cloud DNS service is optional. Depending on Customer’s interest, the Customer may request the service. By default, there is no recurring task proposed on Cloud DNS service, but on demand changes and on demand investigations.
Run service pre-requisite
- A referential file exists in the Git used by OBS which includes the reference configuration of the DNS.
- This file can be executed with a CI/CD used by OBS and the execution has been tested successfully.
Co-manage option
For the Public part, OBS work with the customer for the publics domain naming context.
For the private Part, a RACI must be done.
KPI & alerts
Monitoring
Yes, Metrics,
KPI monitored
Number of changes in the DNS database.
Alerts observed
Number of changes in the DNS rules
Backup and restore
Data backup and restore
Yes. Backup is proposed based on regular export.
Service restore
The CI/CD chain is used to redeploy the records from a backup zone into the native DNS service or from an export
GCP SLA High Availability and Disaster Recovery inter-region
Cloud DNS is a high-performance, resilient, global Domain Name System (DNS) service that publishes your domain names to the global DNS.
In case of public DNS the customer should be responsible for the host mastering (registration)
Charging model
Work Unit |
Per resource group |
Changes catalogue – in Tokens, per act
Changes examples | Effort |
Create / update/ delete zone (one zone including reverse) | 1 token |
Create / update/ delete record (up to 10 records) | 1 token |
Zone delegation* | 1 token |
Configure Firewall DNS | 2 tokens |
Other changes | Estimation in tokens based on time spent |
Content Delivery Network (CDN)
Description
Google Cloud CDN is a fast, reliable, and secure content delivery network that ensures the delivery of data without any latency. Google Cloud CDN delivers content peers to peers securely over the cloud. Google Cloud CDN optimizes your static content on its fast and reliable servers for delivering your static assets quickly and efficiently and gives you the option to keep our data public or private. Through Google cloud CDN it allows to load very easily, faster and securely, the website of our organizations in a simple and secure way for customer as well as for us.
Build to run service included in the OTC
Build service pre-requisite
- Refer to generic description.
Build to run service
- Refer to generic description.
RUN services included in the MRC
Run a managed Cloud DNS service is optional. Mandatory if offer is Managed applications, optional if offer is managed infrastructure.
Run service pre-requisite
- A referential file exists in the Git including the reference configuration of the CDN.
- This file can be executed with a CI/CD and the execution has been tested successfully.
Co-manage option
Yes based on RACI determined during pre-sales or build.
KPI & alerts
Monitoring
Yes: Metrics and diagnostic logs
KPI monitored
- Byte Hit Ratio
- Request Count
- Response Size
- Total Latency
- Customized ping page per zone
Alerts observed
- Customized ping page per zone
- Latency per zone,
- log analysis on métrics
Backup and restore
Data backup and restore
Can be exported from CI/CD Pipeline.
Service restore
The Continuous Deployment chain is used to redeploy the CDN from the configuration file of reference for production environment committed in the Git.
GCP SLA High Availability and Disaster Recovery inter-region
Based on design SOW, the service can be built in multiple regions.
Charging model
Work Unit |
Per Endpoint |
1.3.5 Changes catalogue – in Tokens, per act
Changes examples | Effort |
Purge CDN | 1 Token |
Add URL | 1 Token |
Other changes | Estimation in tokens based on time spent |
Cloud NAT
Description
Cloud NAT is a distributed, software-defined managed service. Cloud NAT configures the Andromeda software that powers your Virtual Private Cloud (VPC) network so that it provides source network address translation (source NAT or SNAT) for VMs without external IP addresses. Cloud NAT also provides destination network address translation (destination NAT or DNAT) for established inbound response packets.
1.1.1 Description
Cloud NAT is a distributed, software-defined managed service. Cloud NAT configures the Andromeda software that powers your Virtual Private Cloud (VPC) network so that it provides source network address translation (source NAT or SNAT) for VMs without external IP addresses. Cloud NAT also provides destination network address translation (destination NAT or DNAT) for established inbound response packets.
1.1.2 Build to run service included in the OTC
1.1.2.1 Build service pre-requisite
- Refer to generic description.
1.1.2.2 Build to run service
- Refer to generic description.
1.1.3 RUN services included in the MRC
1.1.3.1 Run service pre-requisite
- A referential file exists in the Git including the reference configuration of the Cloud NAT.
- This file can be executed with a CI/CD and the execution has been tested successfully.
1.1.3.2 Co-manage option
No, OBS manages the Cloud NAT
1.1.3.3 KPI & alerts
Monitoring
Yes, Metrics,
KPI monitored
- nat_allocation_failed = 1
- dropped_sent_packets_count >= x%
- dropped_received_packets_count >= x%
Alerts observed
- nat_allocation_failed = 1
- dropped_sent_packets_count >= x%
- dropped_received_packets_count >= x%
1.1.3.4 Backup and restore
Data backup and restore
Can be exported from CI/CD Pipeline.
Service restore
The Continuous Deployment chain is used to redeploy the CDN from the configuration file of reference for production environment committed in the Git.
1.1.3.5 GCP SLA High Availability and Disaster Recovery inter-region
HA by design. Based on design SOW, the service can be built in multiple regions
1.1.4 Charging model
Work Unit |
Per Endpoint |
1.1.5 Changes catalogue – in Tokens, per act
Changes examples | Effort |
Create / update/ delete (including reverse) | 1 token |
Configure Firewall NAT | 2 tokens |
Other changes | Estimation in tokens based on time spent |
1.2 Cloud Router
1.2.1 Description
Cloud Router is a fully distributed and managed Google Cloud service that uses the Border Gateway Protocol (BGP) to advertise IP address ranges. It programs custom dynamic routes based on the BGP advertisements that it receives from a peer. Instead of a physical device or appliance, each Cloud Router is implemented by software tasks that act as BGP speakers and responders. A Cloud Router also serves as the control plane for Cloud NAT. Cloud Router provides BGP services for the following Google Cloud products:
1.2.2 Build to run service included in the OTC
1.2.2.1 Build service pre-requisite
- Refer to generic description.
1.2.2.2 Build to run service
- Refer to generic description.
1.2.3 RUN services included in the MRC
1.2.3.1 Run service pre-requisite
- A referential file exists in the Git used by OBS which includes the reference configuration of the Cloud Router.
- This file can be executed with a CI/CD used by OBS and the execution has been tested successfully.
1.2.3.2 Co-manage option
No, Orange Business Services manages the Cloud Router service.
1.2.3.3 KPI & alerts
Monitoring
Yes, Metrics, Logs, Probes
Cloud Router can be monitorized by using Cloud Monitoring using Alerts and Metrics. Realtime Native reporting from GCP (Cloud Monitoring, Cloud Logging) can be used by OBS and Specific reporting on quote.
KPI monitored
gcp.router.best_received_routes_count | Current number of best routes received by router. |
gcp.router.bgp.received_routes_count | Current number of routes received on a bgp session. |
gcp.router.bgp.sent_routes_count | Current number of routes sent on a bgp session. |
gcp.router.bgp.session_up | Indicator for successful bgp session establishment. |
gcp.router.bgp_sessions_down_count | Number of BGP sessions on the router that are down. |
gcp.router.bgp_sessions_up_count | Number of BGP sessions on the router that are up. |
gcp.router.nat.allocated_ports | The number of ports allocated to all VMs by the NAT gateway |
gcp.router.nat.closed_connections_count | The number of connections to the NAT gateway that are closed |
gcp.router.nat.dropped_received_packets_count | The number of received packets dropped by the NAT gateway |
gcp.router.nat.new_connections_count | The number of new connections to the NAT gateway |
gcp.router.nat.open_connections | The number of connections open to the NAT gateway |
gcp.router.nat.port_usage (gauge) |
The highest port usage among all VMs connected to the NAT gateway |
gcp.router.nat.received_bytes_count | The number of bytes received by the NAT gateway |
gcp.router.nat.received_packets_count | The number of packets received by the NAT gateway |
gcp.router.nat.sent_bytes_count | The number of bytes sent by the NAT gateway |
gcp.router.nat.sent_packets_count | The number of packets sent by the NAT gateway |
gcp.router.router_up | Router status, up or down |
gcp.router.sent_routes_count | Current number of routes sent by router. |
Alerts observed
Orange Business Services will set alert depending on the SOW of the Customer.
1.2.3.4 Backup and restore
Data backup and restore
The backup is based on demand Export Template IaC
Service restore
Recovery will be from Infra as Code or by Orange Business Services Operation Team actions.
The Continuous Deployment chain is used to redeploy the Cloud Router service from the configuration file of reference for production environment committed in the Git.
1.2.3.5 GCP SLA High Availability and Disaster Recovery inter-region
HA and non HA are provided by Google Cloud Platform depending on the design and service parameter configuration
Recovery after regions loss is Based on design SOW, the service can be built in multiple regions.
1.2.4 Charging model
Work Unit |
Per router |
1.2.5 Changes catalogue – in Tokens, per act
Changes examples | Effort |
Modify/delete router
Simple modification router |
1 token |
Create router
Complex modification router |
2 tokens |
Other changes | Estimation in tokens based on time spent |
1.3 Cloud VPN
1.3.1 Description
Cloud VPN securely extends your peer network to Google’s network through an IPsec VPN tunnel. Traffic is encrypted and travels between the two networks over the public internet.
1.3.2 Build to run service included in the OTC
1.3.2.1 Build service pre-requisite
- Refer to generic description.
1.3.2.2 Build to run service
- Refer to generic description.
1.3.3 RUN services included in the MRC
1.3.3.1 Run service pre-requisite
- This file can be executed with a CI/CD and the execution has been tested successfully.
1.3.3.2 Co-manage option
No, Orange Business Services manages the Cloud VPN service.
1.3.3.3 KPI & alerts
Monitoring
Yes, Metrics, Logs, Probes
Cloud VPN can be monitorized by using Cloud Monitoring using Alerts and Metrics. Realtime Native reporting from GCP (Cloud Monitoring, Cloud Logging) can be used by OBS and Specific reporting on quote.
KPI monitored
gcp.vpn.network.dropped_received_packets_count | Ingress packets dropped for tunnel. |
gcp.vpn.network.dropped_sent_packets_count | Egress packets dropped for tunnel. |
gcp.vpn.network.received_bytes_count | Ingress bytes for tunnel. |
gcp.vpn.network.sent_bytes_count | Egress bytes for tunnel. |
gcp.vpn.tunnel_established | Indicates successful tunnel establishment if greater than 0. |
gcp.router.best_received_routes_count | Number of best routes received by router. |
gcp.router.bgp.received_routes_count | Number of routes received on a bgp session. |
gcp.router.bgp.sent_routes_count | Number of routes sent on a bgp session. |
gcp.router.bgp.session_up | Indicator for successful bgp session establishment. |
gcp.router.bgp_sessions_down_count | Number of BGP sessions on the router that are down. |
gcp.router.bgp_sessions_up_count | Number of BGP sessions on the router that are up. |
gcp.router.router_up | Router status up or down |
gcp.router.sent_routes_count | Number of routes sent by router. |
Alerts observed
Orange Business Services will set alert depending on the SOW of the Customer.
1.3.3.4 Backup and restore
Data backup and restore
The backup is based on demand Export Template IaC
Service restore
Recovery will be from Infra as Code or by Orange Business Services Operation Team actions.
The Continuous Deployment chain is used to redeploy the Cloud VPN service from the configuration file of reference for production environment committed in the Git.
1.3.3.5 GCP SLA High Availability and Disaster Recovery inter-region
HA are provided by Google Cloud Platform by default.
HA VPN is a high-availability (HA) Cloud VPN solution that lets you securely connect your on-premises network to your VPC network through an IPsec VPN connection in a single region. HA VPN provides an SLA of 99.99% service availability.
Recovery after regions loss is Based on design SOW, the service can be built in multiple regions.
1.3.4 Charging model
Work Unit |
Per Tunnel VPN |
1.3.5 Changes catalogue – in Tokens, per act
Changes examples | Effort |
Modify/delete tunnel | 1 token |
Create tunnel | 2 tokens |
Other changes | Estimation in tokens based on time spent |
1.4 Cloud SQL
1.4.1 Description
Cloud SQL is a fully-managed database service that helps you set up, maintain, manage, and administer your relational databases on Google Cloud Platform. You can use Cloud SQL with MySQL, PostgreSQL, or SQL Server. Cloud SQL provides a cloud-based alternative to local MySQL, PostgreSQL, and SQL Server databases. Many applications running on Compute Engine, App Engine and other services in Google Cloud use Cloud SQL for database storage.
Each Cloud SQL instance is powered by a virtual machine (VM) running on a host Google Cloud server. Each VM operates the database program, such as MySQL Server, PostgreSQL, or SQL Server, and service agents that provide supporting services, such as logging and monitoring.
1.4.2 Build to run service included in the OTC
1.4.2.1 Build service pre-requisite
- Refer to generic description.
1.4.2.2 Build to run service
- Refer to generic description.
1.4.3 RUN services included in the MRC
1.4.3.1 Run service pre-requisite
- This file can be executed with a CI/CD and the execution has been tested successfully.
1.4.3.2 Co-manage option
Yes if CI/CD shared with the customer (IaC Part)
1.4.3.3 KPI & alerts
Monitoring
Yes, Metrics, SlowQuery Log (MySQL)
KPI monitored
- CPU utilization
- Storage usage
- Memory usage
- Read/write operations
- Ingress/Egress bytes
- MySQL queries
- MySQL questions
- Read/write InnoDB pages
- InnoDB data fsyncs
- InnoDB log fsyncs
- Active connections
Alerts observed
- CPU and memory utilization
- Disk utilization
- MySQL connections
- Auto-failover requests and replication lag
1.4.3.4 Backup and restore
Data backup and restore
The backup is based on regular export.
Service restore
Recovery will be from Infra as Code + Backup of the data.
1.4.3.5 GCP SLA High Availability and Disaster Recovery inter-region
HA and non HA are provided by Google Cloud Platform depending on the design and service parameter configuration
Recovery after regions loss is Based on design SOW, the service can be built in multiple regions.
1.4.4 Charging model
Work Unit |
Per Instance |
1.4.5 Changes catalogue – in Tokens, per act
Changes examples | Effort |
Create / update/ delete instance
Create/update/delete Database (MySQL, MySQL, PostgreSQL, or SQL Server) Run script SQL |
1 token |
Clonage Database | 2 tokens |
Other changes | Estimation in tokens based on time spent |
1.5 Cloud Storage
1.5.1 Description
Google Cloud Storage is a RESTful online file storage web service for storing and accessing data on Google Cloud Platform infrastructure. The service combines the performance and scalability of Google’s cloud with advanced security and sharing capabilities.
1.5.2 Build to run service included in the OTC
1.5.2.1 Build service pre-requisite
- Refer to generic description.
1.5.2.2 Build to run service
- Refer to generic description.
- In addition, build to run service for Cloud Storage service will include lifecycle rules, IAM policies.
1.5.3 RUN services included in the MRC
Run a managed Cloud Storage service is optional. Depending on Customer’s interest in monitoring the storage KPIs, in alerting based on KPIs, the Customer may request the service. By default, there is no recurring task proposed on Cloud Storage service, but on demand changes and on demand investigations.
1.5.3.1 Run service pre-requisite
- A referential file exists in the Git including the reference configuration of the Cloud Storage service.
- This file can be executed with a CI/CD and the execution has been tested successfully.
1.5.3.2 Co-manage option
Yes
1.5.3.3 KPI & alerts
Monitoring
Yes, Metrics
Cloud Storage service is monitored through Cloud Monitoring. Orange Business Services will examines Cloud Storage usage (e.g., how many bytes are stored, how many download requests are coming from your applications) and will set alerts according to your SOW.
Orange Business Service will collect metrics from Google Storage to:
- Visualize the performance of your Storage services
- Correlate the performance of your Storage services with your applications
gcp.storage.api.request_count | The number of API calls |
gcp.storage.authn.authentication_count | The number of HMAC/RSA signed requests |
gcp.storage.authz.acl_based_object_access_count | The number of requests that result in an object being granted access solely due to object ACLs. |
gcp.storage.authz.acl_operations_count | The usage of ACL operations |
gcp.storage.authz.object_specific_acl_mutation_count | The number of changes made to object specific ACLs |
gcp.storage.network.received_bytes_count | The number of bytes received over the network |
gcp.storage.network.sent_bytes_count | The number of bytes sent over the network |
gcp.storage.storage.object_count | The total number of objects per bucket |
gcp.storage.storage.total_byte_seconds | The total daily storage in byte seconds used |
gcp.storage.storage.total_bytes | The total size of all objects in the bucket |
1.5.3.4 Backup and restore
Data backup and restore
No backup.
Service restore
Recovery will be from Infra as Code + Backup of the data.
1.5.3.5 GCP SLA High Availability and Disaster Recovery inter-region
HA and non HA are provided by Google Cloud Platform by default for Cloud Storage service.
1.5.4 Charging model
Work Unit |
Per Bucket |
1.5.5 Changes catalogue – in Tokens, per act
Changes examples | Effort |
Modify life cycle rules/ Chargement de données | 1 token |
Bucket synchronization | 2 tokens |
Other changes | Estimation in tokens based on time spent |
1.6 Storage Transfer Service
1.6.1 Description
Storage Transfer Service is a Google Cloud product that enables you to:
- Move or backup data to a Cloud Storage bucket either from other cloud storage providers or from your on-premises
- Move data from one Cloud Storage bucket to another, so that it is available to different groups of users or applications.
- Periodically move data as part of a data processing pipeline or analytical workflow.
With Storage Transfer Service, you can transfer data from other clouds, HTTP(S) and filesystems in private data centers, as well as transfer data between Google Cloud Storage buckets.
1.6.2 Build to run service included in the OTC
1.6.2.1 Build service pre-requisite
- Refer to generic description.
1.6.2.2 Build to run service
- Refer to generic description.
1.6.3 RUN services included in the MRC
Run a managed Storage Transfer Service is optional. Depending on Customer’s interest in monitoring the storage KPIs, in alerting based on KPIs, the Customer may request the service. By default, there is no recurring task proposed on Storage Transfer Service, but on demand changes and on demand investigations.
1.6.3.1 Run service pre-requisite
- This file can be executed with a CI/CD and the execution has been tested successfully.
1.6.3.2 Co-manage option
No, Orange Business Services fully managed OBS manages the Storage Transfer Service.
1.6.3.3 KPI & alerts
Monitoring
Yes, Metrics, Logs, Probes
KPI monitored
- CPU
- Disk
- HTTP request and response status
- Memory
- Network
- Number of active instances
Alerts observed
- CPU
- Disk
- HTTP request and response status
- Memory
- Network
- Number of active instances
1.6.3.4 Backup and restore
Data backup and restore
The backup is based on demand Export Template IaC.
Using Google data transfer services you can easily backup data from another cloud storage provider to Google Storage Transfer Service.
Service restore
Recovery will be from Infra as Code + Backup of the data.
1.6.3.5 GCP SLA High Availability and Disaster Recovery inter-region
HA and non HA are provided by Google Cloud Platform by default.
1.6.4 Charging model
Work Unit |
Per Job |
1.6.5 Changes catalogue – in Tokens, per act
Changes examples | Effort |
Modify/delete Job | 1 token |
Create Job | 2 tokens |
Other changes | Estimation in tokens based on time spent |
1.7 Google Kubernetes Engine (Std)
1.7.1 Description
Google Kubernetes Engine (GKE) is a Google Cloud Platform (GCP) service. It is a hosted platform that allows you to run and orchestrate containerized applications. GKE manages Docker containers deployed on a cluster of machines.
GKE offers two modes of operation:
- Standard: You manage the underlying infrastructure of the cluster, which provides greater flexibility in configuring nodes.
- Autopilot: Google provisions and manages all of the underlying cluster infrastructure, including nodes and node pools. This gives you a cluster that is optimized for autonomous operation.
1.7.2 Build to run service included in the OTC
1.7.2.1 Build service pre-requisite
- Refer to generic description.
1.7.2.2 Build to run service
- Refer to generic description.
1.7.3 RUN services included in the MRC
1.7.3.1 Run service pre-requisite
- This file can be executed with a CI/CD and the execution has been tested successfully.
1.7.3.2 Co-manage option
Yes if CI/CD shared with the customer KPI & alerts
Monitoring
Yes, Insights, Metrics, logs, Health probes.
Orange Business Services will collect metrics from Docker, Kubernetes, and your containerized applications
KPI monitored
- Disk I/O
- CPU and memory usage
- Container and pod events
- Network throughput
- Individual request traces
Alerts observed
- Disk I/O
- CPU and memory usage
- Container and pod events
- Network throughput
1.7.3.3 Backup and restore
Data backup and restore
The backup is based on backup of IaC + resources k8s + data
Service restore
Recovery will be from Infra as Code + Backup of the data.
1.7.3.4 GCP SLA High Availability and Disaster Recovery inter-region
HA and non HA are provided by Google Cloud Platform depending on the design and service parameter configuration.
Recovery is based on design SOW, need actions from Operation teams of Orange Business Services.
1.7.4 Charging model
Work Unit |
Per Cluster |
1.7.5 Changes catalogue – in Tokens, per act
Changes examples | Effort |
Add/delete node | 1 token |
Update Cluster | 2 tokens |
Modify network ranges
Modify autoscaling parameters |
4 tokens |
Other changes | Estimation in tokens based on time spent |
1.8 Google Kubernetes Engine (Autopilot)
1.8.1 Description
Google Kubernetes Engine (GKE) is a Google Cloud Platform (GCP) service. It is a hosted platform that allows you to run and orchestrate containerized applications. GKE manages Docker containers deployed on a cluster of machines.
GKE offers two modes of operation:
- Standard: You manage the underlying infrastructure of the cluster, which provides greater flexibility in configuring nodes.
- Autopilot: Google provisions and manages all of the underlying cluster infrastructure, including nodes and node pools. This gives you a cluster that is optimized for autonomous operation.
1.8.2 Build to run service included in the OTC
1.8.2.1 Build service pre-requisite
- Refer to generic description.
1.8.2.2 Build to run service
- Refer to generic description.
1.8.3 RUN services included in the MRC
1.8.3.1 Run service pre-requisite
- This file can be executed with a CI/CD and the execution has been tested successfully.
1.8.3.2 Co-manage option
Yes if CI/CD shared with the customer KPI & alerts
Monitoring
Yes, Insights, Metrics, logs, Health probes.
Orange Business Services will collect metrics from Docker, Kubernetes, and your containerized applications
KPI monitored
- Disk I/O
- CPU and memory usage
- Container and pod events
- Network throughput
- Individual request traces
Alerts observed
- Disk I/O
- CPU and memory usage
- Container and pod events
- Network throughput
1.8.3.3 Backup and restore
Data backup and restore
The backup is based on backup of IaC + resources k8s + data
Service restore
Recovery will be from Infra as Code + Backup of the data.
1.8.3.4 GCP SLA High Availability and Disaster Recovery inter-region
HA and non HA are provided by Google Cloud Platform depending on the design and service parameter configuration.
Recovery is based on design SOW, need actions from Operation teams of Orange Business Services.
1.8.4 Charging model
Work Unit |
Per Cluster |
1.8.5 Changes catalogue – in Tokens, per act
Changes examples | Effort |
Force update cluster | 1 token |
Other changes | Estimation in tokens based on time spent |
1.9 Compute Engine
1.9.1 Description
The Managed Service for Compute Engine is called Managed OS. OBS manages both the OS and the Compute Engine.
Orange Business Services can managed service units like OS, Middleware, Database in the Managed Compute Engine.
4 possible Managed services:
- Managed OS only
- Managed OS + Managed MW
- Managed OS + Managed DB
- Managed OS + Managed MW + Managed DB
Compute Engine is a computing and hosting service that lets you create and run virtual machines on Google infrastructure. Compute Engine offers scale, performance, and value that lets you easily launch large compute clusters on Google’s infrastructure.
1.9.2 Build to run service included in the OTC
1.9.2.1 Build service pre-requisite
- Refer to generic description.
1.9.2.2 Build to run service
- Refer to generic description.
1.9.3 RUN services included in the MRC
1.9.3.1 Run service pre-requisite
- A referential file exists in the Git including the reference configuration of the Compute Engine.
- This file can be executed with a CI/CD and the execution has been tested successfully.
1.9.3.2 Co-manage option
Yes but need to be careful to the RACI between OBS & Customer
1.9.3.3 KPI & alerts
Monitoring is performed through configuration and activation of Cloud Monitoring.
OBS backend supervision system is collecting alerts from Cloud Monitoring & Cloud Logging.
Monitoring
Yes, Insights, Metrics, logs, Health probes.
Metrics do not require installation of the Monitoring or Logging agent, but you must enable the Container-Optimized OS Health Monitoring feature.
KPI monitored for Instances:
- CPU Utilization
- Count of disk read/write bytes
- Count of disk read/write operations
- Count of throttled read/write operations
- Count of sent bytes/received bytes
- Count of incoming bytes dropped due to firewall policy
- Count of incoming packets dropped due to firewall policy
Alerts observed:
Alert on CPU, Memory Usage and Disk Usage.
Project metrics:
Like most cloud service providers, Google Compute Engine has limits on the number of resources a project may consume. If the customer are approaching (or have reached) his quota for a specific resource, OBS will tune the quota metrics for the customer if needed.
Activating Detailed Monitoring will be charged by GCP.
- OS patching
GCP VM Manager
For managed OS, OBS leverages GCP VM Manager for the patching of the Operating System (OS).
Behavior: With GCP VM Manager, patches are decided by Google and all patches are to be applied if mandatory for the Compute Engine for Windows and Linux.
Additional reporting could be asked by the Customer and extra fees will be charged.
- Antivirus
For managed OS, OBS leverages its central anti-virus system based on Sophos. This requires the installation of the anti-virus agent on the OS for each Compute Engine as well as the VPN connectivity to OBS Centralized Administration Zone. OBS systems allows for central reporting on Malware from its backend console system.
Would the Customer desire to keep its own Antivirus system, then OBS shall not be taken responsible for protection against viruses.
- Backup and restore
Data backup and restore
By default, OBS leverages GCBDR on the Compute Engine for Managed OS. The configuration of GCBDR pattern as well as retention period shall be agreed with the Customer prior to the RUN. The first backup is full. The following backups are incremental. You can the frequency of the backup. As example: 1 x backup per week, 1x incremental backup per day per Compute Engine. The retention period depends on customer request. GCP charges will be calculated based on change rate.
Restore of Compute Engine are performed from the backup.
- In case of incident, latest version of backup can be restored
- Upon change request, a previous version of backup can be restored.
- GCP SLA High Availability and Disaster Recovery inter-region
Service is Highly Available within a single Availability Zone. HA can be configured using instance group.
Multi-Availability Zones design requires specific design and subject to a specific additional charging.
This service is covered by GCBDR which enables the creation of backup copies across GCP Regions.
If this option is activated, traffic between regions and storage will be charged by GCP.
- Administration tasks tracing
Actions performed by OBS managed teams on the managed OS are done from OBS Administration Zone through an access controlled by a CyberArk bastion. OBS CyberArk bastion protects the access and keep trace of the actions performed by the maintenance team allowing for audit.
The VPN connectivity to the OBS Administration Zone necessary for the management.
- Login on to the Virtual Machine
For Windows OS based Compute Engine, access shall be granted by the Customer to OBS managed application operations staff through a domain account configured with proper privilege groups.
For Linux OS based Compute Engine, an encrypted key is created and provided to OBS managed application operations staff to log onto the VM.
For Applications, in case of managed application: a secret stored in a safe.
- Logs
Log management is not included in the managed OS / managed Compute Engine service.
Optionally it can be activated through GCP Cloud Logging through Change Request process.
- Security
By default, the MRC includes the use of security policies and groups as per customer’s configuration request.
The MRC does not cover security recommendations. Security recommendations can be part of an optional security scope of work based on customer request.
- Limitations
Managed Applications services is provided only for OS versions supported by the CSP vendor.
1.9.4 Charging model
Work Unit |
Per Virtual Machine instance |
1.9.5 Changes catalogue – in Tokens, per act
Changes examples | Effort |
Create a Virtual Machine | 2 Tokens |
Attach a Disk to a Virtual Machine | 2 Tokens |
Restore a Virtual Machine from a snapshot | 1 Token |
Backup a Virtual Machine | 1 Token |
Create and Deploy VMs in a Instance Group | 2 Tokens |
Start/Stop/Restart Virtual Machine | 2 Tokens |
Create/modify/delete Storage Accounts | 2 Tokens |
Other changes | Estimation in tokens based on time spent |
1.10 Virtual Private Cloud
1.10.1 Description
Virtual Private Cloud (VPC) provides networking functionality to Compute Engine virtual machine (VM) instances, Google Kubernetes Engine (GKE) clusters, and the App Engine flexible environment. VPC provides networking for your cloud-based resources and services that is global, scalable, and flexible.
A VPC network is a global resource which consists of a list of regional virtual subnetworks (subnets) in data centers, all connected by a global wide area network. VPC networks are logically isolated from each other in the Google Cloud Platform.
At the basic level, managing Virtual Private Cloud consists in building, deploying, and maintaining the Infra as Code for it and managing the changes.
OBS has 2 prices for Managed Virtual Private Cloud depending on the number of subnets of the customer projects:
- VPC with 1 to 2 subnets
- VPC with more then 3 subnets
The management of Virtual Private Cloud is included as part of a larger bundle of Network and Security Managed services which provides network and security design, maintenance, network watching, intrusion detection, troubleshooting depending on an agreed Scope of Work.
1.10.2 Build to run service included in the OTC
1.10.2.1 Build service pre-requisite
- Refer to generic description.
1.10.2.2 Build to run service
- Refer to generic description.
1.10.3 RUN services included in the MRC
1.10.3.1 Run service pre-requisite
- A referential file exists in the Git including the reference configuration of the Virtual Private Cloud
- This file can be executed with a CI/CD and the execution has been tested successfully.
1.10.3.2 Co-manage option
No, Orange Business Services manages the Virtual Private Cloud service.
1.10.3.3 KPI & alerts
Monitoring
Yes, Metrics, Logs (option)
Alerts observed:
Packet loss, up/down network
1.10.3.4 Backup and restore
Data backup and restore
Can be exported from Infra as Code.
Service restore
Recovery will be from Infra as Code + Backup
1.10.3.5 GCP SLA High Availability and Disaster Recovery inter-region
HA by design.
No Recovery after region loss, need to run the IaC on another region only for subnet
1.10.4 Charging model
Work Unit |
Per Virtual Private Cloud instance |
1.10.5 Changes catalogue – in Tokens, per act
Changes examples | Effort |
Add subnet/add range IP on subnet/reservation of static address | 1 token |
Creation network peering | 2 tokens |
Other changes | Estimation in tokens based on time spent |
1.11 Persistent Disk
1.11.1 Description
Persistent disks are durable network storage devices that your instances can access like physical disks in a desktop or a server. The data on each persistent disk is distributed across several physical disks.
Store data from VM instances running in Compute Engine or GKE, Persistent Disk is an Google’s Cloud block storage offering.
OBS proposed 4 types of Persistent Disk:
- Managed Standard Persistent Disk
- Managed Balanced Persistent Disk
- Managed SSD Persistent Disk
- Managed Extreme Persistent Disk
1.11.2 Build to run service included in the OTC
1.11.2.1 Build service pre-requisite
- Refer to generic description.
1.11.2.2 Build to run service
- Refer to generic description.
1.11.3 RUN services included in the MRC
Run a managed Persistent Disk service is optional. Depending on Customer’s mandatory if Persistent Disk is attached to managed services, the Customer may request the service.
1.11.3.1 Run service pre-requisite
- A referential file exists in the Git including the reference configuration of the Persistent Disk.
- This file can be executed with a CI/CD and the execution has been tested successfully.
1.11.3.2 Co-manage option
No, Orange Business Services manages the Persistent Disk service.
1.11.3.3 KPI & alerts
Monitoring
Yes, Metrics
Persistent Disk service is monitored through Cloud Monitoring. Orange Business Services will examines Persistent Disk usage (e.g., how many bytes are stored, how many download requests are coming from your applications) and will set alerts according to your SOW.
Orange Business Service will collect metrics from Cloud Monitoring to:
- Graph multiple persistent disk performance metrics with Metrics Explorerpage
- Graph average IOPS by using the Disk read operationsmetric
- Graph average throughput rates by using the Disk read bytes metric
- Graph maximum per second read operations by using the Peak disk read operationsmetric
- Graph average throttled operations rates by using the Throttled read operationsmetric
- Graph average throttled bytes rates by using the Throttled read bytesmetric
1.11.3.4 Backup and restore
Data backup and restore
Backup of Iac + Disk + Data
Service restore
Recovery will be from Infra as Code + Backup of the data.
1.11.3.5 GCP SLA High Availability and Disaster Recovery inter-region
HA by design but not DR by design.
Regional Persistent Disk depending on application need, need to run the IaC on another region and restore (option)
1.11.4 Charging model
Work Unit |
Per Disk |
1.11.5 Changes catalogue – in Tokens, per act
Changes examples | Effort |
Create Disk
Attach Disk to a VM |
1 token |
Extend Disk
Mount/Format Disk |
2 tokens |
Enable Encryption | 4 tokens |
Other changes | Estimation in tokens based on time spent |
1.12 Cloud Interconnect
1.12.1 Description
Cloud Interconnect provides low latency, high availability connections that enable you to reliably transfer data between your on-premises and Google Cloud Virtual Private Cloud (VPC) networks. Also, Interconnect connections provide internal IP address communication, which means internal IP addresses are directly accessible from both networks.
Cloud Interconnect offers two options for extending your on-premises network:
- Dedicated Interconnect provides a direct physical connection between your on-premises network and Google’s network.
- Partner Interconnect provides connectivity between your on-premises and VPC networks through a supported service provider.
1.12.2 Build to run service included in the OTC
1.12.2.1 Build service pre-requisite
- Refer to generic description.
1.12.2.2 Build to run service
- Refer to generic description.
1.12.3 RUN services included in the MRC
1.12.3.1 Run service pre-requisite
- A referential file exists in the Git including the reference configuration of the Cloud Interconnect service.
- This file can be executed with a CI/CD and the execution has been tested successfully.
1.12.3.2 Co-manage option
No, Orange Business Services manages the Cloud Interconnect service.
1.12.3.3 KPI & alerts
Monitoring
Yes, Insights, Metrics, Health probes
Metric
gcp.interconnect.network.attachment.capacity | Network capacity of the attachment |
gcp.interconnect.network.attachment.received_bytes_count | Number of inbound bytes received. |
gcp.interconnect.network.attachment.received_packets_count | Number of inbound packets received. |
gcp.interconnect.network.attachment.sent_bytes_count | Number of outbound bytes sent. |
gcp.interconnect.network.attachment.sent_packets_count | Number of outbound packets sent. |
gcp.interconnect.network.interconnect.capacity | Active capacity of the interconnect. |
gcp.interconnect.network.interconnect.dropped_packets_count | Number of outbound packets dropped due to link congestion. |
gcp.interconnect.network.interconnect.link.operational | Whether the operational status of the circuit is up. |
gcp.interconnect.network.interconnect.link.rx_power | Light level received over physical circuit. |
gcp.interconnect.network.interconnect.link.tx_power | Light level transmitted over physical circuit. |
gcp.interconnect.network.interconnect.operational | Whether the operational status of the interconnect is up. |
gcp.interconnect.network.interconnect.receive_errors_count | Number of errors encountered while receiving packets. |
gcp.interconnect.network.interconnect.received_bytes_count | Number of inbound bytes received. |
gcp.interconnect.network.interconnect.received_unicast_packets_count | Number of inbound unicast packets received. |
gcp.interconnect.network.interconnect.send_errors_count | Number of errors encountered while sending packets. Shown as error |
gcp.interconnect.network.interconnect.sent_bytes_count | Number of outbound bytes sent. |
gcp.interconnect.network.interconnect.sent_unicast_packets_count | Number of outbound unicast packets sent. |
1.12.3.4 Backup and restore
Data backup and restore
Backup of Iac
Service restore
Recovery will be from Infra as Code and actions from Operation Team.
1.12.3.5 GCP SLA High Availability and Disaster Recovery inter-region
HA (SLA 99,9% or 99,99%) by design depending of the chosen options.
Recovery after region loss based on WAN Architecture requirement from the customer.
1.12.4 Charging model
Work Unit |
Per Cloud Interconnect |
1.12.5 Changes catalogue – in Tokens, per act
Changes examples | Effort |
Disable my interconnect connection | 1 token |
Restrict interconnect usage | 2 tokens |
Create interco with customer configuration | > 9 tokens |
Other changes | Estimation in tokens based on time spent |
1.13 Big Query
1.13.1 Description
Google BigQuery is a Data Warehouse designed to allow companies to perform SQL queries very quickly thanks to the processing power of the Google Cloud infrastructure. Thus, it is part of the Infrastructure as a Cloud Service (IaaS) family. Designed for Big Data, this platform can analyze billions of rows of data.
Google BigQuery is the Big Data analysis platform offered by Google via the Cloud.
1.13.2 Build to run service included in the OTC
1.13.2.1 Build service pre-requisite
- Refer to generic description.
- Interaction loop necessary with the customer at each Build
1.13.2.2 Build to run service
- Refer to generic description.
1.13.3 RUN services included in the MRC
1.13.3.1 Run service pre-requisite
- A referential file exists in the Git including the reference configuration of the BigQuery service.
- This file can be executed with a CI/CD and the execution has been tested successfully.
1.13.3.2 Co-manage option
No by default, IAC is fully managed by OBS, we are responsible of the CI/CD up to the dataset (the customer can have access to the tables modifications case by case. The requests for table changes are through tokens.
1.13.3.3 KPI & alerts
Monitoring
Yes, Metrics, Logs
Alerts observed:
Alerts on KPI customer per customer :
- Slot usage
- Job Concurrency
- Job performance
- Failed jobs
- Bytes processed by default in BigQuery
1.13.3.4 Backup and restore
Data backup and restore
Yes, Template IaC, Backup Regional Tables
Service restore
Recovery from Snapshot – Log – Ingestion Code –
1.13.3.5 GCP SLA High Availability and Disaster Recovery inter-region
HA and non HA are provided by Google Cloud Platform by default for BigQuery service.
BigQuery does not automatically provide a backup or replica of your data in another geographic region. You can create cross-region dataset copies to enhance your disaster recovery strategy.”
1.13.4 Charging model
Work Unit |
Per Table |
1.13.5 Changes catalogue – in Tokens, per act
Changes examples | Effort |
Create table/modify table/delete table
Add/modify/update/delete user with policies Copy table |
1 token |
Charge data from a bucket | 2 tokens |
Other changes | Estimation in tokens based on time spent |
1.14 Pub/Sub
1.14.1 Description
Create scalable messaging and ingestion for event-driven systems and streaming analytics. Ingest events for streaming into BigQuery, data lakes or operational databases.
Pub/Sub offers a broader range of features, per-message parallelism, global routing, and automatically scaling resource capacity.
Pub/Sub allows services to communicate asynchronously, with latencies on the order of 100 milliseconds. Pub/Sub is used for streaming analytics and data integration pipelines to ingest and distribute data. It is equally effective as a messaging- oriented middleware for service integration or as a queue to parallelize tasks.
1.14.2 Build to run service included in the OTC
1.14.2.1 Build service pre-requisite
- Refer to generic description.
1.14.2.2 Build to run service
- Refer to generic description.
1.14.3 RUN services included in the MRC
1.14.3.1 Run service pre-requisite
- A referential file exists in the Git including the reference configuration of the Pub/Sub service.
- This file can be executed with a CI/CD and the execution has been tested successfully.
1.14.3.2 Co-manage option
No by default, Iac is fully managed by Orange Business Services.
1.14.3.3 KPI & alerts
Monitoring
Yes, Metrics
Monitoring/alarm on :
- Publisher Status,
- Troughput,
- Publish Requests size,
- Topic,
- Access right
Alerts observed:
Alerts on KPI customer per customer :
- pubsub_snapshot
- pubsub_subscription
- pubsub_topic
1.14.3.4 Backup and restore
Data backup and restore
Yes, from IaC and snapshot.
Service restore
Recovery will be from snapshot.
1.14.3.5 GCP SLA High Availability and Disaster Recovery inter-region
HA are provided by Google Cloud Platform by default for Pub/Sub service. Pub/Sub is global/multi-regional with SLAs guaranteed by Google. For the highest degree of redundancy OBS can create Pub/Sub publisher clients in different GCP regions. Pub/Sub keeps any given message in a single region, although, replicated across zones
1.14.4 Charging model
Work Unit |
Per instance |
1.14.5 Changes catalogue – in Tokens, per act
Changes examples | Effort |
Create/modify/delete instance | 1 token |
Create snapshot msg | 2 tokens |
Other changes | Estimation in tokens based on time spent |
1.15 Pub/Sub Lite
1.15.1 Description
Pub/Sub and Pub/Sub Lite are both horizontally scalable and managed messaging services. Pub/Sub is usually the default solution for most application integration and analytics use cases. Pub/Sub Lite is only recommended for applications where achieving extremely low cost justifies some additional operational work.
Pub/Sub Lite is a cost-effective solution that trades off operational workload, availability, and features for cost efficiency. Pub/Sub Lite requires you to manually reserve and manage resource capacity. Within Pub/Sub Lite, you can choose either zonal or regional Lite topics. Regional Lite topics offer the same availability SLA as Pub/Sub topics. However, there are reliability differences between the two services in terms of message replication.
1.15.2 Build to run service included in the OTC
1.15.2.1 Build service pre-requisite
- Refer to generic description.
1.15.2.2 Build to run service
- Refer to generic description.
1.15.3 RUN services included in the MRC
1.15.3.1 Run service pre-requisite
- A referential file exists in the Git including the reference configuration of the Pub/Sub Lite service.
- This file can be executed with a CI/CD and the execution has been tested successfully.
1.15.3.2 Co-manage option
No by default, Iac is fully managed by Orange Business Services.
1.15.3.3 KPI & alerts
Monitoring
Yes, Metrics
Monitoring/alarm on :
- Publisher Status,
- Troughput,
- Publish Requests size,
- Reservation
Alerts observed:
Alerts on KPI customer per customer :
- pubsublite_reservation
- pubsublite_subscription_partition
- pubsublite_topic_partition
1.15.3.4 Backup and restore
Data backup and restore
Yes, from IaC and snapshot.
Service restore
Recovery will be from snapshot.
1.15.3.5 GCP SLA High Availability and Disaster Recovery inter-region
HA are provided by Google Cloud Platform by default for Pub/Sub Lite service with less resiliency & low reliability then Pub/Sub Lite service.
1.15.4 Charging model
Work Unit |
Per instance |
1.15.5 Changes catalogue – in Tokens, per act
Changes examples | Effort |
Create/modify/delete instance
Reservation gestion Throughput capacity |
1 token |
Create snapshot msg | 2 tokens |
Other changes | Estimation in tokens based on time spent |
1.16 Dataproc
1.16.1 Description
Dataproc is a managed Spark and Hadoop service that lets you take advantage of open source data tools for batch processing, querying, streaming, and machine learning. Dataproc automation helps you create clusters quickly, manage them easily, and save money by turning clusters off when you don’t need them. With less time and money spent on administration, you can focus on your jobs and your data.
1.16.2 Build to run service included in the OTC
1.16.2.1 Build service pre-requisite
- Refer to generic description.
1.16.2.2 Build to run service
- Refer to generic description.
1.16.3 RUN services included in the MRC
1.16.3.1 Run service pre-requisite
- A referential file exists in the Git including the reference configuration of the Dataproc service.
- This file can be executed with a CI/CD and the execution has been tested successfully.
1.16.3.2 Co-manage option
No by default, Iac is fully managed by Orange Business Services.
1.16.3.3 KPI & alerts
Monitoring
Yes, Metrics
gcp.dataproc.cluster.hdfs.datanodes | Indicates the number of HDFS DataNodes that are running inside a cluster. |
gcp.dataproc.cluster.hdfs.storage_capacity | Indicates capacity of HDFS system running on cluster in GB. |
gcp.dataproc.cluster.hdfs.storage_utilization | The percentage of HDFS storage currently used. |
gcp.dataproc.cluster.hdfs.unhealthy_blocks | Indicates the number of unhealthy blocks inside the cluster. |
gcp.dataproc.cluster.job.completion_time.avg | The time jobs took to complete from the time the user submits a job to the time Dataproc reports it is completed. |
gcp.dataproc.cluster.job.completion_time.samplecount | Sample count for cluster job completion time |
gcp.dataproc.cluster.job.completion_time.sumsqdev | Sum of squared deviation for cluster job completion time |
gcp.dataproc.cluster.job.duration.avg | The time jobs have spent in a given state. |
gcp.dataproc.cluster.job.duration.samplecount | Sample count for cluster job duration |
gcp.dataproc.cluster.job.duration.sumsqdev | Sum of squared deviation for cluster job duration |
gcp.dataproc.cluster.job.failed_count | Indicates the number of jobs that have failed on a cluster. |
gcp.dataproc.cluster.job.running_count | Indicates the number of jobs that are running on a cluster. |
gcp.dataproc.cluster.job.submitted_count | Indicates the number of jobs that have been submitted to a cluster. |
gcp.dataproc.cluster.operation.completion_time.avg | The time operations took to complete from the time the user submits a operation to the time Dataproc reports it is completed. |
gcp.dataproc.cluster.operation.completion_time.samplecount | Sample count for cluster operation completion time |
gcp.dataproc.cluster.operation.completion_time.sumsqdev | Sum of squared deviation for cluster operation completion time |
gcp.dataproc.cluster.operation.duration.avg | The time operations have spent in a given state. |
gcp.dataproc.cluster.operation.duration.samplecount | Sample count for cluster operation duration |
gcp.dataproc.cluster.operation.duration.sumsqdev | Sum of squared deviation for cluster operation duration |
gcp.dataproc.cluster.operation.failed_count | Indicates the number of operations that have failed on a cluster. |
gcp.dataproc.cluster.operation.running_count | Indicates the number of operations that are running on a cluster. |
gcp.dataproc.cluster.operation.submitted_count | Indicates the number of operations that have been submitted to a cluster. |
gcp.dataproc.cluster.yarn.allocated_memory_percentage | The percentage of YARN memory is allocated. |
gcp.dataproc.cluster.yarn.apps | Indicates the number of active YARN applications. |
gcp.dataproc.cluster.yarn.containers | Indicates the number of YARN containers. |
gcp.dataproc.cluster.yarn.memory_size | Indicates the YARN memory size in GB. |
gcp.dataproc.cluster.yarn.nodemanagers | Indicates the number of YARN NodeManagers running inside cluster. |
gcp.dataproc.cluster.yarn.pending_memory_size | The current memory request, in GB, that is pending to be fulfilled by the scheduler. |
gcp.dataproc.cluster.yarn.virtual_cores | Indicates the number of virtual cores in YARN. |
1.16.3.4 Backup and restore
Data backup and restore
Yes, from IaC.
Service restore
Recovery will be from Infra as Code
1.16.3.5 GCP SLA High Availability and Disaster Recovery inter-region
Standard, Single node and HA are provided by Google Cloud Platform for Dataproc service.
1.16.4 Charging model
Work Unit |
Per Cluster |
1.16.5 Changes catalogue – in Tokens, per act
Changes examples | Effort |
Create/delete cluster | 1 token |
Bench/config cluster | 4 tokens |
Other changes | Estimation in tokens based on time spent |
1.17 Dataflow
1.17.1 Description
Google Cloud Dataflow is a fully managed service for executing Apache Beam pipelines within the Google Cloud Platform ecosystem.
1.17.2 Build to run service included in the OTC
1.17.2.1 Build service pre-requisite
- Refer to generic description.
1.17.2.2 Build to run service
- Refer to generic description.
1.17.3 RUN services included in the MRC
1.17.3.1 Run service pre-requisite
- A referential file exists in the Git including the reference configuration of the Dataflow service.
- This file can be executed with a CI/CD and the execution has been tested successfully.
1.17.3.2 Co-manage option
No by default, Iac is fully managed by Orange Business Services.
1.17.3.3 KPI & alerts
Monitoring
Yes, Metrics, Lgs
Overview metrics:
- Autoscaling
- Throughput
- CPU utilization
- Worker error log count
Streaming metrics (streaming pipelines only):
- Data freshness (with and without Streaming Engine)
- System latency (with and without Streaming Engine)
- Backlog bytes (with and without Streaming Engine)
- Parallelism (Streaming Engine only)
- Duplicates (Streaming Engine only)
Input metrics:
- Pub/Sub read, BigQuery read, etc.
Output metrics:
- Pub/Sub write, BigQuery write, etc.
1.17.3.4 Backup and restore
Data backup and restore
Yes, From Iac + Backup Pipeline by Customer
Service restore
Recovery From Iac or by Operation Team actions (Restoration). ingestion by the Customer or by OBS with procedure
1.17.3.5 GCP SLA High Availability and Disaster Recovery inter-region
Not HA by design for Dataflow service.
Dataflow does not automatically provide a backup or replica of your data in another geographic region ==> need actions from Operation teams.
If there are no grouping/time-windowing operations, a failover to another Dataflow job in another zone or region by reusing the subscription leads to no data loss in pipeline output data.
- Job fails if region fails over : deploy 2 or more dataflow for streaming purposes
- Streaming from PubSub (no grouping / time-windowing) : messages are acked only when persisted in destination
- Streaming from PubSub (windowing + not rely on data before the outage) : PubSub Seek functionnality
- Streaming from PubSub (grouping + rely on data after the outage) : Dataflow Snapshot functionnality (in preview)
1.17.4 Charging model
Work Unit |
Per Job |
1.17.5 Changes catalogue – in Tokens, per act
Changes examples | Effort |
Delete Job | 1 token |
Deploy/Create Job | 1 Business Hour day |
Other changes | Estimation in tokens based on time spent |
1.18 Cloud Composer
1.18.1 Description
Cloud Composer is a managed Apache Airflow service that helps you create, schedule, monitor and manage workflows. Cloud Composer automation helps you create Airflow environments quickly and use Airflow-native tools, such as the powerful Airflow web interface and command line tools, so you can focus on your workflows and not your infrastructure.
1.18.2 Build to run service included in the OTC
1.18.2.1 Build service pre-requisite
- Refer to generic description.
1.18.2.2 Build to run service
- Refer to generic description.
1.18.3 RUN services included in the MRC
1.18.3.1 Run service pre-requisite
- A referential file exists in the Git including the reference configuration of the Cloud Composer service.
- This file can be executed with a CI/CD and the execution has been tested successfully.
1.18.3.2 Co-manage option
Yes only if CI/CD shared with the customer
1.18.3.3 KPI & alerts
Monitoring
Yes, Metrics, logs, Health probes
gcp.composer.environment.api.request_count | Number of Composer API requests seen so far. |
gcp.composer.environment.api.request_latencies.avg | Distribution of Composer API call latencies. |
gcp.composer.environment.api.request_latencies.samplecount | Sample count for API request latencies |
gcp.composer.environment.api.request_latencies.sumsqdev | Sum of squared deviation for API request latencies |
gcp.composer.environment.dagbag_size | The current DAG bag size |
gcp.composer.environment.dag_processing.parse_error_count | Number of errors raised during parsing DAG files |
gcp.composer.environment.dag_processing.processes | Number of currently running DAG parsing processes |
gcp.composer.environment.dag_processing.total_parse_time | Number of seconds taken to scan and import all DAG files once |
gcp.composer.environment.database_health | Healthiness of Composer Airflow database |
gcp.composer.environment.database.cpu.reserved_cores | Number of cores reserved for the database instance |
gcp.composer.environment.database.cpu.usage_time | CPU usage time of the database instance, in seconds |
gcp.composer.environment.database.cpu.utilization | CPU utilization ratio (from 0.0 to 1.0) of the database instance |
gcp.composer.environment.database.disk.bytes_used | Used disk space on the database instance, in bytes |
gcp.composer.environment.database.disk.quota | Maximum data disk size of the database instance, in bytes |
gcp.composer.environment.database.disk.utilization | Disk quota usage ratio (from 0.0 to 1.0) of the database instance |
gcp.composer.environment.database.memory.bytes_used | Memory usage of the database instance in bytes |
gcp.composer.environment.database.memory.quota | Maximum RAM size of the database instance, in bytes |
gcp.composer.environment.database.memory.utilization | Memory utilization ratio (from 0.0 to 1.0) of the database instance |
gcp.composer.environment.executor.open_slots | Number of open slots on executor |
gcp.composer.environment.executor.running_tasks | Number of running tasks on executor |
gcp.composer.environment.finished_task_instance_count | Overall number of finished task instances |
gcp.composer.environment.healthy | Healthiness of Composer environment. |
gcp.composer.environment.num_celery_workers | Number of Celery workers. |
gcp.composer.environment.num_workflows | Number of workflows. |
gcp.composer.environment.scheduler_heartbeat_count | Scheduler heartbeats |
gcp.composer.environment.task_queue_length | Number of tasks in queue. |
gcp.composer.environment.web_server.cpu.reserved_cores | Number of cores reserved for the web server instance |
gcp.composer.environment.web_server.cpu.usage_time | CPU usage time of the web server instance, in seconds |
gcp.composer.environment.web_server.memory.bytes_used | Memory usage of the web server instance in bytes |
gcp.composer.environment.web_server.memory.quota | Maximum RAM size of the web server instance, in bytes |
gcp.composer.environment.worker.pod_eviction_count | Number of Airflow worker pods evictions |
gcp.composer.workflow.run_count | Number of workflow runs completed so far. |
gcp.composer.workflow.run_duration | Duration of workflow run completion. |
gcp.composer.workflow.task.run_count | Number of workflow tasks completed so far. |
gcp.composer.workflow.task.run_duration | Duration of task completion. |
1.18.3.4 Backup and restore
Data backup and restore
From Iac + GitLab for Application Part
Service restore
Recovery from Logs and actions from Operation Team.
1.18.3.5 GCP SLA High Availability and Disaster Recovery inter-region
HA and non HA are provided by Google Cloud Platform depending on the design and service parameter configuration.
Recovery after region loss are based on design SOW, need actions from Operation teams.
1.18.4 Charging model
Work Unit |
Per instance GKE |
1.18.5 Changes catalogue – in Tokens, per act
Changes examples | Effort |
Create/modify/delete instance GKE | 1 token |
Add node | 2 tokens |
Other changes | Estimation in tokens based on time spent |
1.19 Cloud Big Table
1.19.1 Description
Bigtable is a NoSQL database service, a concept that, by moving away from traditional relational databases, allows it to adapt to the needs of the modern web. These databases are indeed able to run on several different machines simultaneously, which allows to scale up and manage huge volumes of data. It is a system with horizontal scalability.
Bigtable is exposed to applications through multiple client libraries, including a supported extension to the Apache HBase library for Java. Then Bigtable integrates with the existing Apache ecosystem of open source big data software.
1.19.2 Build to run service included in the OTC
1.19.2.1 Build service pre-requisite
- Refer to generic description.
1.19.2.2 Build to run service
- Refer to generic description.
1.19.3 RUN services included in the MRC
1.19.3.1 Run service pre-requisite
- This file can be executed with a CI/CD and the execution has been tested successfully.
1.19.3.2 Co-manage option
No by default, IAC is fully managed by OBS, we are master of the CI/CD up to the table ((Customer can have access to the modifications of the column families on a case by case basis/request for change via tickets)
1.19.3.3 KPI & alerts
Monitoring
Yes, Insights, Metrics, logs, Key Visualizer
Orange Business Services monitors Cloud Bigtable using graphs available in Google Cloud Console or automatically by programming using the Cloud Monitoring API
Orange Business Services uses native tools for logs. Google Bigtable logs are collected with Google Cloud Logging and sent to a Cloud Pub/Sub via a Push HTTP forwarder.
KPI monitored
- Average CPU usage
- Storage usage
- Memory usage
- Read/write operations
- Reading latency
Alerts observed
- CPU and memory utilization
- Disk utilization
gcp.bigtable.backup.bytes_used | Backup storage used. |
gcp.bigtable.autoscaling.max_node_count | Maximum number of nodes in an autoscaled cluster. |
gcp.bigtable.autoscaling.min_node_count | Minimum number of nodes in an autoscaled cluster. |
gcp.bigtable.autoscaling.recommended_node_count_for_cpu | Recommended number of nodes in an autoscaled cluster based on CPU usage. |
gcp.bigtable.autoscaling.recommended_node_count_for_storage | Recommended number of nodes in an autoscaled cluster based on storage usage. |
gcp.bigtable.cluster.cpu_load | CPU load of a cluster. |
gcp.bigtable.cluster.cpu_load_by_app_profile_by_method_by_table | CPU load of a cluster split by app profile, method, and table. |
gcp.bigtable.cluster.cpu_load_hottest_node | CPU load of the busiest node in a cluster. |
gcp.bigtable.cluster.disk_load | Utilization of HDD disks in a cluster. |
gcp.bigtable.cluster.node_count | Number of nodes in a cluster. |
gcp.bigtable.cluster.storage_utilization | Storage used as a fraction of total storage capacity. |
gcp.bigtable.disk.bytes_used | Amount of compressed data for tables stored in a cluster. |
gcp.bigtable.disk.storage_capacity | Capacity of compressed data for tables that can be stored in a cluster. |
gcp.bigtable.replication.latencies.avg | Distribution of replication request latencies for a table. |
gcp.bigtable.replication.latencies.samplecount | Sample count for replication request latencies. |
gcp.bigtable.replication.latencies.sumsqdev | Sum of squared deviation for replication request latencies. |
gcp.bigtable.replication.max_delay | Upper bound for replication delay between clusters of a table. |
gcp.bigtable.server.error_count | Number of server requests for a table that failed with an error. |
gcp.bigtable.server.latencies.avg | Distribution of replication request latencies for a table. |
gcp.bigtable.server.latencies.samplecount | Sample count for replication request latencies. |
gcp.bigtable.server.latencies.sumsqdev | Sum of squared deviation for replication request latencies. |
gcp.bigtable.server.modified_rows_count | Number of rows modified by server requests for a table. |
gcp.bigtable.server.multi_cluster_failovers_count | Number of failovers during multi-cluster requests. |
gcp.bigtable.server.received_bytes_count | Number of uncompressed bytes of request data received by servers for a table. |
gcp.bigtable.server.request_count | Number of server requests for a table. |
gcp.bigtable.server.returned_rows_count | Number of rows returned by server requests for a table. |
gcp.bigtable.server.sent_bytes_count | Number of uncompressed bytes of response data sent by servers for a table. |
gcp.bigtable.table.bytes_used | Amount of compressed data stored in a table. |
1.19.3.4 Backup and restore
Data backup and restore
The backup is based From IaC + Snapshot from table in same zone in same cluster
Service restore
Recovery will be from other table.
1.19.3.5 GCP SLA High Availability and Disaster Recovery inter-region
HA by design.
Replication of tables on other regions necessary for recovery after region loss.
1.19.4 Charging model
Work Unit |
Per Instance |
1.19.5 Changes catalogue – in Tokens, per act
Changes examples | Effort |
Create/modify/delete table
Add/mofdify/update/delete user with policies Copy table |
1 token |
Strategy for making optimal insertion keys | 3 tokens |
Reclustering table | More than 1 day |
Other changes | Estimation in tokens based on time spent |
1.20 Cloud Datastore
1.20.1 Description
Datastore is a NoSQL database that offers great scalability for your applications. This database automatically manages data segmentation and replication so that you have a sustainable, high-availability database that can dynamically scale to handle the load of your applications. Datastore offers a multitude of features such as ACID transactions, SQL queries, indexes and more.
- Applications can use Datastore to execute SQL-like queries that support filtering and sorting.
- Datastore replicates data across multiple data centers, providing a high level of read/write availability.
- Datastore also provides automatic scalability, high consistency for read and ancestor queries, eventual consistency for all other queries, and atomic transactions. The service has no scheduled downtime.
1.20.2 Build to run service included in the OTC
1.20.2.1 Build service pre-requisite
- Refer to generic description.
1.20.2.2 Build to run service
- Refer to generic description.
1.20.3 RUN services included in the MRC
1.20.3.1 Run service pre-requisite
- This file can be executed with a CI/CD and the execution has been tested successfully.
1.20.3.2 Co-manage option
No by default, IAC is fully managed by OBS, we are master of the CI/CD up to the table ((Customer can have access to the modifications of the column families on a case by case basis/request for change via tickets)
1.20.3.3 KPI & alerts
Monitoring
Yes, Insights, Metrics, logs
Orange Business Service collects metrics from Google Datastore to :
- Visualize the performance of your datastores
- Correlate the performance of your datastores with your applications
Orange Business Services uses native tools for logs. Cloud Datastore logs are collected with Google Cloud Logging and sent to a Cloud Pub/Sub via a Push HTTP forwarder.
gcp.datastore.api.request_count | Datastore API calls. |
gcp.datastore.index.write_count | Datastore index writes. |
gcp.datastore.entity.read_sizes.avg | Average of sizes of read entities. |
gcp.datastore.entity.read_sizes.samplecount | Sample Count for sizes of read entities. |
gcp.datastore.entity.read_sizes.sumsqdev | Sum of Squared Deviation for sizes of read entities. |
gcp.datastore.entity.write_sizes.avg | Average of sizes of written entities. |
gcp.datastore.entity.write_sizes.samplecount | Sample Count for sizes of written entities. |
gcp.datastore.entity.write_sizes.sumsqdev | Sum of Squared Deviation for sizes of written entities. |
1.20.3.4 Backup and restore
Data backup and restore
The backup is based From IaC + Snapshot from table in same zone in same cluster
Service restore
Recovery will be from other table.
1.20.3.5 GCP SLA High Availability and Disaster Recovery inter-region
HA by design.
Replication of tables on other regions necessary for recovery after region loss.
1.20.4 Charging model
Work Unit |
Per Instance |
1.20.5 Changes catalogue – in Tokens, per act
Changes examples | Effort |
Create/modify/delete table
Add/mofdify/update/delete user with policies Copy table |
1 token |
Strategy for making optimal insertion keys | 3 tokens |
Reclustering table | More than 1 day |
Other changes | Estimation in tokens based on time spent |
1.21 Memorystore
1.21.1 Description
The Cloud Memorystore service is a data storage service in RAM entirely managed by Google and compatible with Redis. Redis is a cache management system compatible with the main CMS such as WordPress, Drupal, Magento or Prestashop. Enabling a Redis service for these applications will dramatically speed up your users’ browsing experience. With the Cloud Memorystore service you can easily achieve sub-millisecond latencies and the service is calibrated to support loads consistent with the largest cache requirements.
The Cloud Memorystore service is completely isolated inside your VPC network. And only your virtual server instances have access to it. By using Cloud Memorystore you relieve your virtual server instances of redundant and unnecessary computations.
1.21.2 Build to run service included in the OTC
1.21.2.1 Build service pre-requisite
- Refer to generic description.
1.21.2.2 Build to run service
- Refer to generic description.
1.21.3 RUN services included in the MRC
1.21.3.1 Run service pre-requisite
- This file can be executed with a CI/CD and the execution has been tested successfully.
1.21.3.2 Co-manage option
No by default, IAC is fully managed by OBS, we are master of the CI/CD up to the table ((Customer can have access to the modifications of the column families on a case by case basis/request for change via tickets)
1.21.3.3 KPI & alerts
Monitoring
Yes, Metrics, Hit Logs,
Orange Business Service collects metrics from Cloud Memorystore to :
- Visualize the performance of your datastores
- Correlate the performance of your datastores with your applications
Orange Business Services uses native tools for logs. Cloud Memorystore logs are collected with Google Cloud Logging and sent to a Cloud Pub/Sub via a Push HTTP forwarder.
gcp.redis.clients.blocked | Number of blocked clients |
gcp.redis.clients.connected | Number of client connections |
gcp.redis.commands.calls | Total number of calls for this command |
gcp.redis.commands.total_time | The amount of time in microseconds that this command took in the last second |
gcp.redis.commands.usec_per_call | Average time per call over 1 minute by command |
gcp.redis.keyspace.avg_ttl | Average TTL for keys in this database |
gcp.redis.keyspace.keys_with_expiration | Number of keys with an expiration in this database |
gcp.redis.keyspace.keys | Number of keys stored in this database |
gcp.redis.persistence.rdb.bgsave_in_progress | Flag indicating a RDB save is on-going |
gcp.redis.replication.master.slaves.lag | The number of bytes that replica is behind. |
gcp.redis.replication.master.slaves.offset | The number of bytes that have been acknowledged by replicas. |
gcp.redis.replication.master_repl_offset | The number of bytes that master has produced and sent to replicas. To be compared with replication byte offset of replica. |
gcp.redis.replication.offset_diff | The number of bytes that have not been replicated to the replica. This is the difference between replication byte offset (master) and replication byte offset (replica). |
gcp.redis.replication.role | Returns a value indicating the node role. 1 indicates master and 0 indicates replica. |
gcp.redis.server.uptime | Uptime in seconds |
gcp.redis.stats.cache_hit_ratio | Cache Hit ratio as a fraction |
gcp.redis.stats.connections.total | Total number of connections accepted by the server |
gcp.redis.stats.cpu_utilization | CPU, in seconds of utilization, consumed by the Redis server broken down by System/User and Parent/Child relationship |
gcp.redis.stats.evicted_keys | Number of evicted keys due to max memory limit |
gcp.redis.stats.expired_keys | Total number of key expiration events |
gcp.redis.stats.keyspace_hits | Number of successful lookup of keys in the main dictionary |
gcp.redis.stats.keyspace_misses | Number of failed lookup of keys in the main dictionary |
gcp.redis.stats.memory.maxmemory | Maximum amount of memory Redis can consume |
gcp.redis.stats.memory.system_memory_usage_ratio | Memory usage as a ratio of maximum system memory |
gcp.redis.stats.memory.usage_ratio | Memory usage as a ratio of maximum memory |
gcp.redis.stats.memory.usage | Total number of bytes allocated by Redis |
gcp.redis.stats.network_traffic | Total number of bytes sent to/from redis (includes bytes from commands themselves, payload data, and delimiters) |
gcp.redis.stats.pubsub.channels | Global number of pub/sub channels with client subscriptions |
gcp.redis.stats.pubsub.patterns | Global number of pub/sub pattern with client subscriptions |
gcp.redis.stats.reject_connections_count | Number of connections rejected because of max clients limit |
1.21.3.4 Backup and restore
Data backup and restore
The backup is based From IaC + Snapshot from table in same zone in same cluster
Service restore
Recovery will be from Dump of the DB
1.21.3.5 GCP SLA High Availability and Disaster Recovery inter-region
HA by design.
Replication of tables on other regions necessary for recovery after region loss.
1.21.4 Charging model
Work Unit |
Per Instance |
1.21.5 Changes catalogue – in Tokens, per act
Changes examples | Effort |
Create/modify/delete table
Add/mofdify/update/delete user with policies Copy table |
1 token |
Optimisation of index | 4 tokens |
Other changes | Estimation in tokens based on time spent |
1.22 Cloud Firestore
1.22.1 Description
Cloud Firestore is a document-oriented NoSQL database that automatically manages data partitioning and replication to ensure reliability, while being able to scale up according to application needs. And it does so automatically.
Google Cloud Firestore is also a flexible and scalable database for mobile, web and server development from Firebase and Google Cloud Platform.
1.22.2 Build to run service included in the OTC
1.22.2.1 Build service pre-requisite
- Refer to generic description.
1.22.2.2 Build to run service
- Refer to generic description.
1.22.3 RUN services included in the MRC
1.22.3.1 Run service pre-requisite
- This file can be executed with a CI/CD and the execution has been tested successfully.
1.22.3.2 Co-manage option
Yes if CI/CD shared with the customer (IaC Part)
1.22.3.3 KPI & alerts
Monitoring
Yes, Metrics, SlowQuery Log (FireStore)
Orange Business Services uses native tools for logs. Cloud Firestore logs are collected with Google Cloud Logging and sent to a Cloud Pub/Sub via a Push HTTP forwarder.
Metric
gcp.firestore.document.delete_count | The number of successful document deletes. |
gcp.firestore.document.read_count | The number of successful document reads from queries or lookups. |
gcp.firestore.document.write_count | The number of successful document |
1.22.3.4 Backup and restore
Data backup and restore
The backup based on regular export.
Service restore
Recovery will be from Infra as Code + Backup of the data.
1.22.3.5 GCP SLA High Availability and Disaster Recovery inter-region
HA and non HA are provided by Google Cloud Platform depending on the design and service parameter configuration
Recovery after regions loss is Based on design SOW, the service can be built in multiple regions.
1.22.4 Charging model
Work Unit |
Per Instance |
1.22.5 Changes catalogue – in Tokens, per act
Changes examples | Effort |
Create/update/delete instance
Create/update/delete BD Run script FireStore |
1 token |
Index refactoring | 4 tokens |
Other changes | Estimation in tokens based on time spent |
1.23 Cloud Spanner
1.23.1 Description
Cloud SQL is a fully-managed database service that helps you set up, maintain, manage, and administer your relational databases on Google Cloud Platform. You can use Cloud SQL with MySQL, PostgreSQL, or SQL Server. Cloud SQL provides a cloud-based alternative to local MySQL, PostgreSQL, and SQL Server databases. Many applications running on Compute Engine, App Engine and other services in Google Cloud use Cloud SQL for database storage.
1.23.2 Build to run service included in the OTC
1.23.2.1 Build service pre-requisite
- Refer to generic description.
1.23.2.2 Build to run service
- Refer to generic description.
1.23.3 RUN services included in the MRC
1.23.3.1 Run service pre-requisite
- This file can be executed with a CI/CD and the execution has been tested successfully.
1.23.3.2 Co-manage option
Yes if CI/CD shared with the customer (IaC Part)
1.23.3.3 KPI & alerts
Monitoring
Yes, Metrics
Orange Business Services uses native tools for logs. Cloud Spanner logs are collected with Google Cloud Logging and sent to a Cloud Pub/Sub via a Push HTTP forwarder.
Metrics
gcp.spanner.api.received_bytes_count | Uncompressed request bytes received by Cloud Spanner. |
gcp.spanner.api.sent_bytes_count | Uncompressed response bytes sent by Cloud Spanner. |
gcp.spanner.api.api_request_count | Cloud Spanner API requests. |
gcp.spanner.api.request_count | Rate of Cloud Spanner API requests. |
gcp.spanner.api.request_latencies.avg | Average server request latencies for a database. |
gcp.spanner.api.request_latencies.samplecount | Sample count of server request latencies for a database. |
gcp.spanner.api.request_latencies.sumsqdev | Sum of Squared Deviation of server request latencies for a database. |
gcp.spanner.api.request_latencies_by_transaction_type | Distribution of server request latencies by transaction types. |
gcp.spanner.instance.cpu.utilization | Utilization of provisioned CPU, between 0 and 1. |
gcp.spanner.instance.cpu.smoothed_utilization | 24-hour smoothed utilization of provisioned CPU between 0.0 and 1.0. |
gcp.spanner.instance.cpu.utilization_by_operation_type | Percent utilization of provisioned CPU, by operation type between 0.0 and 1.0. |
gcp.spanner.instance.cpu.utilization_by_priority | Percent utilization of provisioned CPU, by priority between 0.0 and 1.0. |
gcp.spanner.instance.node_count | Total number of nodes. |
gcp.spanner.instance.session_count | Number of sessions in use. |
gcp.spanner.instance.storage.used_bytes | Storage used in bytes. |
gcp.spanner.instance.storage.limit_bytes | Storage limit for instance in bytes |
gcp.spanner.instance.storage.limit_bytes_per_processing_unit | Storage limit per processing unit in bytes. |
gcp.spanner.instance.storage.utilization | Storage used as a fraction of storage limit. |
gcp.spanner.instance.backup.used_bytes | Backup storage used in bytes. |
gcp.spanner.instance.leader_percentage_by_region | Percentage of leaders by cloud region between 0.0 and 1.0. |
gcp.spanner.instance.processing_units | Total number of processing units. |
gcp.spanner.lock_stat.total.lock_wait_time | Total lock wait time for lock conflicts recorded for the entire database. |
gcp.spanner.query_count | Count of queries by database name, status, query type, and used optimizer version. |
gcp.spanner.query_stat.total.bytes_returned_count | Number of data bytes that the queries returned |
gcp.spanner.query_stat.total.cpu_time | Number of seconds of CPU time Cloud Spanner spent on operations to execute the queries. |
gcp.spanner.query_stat.total.execution_count | Number of times Cloud Spanner saw queries during the interval. |
gcp.spanner.query_stat.total.failed_execution_count | Number of times queries failed during the interval. |
gcp.spanner.query_stat.total.query_latencies | Distribution of total length of time, in seconds, for query executions within the database. |
gcp.spanner.query_stat.total.returned_rows_count | Number of rows that the queries returned. |
gcp.spanner.query_stat.total.scanned_rows_count | Number of rows that the queries scanned excluding deleted values. |
gcp.spanner.read_stat.total.bytes_returned_count | Total number of data bytes that the reads returned excluding transmission encoding overhead. |
gcp.spanner.read_stat.total.client_wait_time | Number of seconds spent waiting due to throttling. |
gcp.spanner.read_stat.total.cpu_time | Number of seconds of CPU time Cloud Spanner spent execute the reads excluding prefetch CPU and other overhead. Shown as second |
gcp.spanner.read_stat.total.execution_count | Number of times Cloud Spanner executed the read shapes during the interval. |
gcp.spanner.read_stat.total.leader_refresh_delay | Number of seconds spent coordinating reads across instances in multi-region configurations. |
gcp.spanner.read_stat.total.locking_delays | Distribution of total time in seconds spent waiting due to locking. |
gcp.spanner.read_stat.total.returned_rows_count | Number of rows that the read returned. |
gcp.spanner.row_deletion_policy.deleted_rows_count (count) |
Count of rows deleted by the policy since the last sample. |
gcp.spanner.row_deletion_policy.processed_watermark_age | Time between now and the read timestamp of the last successful execution. |
gcp.spanner.row_deletion_policy.undeletable_rows | Number of rows in all tables in the database that can’t be deleted. |
gcp.spanner.transaction_stat.total.bytes_written_count | Number of bytes written by transactions. |
gcp.spanner.transaction_stat.total.commit_attempt_count | Number of commit attempts for transactions. |
gcp.spanner.transaction_stat.total.commit_retry_count | Number of commit attempts that are retries from previously aborted transaction attempts. |
gcp.spanner.transaction_stat.total.participants | Distribution of total number of participants in each commit attempt. |
gcp.spanner.transaction_stat.total.transaction_latencies | Distribution of total seconds taken from the first operation of the transaction to commit or abort. |
1.23.3.4 Backup and restore
Data backup and restore
Yes, backup automatic include
Service restore
Recovery will be from Infra as Code + Backup of the data.
1.23.3.5 GCP SLA High Availability and Disaster Recovery inter-region
HA and Multiregional
Recovery after regions loss : Managed Service, Serverless, everything is managed by Google
1.23.4 Charging model
Work Unit |
Per Instance |
1.23.5 Changes catalogue – in Tokens, per act
Changes examples | Effort |
Create/update/delete BD | 1 token |
Modification of the DB schema | 4 tokens |
Other changes | Estimation in tokens based on time spent |