Nowadays, due to the competition in the market, customers want to see the products before contracting them. From this requirement, by customers, arose the need to create a temporary environment to show our product, we tell you all about it in this article.
Introduction
What is the PoC environment?
The PoC environment arises from the need to deliver to customers for a certain time a sample of our product in which they can put real data from their infrastructure and so they can see in more detail what we can provide from Datadope.
What are the basic requirements for the creation of this environment?
Quick to deploy and delete
We needed it to be able to be created and destroyed in a fast and agile way.
Multiclient
To be able to create several independent PoCs for different customers simultaneously
Accessible from the customer’s infrastructure
Data could be sent to and accessed from the customer’s infrastructure
Adopted Solution
Infrastructure
For the infrastructure we decided on the Google cloud, since we had experience with it and we had a fairly acceptable level of costs.
Deployment tools
JENKINS: We use Jenkins to launch both the creation of images and the creation or destruction of a POC environment.
TERRAFORM: to create the various elements in GCP
Ansible: to launch the playbooks that we already have of the different components of the product and leave them configured.
Architecture
High level architecture
In GCP we have the different components of IOMetrics and in the client’s CPD we deploy a VM (virtual machine), which we will call appliance, which is the one that will communicate with the different endpoints that we expose in GCP.
APPLIANCE
The appliance is a server that must be deployed on the client side, for this we deliver a VMWare are image. This image contains the following components:
AWX: We use it to make the client-side automations that will populate the CMDB and install the monitoring agents
Hashicorp vault: securely store credentials used by AWX
Redis y Logstash: collect all agent metrics and send them to GCP
Elastic APM: For performance data collection
Synthetix Agent
A Synthetix agent launching synthetic probes on the customer’s intranet
GCP ARCHITECTURE
In GCP we deploy all IOMetrics, for this we first create a network for each POC to be deployed, so we can isolate the POCs for different clients, then we deploy all the components that intercommunicate through this network, we can divide them into two large blocks, on the one hand, SYNTHETIX which we deploy using the following
CLOUD SQL – is Google’s self-managed database, where we deploy the postgresql used by synthetix.
GKE: Google’s kubernetes cluster, we use it to deploy synthetix components.
Cloud storage bucket: to store synthetix logs and test results.
Key management: a key service where we store the keys to decrypt the Synthetix Vault.
We use two types of balancers, external ones to expose services to the internet and internal ones to expose services to the vms we have in GCP
Firewall to restrict access from client IPs to ports we expose on external balancers
And, on the other hand, the IOMetrics Core that uses the following GCP elements:
Compute to host the virtual machines containing the various components of IOMetrics
Balancers to expose the services to the Internet
Firewall to restrict access from client IPs to the ports we expose on the balancers.
IAP is a service that establishes a central authorization layer for applications accessed via HTTPS, we use it to expose services without IP access restriction.
PoC Deployment
To deploy the PoC as we have indicated we use Jenkins, for this we have 3 Jobs that are in charge of the deployment in GCP:
1
Build of IOMetrics – Appliance
We have this Job that builds images of the VMs with the IOMetrics software installed and configured so that the deployment of a PoC is faster, the steps of this Job are:
- Form: Entering variables
- Docker: Raising a container
- Credentials: Translation of the Jenkins credentials to access the required resources
- Git: Download the code from the indicated repository
- Terraform: Plan and lift the indicated infrastructure
- Ansible: Configures the machines raised by terraform
- Image Creator: Creates an image of the configured machines
- Terraform Destroy: Destroy the previously built infrastructure
Build-IOMetrics Variables
To configure the job that builds the images of the machines that will be raised in the POC are:
- Version: IOMetrics version to be defined
- Appliance: If we want to image the appliance
- Appliance cloud: what environment the appliance image is going to be for. Currently, it can be GCP or OST
- Terraform plan: Terraform makes an execution plan. Necessary for any action that needs terraform.
- Terraform create: Terraform raises the infrastructure detected in the plan.
- Ansible: Option to not configure the created machines.
- Terraform destroy: Terraform destroys the infrastructure that is up and matches what terraform plan showed.
2
POC Create
This is the Job that deploys the PoC for a customer, the steps of this Job are:
- Form: Entering variables
- Docker: Raising a container
- Credentials: Translation of Jenkins credentials to access the required resources.
- Git: Download the code from the indicated repository.
- Terraform: Plan and lift the indicated infrastructure.
- Check endpoints: The job waits for the exposed endpoints to respond before the execution is considered successful.
POC Create Variables
To configure a POC for a client when executing the job we will need the following variables to be defined:
- Client: Client name
- Version: version of IOMetrics to be used
- POC days: Estimate of days that the POC will be up, it is used to calculate the size of the disks to allocate to each machine.
- GKE: If we want to deploy synthetix.
- Backend allow IPs: Client IPs that will be allowed to access the firewall.
3
POC Destroy
“POC destroy” is a very simple job in which we only have to indicate the name of the client and it automatically detects the infrastructure linked to that client and performs a terraform destroy.
Although in the POC creation job we indicate the number of days that it will be up, this does not indicate that this job will be executed in that number of days. It is completely manual.
Appliance - GCP Connection
After the deployment of the GCP part, the appliance must be deployed in the customer’s CPD, Datadope will provide a URL where you can download a file containing the virtual disk of the appliance, in .vmdk format. The minimum requirements for this appliance are:
- 4 x vCPU
- 8 GB RAM
- 40 GB disk
The virtual machine deploying the client has a user and password provided by Datadope. Said user has root permissions configured and, after the first login, a password change will be requested for additional security, these credentials no longer being guarded by Datadope.
Appliance configuration
After deploying this appliance the client will have to launch the following command to finish configuring AWX:
$ sudo iometrics-provision -c poc_name -u linux_user -kr ssh_rsa.key -ke ssh_ecdsa.key -wu windows_user -wp windows_pass_file -n network -vv
Where:
poc_name (req): PoC name, provided by Datadope.
linux_user (req): User to access Linux servers.
ssh_rsa.key (req): File with the SSH private key with RSA encryption, configured to access through key exchange with Linux servers.
ssh_ecdsa.key (req): File with the ECDSA encryption SSH private key, configured for access via key exchange with Linux servers.
network (req): Network to scan and monitor, in IP/mask format (for example: 192.168.0.1/24).
windows_user (opc): User to access Windows servers.
windows_pass_file (opc): Text file containing the Windows user password.
Deployment
Once the client has configured the appliance it means that everything is ready to deploy agents in its network and send data, for this purpose the AWX Jobs will be executed or will wait for the scheduled execution of the AWX Jobs that are in charge of:
- Discovery of networks and IPs.
- Server and software discovery.
- Installation and configuration of monitoring agents.
By having a connection established between your network and the one set up in GCP, all the data we retrieve from the agent deployment will already reach our systems (CMDB, Elasticsearch…).
Conclusion
Starting from the tools we usually use and the code we had to deploy the components, we have managed to deploy in cloud and in an agile way a test environment as close as possible to a production one, which helps us to give a real vision of the product to our future customers.
Javi Martín
Ernesto Sánchez