Initial ansible proof of concept #3843

ventifus · 2024-09-16T16:49:17Z

What this PR does / why we need it:

This PR leverages Ansible to automate cluster creation in predefined configurations. These configurations are intended to exercise ARO features in a repeatable manner. A formal proposal is under development, see

https://docs.google.com/document/d/1AIcP1jMQS-9hsI_bS6Qhr3IoqtD97LqKfw6c9PNXD5g

We have been running smoketests in regions manually. This is error-prone and needs to be automated. As a first step here is the minimum viable Ansible playbook that creates a cluster, waits for it to become healthy via kube api, runs az aro update, then deletes the cluster. This is designed to run Ansible from a container so that users' development hosts don't need to have Ansible installed and will facilitate running smoketests in an automated fashion in the future.

This will allow us to evaluate the following questions:

Do we want an Ansible dependency in our codebase?
What smoke tests do we want? (see https://docs.google.com/document/d/1K9Qffl5CuRr_PpeAKI4ik6Rl4je1p-rdf6_xZXJZzVM)

Test plan for issue:

To test, make sure you've got a valid az login session. This is passed through to the container via your ~/.azure directory.

Then run

$ make cluster

This will first build the required aro-ansible container image if it doesn't exist, then it will execute ansible-playbook in the container.

There are several variables implemented to control aspects of the playbook execution. These can be combined with each other as needed.

To personalize the resulting resource groups, set the CLUSTERPREFIX as desired. This variable defaults to your current shell's $USER.

$ make cluster CLUSTERPREFIX=ocpbugs35300

The default region used is eastus. Set the REGION parameter to choose a different region:

$ make cluster REGION=centraluseuap

To choose one or more cluster configurations, set the CLUSTERPATTERN parameter to a wildcard string that matches the cluster scenarios you wish to test:

$ make cluster CLUSTERPATTERN=udr

To clean up at the end of the run, set CLEANUP to True. This will delete the cluster, the resource group, then the Entra Service Principal and Application.

$ make cluster CLEANUP=True

Currently implemented cluster configurations are:

basic: Simplest cluster, nothing fancy
private: Simple cluster with apiserver and ingress visibility set to private.
enc: Encryption-at-host enabled
udr: UserDefinedRouting with a blackhole Route Table
byok: Disk encryption using bring-your-own-key

Private clusters such as private and udr will cause the creation of a jumphost to access the cluster API. Your local SSH public key will be passed to the jumphost, then ansible will use your corresponding private key to tunnel SSH through it. If your local SSH configuration differs from defaults, the Makefile supports two variables to tweak things:

SSH_CONFIG_DIR := $(HOME)/.ssh/
SSH_KEY_BASENAME := id_rsa

Is there any documentation that needs to be updated for this PR?

Yes

How do you know this will function as expected in production?

lranjbar · 2024-09-26T15:52:33Z

This is really great! I'd love to move this into openshift/aro-e2e so we can use it more broadly.

But also if we turn it into an Ansible collection, we can set up ansible tests for it. Here is an example of the containerized ansible tests I set up for dev-scripts:

https://github.com/openshift-metal3/dev-scripts/blob/master/agent/tests/Dockerfile.agent-test
https://github.com/openshift-metal3/dev-scripts/blob/master/agent/agent_test_commands.sh
https://github.com/openshift-metal3/dev-scripts/blob/master/agent/agent_tests.sh

We can use the same pattern here. I'm really excited for this!

github-actions · 2024-10-10T19:20:42Z

Please rebase pull request.

ventifus mentioned this pull request Sep 16, 2024

Initial ansible proof of concept #3634

Closed

ventifus force-pushed the ventifus/ARO-7993-ansible-cluster-automation branch 3 times, most recently from 14e4e33 to 80d4f73 Compare September 16, 2024 19:05

ventifus force-pushed the ventifus/ARO-7993-ansible-cluster-automation branch 2 times, most recently from 894412a to 6761819 Compare September 24, 2024 22:05

Initial ansible proof of concept

eac902b

ventifus force-pushed the ventifus/ARO-7993-ansible-cluster-automation branch from 6761819 to eac902b Compare September 25, 2024 18:30

ventifus mentioned this pull request Sep 27, 2024

Ansible cluster installation openshift/aro-e2e#3

Open

github-actions bot added the needs-rebase branch needs a rebase label Oct 10, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Initial ansible proof of concept #3843

Initial ansible proof of concept #3843

ventifus commented Sep 16, 2024 •

edited

Loading

lranjbar commented Sep 26, 2024

github-actions bot commented Oct 10, 2024

Initial ansible proof of concept #3843

Are you sure you want to change the base?

Initial ansible proof of concept #3843

Conversation

ventifus commented Sep 16, 2024 • edited Loading

What this PR does / why we need it:

Test plan for issue:

Is there any documentation that needs to be updated for this PR?

How do you know this will function as expected in production?

lranjbar commented Sep 26, 2024

github-actions bot commented Oct 10, 2024

ventifus commented Sep 16, 2024 •

edited

Loading