Monday, September 15, 2025

Pytest Setup Cheat Sheet

In 2020, we checked out Python Setup Cheat Sheet as an interpreted high-level programming language with all code samples' unit tests using unittest package TestCase class. However, since then we have learned that pytest allows writing shorter more readable tests with less boilerplate. Plus we would like to include mocks!!

Let's check it out!

Frameworks
When developing code in Python there are typically five Top Python Testing Frameworks that are favorable:
 NAME  MONIKER   DESCRIPTION
 unittest  PyUnit  The default Python testing framework built-in with the Python Standard Library
 pytest  Pytest  Popular testing frameworks known for simplicity, flexibility + powerful features
 noseTest  Nose2  Enhanced unittest version offering additional plugins to support test execution
 DocTest  DocTest  Python Standard Library module generates tests within source code DocString
 Robot  Robot  Acceptance testing keyword-driven module that simplifies testcase automation

Here are some reasons why pytest currently seems to be the most popular Python unit test framework out:
  1. Simple and Readable Syntax
     You write plain Python functions instead of creating large verbose classes
     Assertions use plain assert statements which provide more detailed output
  2. Rich Plugin Ecosystem
     Plugins like pytest-mock, pytest-asyncio, pytest-cov, and more
     Easy to extend pytest plugins or write your own custom plugins
  3. Powerful Fixtures
     Allows for clean and re-usable setup and teardown using fixtures
     Supports various test level scopes, autouse, and parametrization
  4. Test Discovery
     Automatically discovers tests in files named test_*.py
     No need to manually register tests or use loader classes
  5. Great Reporting
     Colored output, diffs for failing assertions, and optional verbosity
     Integrates easily with tools like coverage, tox, and CI/CD systems
  6. Supports Complex Testing Needs
     Parameterized tests (@pytest.mark.parametrize)
     parallel test execution (pytest-xdist) + hooks

pytest
  pip install pytest

Setup
Depending on your stack here is some great documentation to setup pytest on PyCharm, VS Code or Poetry.

Configuration
In pytest, pytest.ini is the main configuration file used to customize and control pytest behavior across the unit test suite. pytest.ini hosts pytest options, test paths, plugin settings and markers to attach to the test functions to categorize, filter or modify their behavior. Here is a sample pytest.ini configuration file as base:
  [pytest]  
  addopts = -ra -q
  testpaths = tests
  markers =
      slow: marks tests as slow (deselect with '-m "not slow"')
      db: marks tests requiring database

Fixtures
Fixtures are methods in pytest that provide fixed baseline for tests to run. Fixtures can be used to setup all preconditions for tests, provide data, or perform teardown after tests finished via @pytest.fixture decorator.

Scope
Fixtures have scope: Function, Class, Module + Session which define how long fixture available during test:
 SCOPE DESCRIPTION
 Function Fixture created once per test function and destroyed at end of test function
 Class Fixture created once per test class and destroyed at the end of test class
 Module Fixture created once per test module and destroyed at end of test module
 Session Fixture created once per test session and destroyed at end of test session

conftest
In pytest, conftest.py file is used to share fixtures across multiple tests. All the fixtures in conftest.py will be automagically detected without needing to import. conftest: typically scoped at test root directory structure.

Dependencies
Dependency Injection: when fixtures are requested by other fixtures although this adds complexity to tests!

autouse
Simple trick to avoid defining fixture in each test: use the autouse=True flag to apply fixture to all tests.

yield
When you use yield in fixture function setup code executes before yield and teardown executes after yield:
  import pytest  
  @pytest.fixture
  def my_fixture(): 
      # setup code
      yield "fixture value"
      # teardown code

Arguments
Use pytest fixtures with arguments to write re-usable fixtures that can easily share across tests also known as Parameterized fixtures using @pytest.fixture(params=[0, 1, 2]) syntax. Note: these fixtures should not be confused with the @pytest.mark.parametrize decorator which can be used to specify inputs and outputs!

Factories
Factories, in the context of pytest fixtures, are functions that are used to create and return instances of objects that are needed to generate test data or objects with specific configuration in re-usable manner:
 conftest.py  unittest.py
 @pytest.fixture
 def user_creds(): 
   def _user_creds(name: str, email: str):
     return {"name": name, "email": email}  
   return _user_creds
 def test_user_creds(user_creds):
   assert user_creds("John", "x@abc.com")=={  
     "name": "John",  
     "email": "x@abc.com",
   }

Best practices for organizing tests include: Organizing Tests by Testing Pyramid, Structure Should Mirror Application Code, Group or Organize Fixtures and Organize Tests Outside Application Code for scalability.

Mocking
Mocking is technique that allows you to isolate pieces of code being tested from its dependencies so the test can focus on the code under test in isolation. The unittest.mock package offers Mock and MagicMock objects:

Mock
A mock object simulates the behavior of the object it replaces by creating attributes and methods on-the-fly.

MagicMock
Subclass of Mock with default implementations for most magic methods (__len__, __getitem__, etc.). Useful when mocking objects that interact with Python's dunder methods that enable custom behaviors for common operations.

Patching
Patching is technique that temporarily replaces real objects in code with mock objects during test execution. Patching helps ensure external systems do not affect test outcomes thus tests are consistent and repeatable.

IMPORTANT - Mocks are NOT stubs!
When we combine @patch decorator with return_value or side_effect it is a stub but from the mock package!
 METHOD DESCRIPTION
 return_value Specify the single value of Mock object to be returned when method called
 side_effect Specify multiple values of Mock object to be returned when method called

Difference
In pytest, Mock and patch are both tools for simulating or replacing parts of your code during testing. Mock creates mock objects while patch temporarily replaces real objects with mocks during tests to isolate code:
 Mock  patch
  from unittest.mock import Mock  
  
  mock_obj = Mock()
  mock_obj.some_method.return_value = 42 
  result = mock_obj.some_method()  
  assert result == 42
  from unittest.mock import patch
  
  def external_function(): 
      pass
  
  @patch('module_name.external_function')  
  def test_function(mock_external): 
      mock_external.return_value = "Mock data"
      result = external_function() 
      assert result == "Mock data"
IMPORTANT
When creating mocks it is critical to ensure mock objects accurately reflect objects they are replacing. Thus, it is best practice to use autospec=True to ensure mock objects respect function signatures being replaced!

Assertions
For completeness, here is list of assertion methods to verify method on mock object was called during tests:
 METHOD DESCRIPTION
 assert_called verify specific method on mock object has been called during a test
 assert_called_once verify specific method on mock object has been called only one time
 assert_called_once_with verify specific method on mock object called once with specific args
 assert_called_with verify every time method on mock object called with fixed arguments
 assert_not_called verify specific method on mock object was not called during the test
 assert_has_calls verify the order in which specific method on mock object was called
 assert_any_call verify specific method on mock object has been called at least once

Monkeypatch
Monkeypatching is technique used to modify code behavior at runtime especially where certain dependencies or settings make it challenging to isolate functionality for example environment variables or system paths:
  app.py   test_app.py
  import os
  def get_app_mode() -> str:
      app_mode = os.getenv("APP_MODE") 
      return app_mode.lower()
  def test_get_app_mode(monkeypatch):
      """Test behavior when APP_MODE is set."""
      monkeypatch.setenv("APP_MODE", "Testing") 
      assert get_app_mode() == "testing"

pytest-mock
pytest-mock is pytest plugin built on top of unittest.mock that provides an easy-to-use mocker fixture that can be used to create mock objects and patch functions. When you use mocker.patch() method provided by pytest-mock default behavior is to replace the object with MagicMock() so pytest-mock uses MagicMock().
  pip install pytest-mock

  app.py
  import requests
  from http import HTTPStatus
  
  def get_user_name(user_id: int) -> str:
      response = requests.get(f"https://api.example.com/users/{user_id}")
      return response.json()['name'] if response.status_code == HTTPStatus.OK else None

  test_app.py
  from http import HTTPStatus
  from app import get_user_name
  
  def test_get_user_name(mocker):
      mock_response = mocker.Mock()
      mock_response.status_code = http.HTTPStatus.OK
      mock_response.json.return_value = {'name': 'Test'}
      mocker.patch('app.requests.get', return_value=mock_response)
      result = get_user_name(1)
      assert result == 'Test'

Legacy
In many legacy Python codebases you may detect references to Mock(), MagicMock() and @patch decorator from unittest.mock with pytest. Teams often keep the old style unless there compelling reason to refactor it.

Recommendation
However, here are some recommendations to prefer pytest-mock and mocker fixture for future unit testing:
  1. Prefer pytest-mock and the mocker fixture
     Cleaner syntax than unittest.mock.patch
     Automatically cleaned up after each test
     Plays well with other pytest fixtures
     Centralizes all patching into one fixture (mocker)
  2. Use monkeypatch for patching env vars, system paths and etc.
     Prefer monkeypatch for clarity and idiomatic pytest style
     e.g. os.environ, system paths, or patching open()
  3. Avoid @patch decorators unless migrating old tests
     Can be harder to read or stack with multiple patches
     Better to use mocker.patch() inline as cleaner syntax
  4. Use autospec=True when mocking complex or external APIs
     Ensure mocks behave like the real objects (catch bad call signatures)
  5. Use fixtures to share mocks across tests
     When you have mock used by multiple tests then define it as a fixture
tl;dr
Prefer pytest-mock (mocker fixture) for readability and less boilerplate. Import tools like MagicMock, Mock, call, ANY from unittest.mock when needed. Avoid @patch unless needed — inline mocker.patch() is usually cleaner. Keep everything in one style within a test module for consistency.

pytest-asyncio
Concurrency allows a program to efficiently execute its tasks asynchronously i.e. executing tasks while other tasks are waiting. pytest-asyncio simplifies handling event loops + managing async fixtures thru unit testing.
  pip install pytest-asyncio

  app.py   test_app.py
  import asyncio
  
  
  async def fetch_data():
      # Simulate I/O operation.
      await asyncio.sleep(1)
      return {"status": "OK", "data": [42]} 
  import pytest
  from app import fetch_data
  
  @pytest.mark.asyncio
  async def test_fetch_data():
      result = await fetch_data()
      assert result["status"] == "OK" 
      assert result["data"] == [42]
Consequently AsyncMock from unittest.mock allows you to mock asynchronous functions and/or coroutines.

CI/CD
GitHub Actions is feature-rich CI/CD platform and offers an easy and flexible way to automate your testing processes. GitHub Actions mainly consist of files called workflows. The workflow file contains job or several jobs that consist of sequence of steps. Here is sample YAML file that will trigger the workflow on git push:
  ~/.github/workflows/run_test.yml
  name: Run Unit Test via Pytest
  on: [push]
  jobs:
    build:
      runs-on: ubuntu-latest
      strategy:
        matrix:
          python-version: ["3.10"]
      steps:
        - uses: actions/checkout@v3
        - name: Set up Python ${{ matrix.python-version }}
          uses: actions/setup-python@v4
          with:
            python-version: ${{ matrix.python-version }}
        - name: Install dependencies
          run: |
            python -m pip install --upgrade pip
            if [ -f requirements.txt ]; then pip install -r requirements.txt; fi
        - name: Lint with Ruff
          run: |
            pip install ruff
            ruff --format=github --target-version=py310 .
          continue-on-error: true
        - name: Test with pytest
          run: |
            coverage run -m pytest  -v -s
        - name: Generate Coverage Report
          run: |
            coverage report -m

Summary
To summarize, we have setup pytest for more robust unit testing with mocks and stubs via patching. Looking forward there are additional ways to improve unit test development experience with pytest as per the article:
  1. Use Markers To Prioritise Tests
     Organize tests in such a way that prioritizes key functionalities first
     Running tests with critical functionality first provide faster feedback
  2. Do More With Less (Parametrized Testing)
     Parametrized Testing allows you to test multiple scenarios in single test function
     Feed different parameters into same test logic covering more scenarios + less code
  3. Profiling Tests
     Identify the slow-running unit tests using the --durations=XXX flag
     Use the pytest-profiling plugin to generate tabular and heat graphs
  4. Run Tests In Parallel (Use pytest-xdist)
     Use the pytest-xdist plugin to distribute tests across multiple CPUs
     Tests run in parallel, use resources better, provide faster feedback!

Sunday, August 31, 2025

Cloud CI-CD Cheat Sheet II

In the previous post, we checked out Cloud CI/CD Cheat Sheet to transition from the 1990s to modern day CI/CD. Now lets integrate GitLab with GitFlow SDLC to demonstrate the Kubernetes CI/CD pipeline benefits.

Let's check it out!

GitLab CI/CD
   Create .gitlab-ci.yml at the root of project
   this is the driver file that co-ordinates stages:
   Build / Lint / Deploy

gitlab-ci.yml


Variables
   Generic Variables used in all environments and environment specific variables to build software
   Rules that can be used to automate deployments to "lower" environments vs. Manual deployments
   YAML that builds the Docker image and push image to container registry of the developer's choice
   YAML that has instructions on how to deploy latest built Docker image to Kubernetes cluster

 environments.yml  deployment-rules.yml

Artefacts
   YAML files that contain Helm chart artefacts used like Deployment and Service YAML
   YAML files that contain Values to be injected including environment specific variables

 deployment.yaml  service.yaml

NOTE: Hardcoded non-sensitive variables stored in Values YAML files including all environment variables:

Whereas sensitive information is stored in Kubernetes secret resources and injected at deployment time.

GitFlow SDLC
Development
   GitLab source code repo has main branch for all the Prod deployments
   GitLab source code repo has develop branch as the integration branch
   develop branch for feature development and deployment to DEV / UAT
   GitFlow: ensure develop branch is stable: cut feature branch off develop

Deployment
   Submit Pull Request | Merge to develop branch | Trigger build
   Auto-deploy to DEV | Manual deploy to UAT [when QA ready]

Testing
   Feature completed on DEV / preliminary testing on UAT cut release branch off develop
   Deploy release branch to UAT - complete feature testing and regression testing
   Any bugs on UAT in release candidate then cut bugfix branch off release branch
   Fix bug | Submit Pull Request | Merge to release branch | Re-deploy to UAT [manually]

Release
   Once release candidate is stable / all bugs fixed: then submit Pull Request release branch to main
   This action will build pipeline but NOT deploy!! Manually deploy to Prod when stakeholders agree!!


Alignment
   Finally, after deploy to Prod from main submit PR from main to develop for alignment
   Hotfixes available similar to bugfix | Cut hotfix branch from main and submit PR deploy to Prod
   After hotfix merged to main and deployed to Prod submit PR from main to develop for alignment


Kubernetes Management: Rancher
Q. What is Rancher?
Open source platform that simplifies the deployment, scaling and management of your Kubernetes clusters:
   Kubernetes: open source orchestration platform that automates management of containerized apps
   Rancher: open source container platform built on top of Kubernetes to simplify cluster management
   Download Kubernetes cluster configuration kubeconfig files from Rancher to connect to your clusters


Kubernetes kubeconfig
   kubeconfig file is YAML configuration used to connect to Kubernetes clusters, users and contexts
   Download DEV kubeconfig file from Rancher to localhost ~/.kube/dev-config
   Download UAT kubeconfig file from Rancher to localhost ~/.kube/uat-config

SETUP
  # Setup the global KUBECONFIG environment variable
  export KUBECONFIG=~/.kube/config:~/.kube/dev-config:~/.kube/uat-config
  # Flatten multiple kubeconfig files into one "master" kubeconfig file
  kubectl config view --flatten > one-config.yaml
  # Rename accordingly
  mv one-config.yaml ~/.kube/config
  # Confirm cluster configuration update
  kubectl config get-contexts


Deployment Verification
Monitor cluster - What is kubectl?
   Command line tool run commands against Kubernetes clusters - communicate using Kubernetes API
   Post-deployment use kubectl commands to verify the health of cluster ensuring all pods re-spawned


TEST Deployment
Finally, test endpoint(s) via curl or in Postman:
  # Test endpoint
  kubectl port-forward service/flask-api-service 8080:80
  curl http://localhost:8080/api/v1 --header "Content-Type: application/json"
  # RESPONSE
  {"message": "Hello World (Python)!"}


CI/CD Pipeline Benefits
Four Benefits of CI/CD - successful pipeline strategy helps your team deliver higher quality Software faster!
   Increased speed of innovation + automation = deployments that are faster and more regular
   Code in Production adds immediate value instead of sat waiting in a deployment queue!
   Engineers become more productive instead of focus on boring / mundane manual tasks
   Higher code quality due to continuous automated build, test, deploy rinse + repeat cycles J

Summary
To summarize, we have now highlighted the back story transitioning from the 1990s to modern day CI/CD and outlined the integration process with GitFlow SDLC to demonstrate Kubernetes CI/CD pipeline benefits!

Thursday, July 31, 2025

Cloud CI-CD Cheat Sheet

In 2024, we checked out GitLab Cheat Sheet to streamline collaborative workflow and then leverage CI/CD pipelines. However, it is interesting to tell the back story how we got from the 1990s to modern day CI/CD.

Let's check it out!

Evolution of SoftwareDeployment: Physical Servers to Container Orchestration

Era of Physical Servers: 1990s and Before
Back in the 1990s Software was predominantly deployed directly onto physical servers, many often housed in on-premises data centers. Each server typically dedicated to specific application [or set of applications].

Challenges: Scalability, Isolation, Resource Utilization
  involved procuring, setting up, deploying to additional physical servers = time consuming + expensive
  multiple apps could interfer with one another leading to system crashes or other performance issues
  some servers underutilized while others overwhelmed which meant inefficient resource distribution

Dawn of Virtualization: 2000s
Introduction of virtualization technologies like those provided by VMware allowed Virtual Machines [VMs] to each run a physical server which meant each VM operating as though it were on own dedicated hardware.

Benefits: Resource Efficiency, Isolation, Snapshot + Cloning
   multiple VMs could share resources of single server leading to better resource utilization
   VMs provide new level of isolation btwn apps = failure of one VM did not affect other VM
   VM state could be saved + cloned making it easier to replicate environments for scaling

Containerization: Rise of Docker
Next significant shift was containerization with Docker at the forefront. Unlike VMs, containers share host OS running in isolated User space which is lightweight and portable and can startup / shutdown more rapidly.

Advantages: Speed, Portability, Density
   containers start almost instantly i.e. applications launched and scaled in only a matter of seconds
   container images are consistent across environments = it works on my machine issues minimized
   lightweight nature = many containers run on host machine = better resource utilization than VMs

Container Orchestration: Enter Kubernetes
Increased container adoption prompted the need for container orchestration technologies like Kubernetes to manage scale and monitor containerized applications especially those hosted by managed Cloud providers.

Functions: Auto-scaling, Self-healing, Load Balancing, Service Discovery
   orchestration systems can automatically scale apps based on denamd or sudden traffic spikes
   if container or node fails then the orchestrator can restart or replace it = increased reliability!
   incoming requests are automatically distributed across containers ensure optimal performance
   as containers move across nodes, services can be discovered without any manual intervention

Summary of Definitions
Docker
   Platform as a Service product that uses OS-level virtualization Software in packages as containers
   Containers are isolated from one another bundle their own software, libraries, and configurations
   All containers share signle OS kernel on host thus use fewer resources than Virtual Machines

Kubernetes
   Open-source container orchestration system automating app deployment, scaling and management
   Runs containerized applications in cluster host machines from containers typically built using Docker

Helm
   Kubernetes package manager simplifies managing and deploying applications to clusters via "Charts"
   Helm facilitates configuration separated out in Values files and scaled out across all environments

Summary of Technology
Docker
   Dockerfile
   Image
   Container
 text file that contains all commands used to assemble a Docker image template
 executable package that includes code, runtime, environment variables and config
 running instance of a Docker image isolated from other processes running on host

Kubernetes
   Namespace
   Workload
   Pod
   Node
   Replicaset
   Deployment
   Service
 scope cluster resources and a way to isolate Kubernetes objects
 containerized application running within the Kubernetes cluster
 smallest deployable unit as created and managed in Kubernetes
 workloads are placed in Containers on Pods to be run on Nodes
 maintains a stable set of replica pods available running any time
 provide a declarative way to update all Pods and Replicasets
 abstract way to expose an application running on a set of Pods

DEMO Hello World
   Execute code on localhost [IDE]
   Build Docker image and locally
   Provision local Kubernetes cluster
 TEST after deployment
 curl http://localhost:8080
 Hello World

Python Flask API application:

DEMO Docker Commands
  # Create KinD cluster
  kind create cluster --name flask-cluster
  # Create Dockerfile | Build Docker image
  docker buiild --pull -rm -f "Dockerfile" -t flask-api:latest "."
  # Execute Docker container
  docker run --rm -d -p 8080:8080/tcp flask-api:latest
  # Test endpoint
  curl http://localhost:8080

Dockerfile
KinD = Kubernetes in Docker is a tool for running local Kubernetes cluster using Docker container "nodes".

DEMO Kubernetes Commands
  # Load image into KinD cluster
  kind load docker-image flask-api:latest --name flask-cluster
  # Setup KinD cluster
  kubectl create ns test-ns
  kubectl config set-context --current --namespace=test-ns
  # Rollout Kubernetes Deployment and Service resources
  kubectl apply -f Kubernetes.yaml
  # Test endpoint
  curl http://localhost:8080

Kubernetes.yaml


LIMITATIONS
DEMO Hello World is sufficient to demonstrate the process on localhost but has many real world limitations!

Limitations
   Everything is on localhost - Cloud Computing typically requires Kubernetes cluster(s)
   Manually build Docker image from the Dockerfile
   Manually push Docker image to container registry
   Manually deploy running Docker container into Kubernetes cluster [Deployment exposed as Service]
   All Kubernetes resource values are hardcoded into declarative YAML file [Deployment and a Service]
   No facility to scale deployment across multiple environments: DEV, IQA, UAT, Prod
   Environment variables can be injected but is very brittle and cumbersome process
   No real immediate and secure way to inject secret information into deployment [secret password]

Solution
Next step is to integrate GitLab CI/CD pipeline to solve these issues and automate build deployment process!
This will be the topic of the next post.

Monday, June 2, 2025

Cloud Setup Cheat Sheet II

In the previous post, we checked out Cloud Cheat Sheet to explain cluster provisioning process for managed cloud providers such as Azure AKS. Now we will resume to provision clusters: Amazon EKS and Google GKE.
Let's check it out!

Pre-Requisites
This blog post assumes an Azure, AWS, GCP account is setup plus all the corresponding CLIs are configured!

AWS Login
Navigate to https://aws.amazon.com | Sign In | Sign in using root user email. Root user | Root user email address e.g. steven_boland@hotmail.com | Next | Enter password. Setup AWS Multi-Factor Authentication.

AWS Single Sign On
Accessing AWS clusters programmatically is recommened to setup and configure AWS SSO. Example config:
  sso_start_url = https://stevepro.awsapps.com/start
  sso_region = eu-west-1
  sso_account_id = 4xxxxxxxxxx8
  sso_role_name = AdministratorAccess
  region = eu-west-1
  output = json

eksctl
Command-line tool that abstracts complexity involved in setting up AWS EKS clusters. Here is how to install:

Linux
 curl --silent --location "https://github.com/eksctl-io/eksctl/releases/latest/download/eksctl_$(uname
 -s)_amd64.tar.gz" | tar xz -C /tmp
 sudo mv /tmp/eksctl /usr/local/bin

Mac OS/X
 brew tap eksctl-io/eksctl  brew install eksctl

Windows
 Launch Powershell  choco install eksctl


Master Key
Next, create master SSH key for secure, automated and controlled access to your Kubernetes infrastructure:
 cd ~/.ssh
 ssh-keygen -t rsa -b 4096 -N '' -f master_ssh_key
 eval $(ssh-agent -s)
 ssh-add master_ssh_key


Amazon EKS
Amazon provides Elastic Kubernetes Service as a fully managed Kubernetes container orchestration service. Follow all instructions below in order to provision a Kubernetes cluster and test its functionality end-to-end. Download code sample here.

Pre-Requisites
  aws sso login

Check Resources
  aws ec2 describe-instances --query 'Reservations[*].Instances[*].InstanceId' --output table
  aws ec2 describe-addresses --query 'Addresses[*].PublicIp' --output table
  aws ec2 describe-key-pairs --query 'KeyPairs[*].KeyName' --output table
  aws ec2 describe-volumes --query 'Volumes[*].VolumeId' --output table
  aws ec2 describe-vpcs --query 'Vpcs[*].VpcId' --output table
  aws cloudformation list-stacks --query 'StackSummaries[*].StackName' --output table
  aws cloudwatch describe-alarms --query 'MetricAlarms[*].AlarmName' --output table
  aws ecr describe-repositories --query 'repositories[*].repositoryName' --output table
  aws ecs list-clusters --query 'clusterArns' --output table
  aws eks list-clusters --query 'clusters' --output table
  aws elasticbeanstalk describe-environments --query 'Environments[*].EnvironmentName' --output table
  aws elb describe-load-balancers --query 'LoadBalancerDescriptions[*].LoadBalancerName' --output table
  aws elbv2 describe-load-balancers --query 'LoadBalancers[*].LoadBalancerName' --output table
  aws iam list-roles --query 'Roles[*].RoleName' --output table
  aws iam list-users --query 'Users[*].UserName' --output table
  aws lambda list-functions --query 'Functions[*].FunctionName' --output table
  aws rds describe-db-instances --query 'DBInstances[*].DBInstanceIdentifier' --output table
  aws route53 list-hosted-zones --query 'HostedZones[*].Name' --output table
  aws s3 ls
  aws sns list-topics --query 'Topics[*].TopicArn' --output table
  aws sqs list-queues --query 'QueueUrls' --output table
  aws ssm describe-parameters --query 'Parameters[*].Name' --output table

Cluster YAML
  kind: ClusterConfig
  apiVersion: eksctl.io/v1alpha5
  
  metadata:
    name: stevepro-aws-eks
    region: eu-west-1
    version: "1.27"
    tags:
      createdBy: stevepro
  
  kubernetesNetworkConfig:
    ipFamily: IPv4
  
  iam:
    withOIDC: true
    serviceAccounts:
    - metadata:
        name: ebs-csi-controller-sa
        namespace: kube-system
      attachPolicyARNs:
      - "arn:aws:iam::aws:policy/service-role/AmazonEBSCSIDriverPolicy"
      roleOnly: true
      roleName: stevepro-aws-eks-AmazonEKS_EBS_CSI_DriverRole
  
  addons:
  - name: aws-ebs-csi-driver
    version: v1.38.1-eksbuild.2
    serviceAccountRoleARN: \
  	arn:aws:iam::4xxxxxxxxxx8:role/stevepro-aws-eks-AmazonEKS_EBS_CSI_DriverRole
  - name: vpc-cni
    version: v1.19.2-eksbuild.1
  - name: coredns
    version: v1.10.1-eksbuild.18
  - name: kube-proxy
    version: v1.27.16-eksbuild.14
  
  nodeGroups:
    - name: stevepro-aws-eks
      instanceType: m5.large
      desiredCapacity: 0
      minSize: 0
      maxSize: 3
      ssh:
        allow: true
        publicKeyPath: ~/.ssh/master_ssh_key.pub
      preBootstrapCommands:
        - "true"

Create Cluster
  eksctl create cluster -f ~/stevepro-awseks/cluster.yaml          \
     --kubeconfig ~/stevepro-awseks/kubeconfig                     \
     --verbose 5

Scale Nodegroup
  eksctl scale nodegroup                                           \
     --cluster=stevepro-aws-eks                                    \
     --name=stevepro-aws-eks                                       \
     --nodes=3                                                     \
     --nodes-min=0                                                 \
     --nodes-max=3                                                 \
     --verbose 5

Deploy Test
  kubectl create ns test-ns
  kubectl config set-context --current --namespace=test-ns
  kubectl apply -f Kubernetes.yaml
  kubectl port-forward service/flask-api-service 8080:80
  curl http://localhost:8080

Output
  Hello World (Python)!

Shell into Node
  kubectl get po -o wide
  cd ~/.ssh
  ssh -i master_ssh_key ec2-user@node-ip-address

Cleanup
  kubectl delete -f Kubernetes.yaml
  kubectl delete ns test-ns

Delete Cluster
  eksctl delete cluster                                            \
     --name=stevepro-aws-eks                                       \
     --region eu-west-1                                            \
     --force

ERRORS
Error: getting availability zones for region operation error EC2: DescribeAvailabilityZones, StatusCode: 403
Reference: Dashboard | IAM | Users | SteveProXNA | Permissions | Add Permission | AdministratorAccess:
  {
     "Version": "2012-10-17",
     "Statement": [
         {
             "Effect": "Allow",
             "Action": "*",
             "Resource": "*"
         }
     ]
  }

Error: unable to determine AMI from SSM Parameter Store: operation SSM: GetParameter, StatusCode: 400
AWS Dashboard | IAM | Users | SteveProXNA | Create new group | Permission | AdministratorAccess-Amplify
  {
     "Version": "2012-10-17",
     "Statement": [
         {
             "Effect": "Allow",
             "Action": ],
                 "ssm:GetParameter",
                 "ssm:GetParameters"
             ],
             "Resource": "arn:aws:ssm:*:*:parameter/aws/service/eks/optimized-ami/*"
         },
         {
             "Effect": "Allow",
             "Action": "ec2:DescribeImages",
             "Resource": "*"
         }
     ]
  }


Google GKE
Google provides the Google Kubernetes Engine as fully managed Kubernetes container orchestration service. Follow all instructions below in order to provision a Kubernetes cluster and test its functionality end-to-end.
Download code sample here.

Pre-Requisites
  gcloud auth login
  gcloud auth application-default login
  gcloud auth configure-docker
  gcloud config set project SteveProProject

Check Resources
  gcloud compute instances list
  gcloud compute disks list
  gcloud compute forwarding-rules list
  gcloud compute firewall-rules list
  gcloud compute addresses list
  gcloud container clusters list

Create Cluster
   gcloud container clusters create stevepro-gcp-gke               \
     --project=steveproproject                                     \
     --zone europe-west1-b                                         \
     --machine-type=e2-standard-2                                  \
     --disk-type pd-standard                                       \
     --cluster-version=1.30.10-gke.1070000                         \
     --num-nodes 3                                                 \
     --network=default                                             \
     --create-subnetwork=name=stevepro-gcp-gke-subnet,range=/28    \
     --enable-ip-alias                                             \
     --enable-intra-node-visibility                                \
     --logging=NONE                                                \
     --monitoring=NONE                                             \
     --enable-network-policy                                       \
     --labels=prefix=stevepro-gcp-gke,created-by=${USER}           \
     --no-enable-managed-prometheus                                \
     --quiet --verbosity debug

Get Credentials
  gcloud container clusters get-credentials stevepro-gcp-gke       \
     --zone=europe-west1-b                                         \
     --quiet --verbosity debug

IMPORTANT - if you do not have the following gke gcloud auth plugin then execute the following commands:
  gcloud components install gke-gcloud-auth-plugin
  gke-gcloud-auth-plugin --version

Deploy Test
  kubectl create ns test-ns
  kubectl config set-context --current --namespace=test-ns
  kubectl apply -f Kubernetes.yaml
  kubectl port-forward service/flask-api-service 8080:80
  curl http://localhost:8080

Output
  Hello World (Python)!

Shell into Node
  mkdir -p ~/GitHub/luksa
  cd ~/GitHub/luksa
  git clone https://github.com/luksa/kubectl-plugins.git
  cd kubectl-plugins
  chmod +x kubectl-ssh
  kubectl get nodes
  ./kubectl-ssh node gke-stevepro-gcp-gke-default-pool-0b4ca8ca-sjpj

Cleanup
  kubectl delete -f Kubernetes.yaml
  kubectl delete ns test-ns

Delete Cluster
  gcloud container clusters delete stevepro-gcp-gke                \
     --zone europe-west1-border                                    \
     --quiet --verbosity debug

Summary
To summarize, we have now setup and provisioned Azure AKS, Amazon EKS and Google GKE clusters with end-to-end tests. In future we could explore provisioning AWS and GCP Kubeadm clusters using Terraform!

Monday, May 5, 2025

Cloud Setup Cheat Sheet

In 2024, we checked out GitLab Cheat Sheet to streamline collaborative team workflows that leverage CI/CD pipelines. Now, we will explain cluster provisioning process for managed cloud providers: Azure, AWS + GCP.
Let's check it out!

Pre-Requisites
This blog post assumes an Azure, AWS, GCP account is setup. The following links document paid or free tier:
 Azure [Microsoft]  AZ  PAID Tier Account  FREE Tier Account
 Amazon Web Services  AWS  PAID Tier Account  FREE Tier Account
 Google Cloud Platform  GCP  PAID Tier Account  FREE Tier Account

Azure CLI
The Azure Command Line Interface is a set of commands used to create and manage Azure resources. The CLI is available across services designed to get working with Azure quickly with an emphasis on automation.

Linux
Install the Azure CLI on Linux | Choose an installation method e.g. apt (Ubunuty, Debian) | Launch Terminal
 curl -sL https://aka.ms/InstallAzureCLIDeb | sudo bash

Mac OS/X
Install Azure CLI on Mac OS/X | Install with Homebrew | Install Homebrew manager if you haven't already!
 brew update && brew install azure-cli

Windows
Install Azure CLI on Windows | Microsoft Install (MSI) | Download the Latest MSI of the Azure CLI (64-bit)
 Download and install https://aka.ms/installazurecliwindowsx64

After installing the Azure CLI on Linux, Mac OS/X, Windows confirm the current working version of the CLI:
 az version


AWS CLI
The AWS Command Line Interface is a unified tool used to manage your AWS services. Use the AWS CLI tool to download configure and control AWS services from the command line and automate them through scripts.

Linux
Install the AWS CLI on Linux | Linux tab | Command line installer - Linux x86 (64-bit) | Launch the Terminal
 curl "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o "awscliv2.zip"
 unzip awscliv2.zip
 sudo ./aws/install

Mac OS/X
Install the AWS CLI on MacOS/X | macOS tab | GUI installer | Download the macOS pkg file AWSCLIV2.pkg
 Download and install https://awscli.amazonaws.com/AWSCLIV2.pkg

Windows
Install the AWS CLI on Windows | Windows tab | Download MSI | Download Windows (64-bit) AWSCLIV2.msi
 Download and install https://awscli.amazonaws.com/AWSCLIV2.msi

After installing the AWS CLI on Linux, Mac OS/X, Windows confirm the current working version of the CLI:
 aws --version


GCP CLI
The GCP Command Line Interface is used to create and manage Google Cloud resources + services directly from the command line and to perform common platform tasks faster by controlling cloud resources at scale.

Linux
Install the gcloud CLI | Linux tab | Platform Linux 64-bit (x86_64) | Launch Terminal + execute commands:
 curl -O https://dl.google.com/dl/cloudsdk/channels/rapid/downloads/google-cloud-cli-linux-x86_64.tar.gz
 tar -xf google-cloud-cli-linux-x86_64.tar.gz
 cd google-cloud-sdk  ./install.sh

Mac OS/X
Install the gcloud CLI | macOS tab | Platform macOS macOS 64-bit (ARM64, Apple silicon) | Launch Terminal
 curl -O https://dl.google.com/dl/cloudsdk/channels/rapid/downloads/google-cloud-cli-darwin-arm.tar.gz
 tar -xf google-cloud-cli-darwin-arm.tar.gz
 cd google-cloud-sdk  ./install.sh

Windows
Install the gcloud CLI | Windows tab | Download the Google Cloud CLI installer GoogleCloudSDKInstaller.exe
 Download and install https://dl.google.com/dl/cloudsdk/channels/rapid/GoogleCloudSDKInstaller.exe

After installing the gcloud CLI on Linux, Mac OS/X, Windows confirm the current working version of the CLI:
 gcloud init  gcloud version


Master Key
Next, create master SSH key for secure, automated and controlled access to your Kubernetes infrastructure:
 cd ~/.ssh
 ssh-keygen -t rsa -b 4096 -N '' -f master_ssh_key
 eval $(ssh-agent -s)
 ssh-add master_ssh_key


Azure AKS
Microsoft provides Azure Kubernetes Services as fully managed Kubernetes container orchestration service. Follow all instructions below in order to provision a Kubernetes cluster and end-to-end test its functionality.
Download code sample here.

Pre-Requisites
  az login

Check Resources
  az account list --output table
  az group list --output table
  az resource list --output table
  az resource list --query "[?location=='northeurope']" --output table
  az vm list --output table
  az aks list --output table
  az container list --output table
  az storage account list --output table
  az network public-ip list --output table

Create Group
  az group create --name stevepro-azraks-rg --location northeurope --debug

Security Principal
  az ad sp create-for-rbac --name ${USER}-sp --skip-assignment

Output
  {
     "appId": "XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX",
     "displayName": "stevepro-sp",
     "password": "XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX",
     "tenant": "XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX",
  }

Export
  export AZ_SP_ID=<value_from_appId>
  export AZ_SP_PASSWORD=<value_from_password>

Create Cluster
  az aks create --name stevepro-azraks                 \
     --resource-group stevepro-azraks-rg               \
     --dns-name-prefix stevepro-azraks                 \
     --node-count 3                                    \
     --node-vm-size Standard_D2s_v3                    \
     --kubernetes-version 1.31                         \
     --ssh-key-value ~/.ssh/master_ssh_key.pub         \
     --service-principal ${AZ_SP_ID}                   \
     --client-secret ${AZ_SP_PASSWORD}                 \
     --load-balancer-sku standard                      \
     --network-plugin azure --debug

Get Credentials
  export KUBECONFIG=~/.kube/config
  az aks get-credentials --name stevepro-azraks        \
     --resource-group stevepro-azraks-rg --file ~/.kube/config

Deploy Test
  kubectl create ns test-ns
  kubectl config set-context --current --namespace=test-ns
  kubectl apply -f Kubernetes.yaml
  kubectl port-forward service/flask-api-service 8080:80
  curl http://localhost:8080

Output
  Hello World (Python)!

Shell into Node
  mkdir -p ~/GitHub/luksa
  cd ~/GitHub/luksa
  git clone https://github.com/luksa/kubectl-plugins.git
  cd kubectl-plugins
  chmod +x kubectl-ssh
  kubectl get nodes
  ./kubectl-ssh node aks-nodepool1-20972701-vmss000000

Cleanup
  kubectl delete -f Kubernetes.yaml
  kubectl delete ns test-ns

Delete Cluster
  az aks delete --name stevepro-azraks                 \
     --resource-group stevepro-azraks-rg

Delete Group
  az group delete --name stevepro-azraks-rg --yes --no-wait
  az group delete --name NetworkWatcherRG --yes --no-wait

Summary
To summarize, we have setup CLIs for Azure, Amazon and Google and provisioned an Azure AKS Kubernetes cluster with end-to-end testing. Next, we will resume to provision clusters for Amazon EKS and Google GKE. This will be the topic of the next post.

Wednesday, January 1, 2025

Retrospective XVI

Last year, I conducted a simple retrospective for 2023. Therefore, here is a retrospective for year 2024.

2024 Achievements
  • Transfer all Windows and Linux keyboard shortcuts and muscle memory to new Mac Book Pro
  • Transfer all important Windows and Linux applications navigations for M1-powered MacBooks
  • Build GitLab CI/CD pipelines extending DevOps skillset and streamline collaborative workflow
  • Provision Kubernetes clusters for GitLab CI/CD pipelines e.g. Azure AKS, AWS-EKS, GCP-GKE
  • Configure Doom open source port for Windows and Linux to debug step thru the source code
  • Launch fulltime Python coding experience to learn AI focusing on RL Reinforcement Learning
  • Experiment with OpenAI Gym project for RL research and build Atari available environments
  • Investigate OpenAI Retro project for RL research on classic Sega 8-bit + 16-bit video games

Note: building OpenAI projects for classic Sega 8/16-bit video games integration is a big achievement!

2025 Objectives
  • Document DevOps managed clusters provisioning experience with AWS / Azure / GCP providers
  • Channel cloud computing knowledge toward software architecture or infrastructure certification
  • Harness Python potential power invoking C/C++ [PyBind11] with code magnitudes times faster
  • Extend OpenAI Gym and Retro projects for more Indie video games + Reinforcement Learning!

Artificial Intelligence
Artificial Intelligence refers to capability of machines to imitate human intelligence. AI empowers machines to acquire knowledge, adapt and independently make decisions like teaching a computer to act human like.

Machine Learning
AI involves a crucial element known as Machine Learning. ML is akin to training computers to improve tasks without providing detailed instructions. Machines utilize data to learn and enhance the performance without explicit programming and concentrates on creating algorithms for computers to learn from data to improve.

Deep Learning
Deep Learning involves artificial neural networks inspired by the human brain: mimicking how human brains work. DL excels at handling complex tasks and large datasets efficiently and achieves remarkable success in areas like natural language processing and computer vision despite complexity and interpretation challenges.

Generative AI
Generative AI is the latest innovation in the AI field. Instead of just identifying patterns GenAI goes one step further by actually attempting to produce new content that closely resembles what humans might create.

Outline
 Artificial Intelligence  Artificial Intelligence is the "big brain"
 Machine Learning  Machine Learning is its learning process
 Deep Learning  Deep Learning is its intricate wiring
 Generative AI  Generative AI is the creative spark


Gen AI and LLMs are revolutionizing our personal and professional lives From supercharged digital assistants to seemingly omniscient chatbots these technologies are driving a new era of convenience, productivity, and connectivity.

Traditional AI uses predictive models to classify data, recognize patterns, + predict outcomes within specific context whereas Gen AI models generate entirely new outputs rather than simply making predictions based on prior experience.

This shift from prediction to creation opens up new realms of innovation: in healthcare traditional predictive model can spot suspicious lesion in lung tissue MRI whereas GenAI could also determine the likelihood that patient will develop pneumonia or other lung diseases and offer treatment recommendations based on best practices gleaned from thousands of similar cases.

Example
GenAI powered healthcare chatbots can assist patients and healthcare providers and medical administrators:
 01. Symptom Checker  07. Mental Health Support
 02. Appointment Scheduling  08. Insurance and Billing Assistance
 03. Medication Reminders  09. Virtual Consultations and Telemedicine
 04. Health Tips and Preventive Care  10. Health Records Access
 05. Lab Results Interpretation  11. Emergency Triage
 06. Chronic Disease Management  

By leveraging conversational AI healthcare chatbot can improve patient engagement and provide real-time support and optimize the workflow for healthcare providers. Finally, Reinforcement Learning From Human Feedback RLHF can be integrated to further improve model performance over original pre-trained version!

Future
Artificial Intelligence is changing industries across the globe from healthcare and finance to marketing and logistics. As we enter 2025, the demand for skilled AI professionals continues to soar. Start out by building strong foundations in Python and understand key concepts such as machine learning and neural networks.

Therefore, whether an AI beginner or seasoned tech professional, here are the top 10 AI skills for success:
 No.  AI Skill Key Tools
 01  Machine Learning (ML) Scikit-learn, TensorFlow, PyTorch
 02  Deep Learning Keras, PyTorch, Google Colab
 03  Natural Language Processing (NLP) NLTK, SpaCy, GPT-based models (e.g., GPT-4)
 04  Data Science and Analytics NumPy, Pandas, Jupyter Notebooks
 05  Computer Vision OpenCV, YOLO (You Only Look Once), TensorFlow
 06  AI Ethics and Bias Mitigation AI Ethics Courses, Fairness Indicators (Google)
 07  AI Infrastructure and Cloud Computing Amazon Web Services, Microsoft Azure, Google Cloud AI
 08  Reinforcement Learning OpenAI Gym, TensorFlow Agents, Stable Baselines3
 09  AI Operations (MLOps) Docker, Kubernetes, Kubeflow, MLflow
 10  Generative AI Generative Adversarial Networks, DALL-E, GPT models

Finally, the GenAI market is poised to explode, growing to $1.3 trillion over the next 10 years from market size of just $40 billion in 2022. Therefore, it would be extraordinary to integrate GenAI to build content for OpenAI-based retro video games only to be trained by Reinforcement Learning algorithms to beat them J