Monday, August 31, 2020

Python Setup Cheat Sheet II

In the previous post we checked out Python Setup Cheat Sheet to download and install Python and setup an integrated development environment for Python programming. We installed open source Python distribution Anaconda used for data science. Let's continue setup to build artificial intelligence + machine learning apps.

Let's check it out!


Install VS Code
VS Code is an integrated development environment with useful plugins for Python and Data Science. Install VS Code from Anaconda navigator. Otherwise install VS Code for Windows and Mac OS/X and Linux directly.

Launch VS Code. Install Python extension for Visual Studio Code. Install other plugins for example Remote SSH to connect to Linux VM or Code Runner. Install VS Code Insiders to try pre-release version of VS Code.

Update global settings.json file to your particular preferences. Here are some common VS Code examples:
 SYSTEM  LOCATION
 Windows  %APPDATA%/Code/User/settings.json
 Mac OS/X  ~/Library/Application Support/Code/User/settings.json
 Linux  ~/.config/Code/User/settings.json
Note: VS Code Insiders location replace Code/User/settings.json with Code - Insiders/User/settings.json

settings.json
 {
    "workbench.colorTheme": "Visual Studio Light",
    "window.zoomLevel": 0,
    "editor.trimAutoWhitespace": false,
    "editor.renderIndentGuides": false,
    "editor.roundedSelection": false,
    "editor.suggestSelection": "first",
    "python.dataScience.sendSelectionToInteractiveWindow": true,
    "python.jediEnabled": false,
    "code-runner.clearPreviousOutput": true,
    "code-runner.runInTerminal": true,
    "code-runner.showExecutionMessage": false,
    "code-runner.respectShebang": false,
    "code-runner.defaultLanguage": "python3",
 }

 Windows  Mac OS/X + Linux
 {
    "python.pythonPath": "%USERPROFILE%/Anaconda3/python.exe",
    "code-runner.executorMap": {
        "python": "python.exe",
        "python.pythonPath": "%USERPROFILE%/Anaconda3",
    },
 }
 {
    "python.pythonPath": "/anaconda3/bin/python",
    "code-runner.executorMap": {
        "python": "python",
        "python.pythonPath": "/anaconda3/bin",
    },
 }


Hello VS Code
Create folder "HelloVScode". Launch VS Code. File | Open Folder | HelloVScode | Select Folder. Create simple "HelloWorld.py" file. Enter simple code print('Hello World'). Press F5 to debug Python script. Customize Run + Debug click "create a launch.json file". Click Python File: Debug the currently active Python file. Press F5.

IMPORTANT
Press Ctrl + Shift + P | Python: Select Interpreter. If creates new workspace settings.json + override Python interpreter user settings.json then do not check python.pythonPath into source control when deploying cross platform.

Virtual Environment
Unlike PyCharm, VS Code does not automagically create an isolated Python environment for new projects. Therefore, follow all instructions here to setup and activate virtual environment for Python using VS Code.

CommandNotFoundError
When running Python script using Anaconda you may get error CommandNotFoundError: Your shell has not been properly configured to use 'conda activate'. Terminal | Click "+" | Spawn new Terminal default "Conda".

Code Runner
Automate process by installing Code Runner plugin. Press F1 | Run Code first time to set Output to "Code". Ensure entries above in user settings.json for fast turnaround. Press Ctrl + Alt + N for each subsequent run!

IMPORTANT
On Mac OS/X if Ctrl + Alt + N does not work then you may have to swap Ctrl for Cmd in keybindings.json:
~/Library/Application Support/Code/User/keybindings.json
// Place your key bindings in this file to override the defaultsauto[]
[
    {
        "key": "cmd+alt+n",
        "command": "code-runner.run"
    },
    {
        "key": "cmd+alt+n",
        "command": "-code-runner.run"
    }
]


Code Sample
Let's test drive develop a simple code sample as a Python module that could be deployed as Python package.

IMPORTANT
A module is a single Python script file whereas a package is a collection of modules. A package is a directory of Python modules containing an additional __init__.py file to distinguish from a directory of Python scripts.

Create folder "PackageVScode". Launch VS Code. File | Open Folder | PackageVScode. Create New Folder "MyPackage". Create other top level folders, for example, Build + Docs etc. Create hidden .vscode folder.

Create sub folders src and tests beneath "MyPackage". Create requirements.txt file + setup.py beneath "MyPackage" also. Finally, create module.py and __init__.py under src and test_module.py under tests.

 test_module.py  module.py
 import unittest
 from MyPackage.src.module import add_one

 class TestSimple(unittest.TestCase):

    def test_add_one(self):
        result = add_one(5)
        self.assertEqual(result, 6)

 if __name__ == '__main__':
    unittest.main()
 def add_one(number):
    return number + 1

Add launch.json to .vscode folder. Accept default to launch current Python file from integrated terminal:
launch.json
 {
    "version": "0.2.0",
    "configurations": [
        {
            "name": "Python: Current File",
            "type": "python",
            "request": "launch",
            "program": "${file}",
            "console": "integratedTerminal"
        }
    ]
 }

Open test_module.py. Press F5 to debug. Output ModuleNotFoundError: No module named 'MyPackage'. In VS Code you must correctly set PYTHONPATH in order to debug step through and run local source code!

Add settings.json to .vscode folder. Configure Python source folders by configuring the env PYTHONPATH:
settings.json
 {
    "terminal.integrated.env.windows": {
        "PYTHONPATH": "${env:PYTHONPATH};${workspaceFolder}"
    },
    "terminal.integrated.env.osx": {
        "PYTHONPATH": "${env:PYTHONPATH}:${workspaceFolder}",
    },
    "terminal.integrated.env.linux": {
        "PYTHONPATH": "${env:PYTHONPATH}:${workspaceFolder}",
    },
    "python.envFile": "${workspaceFolder}/.env"
    ]
 }

Finally, you must add "hidden".env file beneath "PackageVScode" that sets workspace folder per platform:
 WORKSPACE_FOLDER=/Absolute/Path/to/PackageVScode/
 PYTHONPATH=${WORKSPACE_FOLDER}

IMPORTANT
The "hidden".env file "lives" locally to project and should NOT be checked into source code version control! It may also be necessary to close VS Code and re-open after configuring .env file. Windows must use "/".

Now press F5 to debug test code or press Ctrl + Alt + N for Code Runner. Alternatively, Terminal command:

python -m unittest discover MyPackage

IMPORTANT
If the Terminal reveals Run 0 tests then ensure package has __init__.py setup in every relevant sub folder. Finally, enter dependencies for requirements.txt file and setup.py e.g. numpy. Install package at terminal:
 setup.py  requirements.txt
 import setuptools

 with open("README.md", "r") as fh:
    LONG_DESCRIPTION = fh.read()

 setuptools.setup(
    name='MyPackage',
    version='0.1.2',
    description="My test package.",
    long_description=LONG_DESCRIPTION,
    long_description_content_type="text/markdown",
    packages=setuptools.find_packages(),
    install_requires=[
        "numpy>=1.17.1",
    ]
 )
 numpy>=1.17.1
pip install MyPackage/.

Finally, we could also replicate the unit test code directly on the REPL. Select Terminal tab. Enter commands:

 python
 >>> from MyPackage.src.module import add_one
 >>> result = add_one(5)
 >>> print(result)

Code Linting
Linting highlights syntactical and stylistic problems in Python source code which helps identify and correct subtle programming errors. In VS Code, navigate to Python script via Terminal e.g. Pylint test_module.py.

Alternatively, select Terminal and type specific linter like flake8 to check Python source code against PEP8 coding style programming errors. If flake8 is not installed then simply type pip install flake8 at Terminal.


Jupyter Notebooks
Notebooks are becoming the standard for prototyping and analysis for data scientists. Many cloud providers and Anaconda navigator offer machine learning and deep learning services in the form of Jupyter notebooks.

Launch Anaconda navigator | Choose Jupyter Notebook | Launch. After the browser launches create a New | Python 3 | Jupyter notebook. Follow tutorial. Change browser from Google Chrome to Firefox if any issues! Alternatively, create notebook in VS Code | Ctrl + Shift + P | Python: Create New Blank Jupyter Notebook.


Remote SSH
Often machine learning AI projects may require remote development in VS Code to use Windows to develop in a Linux-based environment. Install Remote SSH to run and debug Linux-based applications on Windows. Start | run | cmd. ssh username@linux_server. Enter passphrase. Enter verification code if setup for MFA.

SSH Keys
Use SSH key authentication and setup SSH keys to connect local Windows host and remote Linux VM server.
 Windows  Linux
 cd %USERPROFILE%
 cd .ssh
 ssh-keygen -C "username@emailaddress.com"
 ssh username@linux_server
 cd ~/.ssh
 ssh-keygen -C "username@emailaddress.com"

Dump the contents of id_rsa.pub file from local Windows host to authorized_keys file on Linux VM server:
cd ~/.ssh
echo "contents_of_Windows_id_rsa.pub_file" >> authorized_keys

Finally, configure %USERPROFILE%\.ssh\config file with Linux VM server information to alias connection.
 Host ENVIRONMENT
     HostName linux_server
     User username
     IdentityFile ~/.ssh/id_rsa

SSH Tunnel
Port forwarding via SSH Tunnel creates a secure connection between the local computer and remote machine through which services can be relayed. SSH Tunnel is useful for transmitting data over encrypted connection.
 ssh -N -L localhost:8787:LoadBalancer_External-IP:8000 username@linux_server

The same technique can be uesd to access Jupyter Notebook on Linux VM server. Launch terminal and enter:
 ssh username@linux_server
 cd ~/
 jupyter notebook --no-browser --port=8889
 ssh -N -L localhost:8889:localhost:8889 username@linux_server

WinSCP
Install WinSCP as popular SFTP client to navigate + copy files between local Windows and remote Linux VM. Launch WinSCP. Enter Host Name, Port number, User name, Password and verification code if setup for MFA.


Summary
To summarize, we have setup Python distribution Anaconda on Windows, Mac OS/X and Linux to now build artificial intelligence and machine learning apps. We are now set to develop machine learning models then deploy using Flask API. Apps can then be containerized using Docker and orchestrated using Kubernetes to significantly increase the efficiency of a Continuous Integration / Continuous Deployment infrastructure J