The CAF is a collection of proven tools and documentation, including best practices, reference architectures, and implementation guidance. This allows business and technology strategies to be aligned so that they accelerate cloud adoption in a controlled and governed manner. The focus for this content is the cloud architect, who is the conduit for discussions and activity between the business and operations teams, and acts as the thought leader for the organization.
The CAF provides various methodologies, as per the following diagram:
Figure 9.6 – Azure CAF methodologies
Let’s look at these in more detail:
Strategy: Define justification and outcomes.
Plan: Align business outcomes to actionable adoption plans.
Ready: Preparation of the cloud environment.
Migrate: Existing workloads move and are modernized.
Innovate: New workload development using cloud-native or hybrid solutions.
In this section, we looked at the CAF for Azure. The following section looks at the hands-on exercises for this chapter, which will help you build on the skills you’ve learned in this chapter.
Hands-on exercises
To support your learning with some practical skills, we will create some of the resources that were covered in this chapter.
The following exercises will be carried out:
Exercise 1 – assigning access with RBAC
Exercise 2 – creating a custom RBAC role
Exercise 3 – adding a resource lock to a resource group
Exercise 4 – enabling resource tagging with Azure Policy
Exercise 5 – limiting the resource creation location with Azure Policy
Getting started
To get started with these hands-on exercises, you will need an Azure subscription that has access to create and delete resources in the subscription; you can use an existing account that you have created as part of the exercises from any chapter in this book. Alternatively, you can create a free Azure account by going to https://azure.microsoft.com/free.
Task – removing the policy assignment from the previous exercise
Before starting this exercise, if you created the policy assignment for the previous exercise and have not deleted this yet, do so now by performing the following step.
From the Assignments blade, locate the assignment to delete from the list. Then, right-click and select Delete assignment from the pop-up menu.
Task – creating a policy assignment
In the search bar, type policy; click Policy from the results list.
From the Policy blade, click Assignments under Authoring on the left navigation menu.
Click Assign policy from the top toolbar.
From the Policy definition field, under Basics on the Basics tab, click the ellipsis button on the right-hand side of the text box.
From the Available Definitions page that appears, in the search box, enter allowed locations.
From the policy definition search results, click Allowed locations.
Click Select.
From the Parameters tab, select the allowed locations for resource creation.
Click Next: Review + create.
On the Review + create tab, review your settings; you may go back to the previous tabs and make any edits if required. Once you have confirmed your settings, click Create.
You will receive a notification that the policy assignment succeeded.
Task – testing the policy function
In the search bar, type virtual machines; click Virtual machines from the results list.
From the Virtual machines blade, click the + Create button via the top toolbar and select Virtual machine.
From the Basics tab, set the Project details as required.
From the Instance details tab, select a region that is NOT on the allowed location for the policy; this is so we can test the limits of the region that was set in the policy.
You will receive a notification about policy enforcement stating that in this example, the region selected does not match that allowed in the policy of a location that resources can be created in:
Policy enforcement. Value does not meet requirements on resource: Microsoft.Compute/virtualMachines
The field ‘Location’ with the value ‘(Europe) West Europe’ is denied
To remediate this, from the Instance details tab, select a region that IS in the allowed location for the policy; this is so we can test the limits of the region that was set in the policy.
You will no longer receive the policy enforcement message and will be allowed to continue with resource creation in the policy allowed location.
The final task is to clean up and delete the assigned policy that was created in this exercise; this can be achieved by performing the following step.
From the Assignments blade, locate the assignment to delete from the list. Then, right-click and select Delete assignment from the pop-up menu.
In this exercise, we successfully limited the resource creation location with Azure Policy.
This section covered the hands-on exercises for this chapter. The following section provides a summary of this chapter.
As we learned in Chapter 1, Introduction to Cloud Computing, security is a shared responsibility model. This means that certain responsibilities transfer to the cloud provider in a cloud environment operating model, while other responsibilities are retained by the customer; you should understand when it is your responsibility to provide the appropriate level of security and control, and when it is not your responsibility but instead that of the cloud services provider to ensure that their platform is kept compliant and your data is kept private.
The following security model diagram visually sets out the division or separation of responsibilities between the consumer of the cloud resources and the cloud services provider itself:
Figure 10.1 – Shared responsibility model
The most critical responsibilities to be aware of are the responsibilities that you, as the consumer of cloud services, always retain and your responsibility to secure and protect.
Security, compliance, privacy, and transparency are fundamental for a trust model and are the core tenets of Microsoft Online Services; the following diagram represents Microsoft’s trusted cloud principles:
Figure 10.2 – Microsoft trusted cloud principles
The preceding diagram shows that while it is your data and your control, Microsoft is responsible for delivering and operating a cloud services platform that will provide the data residency an organization needs, as well as ensuring it will keep that data secure, private, and compliant with recognized compliance and regulatory standards. These, however, are not just principles, but contractual guarantees.
In this section, we looked at Microsoft’s trusted cloud principles. The following sections look at how Microsoft delivers on these core tenets.
Trust Center
The Trust Center is a publicly accessible web portal that acts as a single point of focus for an organization that needs resources and in-depth information regarding the Microsoft principles of security, privacy, and compliance. The Trust Center can be accessed from https://www.microsoft.com/trust-center:
Figure 10.3 – Microsoft Trust Center
The Trust Center is a centralized place for any organization that needs information or resources on security, privacy, and compliance regarding Microsoft Online Services, not just Azure. The following section looks at the Microsoft Privacy Statement.
Attackers can take many forms, such as criminal hackers, hacktivists, competitors, and foreign nations. Don’t forget either that attackers are not only external; they can be internal to an organization—for example, ex-employees—these often being the hardest to detect and prevent. For further reading, you should enter Sly Dog gang into your favorite search engine to read about a real-world insider espionage attack on one of the highest-profile manufacturers of electric vehicles.
You must put in measures so that you don’t become an easy target for opportunists as well as the crafted, pre-meditated, military-style operation of some sophisticated attacks; these measures are designed to raise the attacker’s costs significantly, so they divert their resources and activities to an easier attack target that has a higher return on their attack investment.
The approach that should be taken is to adopt a threat priority model; this can then aid in identifying your threat priorities and where security investments should be made to reduce your costs of security operations and increase your attacker’s kill-chain costs. The following diagram aims to visualize this approach:
Figure 7.1 – Threat priority model
Any security approach must start from an inward look at your current security position and secure score. A secure score can be thought of like a credit-rating score you receive to see how likely you are to be accepted for a finance agreement, but in security terms, it looks at where you are on the attack vulnerability scale of 1 to 10, as it were; this score will indicate your security posture.
A security posture is an organization’s threat-protection and response capabilities; this ensures that an organization has the ability for systems, data, and identities to be recoverable and operational should an attack be successful. It is critical to understand that we cannot prevent or eliminate threats and attacks, and the fact is that an attacker only has to be successful once while you must protect everything, all the time. A security posture’s goal should be to reduce exposure to threats, shrinking attack surface areas and vectors while building resilience to attacks, as they cannot be eliminated.
A security strategy and security posture should use the guiding principles of Confidentiality, Integrity, and Availability, also referred to as the CIA triangle. There is no perfect threat prevention or security solution; there will always be a trade-off, and the CIA model is a way to think about that. The CIA model is a common industry model used by security professionals; it is not a Microsoft model. Let’s look at these guiding principles in more detail here:
Confidentiality—This is a requirement that sensitive data is kept protected and can only be accessed by those who should have access through the principle of least privilege (POLP). Confidentiality is about the confidence that the data cannot be accessed, read, or interpreted by anybody other than those intended to read and access this data; this can be achieved by encrypting the data. The encryption keys also need to be made confidential and available to those who need access to the data.
Integrity—This means that data transferred is the same as data received; the bytes sent are the same bytes received. Integrity is about the confidence that the data has not been altered from its original form or tampered with; this can be achieved by hashing the data. Malware can threaten the integrity of systems and data.
Availability—This means that data and systems are available to those that need them, including access to encryption keys, but in a secure and governed manner. Availability means a trade-off between the three sides of the triangle and a balance being made of being locked down for security but accessible for operational needs and productivity. A distributed-denial-of-service (DDoS) attack will threaten the availability of systems, data, and encryption keys.
The following screenshot represents the CIA triangle model:
Figure 7.2 – Security posture CIA triangle
The aim of an attack may be specific to an organization and may be different based on the form of the attacker—such as a criminal hacker, a foreign nation, a hacktivist, an opportunist, and so on. The aim may be to steal data, deface a website, alter the integrity of an app or a service, extort money through ransom, and so on.
There are two motivations of attackers, money or mission. The motivation is clearer for money-driven attacks and has a certain level of calculation by the attacker on their return on investment (ROI) before they give up and move on to another target. However, for mission-driven attacks, the rationale may be more opaque and less tangible of what is to be gained, and a mission attack is often more of a moral standard and a matter of ethics, principles, politics, and control than money. Thus, the attacks may be more sustained and the attackers determined to succeed at any cost, because the reward may not have a price that can be attached. The following diagram aims to visualize this approach:
Figure 7.3 – Attack motivations
We have learned about the types of attackers and their motivations; the following are some of the most common threats to protect against:
Ransomware—This is malware that will encrypt files and folders in an attempt to extort money.
Data breach—This includes phishing, spear phishing, Structured Query Language (SQL) injection, stealing passwords/bank details/other sensitive information, luring somebody to click a link, and opening a file.
Dictionary attack—This is an identity-theft attack, also known as a brute-force attack; known passwords are used against an account to steal an identity.
Disruptive attack—This is a network and workload attack; a DDoS attack attempts to make a network or workload unavailable by flooding it with requests and attempting to exhaust its resources.
Attackers plan and structure their attacks; this is so they can live undetected on the network and in the user’s systems without the victim being alerted. As the adage says, there are two types of organizations: those who have been compromised and those who don’t know yet.
Attacks follow a sequence or chain of events; this is known as an attack chain or a kill chain. The following diagram shows a common chain:
Figure 7.4 – Attack chain
When a user account is compromised, it can access the network and then work to elevate privileges to an admin account that can then move laterally within the network to access the data and execute activities such as steal, delete, corrupt, and encrypt data.
Through a Zero Trust and DiD approach to protecting assets, the goal is to prevent and disrupt this chain of events; we want to put multiple obstacles in the attacker’s way and increase their attack costs so that they will move on to launching an easier attack elsewhere that offers less resistance.
Security can often be seen as the anti-pattern of operations, availability, and productivity; you may have encountered overzealous security teams referred to as business prevention teams. Much as there have been silos and cultural divides between development and operations teams, there is often a divide between security and these teams.
Often, the feeling is that it’s the security team’s job to make things secure and protect code, data, systems—a not my job attitude, throwing it over the wall in an it’s the security team’s problem now culture.
Security must be in place before a single line of code is written, a system created, or data stored; a culture akin to Development-Operations (DevOps) of fostering trust between all teams and security teams must exist, and leaders must bring the notion and culture of Security-Development-Operations (SecDevOps) into an organization.
The bottom line is that security is not just somebody else’s problem, but everybody’s responsibility; and as they say… if you are not part of the solution, you are part of the problem.
From the search box at the top of the portal, enter Service Health and click on Service Health from the results.
On the Service Health page, there is a left-hand navigation menu. Here, you can view ACTIVE EVENTS, HISTORY, RESOURCE HEALTH, and ALERTS.
From Service issues, under the ACTIVE EVENTS section, you can launch a guided tour and view the current service issues or issues that have been resolved in the past 7 days that may be impacting your resources; the map will show color status dots for each region you have resources in. This can be seen in the following screenshot:
Figure 6.37 – Service Health
In the ACTIVE EVENTS section, you can also click through to see any Planned maintenance events that may impact your resources, Health advisories, and Security advisories; for each, there is a click-through link to see all past issues in the Health history section
In the Health History section, you can click on a service issue, find out more information about the impact, any updates, and download a summary report and a Root Cause Analysis (RCA) report.
In the Resource health section, you can view the health of any individual resource you have created within your subscription(s) so that you know if everything is running as expected. If any problems are impacting its running, you will be told and actions you can take will be provided. For example, if a VPN gateway is not running normally, a hyperlink will be provided so that you can reset it.
In the Health alerts section, you can add a service health alert rule so that a notification is sent based on a set of service health criteria.
With that, we’ve covered the hands-on exercises for this chapter. Now, let’s summarize what we’ve learned.
In this section, you’ll learn how to install the Azure CLI.
The following steps must be carried out on the OS of a machine you have admin access to; this could be physical or virtual. We are using a Windows 10 device for this exercise – Windows 10 Pro 20H2 using PowerShell 7 for reference. Once installed, the CLI can be accessed via PowerShell or the Windows Command Prompt (CMD):
From your device, search for Windows PowerShell 7 and click Open.
The following screenshot shows the installation progress:
Figure 6.20 – Installing the Azure CLI via PowerShell
2. Close PowerShell and reopen it, Then, enter the following command to run the CLI from PowerShell:
az login
3. When the pop-up Azure sign-in page appears, as shown in the following screenshot, sign in:
Figure 6.21 – Azure Sign in page
4. Enter the following command to check the version of the CLI that’s been installed:
az version
The output will look as follows:
Figure 6.22 – CLI version
5. Enter the following command to update to the latest version:
az upgrade
6. The output will look as follows:
Figure 6.23 – CLI upgrade
7. The command’s output in the preceding screenshot shows that the latest updates have been installed and that no is upgrade available; this final command concludes this exercise.
We looked at installing the Azure CLI in this exercise. In the following exercise, we will create a resource group and a virtual machine using PowerShell from Cloud Shell.
This section introduces the core solutions available in Azure to protect and secure the network and applications running in Azure; this section also covers solutions that, while not part of the exam objectives, have been included with brief coverage as they should be considered required knowledge for a day-to-day Azure role.
NSGs
An NSG is a network security control and should be part of your DoD approach to protecting the network layer from network threats.
An NSG controls access, limits connections to virtual machines (VMs) in an Azure Virtual Network (VNet), and uses a deny-by-default policy; this means that all access is denied unless explicitly allowed. The following diagram shows a simplification of this:
Figure 7.7 – VM access
In the preceding diagram, Subnet 1 has no traffic filtering in place, so you would be able to connect to Windows VM1 using Remote Desktop Protocol (RDP) on port 3389, and so can an attacker; the most common attack is a brute-force attack to connect to a VM using unsecured management ports—that is, port 3389 for a Windows VM and port 22 for a Linux VM.
Windows VM2, however, has an NSG applied at the subnet level and so, by default, will filter traffic and deny access when attempting to connect using RDP on port 3389.
An NSG uses a collection of inbound and outbound rules to filter network traffic, in much the same way a traditional packet filter appliance firewall does; it evaluates five data points (referred to as the 5-tuple method) to evaluate whether access is allowed or denied.
An NSG will not encrypt inbound or outbound traffic; it is used for filtering traffic to and from a VM by setting the following five data points:
Source of the traffic
Source port used by the traffic
Destination of the traffic
Destination port used by the traffic
Protocol used by the traffic
The preceding data points will determine if a connection can be made; the NSG will provide an action to be taken by a rule (allow or deny) and apply a priority. Each rule is given a number—the lowest-number rules will be processed first.
Any troubleshooting for not being able to access a VM should start by determining the ports and protocols required to establish the communication, then identifying if they are being filtered or if they are blocked.
I use an adage that says: 90% of the time, it’s a ports or protocol (or permissions) issue, and so is the other 10%.
There are a set of default rules for an NSG (which cannot be removed or disabled); these specify which source and destination will be able to access resources to and from the VNets and specify the port and protocol that will be allowed or denied. For a machine to be accessible from the internet on the chosen port, you must ensure that there are rules added to an NSG to allow these ports for communication; by default, all inbound traffic that doesn’t meet one of the default rules will be blocked.
An NSG can be associated with a VM network interface controller (NIC) and a subnet (but not a VNet); the same NSG can be associated with multiple subnets and NICs but a subnet and a NIC can only have one NSG associated. The following diagram aims to visualize this:
Figure 7.8 – NSG association
An NSG can only be used to control access and filter traffic for resources in the same region and subscription as the NSG; if you wish to control access and filter traffic for resources across multiple regions or VNets, then Azure Firewall is required for this type of centralized security control of highly distributed networks.
This section looked at NSGs, which can be used as part of a DiD strategy. The following section looks at Azure Firewall, another network security control that can be implemented.
Azure Firewall is a cloud-based and Microsoft-managed network security service; it allows centralized (L3-L7) connectivity policies and control of network and application traffic across all VNets, across multiple regions and subscriptions. Being an Azure managed service, it has built-in high availability (HA).
It provides control of traffic through user-defined routing (UDR) and can create segmentation of networks when required for regulatory compliance and when adopting a DiD strategy, as well as implementing a Zero Trust framework. The following diagram provides a typical reference architecture for an Azure firewall to protect resources from attack and control traffic flow:
Figure 7.9 – Azure Firewall
The Azure Firewall service is applied at the VNet level and not the VM network interface or subnet level, as in the case of an NSG. It can filter and control all incoming traffic and connections to resources across all VNets that the Azure Firewall service is securing, as well as outgoing traffic to other VNets, services, internet, third-party service providers (SPs), and on-prem sites.
Azure Firewall provides inbound destination network address translation (DNAT) and outbound source NAT (SNAT); it can have multiple Public Internet Protocol (IP) addresses.
Azure Firewall also has a premium stock-keeping unit (SKU) that includes next-generation capabilities such as Transport Layer Security (TLS) inspection, Intrusion Detection Systems (IDS), Intrusion Prevention Systems (IPS), Uniform Resource Locator (URL) filtering, web categories; these are requirements of regulated and highly sensitive environments.
In contrast, an NSG can only be applied at the subnet or VM interface level; the NSG traffic control method can only be associated with a resource in the same region and subscription, so this becomes decentralized and hard to manage and troubleshoot.
It is also important to consider that the Azure Firewall service is not the only method of controlling and securing network and application traffic in Azure; you should also consider network virtual appliances (NVAs) from third-party vendors that are available through the Azure Marketplace, such as Barracuda, Fortinet, Palo Alto, WatchGuard, Cisco, SonicWall, and so on.
NVAs are VMs that you create in Azure and are run a vendor’s software image of their network appliance to perform a network function such as a firewall, IDS/IPS, virtual private network (VPN), software-defined wide-area network (SD-WAN), and so on.
Further information and best practices can be found at the following links:
This section looked at the Azure Firewall service for securing and controlling network traffic. The following section looks at the Azure DDoS protection service.