Tags: XenApp

GSLB for NetScaler Gateway across Azure Locations

In this post I’ll be going through how I have configured GSLB for NetScaler Gateway in Azure, and the various elements required for this type of configuration.

Firstly – I began by setting up the background infrastructure to demonstrate this test. Namely, 2x Active Directory DCs within two Azure Locations (Eastern US (EUS), and South Central US (SCUS)). These were on a Virtual Network setup for each location, joined with a Site to Site VPN utilizing Virtual Network Gateways. Next, I created a simple Citrix Environment that spanned the sites, ensuring I had resources in both sites – to properly demonstrate the failover.  An overview of the background infrastructure is shown below:

Next, I spun up two NetScaler BYOL versions:

These will form the bulk of the work where configuration will take place – in providing NetScaler Gateway via GSLB. I’m using Platinum Licensing, but Enterprise would also be fine (so that GSLB is available).

My NetScalers will all have multiple IP addresses assigned, for the SNIP, GSLB Site, and Gateway VIP:

Well worth a read at this point is CTP Gareth Carson’s awesome blog post around NetScaler deployment in Azure.

The next step for me was to setup 2x NetScaler Gateway vServers, which will be used for external access, and will be the vServers that will be provided by GSLB:

East US NetScaler:

South Central US NetScaler:

Next – I setup Authoritative DNS Listeners on both NetScalers, making use of the Subnet IP for this service (East US NetScaler shown below) :

So – next we can setup the GSLB Sites, to enable the synchronisation of GSLB information via Metric Exchange Protocol. It’s just a case of adding one local and one remote site on each NetScaler, and using the GSLB Site IPs. Once this is completed, the NetScalers will show as follows:

East US:

South Central US:

Once these are setup and showing as Active – we have communication between the NetScalers. This is carried out using Metric Exchange Protocol (MEP). Next – we need to configure GSLB, but first I am going to create the Public IP addresses for each site, as this makes the GSLB Setup easier! We will need two IP Addresses (one for each site), and TCP 443 (Gateway), and UDP 53 (ADNS) will need to be NAT’d through for this configuration to work. See the diagram below for an overview of this:

This is easy to configure – firstly we setup two Load Balancers, each with a Public IP (Static) assigned:

Ensure that Static is selected for the Public IP – otherwise this may change and then the solution will stop working:

Once this is created for both sites – we will have two Load Balancers ready to use for our NAT requirements:

At this point, I like to update the Network Security Groups on the NetScalers to ensure that the required inbound rules are in place. For both NetScalers we need to allow HTTPS and DNS inbound from anywhere (as these will be internet facing):

Once complete, we can then NAT through the required ports (TCP 443, and UDP 53) by creating an inbound NAT rule. Remember, you’ll need to do this twice on each Load Balancer – so both ports are forwarded to the NetScaler on the Load Balancer site.

Once these are all in place – we can test the NetScaler Gateway is up and accessible by visiting https:// and then the Load Balancer IP for each site. Before we configure GSLB on the NetScaler – we need to delegate the DNS Zone. This will vary depending on how your External DNS Servers are setup. Essentially – we need lookups for the Gateway URL to be handled by the NetScaler appliances. This means any lookups for the Gateway IP need to be delegated to the NetScaler appliances – so that they can provide the URL for the Active GSLB Site.

I’m using Azure DNS – so I have a zone setup. My URL is desktop.jwnetworks.co.uk – so I will be delegating control of this to the NetScalers. To start – create two A Records, one for each NetScaler, and these need to be pointed at the NAT’d IP. These will be used for the DNS Lookups. We then need to create a new NS Record for our Gateway Domain Name, with a 5 Second TTL, and pointing it at the A Records for the NetScalers we configured above:

Now that this is in place – it’s time to configure our GSLB Configuration! This can be done from one of the NetScalers and then propagated to the other via Configuration Sync. I’ll therefore carry this out on the EUS NetScaler – as you can see below, we have only the sites setup for GSLB (as we did this earlier):

We start by clicking the “Configure GSLB” button, and then run through the Wizard – I am going to run with an Active/Passive site:

We then click OK, and we are presented with the GSLB Sites pane – but we have already configured this:

We can click continue, and we then need to setup the GSLB Services. So these are the two NetScaler Gateways we are using to provide this service. The key here is that we need to make sure that the Public IP addresses are listed – EUS is shown below:

We then click Create, and then repeat this process for the SCUS Site – making sure that the Public IP address is entered again. Once this is done, we will have a Local and Remote Site Configured:

Next you will be prompted to create a GSLB Backup vServer – but I’m not going to create that as part of this Proof of Concept.

Next – we create the GSLB vServer. This is just the entry point for traffic being Load Balanced by GSLB. Note – ensure that you pick an appropriate load balancing method, and then click continue after filling out your details:

Next – click on Save, and the configuration is done! We can now sync the config across to the other NetScaler. This is done by clicking on “Auto Synchronisation GSLB” from the GSLB Dashboard:

Once this completes successfully – we can test our configuration! To start – we can do an nslookup, and set type=ns. This will tell us that the NameServers are correctly configured:

As you can see – the nslookup is returning all the expected information. Because we configured Active/Passive – the A record returned for a normal (A record) nslookup is that of the EUS NetScaler. Next – we can test that the Gateway Page is working and accessible:

Bingo – all good so far! Next – let’s try shutting down the EUS NetScaler and see if things are still working as expected. At this point the IP address returned should change from that of the EUS Load Balancer IP, to the SCUS Load Balancer IP:

Before EUS NetScaler Shutdown:

After EUS NetScaler Shutdown:

As you can see this works as expected, and after a page refresh, the NetScaler Gateway page is shown again:

This means that during a failure condition affecting the EUS NetScaler, requests for the Gateway URL will be directed (via DNS) to the SCUS Site. This provides Data Centre level failover for Gateway Services, making use of native Azure Load Balancers, and a single NetScaler on each site. This solution is suitable for pretty much any service accessed via a Web Browser – GSLB can be used in this way to fail NetScaler Gateway services over between Azure Sites or to distribute other traffic types as required.

 

 

 

 

XenDesktop Site Failover – asking the community…

Recently I’ve been doing a lot of work on large deployments that require active/active or active/passive setups, whereby options to fail over to a DR site are either required as part of the design, or presented as future enhancement to the customer. Most of these have been fairly open questions – “How can we achieve this?” for example. It’s a question that is almost completely subjective; it depends entirely on business needs, and what the available budget is.

Subjective elements aside, it is a much debated technical area, so I opened up a question on the MyCUGC forums to ask the community how they were going about this. I also tweeted the question out @jakewalsh90:

I based my question around the concept that is most common (certainly to me at least) – an active/active or active/passive design, with a primary site and a secondary (DR/Backup) site. This is without a doubt the most common environment type that I encounter, predominantly in small and medium enterprises up to around 5000 users.

The main purpose of this post is to summarize the elements (both technical and strategic) that could be considered, and the different options we can lean on to help achieve the desired results. And also, to highlight just how good the response from the Citrix Community was on this question!

Key Considerations

By far the most common point that came out of the discussion around this was – “it depends”. There are a great number of factors to consider for any solution like this, including:

  • Budget – what is affordable and achievable with our budget?
  • Connectivity – are we limited by latency/bandwidth/other traffic etc? Are we using Dark Fiber, MPLS, VPN etc?
  • DC Locations – if we are planning for a Secondary/DR site, is it likely this would ever be affected by an issue that took down our primary site? (Hurricanes, Floods, Earthquakes etc.)
  • Capacity – is this a full DR/Secondary solution or just a subset of applications and users?
  • Hardware – do we have the hardware to achieve this? Is it within our budget?
  • Software – can we do this within our current licensing or do we need an uplift?
  • Applications – are we replicating everything or just key applications? How will these applications perform in another DC? (Applications may have web/database dependencies based only in a single site).
  • User Data – are we replicating user data too? How are profiles going to be handled?
  • Failover method – are we utilizing a Citrix solution for this, or perhaps a product like VMware Site Recovery Manager? How is failover undertaken – automatic? manual?

Citrix Considerations

Aside from the many other factors affecting a question like this, our discussion focused on the Citrix technical elements aimed at DR/Failover options available. I’ve highlighted the key points we discussed, and gathered a number of resources that I think are helpful in discovering these further:

 

GSLB via NetScaler for StoreFront (Access Layer) – this was a common theme throughout the discussions, and there seems to be a general consensus that utilising GSLB on NetScaler is a logical way forward. Creating an access layer that utilizes NetScaler GSLB and StoreFront, whilst spanning the DC’s, will give a solution that is resilient and reliable, and won’t require complex replication/management. Dave Brett has written an excellent article on setting this up.

 

 XenDesktop Site with ZonesZones in XenDesktop are an awesome way to split geographically (or logically) separate resources, whilst maintaining the ease of management and reduced overhead of only having a single farm. Utilizing Zoning to form an active/active or active/passive solution is simple in configuration terms too. With Zones users can be automatically redirected to a secondary zone VDA during the failure of a their primary zone VDA.

 

Local Host Cache – as I am sure you are aware, Local Host Cache is now back in XenDesktop, and provides additional tolerance for database outages. LHC allows connection brokering operations to take place when the following issues occur:

The connection between a Delivery Controller and the Site database fails in an on-premises Citrix environment.

The WAN link between the Site and the Citrix control plane fails in a Citrix Cloud environment.

See https://docs.citrix.com/en-us/xenapp-and-xendesktop/7-12/manage-deployment/local-host-cache.html for further details on LHC.

You can check to see if LHC is on by running the following PowerShell: Get-BrokerSite. I’m running 7.15 in my lab so it is enabled by default:

 

SQL Options – SQL is a key component of the FMA architecture – so any solution (with or without DR/Failover) needs a reliable solution for hosting Site Databases. Usually my go to solution is to mirror any databases using SQL Mirroring. AlwaysON Failover Clustering, and Always On AvailabilityGroups are both possible solutions – particularly given that Database Mirroring is being deprecated.

When DR is considered this opens up additional hardware and software requirements to provide suitable hardware and SQL Server licensing.

See page 101-102 of the Updated Citrix VDI Handbook for further information on SQL redundancy and replication options: http://docs.citrix.com/content/dam/docs/en-us/xenapp-xendesktop/7-15-ltsr/downloads/Citrix%20VDI%20Handbook%207.15%20LTSR.pdf

 

Using StoreFront to handle the Failover (Site/Delivery Controller Level) – From StoreFront 3.6 it has been possible to Load Balance Resources across controllers, allowing StoreFront to effectively handle failover between XenDesktop Farms. (See https://www.citrix.com/blogs/2016/09/07/storefront-multi-site-settings-part-2/ for more details on this)

This method allows us to have two XenDesktop Farms – and to publish identical resources which are then load balanced by the StoreFront server. Failover would only occur in the event that a Delivery Controller was unavailable in the primary site. This solution would still allow for a GSLB approach with StoreFront and NetScaler too.

The main disadvantage of this approach is the increased management overhead of the additional XenDesktop Farm, but this can be managed by having good practices in place.

This is configured in the Delivery Controller section of a StoreFront site – and requires both farms to publish the resources required for failover. See below – two Farms configured in the Delivery Controller section within a StoreFront site:

We also need to configure the “User Mapping and Multi-Site Aggregation Configuration”. Note that below I have configured all Delivery Controllers to for “Everyone” – but this may need to be adjusted in a production environment:

You will also need to configure resource aggregation as below. For failover, do not tick “Load Balance resources across controllers”. However, “Controllers publish identical resources” will need to be ticked so that identically named published applications or desktops are de-duplicated:

With this set, any resources published in both farms will be launched from the Secondary Site in the event that the Delivery Controllers in the first site fail to respond.

 

Application Level Failover using Application Group Priorities – it is also possible to use application groups with priorities to control the failover of applications. When you configure an application group in XenDesktop 7.9+ you are able to configure this:

Gareth Carson has a great blog post on this which explains the functionality in more detail.

In Conclusion…

Hopefully this post has been helpful in highlighting some of the considerations for a DR/Second Site scenario. And also, has helped to highlight some of the Citrix technologies and great community resources out there to help make the process a little easier. It’s been useful for me to ask the question and compile a post like this because I’ve had to look into the various technologies and find out more about them in my own lab before writing this… until next time, cheers!

 

 

 

Citrix Workspace Environment Management – Process Management

After testing out the excellent CPU and Memory management features in Citrix Workspace Environment Management (WEM), I wanted to blog about how processes can be controlled using the software.

Prior to starting this test, I have a basic Citrix XenDesktop environment configured, a WEM environment configured, and the relevant group policies in place to support this.

To prevent processes from running, we browse to System Optimization, and then Process Management:

From here we can enable process management:

Next – we have two options, we can whitelist, or blacklist. If we whitelist – only those executables listed will be allowed to run, whereas a blacklist will block only those listed.

I’m going to test out a blacklist:

We can exclude local administrators, and also choose to exclude specified groups – for example perhaps a trusted subset of users or specific groups of users who need to run some of the applications we wish to block.

For this test I am going to add notepad.exe to the list:

Next I saved the WEM configuration, refreshed the cache, and then logged into a Desktop Session to test the blacklist. Upon firing up notepad I am greeted with the message:

Bingo – a simple and effective way to block processes from running. This would be very effective when combined with a list of known malicious executables for example, or known problematic software items.

In a future release I’d love to see more granularity in this feature – for example blacklists, with the ability to whitelist processes for certain groups, rather than as a whole. This would enable control of applications on a much more granular level – for example, blocking “process.exe” for Domain Users, but allowing it for a trusted group of users.