Restoring Reliable ServiceNow Performance for WFH Users
The cloud-based ServiceNow solution is widely recognized as an essential part of an IT Service Management (ITSM) ecosystem and is used by many of today’s global enterprises.
This Use Case example focuses on how one IT Operations team used NETSCOUT Smart Edge Monitoring and our “Visibility Without Borders” approach to restore reliable ServiceNow performance to a VIP “power user” who had recently moved from headquarters to a work-from-home (WFH) environment.
This enterprise had recently transitioned staff members from corporate facilities back to WFH environments and smaller remote offices to assure continued employee safety during the extended COVID-19 pandemic.
Throughout these collective remote workforce transitions, the IT Operations team had strived for consistent client edge visibility in order to assure high-quality user experience on Software as a Service (SaaS) platforms, such as ServiceNow, Microsoft Office 365, Cisco Webex, Salesforce, and others. However, when employees moved from headquarters to WFH and remote office environments during the latest hybrid workforce transition, IT Operations began experiencing intermittent “blind spots” along the client edge. In a related instance, a Help Desk ticket had recently been escalated to IT Operations for rapid investigation. An enterprise IT “power user” located in their Indianapolis business region had reported degraded ServiceNow performance, which had quickly impacted this employee’s essential ITSM workflow activities.
IT Operations teams rely on ServiceNow as an integral part of their ITSM ecosystem for everyday technology management and customer support of incidents, problems, and infrastructure changes, including Help Desk functionality. Delays, degradations, unavailability, and / or log-in failures with ServiceNow when an IT professional is trying to use the system impacts their ability to do their jobs. This negatively affects IT productivity and end-user experience with ServiceNow, possibly including any ITSM-related project delays.
What the organization needed was visibility to assure availability, responsive performance, and quality user experience with the ITSM platform.
IT Operations had implemented the NETSCOUT® Smart Edge Monitoring solution, which had expanded visibility into the enterprise client edges of the network supporting both WFH and remote office users dependent upon reliable ServiceNow access and performance.
In commencing ServiceNow troubleshooting, IT Operations had access to Smart Edge Monitoring workflows that leveraged both NETSCOUT nGeniusONE® analytics and nGenius®PULSE nPoint Business Transaction Testing metrics to assess the quality of user experience along the client edge. Given the geographic location reported in the Help Desk ticket, IT Operations’ troubleshooting began with the nGeniusPULSE Sites Overview Dashboard, which showed the scope and impact of ServiceNow performance degradations, which in this case included numerous users in the Indianapolis region.
Figure 1: This nGeniusPULSE Business Services Overview Dashboard showed ServiceNow performance degradation impacting Indianapolis-area employees.
By contextually drilling into the Indianapolis “hot spot” in the Sites Overview Dashboard, IT Operations transitioned the workflow into the nGeniusPULSE Business Services Overview Dashboard exhibited in Figure 1, which presented a finer-grained view of ServiceNow performance across all of the company’s WFH, remote office, and corporate locations.
At this point, IT Operations derived valuable nGeniusPULSE analysis from nPoint synthetic testing that corresponded to the ServiceNow degradation details captured on the initial Help Desk ticket. Specifically, ServiceNow Web Test results showed all four Indianapolis employees were based in WFH environments and not remote offices, which as a result excluded corporate locations from subsequent troubleshooting. The Web Test results also showed that only one out of the four Indianapolis WFH users was experiencing degraded service – with nGeniusPULSE Dashboard displaying the name of the very same IT employee who opened the original ServiceNow Help Desk ticket!
Figure 2: The ServiceNow Web Test Log showed very high DNS response times, which prompted IT Operations to transition troubleshooting into nGeniusONE for deep-dive, packet-based analysis.
Figure 3: By hovering over DNS Session Analysis, IT Operations saw two DNS servers involved in a single transaction.
By using the nGenius nPoint sensor deployed on the employee’s laptop, IT Operations executed nGeniusPULSE synthetic Web Tests that assessed the WFH user experience on ServiceNow, with these results showing a 12+ second delay. Given the ITSM project activities involving this employee, that performance was unacceptable and adversely impacted productivity.
By next viewing the Web Test log itself, IT Operations was presented with nGeniusPULSE end-user experience analytics that quickly showed high DNS response times as root cause of the ServiceNow access delay. As exhibited in Figure 2, those same results provided evidence that excluded other service dependencies along the WFH client edge environment, including network, application, SSL, or server.
For this IT team, an essential element of the next-stage troubleshooting workflow involved direct access to the Smart Edge Monitoring solution’s integrated packet capture and analysis capabilities from both nPoint sensors and InfiniStreamNG® (ISNG) appliances with NETSCOUT Cloud Adaptors deployed across the company’s data center and remote office locations. By accessing NETSCOUT smart data generated by patented Adaptive Service Intelligence® (ASI) technology generated from this network packet traffic, IT then had a true, end-to-end view into this user’s ServiceNow experience on a transactional basis.
In a critical stage of root cause analysis, the nGeniusONE Universal Monitor view provided IT Operations with visualization into the collective nPoint transactions involved in a single ServiceNow Web Test, which enabled a Session Analysis view for deep-dive forensic analytics. Very quickly, IT Operations use of nGeniusONE Session Analysis showed that an initial DNS query had failed to get a response and timed out, which resulted in another DNS query sent more than 4 seconds later. As exhibited in Figure 3, two DNS servers were involved in this transaction.
IT Operations analysis of the nGeniusONE Session Trace also showed an HTTP redirect, which triggered another failed DNS query, timeout, and second query. By the time the successful HTTP connection was made for the ServiceNow Web Test transaction, more than 9 seconds had expired due to DNS issues.
IT Operations’ final-state troubleshooting showed root cause: the employee’s PC was misconfigured with a bad DNS server, which promoted most Web transactions to be delayed.
As a result of NETSCOUT Smart Edge Monitoring visibility and analytics, IT Operations was provided with a very straightforward remediation that involved configuring the IT user’s laptop with the correct primary DNS server.
In today’s IT Operations efforts to visualize and manage high-quality experience along the client edge, it is frequently unclear whether a service degradation in a WFH environment is very limited to a particular user community or widespread. The abilities to quickly focus on the actual affected users and move on to evaluate the transaction path for identification of the actual source of a degradation are critical in reducing the mean-time-to-knowledge (MTTK) and mean-time-to-repair (MTTR).
This Use Case shows the value of NETSCOUT’s Visibility without Borders approach and the user-level, environment-specific precision of our Smart Edge Monitoring analytics.