Docs » Using the µAPM User Interface » The Traces Page

The Traces Page

The Traces page lets you “slice and dice” your trace data to view traces for specific services, endpoints, tags, and duration. The Outlier Analyzer makes it easy to quickly filter your traces down to those that are most likely contributing to any problems.

Viewing traces

To see all the most recent traces, hover over µAPM on the navigation bar and select Traces. (On smaller screens, you may need to scroll down to see all the information below the service map.)

../../_images/traces-page.png

You can also get to the Traces page from other locations, with a context based on where you are starting.

  • From the Services page, click a service icon in the map or expand a service in the table, then select View traces from the dropdown menu in the service map. The Traces page will be filtered by the specified service.
../../_images/service-in-table.png
  • From the Actions menu on a chart, click View traces from this time window. The Traces page will have the same time range as the chart.
../../_images/from-chart-actions-menu.png
  • When viewing an alert message, click View more details if necessary, then click View traces from this time window. The Traces page will have a time range that encompasses the time when the alert was triggered.
../../_images/from-alert-modal.png

Which traces are retained by µAPM

While we retain metrics for every trace and span sent to SignalFx, the Smart Gateway determines which traces are shown in the Traces page through the use of our NoSample™ Tail-Based Distributed Tracing mechanism, which prefers interesting or anomalous traces over “normal” traces. Three primary factors influence the likelihood of a trace being kept:

  • its duration (how slow it is compared to traces for the same execution path)
  • whether it contains errors
  • the frequency of execution

The amount of time allowed for a trace to complete (at which point the decision whether to retain the trace is made) is based on the recent history of durations of similar traces.

Filtering the Traces page

Because of how traces are generally used when troubleshooting a problem, filtering the Traces page works differently than it does in charts or dashboards. Specifying multiple options for Service or for Endpoint/Operation ANDs the terms together. For example, specifying the cart service narrows down the view to traces that contain cart service, as expected.

../../_images/and-filter-01.png

Modifying the filter to specify both cart and checkout narrows down the view to traces that contain both the cart and checkout services. As you can see, fewer traces match these criteria.

../../_images/and-filter-02.png

The Endpoint/Operation filter works the same way.

However, specifying multiple tags in the Filter field ORs the terms together, similar to how filters operate in charts and dashboards. For example, the options show below will include traces where the value of the amount tag is 13.95 or 14.99.

../../_images/or-filter.png

Using the Latency Distribution chart

If you click on a bucket or click and drag across an area in the Latency Distribution chart, the page is filtered to display only traces with durations in the same time range as the selection. In this case, we clicked in the area shown below because we wanted to see more details about some of the traces that were above the P99 value.

../../_images/latency>99.png

The traces shown at right are narrowed down to include only those with the specified duration. We can now click one of the traces at right to see it in the View a trace page.

../../_images/latency-filter.png

Using the Errors/Requests chart

If you click on a bucket or click and drag across an area in the Errors/Request chart, the time range becomes shorter and you can see more information on individual traces. In this case, we clicked in the area shown below because we wanted to see more details about the error shown.

../../_images/drill-down-1.png

The chart’s time range is narrowed down from the original 15 minutes to several seconds. Depending on the starting time range, you may have to click in the chart more than once to narrow the view down to a single trace.

../../_images/drill-down-2.png

Clicking on the trace error filters the page to that individual trace. We can now click on the error trace at right to see it in the View a trace page.

../../_images/drill-down-3.png

Analyzing outliers


Available in SignalFx Enterprise Edition. Outlier Analyzer is currently in beta.


The Outlier Analyzer makes it easy to quickly filter your traces down to those that are most likely contributing to any problems. When you click Analyze Outliers, two additional sections are displayed above the list of traces on the right: Top Operations and Top Tags.

  • Top Operations aggregates the durations of the operations across the longest traces in the data set (above the 90th percentile). The list is sorted by % of duration.
  • Top Tags displays the tags used more commonly in traces having the longest durations (above the 90th percentile). The tags at the top are more common in the long traces than they are in the trace set in general. Hovering over a tag shows where traces with that tag appear in the Latency Distribution chart.
../../_images/analyzer-hover.png

As you can see in the illustration below, the catalog: stock_db operation takes over 50% of the total duration.

../../_images/analyzer-01.png

Clicking on that operation filters the view by service and operation. In this case, at least one trace for that service and operation has an error.

../../_images/analyzer-02-filtered.png

Expand the view and scroll if necessary until you see the trace with the error. (In a different scenario, there may not be an error, but you could scroll through the traces to find the one with the longest duration.)

../../_images/analyzer-03-error.png

Click the trace to analyze it in the View a trace page.

../../_images/analyzer-04-view-trace.png