Guidelines for Performance Monitoring

This document covers how SDKs should add support for Performance Monitoring with Distributed Tracing.

This should give an overview of the APIs that SDKs need to implement, without mandating internal implementation details.

Reference implementations:

SDK Configuration

Tracing is enabled by setting either one of two new SDK config options, tracesSampleRate and tracesSampler. If not set, both default to undefined, making tracing opt-in.

tracesSampleRate

This should be a float/double between 0.0 and 1.0 (inclusive) and represents the percentage chance that any given transaction will be sent to Sentry. So, barring outside influence, 0.0 is a 0% chance (none will be sent) and 1.0 is a 100% chance (all will be sent). This rate applies equally to all transactions; in other words, each transaction should have the same random chance of ending up with sampled = true, equal to the tracesSampleRate.

See more about how sampling should be performed below.

tracesSampler

This should be a callback, called when a transaction is started, which will be given a samplingContext object and which should return a sample rate between 0.0 and 1.0 for the transaction in question. This sample rate should behave the same way as the tracesSampleRate above, with the difference that it only applies to the newly-created transaction, such that different transactions can be sampled at different rates. Returning 0.0 should force the transaction to be dropped (set to sampled = false) and returning 1.0 should force the transaction to be sent (set sampled = true).

Optionally, the tracesSampler callback can also return a boolean to force a sampling decision (with false equivalent to 0.0 and true equivalent to 1.0). If returning two different datatypes isn't an option in the implementing language, this possibility can safely be omitted.

See more about how sampling should be performed below.

maxSpans

Because transaction payloads have a maximum size enforced on the ingestion side, SDKs should limit the number of spans that are attached to a transaction. This is similar to how breadcrumbs and other arbitrarily sized lists are limited to prevent accidental misuse. If new spans are added once the maximum is reached, the SDK should drop the spans and ideally use the internal logging to help debugging.

The maxSpans should be implemented as an internal, non-configurable, constant that defaults to 1000. It may become configurable if there is justification for that in a given platform.

The maxSpans limit may also help avoiding transactions that never finish (in platforms that keep a transaction open for as long as spans are open), preventing OOM errors, and generally avoiding degraded application performance.

Event Changes

As of writing, transactions are implemented as an extension of the Event model.

The distinctive feature of a Transaction is type: "transaction".

Apart from that, the Event gets new fields: spans, contexts.TraceContext.

New Span and Transaction Classes

In memory, spans build up a conceptual tree of timed operations. We call the whole span tree a transaction. Sometimes we use the term "transaction" to refer to a span tree as a whole tree, sometimes to refer specifically to the root span of the tree.

Over the wire, transactions are serialized to JSON as an augmented Event, and sent as envelopes. The different envelope types are for optimizing ingestion (so we can route "transaction events" differently than other events, mostly "error events").

In the Sentry UI, you can use Discover to look at all events regardless of type, and the Issues and Performance sections to dive into errors and transactions, respectively. The user-facing tracing documentation explains more of the concepts on the product level.

The Span class stores each individual span in a trace.

The Transaction class is like a span, with a few key differences:

  • Transactions have name, spans don't.
  • Calling the finish method on spans record the span's end timestamp. For transactions, the finish method additionally sends an event to Sentry.

The Transaction class may inherit from Span, but that's an implementation detail. Semantically, transactions represent both the top-level span of a span tree as well as the unit of reporting to Sentry.

  • Span Interface

    • When a Span is created, set the startTimestamp to the current time
    • SpanContext is the attribute collection for a Span (Can be an implementation detail). When possible SpanContext should be immutable.
    • Span should have a method startChild which creates a new span with the current span's id as the new span's parentSpanId and the current span's sampled value copied over to the new span's sampled property
    • The startChild method should respect the maxSpans limit, and once the limit is reached the SDK should not create new child spans for the given transaction.
    • Span should have a method called toSentryTrace which returns a string that could be sent as a header called sentry-trace.
    • Span should have a method called iterHeaders (adapt to platform's naming conventions) that returns an iterable or map of header names and values. This is a thin wrapper containing return {"sentry-trace": toSentryTrace()} right now. See continueFromHeaders as to why this exists and should be preferred when writing integrations.
  • Transaction Interface

    • A Transaction internally holds a flat list of child Spans (not a tree structure)
    • Transaction has additionally a setName method that sets the name of the transaction
    • Transaction receives a TransactionContext on creation (new property vs. SpanContext is name)
    • Since a Transaction inherits a Span it has all functions available and can be interacted with like it was a Span
    • A transaction is either sampled (sampled = true) or unsampled (sampled = false), a decision which is either inherited or set once during the transaction's lifetime, and in either case is propagated to all children. Unsampled transactions should not be sent to Sentry.
    • TransactionContext should have a static/ctor method called fromSentryTrace which prefills a TransactionContext with data received from a sentry-trace header value
    • TransactionContext should have a static/ctor method called continueFromHeaders(headerMap) which is really just a thin wrapper around fromSentryTrace(headerMap.get("sentry-trace")) right now. This should be preferred by integration/framework-sdk authors over fromSentryTrace as it hides the exact header names used deeper in the core sdk, and leaves opportunity for using additional headers (from the W3C) in the future without changing all integrations.
  • Span.finish()

    • Accepts an optional endTimestamp to allow users to set a custom endTimestamp on the finished span
    • If an endTimestamp value is not provided, set endTimestamp to the current time (in payload timestamp)
  • Transaction.finish()

    • super.finish() (call finish on Span)
    • Send it to Sentry only if sampled == true
    • Like spans, can be given an optional endTimestamp value that should be passed into the span.finish() call
    • A Transaction needs to be wrapped in an Envelope and sent to the Envelope Endpoint
    • The Transport should use the same internal queue for Transactions / Events
    • The Transport should implement category-based rate limiting →
    • The Transport should deal with wrapping a Transaction in an Envelope internally

Sampling

Each transaction has a "sampling decision," that is, a boolean which dictates whether or not it should be sent to Sentry. This should be set exactly once during a transaction's lifetime, and should be stored in an internal sampled boolean.

There are multiple ways a transaction can end up with a sampling decision:

  • Random sampling according to a static sample rate set in tracesSampleRate
  • Random sampling according to a dynamic sample rate returned by tracesSampler
  • Absolute decision (100% chance or 0% chance) returned by tracesSampler
  • If the transaction has a parent, inheriting its parent's sampling decision
  • Absolute decision passed to startTransaction

When there's the potential for more than one of these to come into play, the following precedence rules should apply:

  1. If a sampling decision is passed to startTransaction (startTransaction({name: "my transaction", sampled: true})), that decision will be used, regardlesss of anything else
  2. If tracesSampler is defined, its decision will be used. It can choose to keep or ignore any parent sampling decision, or use the sampling context data to make its own decision or choose a sample rate for the transaction.
  3. If tracesSampler is not defined, but there's a parent sampling decision, the parent sampling decision will be used.
  4. If tracesSampler is not defined and there's no parent sampling decision, tracesSampleRate will be used.

Sampling Context

If defined, the tracesSampler callback should be passed a samplingContext object, which should include, at minimum:

  • The transactionContext with which the transaction was created
  • A boolean parentSampled which contains the sampling decision passed down from the parent, if any
  • Data from an optional customSamplingContext object passed to startTransaction when it is called manually

Depending on the platform, other default data may be included. (For example, for server frameworks, it makes sense to include the request object corresponding to the request the transaction is measuring.)

Propagation

A transaction's sampling decision should be passed to all of its children, including across service boundaries. This can be accomplished in the startChild method for same-service children and using the senry-trace header for children in a different service.

Header sentry-trace

The header is used for trace propagation. SDKs use the header to continue traces from upstream services (incoming HTTP requests), and to propagate tracing information to downstream services (outgoing HTTP requests).

sentry-trace = traceid-spanid-sampled

sampled is optional. So at a minimum, it's expected:

sentry-trace = traceid-spanid

To offer a minimal compatibility with the W3C traceparent header (without the version prefix) and Zipkin's b3 headers (which consider both 64 and 128 bits for traceId valid), the sentry-trace header should have a traceId of 128 bits encoded in 32 hex chars and a spanId of 64 bits encoded in 16 hex chars. To avoid confusion with the W3C traceparent header (to which our header is similar but not identical), we call it simply sentry-trace. No version is being defined in the header.

The sampled Value

To simplify processing, the value consists of a single (optional) character. The possible values are:

Copied
  - No value means defer

0 - Don't sample

1 - Sampled

Unlike with b3 headers, a sentry-trace header should never consist solely of a sampling decision, with no traceid or spanid values. There are good reasons to always include the traceid and spanid regardless of the sampling decision, and doing so also simplifies implementation.

Besides the usual reasons to use *defer,* in the case of Sentry, a reason would be if a downstream system captures an error event with Sentry. The decision could be done at that point to sample that trace in order to have tracing data available for the reported crash.

sentry-trace = sampled

Which in reality is useful for proxies to set it to 0 and opt out of tracing.

Static API Changes

The Sentry.startTransaction function should take two arguments - the transactionContext passed to the Transaction constructor and an optional customSamplingContext object containing data to be passed to tracesSampler (if defined).

It creates a Transaction bound to the current hub and returns the instance. Users interact with the instance for creating child spans and, thus, have to keep track of it themselves.

Hub Changes

  • Introduce a method called traceHeaders

    • This function returns a header (string) sentry-trace
    • The value should be the trace header string of the Span that is currently on the Scope
  • Introduce a method called startTransaction

    • Takes the same two arguments as Sentry.startTransaction
    • Creates a new Transaction instance
    • Should implement sampling as described in more detail in the 'Sampling' section of this document
  • Modify the method called captureEvent or captureTransaction

    • Don't set lastEventId for transactions

Scope Changes

The Scope holds a reference to the current Span or Transaction.

  • Scope Introduce setSpan
    • This can be used internally to pass a Span / Transaction around so that integrations can attach children to it
    • Setting the transaction property on the Scope (legacy) should overwrite the name of the Transaction stored in the Scope, if there is one. With that we give users the option to change the transaction name even if they don't have access to the instance of the Transaction directly.

Interaction with beforeSend and Event Processors

The beforeSend callback is a special Event Processor that we consider to be of most prominent use. Proper Event Processors are often considered internal.

Transactions should not go through beforeSend. However, they are still processed by Event Processors. This is a compromise between some flexibility in dealing with the current implementation of transactions as events, and leaving room for different lifetime hooks for transactions and spans.

Motivations:

  1. Future-proofing: if users rely on beforeSend for transactions, that would complicate eventually implementing individual span ingestion without breaking user code. As of writing, a transaction is sent as an event, but that is considered an implementation detail.

  2. API compatibility: users have their existing implementation of beforeSend that only ever had to deal with error events. We introduced transactions as a new type of event. As users upgrade to a new SDK version and start using tracing, their beforeSend would start seeing a new type that their code was not meant to handle. Before transactions, they didn't have to care about different event types at all. There are several possible consequences: breaking user apps; silently and unintentionally dropping transactions; transaction events modified in surprising ways.

  3. In terms of usability, beforeSend is not a perfect fit for dropping transactions like it is for dropping errors. Errors are a point-in-time event. When errors happen, users have full context in beforeSend and can modify/drop the event before it goes to Sentry. With transactions the flow is different. Transactions are created and then they are open for some time while child spans are created and appended to it. Meanwhile outgoing HTTP requests include the sampling decision of the current transaction with other services. After spans and the transaction are finished, dropping the transaction in a beforeSend-like hook would leave orphan transactions from other services in a trace. Similarly, modifying the sampling decision to "yes" at this late stage would also produce inconsistent traces.

You can edit this page on GitHub.