Feb 21st, 2021

📊 Instant Full-Stack Insights with NewRelic

When systems grow bigger, we quickly start lacking a degree of oversight that would be comfortable. Our services interact with third-party tools, query databases, send webhook requests, and perform countless other operations.

Even for simple projects, analyzing how long certain operations take, or being able to trace individual requests can become quite challenging.

For distributed systems with multiple chained services, messaging queues, and other moving parts, this becomes much harder.

Luckily, several products have emerged over the past couple of years, offering more insights and observability throughout your complete stack. In this post, I'll take a look at what NewRelic is offering in terms of browser and API monitoring, distributed tracing, uptime checks, and other neat features I found.

For our journey across the stack, let's start with the experience your customers are served, let's start with the frontend, and more specifically, in the browser.

🌐 Browser-Monitoring, Plug & Play

NewRelic offers to set up browser applications, by integrating a snippet that will spin up a browser agent and send periodic requests containing information like events, interactions, and errors to be ingested into NewRelic.

Once you have set up your application, pasted the snippet, used your application for a bit, head over to NewRelic again. On the Summary page, you're greeted with the most important metrics for performance, error rates, latency, and more.

https://static.brunoscheufler.com/posts/2021-02-20-newrelic/summary.png

On the Session traces page, you can observe all sessions the browser agent recorded. With this, you can trace every interaction, request, or other events the users experience in your application. With this, tracing back interactions that led to an error is easier than ever.

https://static.brunoscheufler.com/posts/2021-02-20-newrelic/session.png

There are a lot more interesting pages to check out, and if you're curious, you can query all the data you're interested in, I think it is safe to say that adding NewRelic to your web applications can be a big help in identifying issues, investigating performance bottlenecks, and observing other critical metrics.

The best part of this is, you get it all out of the box, without any additional configuration. And if you want to report errors, add custom attributes to identify specific transactions or users, or add some more data to enrich your metrics, you can use the Browser agent and SPA API.

With a couple of clicks, you can set up alerting policies, for example, to get notified whenever the page error rate exceeds a configured threshold. You won't miss any critical events with this.

What's more, if you'd like to get your tracing to the next level, you can enable distributed tracing for specified hosts, so all requests sent to external services (like API requests) are modified to include a header that identifies your request. In your backend, you can then add this identifier to any further monitoring, to connect the stack from end to end.

With distributed tracing, errors that occur in one of your services, and are returned to the client, are automatically linked so you get the full picture without any manual investigation.

The only downside of enabling distributed tracing for cross-origin requests is setting up your backend to allow the newrelic header in pre-flight requests, so CORS doesn't complain.

⏲️ Instant Backend Observability

Now that we collect insights on our frontend deployment, how about monitoring our backend services? NewRelic offers application performance management (APM) agents for a variety of languages, we'll focus on Node.js for now, as this is probably the most used setup at the moment.

Before we start, let's quickly go over the core data types NewRelic uses to distinguish between metrics: metrics, events, logs, and traces.

  • Metrics: A numeric measurement of an application or system, metrics come in a variety of flavours and can contain metadata like durations and other attributes.
  • Events: In a straightforward way, everything that can happen, can be turned into an event. An easy example could be the Transaction event NewRelic's APM agent uses for activities like HTTP requests. You can create custom events, attach metric data to events, turn events into metrics,
  • Logs: Another common data type is log data. Whether you're collecting access logs, infrastructure events, or other log-based data, this all fits into the category of logs.
  • Traces: For distributed tracing, NewRelic collects Span objects, including attributes to enrich transactions.

The Node.js agent is initialized before any other logic is run, so it can to wrap all dependencies and special parts of the Node.js core modules. This is necessary to track requests to external services, database calls, and various other metrics.

To make sure nothing else is run before NewRelic is initialized, I require it as a preload script using the --require command-line argument.

node -r newrelic main.js

To configure NewRelic, you can either place a configuration file in your root directory or set environment variables, I prefer the latter.

To add your services to NewRelic, you can head over to APM and create an application. This is not necessarily required, as you could just use your license key and select an application name, but you can also use the walkthrough NewRelic offers, this is probably helpful if this is your first time setting it up.

Once your configuration is in place and NewRelic is either required in your application or using the command-line, the agent will set everything up on application launch, and start to monitor your service.

Without any additional configuration, you'll get the following benefits out of the box:

  • Popular web frameworks are detected, and web transactions are created, giving you fine-grained insights into how long specific parts of your requests take (parsing the request body, etc.)
  • The timing of database calls will be measured when using common database drivers, such as pg. You can also enable slow query tracing.
  • External services will be detected, shown in the service map, and interactions like requests will be measured and displayed in the External services section

Of course, you're not limited to the instrumentation that comes with Node.js agent. You can easily add more details like

  • custom attributes: To enrich your metrics, you might want to add attributes for details about the current user, request, or other metadata. This is done by adding custom attributes to the current transaction
  • errors: With noticeError, you can manually add errors to be registered in NewRelic. You can add custom attributes to errors as well, so you get all the context you need to investigate further.
  • transactions: Sometimes you might want to start a transaction manually, for example, in services that don't follow the typical request-response pattern of APIs. In other cases, you might want to add custom segments to your transactions, for example, to measure the time-specific parts of your business logic take up. The custom instrumentation API makes all of this possible.
  • custom metrics and events: If you want to record metrics outside of transactions, you can use recordMetric for metrics with a single value (i.e. durations of specific operations, or business logic like the value of a shopping cart). If you want to add even more details, you can call recordCustomEvent to create an event that receives attributes in addition to an event type.

Using the Node.js API, you can customize nearly every aspect of collecting or enriching metrics. You can also instrument modules that might have been loaded before NewRelic, or add instrumentation to modules NewRelic does not detect by itself.

Once you're happy with the metrics to be collected, you can start up your service again and perform some actions.

https://static.brunoscheufler.com/posts/2021-02-20-newrelic/apm-overview.png

In the APM overview, you'll get similar metrics to what you saw in the frontend, for example, error rates, but also backend-specific metrics like throughput, the timing of web transactions, and an apdex score to measure the response time a user experiences.

https://static.brunoscheufler.com/posts/2021-02-20-newrelic/trace.png

If you enabled distributed tracing in your agent, you'll see all transaction spans, which can be expanded to show subsequent spans, external service requests, database calls, and other internal processes.

🚨 Uptime Checks with Synthetics

While NewRelic offers a great range of insights already, you can monitor service availability with Synthetics, a group of uptime monitors that can be configured. Synthetics include four different kinds of monitors

  • ✅ availability: A simple ping using curl
  • ⌛ page load performance: Full page load of a set URL, with insights on resource breakdowns and timelines
  • 🙂 user flow / functionality: A scripted browser test that validates your web application works as expected. Similar to end-to-end tests with Cypress or other UI testing libraries, this offers a quick way to continuously ensure your user experience matches your expectations
  • 📄 endpoint availability: Combining the scripted approach you saw in user flow monitors with a request to your backend services, the monitor sends requests

📟 Alerts in All Channels

And we're still not done yet. While dashboards are nice to explore your metrics, sometimes you need to be notified about incidents when they happen.

If you're running low on resources, see a spike in requests, or experience elevated error rates, you might want to be paged. NewRelic allows creating notification channels, integrated to services including Slack, PagerDuty, and OpsGenie, but also webhooks and regular emails.

Policies describe events and conditions under which an incident is created and routed to a notification channel of choice.

https://static.brunoscheufler.com/posts/2021-02-20-newrelic/policy.png


This concludes the high-level overview of features NewRelic offers. With infrastructure and logs, there are still areas we haven't covered yet. I'm also planning to create a writeup on how to automate dashboards and alerts for multiple environments using automation, so you don't have to manually copy over these resources.

I hope you enjoyed this post, if you have any feedback or questions, don't hesitate to reach out on Twitter or by mail!