When systems grow bigger, we quickly start lacking a degree of oversight that would be comfortable. Our services interact with third-party tools, query databases, send webhook requests, and perform countless other operations.
Even for simple projects, analyzing how long certain operations take, or being able to trace individual requests can become quite challenging.
For distributed systems with multiple chained services, messaging queues, and other moving parts, this becomes much harder.
Luckily, several products have emerged over the past couple of years, offering more insights and observability throughout your complete stack. In this post, I'll take a look at what NewRelic is offering in terms of browser and API monitoring, distributed tracing, uptime checks, and other neat features I found.
For our journey across the stack, let's start with the experience your customers are served, let's start with the frontend, and more specifically, in the browser.
NewRelic offers to set up browser applications, by integrating a snippet that will spin up a browser agent and send periodic requests containing information like events, interactions, and errors to be ingested into NewRelic.
Once you have set up your application, pasted the snippet, used your application for a bit, head over to NewRelic again. On the Summary page, you're greeted with the most important metrics for performance, error rates, latency, and more.
On the Session traces page, you can observe all sessions the browser agent recorded. With this, you can trace every interaction, request, or other events the users experience in your application. With this, tracing back interactions that led to an error is easier than ever.
There are a lot more interesting pages to check out, and if you're curious, you can query all the data you're interested in, I think it is safe to say that adding NewRelic to your web applications can be a big help in identifying issues, investigating performance bottlenecks, and observing other critical metrics.
The best part of this is, you get it all out of the box, without any additional configuration. And if you want to report errors, add custom attributes to identify specific transactions or users, or add some more data to enrich your metrics, you can use the Browser agent and SPA API.
With a couple of clicks, you can set up alerting policies, for example, to get notified whenever the page error rate exceeds a configured threshold. You won't miss any critical events with this.
What's more, if you'd like to get your tracing to the next level, you can enable distributed tracing for specified hosts, so all requests sent to external services (like API requests) are modified to include a header that identifies your request. In your backend, you can then add this identifier to any further monitoring, to connect the stack from end to end.
With distributed tracing, errors that occur in one of your services, and are returned to the client, are automatically linked so you get the full picture without any manual investigation.
The only downside of enabling distributed tracing for cross-origin requests is setting up your backend to allow the
newrelic header in pre-flight requests, so CORS doesn't complain.
Now that we collect insights on our frontend deployment, how about monitoring our backend services? NewRelic offers application performance management (APM) agents for a variety of languages, we'll focus on Node.js for now, as this is probably the most used setup at the moment.
Before we start, let's quickly go over the core data types NewRelic uses to distinguish between metrics: metrics, events, logs, and traces.
Transactionevent NewRelic's APM agent uses for activities like HTTP requests. You can create custom events, attach metric data to events, turn events into metrics,
Spanobjects, including attributes to enrich transactions.
The Node.js agent is initialized before any other logic is run, so it can to wrap all dependencies and special parts of the Node.js core modules. This is necessary to track requests to external services, database calls, and various other metrics.
To make sure nothing else is run before NewRelic is initialized, I require it as a preload script using the
--require command-line argument.
node -r newrelic main.js
To configure NewRelic, you can either place a configuration file in your root directory or set environment variables, I prefer the latter.
To add your services to NewRelic, you can head over to APM and create an application. This is not necessarily required, as you could just use your license key and select an application name, but you can also use the walkthrough NewRelic offers, this is probably helpful if this is your first time setting it up.
Once your configuration is in place and NewRelic is either required in your application or using the command-line, the agent will set everything up on application launch, and start to monitor your service.
Without any additional configuration, you'll get the following benefits out of the box:
Of course, you're not limited to the instrumentation that comes with Node.js agent. You can easily add more details like
noticeError, you can manually add errors to be registered in NewRelic. You can add custom attributes to errors as well, so you get all the context you need to investigate further.
recordMetricfor metrics with a single value (i.e. durations of specific operations, or business logic like the value of a shopping cart). If you want to add even more details, you can call
recordCustomEventto create an event that receives attributes in addition to an event type.
Using the Node.js API, you can customize nearly every aspect of collecting or enriching metrics. You can also instrument modules that might have been loaded before NewRelic, or add instrumentation to modules NewRelic does not detect by itself.
Once you're happy with the metrics to be collected, you can start up your service again and perform some actions.
In the APM overview, you'll get similar metrics to what you saw in the frontend, for example, error rates, but also backend-specific metrics like throughput, the timing of web transactions, and an apdex score to measure the response time a user experiences.
If you enabled distributed tracing in your agent, you'll see all transaction spans, which can be expanded to show subsequent spans, external service requests, database calls, and other internal processes.
While NewRelic offers a great range of insights already, you can monitor service availability with Synthetics, a group of uptime monitors that can be configured. Synthetics include four different kinds of monitors
And we're still not done yet. While dashboards are nice to explore your metrics, sometimes you need to be notified about incidents when they happen.
If you're running low on resources, see a spike in requests, or experience elevated error rates, you might want to be paged. NewRelic allows creating notification channels, integrated to services including Slack, PagerDuty, and OpsGenie, but also webhooks and regular emails.
Policies describe events and conditions under which an incident is created and routed to a notification channel of choice.
This concludes the high-level overview of features NewRelic offers. With infrastructure and logs, there are still areas we haven't covered yet. I'm also planning to create a writeup on how to automate dashboards and alerts for multiple environments using automation, so you don't have to manually copy over these resources.