Application Logging and its importance

Shubham Gupta
Ula Engineering
Published in
7 min readOct 11, 2020

--

The usefulness of logs is often underestimated. We all try to put efforts to build perfect systems and thinks that the code we wrote is perfectly good but we all know that eventually, the application will crash, a server will go down, users may complain about a bug that “randomly” appears, or even worse, the client could realize that some data is missing, and there is no clue when or why this data got lost. This is real life, and we must be ready for these problems. A large part of software developers’ lives is monitoring, troubleshooting and debugging. Logging makes this a much easier and smoother process.

Generally, developers don’t write any code that logs meaningful and relevant messages for the feature they are writing. Writing code to log things may even be considered not essential and a waste of time by many developers.

Most of the time developers don’t even realize the importance of logging till they get involved in production debugging and face bugs.

Apart from debugging, logging is also important from multiple other perspectives, like event tracing, request tracing, security, and Business intelligence can also be directly fueled by data produced by logs.

Debugging is an art and believe me that it comes after a lot of experience and if you are considering logging in your codes properly then that makes it easy to debug and helps to find the problems and issues faster.

When we face any pressing issue and are unable to find the actual pain point, at that time we realize the importance of logging. Considering the log level is also very important and why there are multiple levels in the logging. So questions arise that does logging impact the performance of my application? What can I do to optimize the space? should I log everything? To find out everything lets deep dive into it and let’s improve our codes in production.

Why and what to log?

It is not enough to log errors in order to use them for troubleshooting purposes. It is also useful to log successful requests so we can have a clear idea of how users work with the application.

But this goes beyond that. Logs are also useful to detect common mistakes users make, as well as for security purposes. Writing good logs about a user’s activity can alert us about malicious activity.

It is important that logs can provide accurate context about what the user was doing when a specific error happened. In this way, having at least one log entry per request/result in every layer of the application is mandatory. This is not trivial, and we have to take performance issues into account.

In order to address them, Logging Frameworks allow to specify log levels so that the application can be easily configured to just log messages that are marked as the specified log level or with the lower level. The main cons about this is that when a problem comes up, you are likely to be short of contextual data in your logs that can be analyzed.

Does logging impose a huge deal of performance?

From a performance point, it’s not a huge cost as most of the logging frameworks use asynchronous logging. Async decouples the logging overhead from the thread executing your code. An async logger has consistently lower latency than a synchronous logger and high throughput of logging messages at 6–68 times the rate of asynchronous logger.

If there is not much performance cost then should I log everything?

As logging also comes with a cost so is it feasible to log everything? Obviously not. Ideally, we should log what matters. So how do we identify what we should log and at what level we should log? To answer that we need to understand the logging levels properly.

Logging Levels: Why Do We Do It?

When it comes to logging, you have two essential and opposing forces, if you will. On the one hand, you want to capture every last detail you can because this might prove useful during troubleshooting or auditing your system. But on the other hand, all of that logging consumes resources. You can eat up disk space, overload people reading the logs, and even start to slow down your production code if you go overboard.

So logging requires either a balance or a way to get both the proverbial signal and the noise. And logging levels look to help with this by making your situation configurable on the fly.

Common Logging Levels

So what are these logging levels? Here are some common ones that you’ll see, listed in order from most severe to least severe, and followed up by some that are meta-considerations. When logging from your application, follow these guidelines:

FATAL

Fatal represents truly catastrophic situations, as far as your application is concerned. Your application is about to abort to prevent some kind of corruption or serious problem, if possible. This entry in the log should probably result in someone getting a 3 AM phone call.

ERROR

An error is a serious issue and represents the failure of something important going on in your application. Unlike FATAL, the application itself isn’t going down the tubes. Here you’ve got something like dropped database connections or the inability to access a file or service. This will require someone’s attention probably sooner than later, but the application can limp along.

WARN

Now we’re getting into the greyer area of hypotheticals. You use the WARN log level to indicate that you might have a problem and that you’ve detected an unusual situation. Maybe you were trying to invoke a service and it failed a couple of times before connecting on an automatic retry. It’s unexpected and unusual, but no real harm done, and it’s not known whether the issue will persist or recur. Someone should investigate warnings.

INFO

Finally, we can dial down the stress level. INFO messages correspond to normal application behavior and milestones. You probably won’t care too much about these entries during normal operations, but they provide the skeleton of what happened. A service started or stopped. You added a new user to the database. That sort of thing.

DEBUG

With DEBUG, you start to include more granular, diagnostic information. Here, you’re probably getting into “noisy” territory and furnishing more information than you’d want in normal production situations. You’re providing detailed diagnostic information for fellow developers, sysadmins, etc.

TRACE

This is really fine-grained information-finer even than DEBUG. When you’re at this level, you’re basically looking to capture every detail you possibly can about the application’s behavior. This is likely to swamp your resources in production and is seriously diagnostic.

ALL

This is just what it sounds like. Log absolutely everything, including any custom logging levels that someone has defined.

OFF

This is also just what it sounds like. Don’t log anything at all.

Here are a few diagrams that can help you to understand it better

So why do we have so many levels of logging?

Simply we don’t want to log everything in production so we usually keep our logging levels in production as ERROR or INFO modes so that we don’t pile up all the information but have the relevant information to debug things.

But in the development environment, you might want to keep it at INFO or debug so that you can identify potential issues and can debug things properly during development. You can also enable debug/trace levels in your local environments when you are debugging issues.

Also is some issue you are not able to identify in the production then you might change the logging level temporarily and can debug the issues. That’s the reason we always keep the configuration of level separately in the config files so that we can change it without modifying the complete code.

Common mistakes

Many a time I have seen people logging very generic information but not logging the data that is actually needed. Let’s consider the following example. (Consider the following code just as an example)

public Boolean login(@NotNull String userId, @NotNull String password){
User user = repository.findOne(userId);
if (Objects.isNull(user)){
log
.error("user not found");
// throw new Exception();
}
if (user.getPassword().equals(password)){
log
.error("Wrong password");
// throw new Exception;
}
if (!UserStatus.ACTIVE.equals(user.getStatus())){
log
.error("User is not active");
// throw new Exception;
}
log.info("login validation successful for user");
return true;
}

In the above code block we can clearly see that in case a user is not found then we are simply logging it with error mode, but the problem here is that we are not considering logging the relevant information that can be used to trace this log. How will you know if a user will complain that he is not able to login or not able to access his account? There can be thousands of users which have this kinda problem. So how to make it better? Let’s check out the following code snippet

public Boolean login(@NotNull String userId, @NotNull String password){
User user = repository.findOne(userId);
if (Objects.isNull(user)){
log
.error("user not found with id: {}", userId);
// throw new Exception();
}
if (user.getPassword().equals(password)){
log
.error("Wrong password for user id: {}", userId);
// throw new Exception;
}
if (!UserStatus.ACTIVE.equals(user.getStatus())){
log
.error("User with id {} is not active", userId);
// throw new Exception;
}
log.info("login validation successful for user {}", userId);
return true;
}

In the above code, we are properly logging the userId which is being passed in the logs. This will help us to find the actual issue with the userId and we can tell the actual reason why the user is not able to access his account.

Conclusion

Proactive and correct logging can save us big time and can also lead to improved business decisions and drives better business performance. Application logs contain a wide range of information that is not only useful for debugging or finding bugs but also elevate the overall security of an application and fuel business.

--

--