Tag Archives: Sensitive Data

Insecure logging could be a burden to your security team

If you are part of a security team, it is very likely that your team has been feverishly remediating the vulnerabilities caused by log4j in the past two months.  It is really frustrating and struggling as the potential damage of this vulnerability could be catastrophic if exploited. However, the slimy bright side of it , at least, means that  your developer team is trying to implement logging functions in the product for monitoring or debugging purposes.

However, logging itself sometimes could be another security issue that is often overlooked as many developers are treating logging as an internal debugging and monitoring functions where security enforcement is often missing. I have observed many cases where improper logging functions turn out to be security incidents and add many burdens to its security team to overturn the damage.

Apply logging functions with security controls

Before we dive deep into the details, let us look at the following  piece of code from an old internal project that I created a while back. If you are using NodeJs Express framework, you could pinpoint that this piece of code is acting as a middleware to log every single HTTP request with the request body into a log file

The above piece of code definitely composes  security issues if you start to review it from security perspective. First of all, sensitive information in the HTTP request could be added to logs files and it could potentially cause a data leakage if the logs files are accessed by unauthorized users. The internal project is a web application with a login and registration function. As consequence of the above logging functions, the registration verification token and the username and password in the HTTP requestes could be leaked into the log files.

Potential Risks Caused by Logging

Risk 1: Sensitive Data are logged in log files

Logging sensitive data without proper masking or filtering methods is a common security ignorance from startup to enterprise due to many reasons. Couple of years back, twitter sent out an announcement for its users and urged them to change passwords due to unmasked/unfiltered passwords being logged into an internal log file.  

Reason 1: Security is not baked into entire SDLC

Many development organizations are involved their security teams at the test phase of the software/service development cycles. Without consulting the security team at design and development phase, many developers are not aware which data should be masked or filtered before implementing the log functions.

One tricky and representative example  that I have experienced was that the development teams got a list of blacklist data entries, like IP, password and token by referring to a document created from the security team a while back. They applied the filtering method into the log function without consulting with the Security team. However, the ‘referer’ header containing sensitive API tokens from customers was logged into the log files as it was not included in the predefined black list. This implementation mistake was discovered after the feature has been shipped into production environment and it took a while to purge the sensitive data from the log systems.

Reason 2:  Lack of standard logging functions in a complex environment

With more and more companies adopting the micro service architecture and making the development environment complex, lack of standard log functions could be another reason where sensitive data is logged and exposed into log files.

The following diagram is a typical workflow of micro service architecture, where the API gateway is exposed to the public to handle API requests and many micro servers are deployed in its private VPC to process the API requests.  Some developers are probably aware that sensitive data must be filtered out at the API gateway level before sending it to the S3 log system. However, when the requests passed are handled in the internal micro service (for example, micro service B), the developers might forget to perform the filtering as they believe the service is residing in the internal VPC and there is no need to filter sensitive data before writing to the log files. As a result of that, potential sensitive data could be logged in to the log file by some internal micro services.

Reason 3: Insufficient QA and Security Testing

It is common some QA are only performing blackbox testing whereas Security Team are only employing some automation scanning tools to scan the applications to find the potential flaw in the codes. Then it is very difficult for the QA team and security team to figure out the security issues caused by logging without manual code reviewing.

Risk 2: Malicious Data are process and logged without validation

Another risk when implementing logging functions and writing data to log is that maliciou data is processed and executed without any validation. You might be curious about why I should perform certain validation when processing the log data and storing it in a log file as the entire purpose of logging is to capture the raw data and use it for analysis. 

The reason is that you might be at risk of Deserialization exploitation when validation is absent from your logging functions. I have seen many developers dumping the entire object into the log file in some cases. When this happens, they are very likely to use some serialization functions to serialize the object and write it to the log file. After that, they may deserialize the logged data for analyzing purposes.  In this case, it is possible that you are at risk of deserialization exploitation.

Take the Log4j as an example, except the Log4jShell vulnerability, it has been suffering from a couple of deserialization vulnerabilities where untrusted log data could lead to remote code execuations.  

Some Best Practices

To ensure your logging function is not becoming a burden for your security and even turned against you when it leads to a security incident. Some best practices could be followed.

Involve Security at every phase of SDLC when implement log function

Security applies at every phase of the software development life cycle (SDLC). If you don’t have a security review procedure set up in your organization,  it is time for you to define it now.  By designing  a secure log function or log management feature by collaborating with your security team would save your organization much more time and effort. For example, the security team might ask you to avoid using GET instead of POST if your logging function is going to log all the requests in the log. They could also ask you to mask certain sensitive data as soon as the data is processed before it gets logged in to the log files.

Implement a standard and centralized logging function

When the organization is getting larger and larger, your platform and service is getting more sophisticated. Without a standard or centralized logging function, each team is forced to choose its own way to implement logging functions in the service they are in charge of. This could add many security uncertainties in the log functions as you could not foreseen how the logging function is implemented 

Consistent monitoring and scan your log data

Sometimes, unexpected data could still be logged into the log file even though you have set up strict logging functions to mask or filter all the sensitive data.  For example, your clients might not follow your API usage guidance and send sensitive data when calling your API endpoints. Your log function might log these sensitive data into the log files as the usage of the API is not intended as it was designed.  Under this case, you need to have a monitoring tool to scan your log data to check whether there is unexpected sensitive data logging into the log files.

Understand that data you are logging

If you could not measure it, you could not manage it”. It also applies to security. If you don’t know which kind of data you are logging, you could not really secure it. For example, if you are going to dump an entire object into your log file by calling some serialization functions without validation the object data, you are likely to log some malicious data and could lead to an exploitation.