Develop Secure Codes and Service with Github Advanced Security

Back to Nov, 2020, I got the chance to evaluate Github Code Scanning, mainly CodeQL, as part of our effort to improve the security posture of our source code right after Github announced Code scanning became available in Sep 2020. I selected 4 different services written with Java, Python, C++ and JavaScript respectively and ran CodeQL scanning against them. Though there were some great advantages when it comes to ease of use and collaboration, the overall CodeQL code scanning results were average compared to other traditional commercial SAST tools. The result was not good enough for our team to replace the existing SAST tools.

Recently, our team started to assess Github Advance Security (GHAS) again to understand whether we could use Github Advanced Security Feature as a unified platform to secure our source code by evaluating the three main features Code Scanning, Secret Scanning and Dependency vulnerability in the GHAS. The overall evaluation totally surpassed my expectation as I saw a significant improvement of Github Advanced Security features by comparing with the results of evaluation conducted one and half years ago. 

In this post, I would like to share and highlight  some valuable findings and features the GHAS surprised me with and how they could help your organization to secure its codes and build secure services.

How did we start the re-evaluation?

Before we got involved with Github Advanced Security, we were clear that what we really wanted was a unified platform that could perform code scanning (SAST), secret detection,  software composition analysis (3rd party dependency vulnerability) and it should be easily integrated into our current CI pipelines.

With the previous evaluation and experience with Github Code Scanning, we figured out that GHAS could be a one stop solution to meet all the requirements. However, due to the previous evaluation, I was really concerned about the Code Scanning performance before we started evaluation. Well, it turns out that the concern was over worried.

In order to conduct a thorough evaluation, we selected 15 services/repo to cover all languages supported by Github to evaluate the GHAS , diagnose the findings and compare them with the existing tools that we have deployed. 

How did GHAS outperform others 

With the completion of the GHAS evaluation, the following are some highlights we think that the GHAS are outperforming others tools.

1.Code Scanning: Excellent Auto Build with Flexible Configuration

Nevertheless to say, Code scanning is a resource consuming task. Some of the repos that I am evaluating are monolithic repos, so building and scanning them are time and resource consuming. When scanning one of this monolithic repo with a popular open source tool we were evaluating, my personal laptop got completely frozen after running the scan for 10 minutes as the scanning task was consuming more than 9G memories. 

However, with Github Code Scanning (we only enabled CodeQL scanning by default), we found that this is not an issue because it provides an excellent auto build and scanning process in Github-hosted runners deployed in the Github network. 

If you could use Github-hosted runner to build and scan the service, you don’t have to bother your IT team to set up a self-hosted runner either with your own laptop or a remote server in your network. We were able to use the Github-hosted runner to build and launch the scan against 14 repos out of 15 selected. That means more than 93% percent of the scans could be completed with Github-hosted runners. That is a significant advantage as a high successful rate of using Github-Hosted runners means less resources required from our  organization to build and maintain a self-hosted server to run the scans.

  • Flexible Configuration to add manual build commands 

Some of the codes in the select repos have a non-standard build process. We could NOT simply run the default  maven build or cmake commands provided by the Auto-Build function. Under this situation, the flexibility to add the manual build commands is really necessary and powerful to ensure a successful build process . For example, we were able to build our Java service by adding some customized configuration for the maven setting with the help of defining manual build commands in the yaml configuration file.

  • Github Secret to keep your build information safe

As mentioned above, we have to set up some environment variables in the build process.  These environment variables are very sensitive and they should not be  exposed in the CodeQL yaml configuration files directly. The Github secrets function lets us hide the sensitive secrets in the yaml configuration file.

2. Code Scanning: Less False Positive with high true positive rate

One of the biggest challenges in SAST tool/Code scanning is that it likely yields  a high number of False Positives, which costs tons of time and effort for the engineering team to validate these false findings. The main reason for the high number of false positives is that static code analysis is largely based on assumptions and modeling methods after it builds the call stack (from source to sink), which is different from the DAST tool where the test payloads are actually executed by application codes.

As weeding out false positives is time and resource intensive,  low False Positive rate and high true positive rate are the key factors for the entire evaluation.  During the evaluation, we went through all the critical, high, medium vulnerabilities reported byCodeQL. Here are some key findings

  • Less False Positives than we expected

For the projects writing in Java, C++ and C#, the false positives are really low. The best one is with the Java languages, we saw a false positive rate at 0% with 2 valid findings. We double check it with another popular open source tool, the performance is equivalent where 2 valid findings were reported. Overall, for the compiled languages, most of the SAST tools we compared have a low false positive rate.  

CodeQL stands out when it comes to the Script languages, for example, JavaScript and Ruby. In general, Github CodeQL has less False Positive rate reported in scripting language. For example, CodeQL has false positive rates at 44% compared to 62% false positives rates when scanning a repo written in script language.

  • Relatively high True Positive rates

A good false positive rate does not guarantee the tool is a good one. A SAST tool could produce a 0% false positive rate with zero vulnerability detection. When analyzing the CodeQL scanning results, we calculated that the True Positive detection rate is higher compared to other tools for most of the repos. For example, CodeQL scan reported 25 valid findings against 22  in one repo, and 5 versus 3 findings in another repo when comparing the results generated from one popular SAST tool.

Note: Some reported vulnerabilities are vulnerable but not really exploitable or reachable, under these scenarios, most of these types of vulnerabilities are categorized as  False Positives. 

3. Code Scanning: some vulnerability detections are intelligent

Most of the Code analysis SAST tools are using a set of rules to detect potential vulnerability when scanning the code.  Github CodeQL is NOT an exception. It is utilizing a set of predefined rules to detect the vulnerabilities.  Due to that, many security engineers and developers thinks SAST is just a dumb tool to perform a matching between the code and the rule set in order to detect a vulnerability. That argument is kind of true to a large extent.

However, we found that some vulnerability reported by CodeQL seems to be intelligent and these detections were only reported by CodeQL scanning. Here are a couple of examples based on some real detection we found

  • Inefficient regular expression detection

This detection is to check whether your regex pattern is potentially vulnerable to ReDOS attack.  For example, CodeQL is reporting the Inefficient regular expression  vulnerability against the following code in an open source library.

This is a true security issue, which has been ignored by other tools. An attacker could dramatically slow down the performance of a server with  a malicious string less than 100 characters. You could find a detailed post in another post.

  • Incomplete URL substring sanitization detection

Here is an example where Incomplete URL substring sanitization vulnerability is detected 

These detection looks simple but also intelligent from my perspective. There are many more other smart detections we found with CodeQL, I chose these two on purpose because I found these kinds of vulnerabilities are prevailing in many public Github repos when performing a brief code scanning against a tiny portion of open source repos.

4. Code Scanning: Easy to track the origin of the vulnerable code

This unique advantage makes the entire triaging process much easier and quicker as we could simply use the git blame function to track down which engineers committed the vulnerable code, what the vulnerable codes are supposed to accomplish and the corresponding Jira ticket to these changes.

After collecting all the related information of the vulnerable findings, we could tell the potential impact of the vulnerability, the potential remediation method and how quickly we could fix it. 

5. Code Scanning: multiple languages support in one scan 

Coverage is another factor when evaluating a code analysis SAST tool. Many SAST tools ask you to predefine the language before running the scans as it could only scan one language at a time. Whereas, CodeQL allows you to specify multiple languages and scan them in one scan without requiring some predefined settings.

Even for compiled languages, you could specify multiple builds for different languages in one scan. It is a really useful feature if you have some monolithic repos written with different languages and you want to cover all the code in the repo.

6. Secret Scanning: a powerful feature worthy a try

Secret Scanning is another feature we evaluate partly and we think it is a feature worthy of mention as there are some unique and true values when using it properly.

  • Scan your entire Git history on all presented branches

Github Secret scanning will scan your entire Git history on all branches present in your Github repos to find potential secrets exposed in your code bases. That is a huge difference compared with other tools, where the scan is performed against the main remote branch or local branch when scan is performed locally

  • Empower users to define its own secret patterns

If your secrets or token could not be detected by the Github defaults patterns. You could define custom patterns to identify secrets. 

  • Block Push containing suspected secrets.

Some developers might accidently add secrets into the code when pushing the changes to the remote branch. This could be prevented by enabling push protection, which allows Github to reject the push when the secret scanning finds any suspected secrets.  

7. Code Scanning: clear-text logging of sensitive data detection, a hidden gem 

Insecure logging could cause a security breach or incident in many cases. I  shared some thoughts in one of my blog posts. When analyzing all the code scanning results, It was refreshing to realize  that CodeQL has a detection function to check whether sensitive data is added into log files. 

I believe this detection has great values which are mostly underestimated by many SAST tools. From my experience as a security engineer and a penetration test, I found it is so common for engineers to add sensitive data into the log file for debug purposes, but they eventually forget to remove it before the changs deployed into production. As a consequence, they are collecting some sensitive data from customers by accident.  

With the help of this detection method, many logging issues could be detected before the code is pushed into a production environment.

Limitations in Github Advanced Security (GHAS)

Definitely, I could list more bright sides of how CodeQL is outperforming other tools. But I think it is important to remind people that GHAS, as a newly emerged and growing security tool, has some limitations as many other security tools do.

Here are some limitations that we could summarize from our evaluation.

Limitation 1: Insufficient disk space in Github-hosted runner to build large projects

We were able to use Github-hosted runner to build 14 services out of 15 select repositories. We had issues building one large project using Github-hosted runner as the build was hitting `not enough space on the disk` error all the time no matter how we customize the build commands.   After some analysis, we found the disk space allocated for the Github-hosted runner is really limited for the Windows runners. 

Suggestion: At this moment, there are two types of windows runner supported by Github, windows-2022 and windows-2019. If the Github team could assign specific roles for these two types of windows runner, it might be helpful to resolve the issue. For example, window-2022 should ONLY be used to build Dotnet projects with only Dotnet environment setup in this VM, whereas,window-2019 could be used for other types of build environments. 

Limitation 2: Certain frameworks are not supported in CodeQ

Though the code analysis tool CodeQL supports a large range of frameworks, certain frameworks are not well supported at the moment of the evaluation. For example, Ruby on Rails framework was not supported currently in the CodeQL and we saw some false negatives due to the lack of support for this framework. 

Limitation 3: Some False Positives could be filtered out

Some vulnerabilities reported by CodeQL are vulnerable, but the vulnerable piece of code will not be executed in any case because there are multiple validation or whitelisting methods applied before a user supplied input value to reach the vulnerable code. I believe this kind of filtering could be filtered out by tuning the detection method.

Limitation 4:  Current dependency vulnerability detection is too loose

In my opinion, the software composition analysis (3rd party dependency vulnerability) feature in GHAS is too loose because it mainly scans the package management files, like, pom.xml, package.json files to 1)extract the package name and version. 2)Identify the vulnerability based on the version number.  It means, Github Dependabot  will flag a vulnerability in your code even if you are not calling the vulnerable functions in the vulnerable dependency library.

It seems that the Github team is implementing some changes to check whether your codes are actually calling the vulnerable function rather than based on version number. Once this is fully rolled out, I believe that it will bring the dependency vulnerability detection to a totally new level.

Conclusion

Though Github Advance Security is a relatively new player in the security market, I could say that the code analysis tool CodeQL could compete with any other SAST tools in the market that I have evaluated so far.  With two separate evaluation experiences against GHAS, I observed such a huge improvement of scanning quality and new features adopted in it just in one and half years. That really surprises me and makes me believe the GHAS will be adopted by more and more organizations with the quality of detection and the speed or renovation in the tools.

Github Advance Security (GHAS) is not a silver bullet to catch all the issues by scanning in the code base as it has its own limitations, but this tool is clearly the best of the SAST tools that I have evaluated.

Leave a Reply

Your email address will not be published. Required fields are marked *