Still Using an Open Source Code Scanner to Identify Your Open Source? It’s 2019. You Can Do Much Better.

January 15, 2018 Ron Rymon

Open Source Scanning Software

“Open source is free.” You may ask why the inverted commas? I believe most organizations have come to understand by now that open source is free to use, modify, and redistribute but… it comes with the condition to abide to the license under which it was granted.

Now - many organizations have become acquainted with the legal requirements of open source licenses due to open source audits. Open source audits usually arise in instances of an M&A, IPO or new investment rounds, where external parties would require open source inventory reports for review. In other instances, internal requirements can be raised by various teams, including legal counsel, security auditors or compliance officers.

But gathering this kind of information manually can become very time-consuming, especially if your organization is using a lot of open source in your products. After all, open source nowadays makes up 60-80% of the total code base, underlining the need and value in overseeing and managing your open source inventory from the get-go.

Up until three years ago, most enterprises wanted to ensure that they are compliant when it comes to their open source usage by running periodic audits with open source scanning tools. But in recent years, a visible shift can be seen in the market whereby most open source code scanners have either changed their approach or lost their entire customer base.

How It All Began

In order to help organizations during their open source audits, a startup named Black Duck Software introduced the first open source scanning solution back in 2002 which would be able to identify the open source components as well as their underlying licenses which were being included in their products. Soon after, several additional vendors joined the party, including Protecode, Palamida and Open Logic, offering open source code scanners in order to overcome the open source discovery challenge. In general, these open source scanners were able to scan the code and identify pieces of code (also known as ‘’snippets’’) which would resemble code that appears in open source components. Users would be alerted to the similarity in code and would then be required to review each alert individually. 

Initially, open source scanners seemed to have revolutionized the way organizations were monitoring and managing their open source inventory. However, it didn’t take long to realize that open source scanners were born with serious flaws, and that scanning their code base would not be as easy and automated as one had initially believed- it was actually the other way around.

The 3 Pitfalls of Scanner-Based Open Source License and Security Management Solutions

Instead of making the process of open source management easier, open source scanners may have brought more challenges. We’ve highlighted the 3 main pitfalls of the solution below:

Pitfall #1: The Never-ending Tale of False Positives

One of the main challenges that arise when using an open source scanner is the amount of “false positive” alerts which are produced. These alerts seemed to have matched snippets, but on a closer look, turned out not to be actually part of an open source component. These false positives will be flagged by the open source scanning solution, and then be ruled out by the development team.

You may think “a few false positives here and there can happen. As long as the majority is correct” – truth of the matter is, these false positives are usually only manageable if the total number of open source components used in your products is limited. However, over the years, open source components have been growing exponentially – in our database alone, we’re talking today about more than 155 million open source components (both source and binary) in languages such as Java, Ruby, and Python, and another 11 billion open source files in languages such as C/C++, Javascript, PHP, and ObjectiveC. With so many open source components available to your development teams, one is guaranteed to have certain coincidental snippet matches. And when we say certain, we’re referring to thousands (we’re not exaggerating!).

Bottom line, sifting through these false positives can be very tedious and time-consuming work – time, which is scarce especially in the days leading to a release or as part of an M&A due diligence process.

Pitfall #2: Agile SDLC Process? Not With An Open Source Scanner

Pitfall #1 clearly underlined the time-consuming aspect of scanning for open source components. With this, pitfall #2 steps into the spotlight – scanning for open source components just cannot be done on a continuous basis. In today’s era, software development teams are trying harder than ever to increase the agility of their SDLC by releasing new versions more frequently and under tighter deadlines. This accelerated pace is however unwantedly brought to a halt with the open source scanners. Automated open source scanning processes can take up to weeks to complete, which is then followed by pitfall #1 (very (very) lengthy reviews of alerts). Even if your organization is using the waterfall development model, this can still introduce significant delays to your release schedule.

Let’s move one step further: you’ve had the patience to wait for the scanning of your code to be completed – turns out it’s right before a release. What if you find an open source component was used with a license which is not in line with your organization’s policy? Or a component with a severe security vulnerability? These delayed results can hurt your agility since you now need to focus on removing the problematic component – a process which often can take up more time. Bottom line, with this way of working, your agility and speed is at risk.

Pitfall #3: Time Is of Essence With Security Vulnerabilities

Last, but certainly not least, is Pitfall #3 – Security Vulnerabilities. We’ve all come to learn, one way or another, that when a security vulnerability becomes known, it is critical to fix it as soon as possible because that is when potential attackers are best positioned to exploit it. This holds true for both proprietary as well as open source code. Unfortunately, in a scanner-based paradigm, you will only know about your open source vulnerabilities the next time you perform the scan, which, as we already explained, could turn out to be months later. Even worse, if your solution is deployed on-premise, then you will also not be aware of your exposure until your database has been updated. Bottom line – if your solution is not continuous (and if you’re using open source scanners, this applies to you!), your customers will remain vulnerable much, much longer.

If Not Open Source Scanners… Then What?

Open Source Scanners may have been a good, primary solution when they were first introduced, but with the ever-increasing adoption of open source components in applications as well as the agility and pace organizations are working at nowadays, open source scanners just can’t make the cut.

Properly managing your open source components does not need to be such a hassle. New solutions have long since been introduced, which present an effective, time-saving and most importantly, a continuous approach to managing your open source usage. The first agile open source management solution was introduced by WhiteSource back in 2011, which is able to integrate fully into your SDLC. Plugins are available which are able to calculate digital signatures for all your open source components in your repositories and builds without ever needing to actually scan your code line by line. These signatures are then cross-referenced with WhiteSource’s comprehensive database to identify all your open source components, including all dependencies, and provide insightful and actionable information relating to licenses, security vulnerabilities, newer versions and quality issues.

Soon after WhiteSource pioneered this new approach, a number of other companies followed suit, realizing the vast array of benefits behind this technology. Gone are the days of thousands of time-consuming false-positives, long scanning periods, and risky exposure windows to security vulnerabilities. We’ve summed up the benefits of the solution in the image below:

It’s Time To Move Forward To Agile Open Source Management

It’s time to acknowledge that Open Source Scanners need to be respectfully laid to rest and to move on with the newer generations of Software Composition Analysis tools that were developed to shift left open source management.

In today’s agile development environment, you simply cannot continue to rely on open source code scanners. And for those of you who are still following a waterfall model – the new tools require a lot less effort from your development and DevOps teams, provide a lot more functionality and consume a fraction of the time. As Seneca once said, “Every new beginning comes from some other beginning’s end.

Previous Article
Open Source Management – the Story of Dave and Mike
Open Source Management – the Story of Dave and Mike

A day in the life of Dave and Mike Dave is a lean, mean highly organized machine. Whereas Mike is more r...

Next Article
The Five Founding Fathers of Open Source
The Five Founding Fathers of Open Source

It’s that time of year again, Thanksgiving! Just as we remember our forefathers who came to the New Worl...