SoS Musings #72 - Making the Move to Memory-Safe Programming Languages
SoS Musings #72 -
Making the Move to Memory-Safe Programming Languages
Software is ubiquitous in modern society, and with this comes the implicit trust that it will behave as intended and will not be hacked. Developers may subject code to extensive testing to ensure it can handle unexpected conditions, but memory safety issues are still a common source of exploitable software vulnerabilities. According to a presentation prepared by Microsoft in 2019, about 70 percent of software vulnerabilities stemmed from memory safety issues. Google also found that more than 70 percent of critical security vulnerabilities found in Chrome were memory safety problems. Software robustness and vulnerability prevention depend critically on memory management. A malicious cyber actor may be able to do nefarious deeds, such as crashing the software or altering the instructions of the running program to do whatever they choose, by exploiting inadequate memory management. Memory safety is an attribute in some programming languages and serves the purpose of preventing programmers from introducing bugs associated with the way memory is used. Memory safety bugs are a common source of security problems, which is why languages that are memory-safe offer a higher level of protection than languages that do not. C and C++ have been the software industry's primary programming languages for decades. However, they lack the memory protections of other programming languages, such as C#, Go, Java, Python, Ruby, Rust, and Swift. According to the State of Software Security Vol. 11 report released by the application security company Veracode, 59 percent of C++ applications contain high-severity or critical-severity vulnerabilities.
Memory safety is a characteristic of programming languages that prevents memory safety bugs, such as out-of-bounds reads and writes and use-after-free bugs. In understanding memory safety bugs, Prossimo, an Internet Security Research Group (ISRG) project, provides an example of an application that maintains to-do lists for users. If there is a ten-item to-do list and a request is made for the 11th item, there should be some type of error. There should also be an error if a request is made for the negative first item. A non-memory-safe language may allow a programmer to read any memory contents that exist before or after the valid list contents, which is known as an out-of-bounds read. The memory preceding the first item on a list may be the last item on another person's list, while the memory following the final item on a list may be the first item on someone else's list. Access to this memory is a major security weakness. According to Prossimo, programmers can prevent out-of-bounds reads by comparing the requested item's index to the list's length. However, programmers make mistakes, so it would be advantageous to use a memory-safe programming language that protects the programmer and their users from this type of bug by default. With an out-of-bounds write vulnerability in the same example, an attempt is made to modify the 11th or negative first item on a to-do list, altering someone else's to-do list. A use-after-free bug may involve accessing an item from a to-do list that has been deleted. For example, deleting a to-do list and then requesting the first item from that list should generate an error, as items should not be able to be fetched from a deleted list. Non-memory-safe programming languages enable programs to fetch memory that they have declared to be finished with but may now be used by another process. The memory location may now contain somebody else's to-do list. These flaws could lead to unauthorized access to private data, data corruption, or even the execution of unauthorized code. An out-of-bounds read could result in the reading of adjacent memory blocks containing sensitive data. An out-of-bounds write can overwrite sensitive data in memory and lead to hijacking the program's control flow and executing privileged or malicious code. If memory-safe programming languages are used, these bugs are discovered during compile time or runtime. At compile time, they are flagged as errors, which the programmers can then fix. When detected at runtime, they cause crashes rather than allowing unchecked memory access, thereby limiting potential damage and averting security vulnerabilities.
In November 2022, the National Security Agency (NSA) released a Cyber Information Sheet titled "Software Memory Safety" to help software developers and operators in preventing and mitigating software memory safety issues. NSA pointed out that the exploitation of memory issues allows malicious actors, not constrained by normal software usage expectations, to enter unusual data into the program, causing the memory to be accessed, written, allocated, or deallocated in unexpected ways. In some instances, a malicious actor may exploit memory management errors to gain access to sensitive data, execute unauthorized code, or cause other damaging effects. Since it may require a great deal of experimentation with unusual inputs to find one that causes an unexpected response, actors may use a technique called "fuzzing" to randomly or intelligently generate a large number of input values to the program until they find one that causes it to crash. In recent years, advancements in fuzzing tools and techniques have made it easier for malicious actors to identify problematic inputs. When a threat actor discovers that a specific input can cause the program to crash, they analyze the code and determine what a specially crafted input could do. Such an input could, in the worst-case scenario, enable the actor to seize control of the system on which the program is running.
Experts have encouraged programmers to use memory-safe languages, such as C#, Go, Java, Ruby, Rust, and Swift to prevent the introduction of certain types of memory-related issues, but it is important to consider the challenges and approaches to encouraging the widespread adoption of code written in these languages. There are challenges to increasing the adoption of these languages, which are associated with education, trust, and personal preferences, as pointed out by Consumer Reports (CR) in a report titled "Future of Memory Safety." The CR report is based on an event held on October 27, 222, in which participants shared memory safety-related resources, discussed opportunities and barriers in the security ecosystem, and brainstormed potential solutions to memory access vulnerabilities that exist in commercially available products. According to the report, some computer science courses require students to perform much of their systems-level work in the notoriously memory-unsafe programming language C. Professors should explain the dangers of C and similar programming languages, and possibly increase the weight of memory safety errors in grading, which are prevalent in code written both inside and outside the classroom. Another option would be to switch between languages during different parts of these courses. Many also think memory-safe languages, such as Rust, are more difficult to learn and to use with hardware, which may discourage people from learning it. However, it must be noted that most other memory-safe programming languages, such as Go, accomplish temporal memory safety through garbage collection, thus simplifying many programming aspects and making the languages much easier to learn. It is essential to acknowledge that some programmers may find memory-safe programming languages more challenging or be resistant to switching to them. To mitigate this issue, it is important to explain that memory-safe programming languages force programmers to think more critically through concepts, which ultimately enhances their code's safety and performance. In some instances, executive-level concerns exist within an organization. In addition to distrusting new languages, management may be concerned that tools may not function correctly. Maybe the tools are functional, but C/C++ equivalents appear more reliable and user-friendly. Through joint partnerships, it may be useful to convey that changing languages now, as opposed to delaying the process, will result in lower costs and higher productivity.
Despite the identified challenges and pushback associated with the adoption of memory-safe languages, the shift to such languages has shown significant improvements in memory safety. Data indicates that the growing prevalence of Java, C#, Rust, and other memory-safe programming languages has resulted in a decline of the entire class of vulnerabilities. Google revealed a significant drop in memory safety vulnerabilities and an associated drop in the severity of its vulnerabilities on Android because of the shift in programming language usage away from memory-unsafe languages. The annual number of memory safety vulnerabilities decreased from 223 to 85 between 2019 and 2022. Android 13 is the first Android release in which most of the newly added code is written in a memory-safe programming language. As the amount of new memory-unsafe code entering Android has decreased, the number of memory safety vulnerabilities has decreased, with the percentage of Android's total vulnerabilities having decreased from 76 percent to 35 percent. 2022 marked the first year that memory safety vulnerabilities no longer represented the majority of Android's vulnerabilities. Although correlation does not necessarily imply causation, the percentage of vulnerabilities caused by memory safety issues appears to correlate closely with the language used for new code.
Memory issues account for a significant amount of exploitable software vulnerabilities. Experts encourage organizations to switch from programming languages with little or no inherent memory protection to memory-safe languages. As the NSA has pointed out, languages, such as C#, Go, Java, Ruby, and Swift offer varying degrees of memory usage protection. Therefore, existing code hardening defenses, such as compiler settings, tool analysis, and operating system configurations, should still be used to protect them. Many memory vulnerabilities can be avoided, reduced, or made difficult to exploit by adopting memory-safe languages and code hardening protections.