C++ Code Safety using static Analysis Principles

C++ Code Safety using static Analysis Principles
Slide Note
Embed
Share

C++ has traditionally prioritized performance, but the demand for strong safety guarantees is growing. Explore how static analysis can improve code safety, bridging the gap between Rust and C++. Discover the importance of correctness and the benefits of leveraging static analysis techniques for early defect detection and resolution. Dive into the challenges, learnings from Rust, and the actionable steps to enhance safety in C++ through static analysis principles.

  • C++
  • Code Safety
  • Static Analysis
  • Rust
  • Software Development

Uploaded on Feb 16, 2025 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. C++ Code Safety using static analysis principles Sunny Chatterjee

  2. This talk Why code safety matters Static Analysis and first learnings from Rust Bridging the safety gap: Challenge and big buckets Data from Microsoft production software Where we are today What s next for Microsoft C++ Static Analysis Helpful links Contributors

  3. Why code safety matters For decades C++ has focused on performance Customers and security researchers have been asking strong safety guarantees in the language It is not just performance and safety. Correctness is equally important! It s not just us. There are languages like Rust empowering everyone to write safe and reliable software. Challenge: Can we use the leverage the power of static analysis to make C++ a safe language?

  4. Static Analysis What? Reads in C/C++ code, apply some clever techniques to look for defects, report defects found at compile time. Our focus is on local static analysis: considers each function in isolation. Why? Drive quality upstream find and help fix coding defects at the earliest point in the development cycle. Save $$$. It s been an effective technique for enforcing C++ Core Guideline rules to empower developers write modern C++. Proven success in finding a class of safety and reliability issues ranging from uninitialized memory to concurrency errors.

  5. First learnings from Rust In Rust, safety features are built into the language. If broken, resulting code will not compile. Equivalent features in C++ can be turned off. There isn t a 1:1 mapping between every Rust safety feature and C++ Core Guidelines. Several Rust rules are too restrictive to existing C++ coding practices.

  6. Bridging the safety gap: The challenge Identify actionable safety differences between Rust and C++ Can be statically checked. Can be powered by the same engine that builds existing static analysis rules. Built on existing type system in C++, language extensions in the Guideline Support Library (GSL), and C++ Core Guidelines. Exercise: Go over the big buckets of safety and correctness in Rust. Find corresponding rules in the C++ standard and/or Core Guidelines. Implement missing checks using static analysis principles.

  7. The big buckets 1. Casting(Type Safety) 2. Switch statements (Correctness) 3. Smarter loops (Bounds safety) 4. Smarter copying (Performance) 5. Lifetimes (Reliability) 6. Mutability

  8. Rust Does not allow implicit casting among primitive types. fn takeInt(a: i32) -> u32 { a // Error, attempted conversion of i32 --> u32 } fn floatToInt() { let float: f32 = 3.2; takeInt(float); // Error, attempted conversion from f32 --> i32 }

  9. C-style casting allowed for some types using the `as` keyword. fn cstyleCast() { let a: i32 = 128; let b: i8 = a as i8; // -128 } No conversion allowed from integer to bool. use std::convert::TryFrom; fn inttoBool() { let a: i32 = 32; let b = a as bool; // error: cannot cast as `bool` let c = bool::try_from(a).ok(); // error: the trait `std::convert::From< i32>` is not implemented if (a) {} // error: expected `bool`, found `i32` if (a != 0) {} // much better... }

  10. Safe casting can be achieved using `TryFrom` and `TryInto` traits use std::convert::TryFrom; fn foo(a: i32) { // Using a default value let b = match i8::try_from(a) { Ok(i) => i, // value fit, success! Err(_) => 42 // use default value }; // Using Option (std::optional) let c = i8::try_from(a).ok(); match c { Some(i) => println!("success! {}", i), None => println!("failed!") } }

  11. C++ 1. Don't use reinterpret_cast (type.1) 2. Don't use static_cast downcasts (type.2) 3. Don't use const_cast to cast away const or volatile (type.3) 4. Don't use C-style casts (type.4) 5. Don't cast between could be implicit (type.1) 6. Do not use function style C-casts (es.49) pointer types when conversion

  12. The big buckets 1. Casting (Type Safety) 2. Switch statements (Correctness) 3. Smarter loops (Bounds safety) 4. Smarter copying (Performance) 5. Lifetimes (Reliability) 6. Mutability

  13. Rust Has a pattern matching construct that covers similar functionality to C++ switch (and more) match my_i32() { 1 => do_smt(), 2 => do_smt(), _ => do_smt() // won't compile without this line (or exhaustively check ing from MIN to MAX } enum T { A, B, C } ... match my_T() { A => do_smt(), B => do_smt(), C => do_smt() // compiles because all cases have been covered }

  14. C++ 1. Switch statements over a non-enum type should have a default (es.79, es.70) 2. Switch statements over an enum type should either have a default or cover all cases (es.79, enum.2, es.70) 3. Check implicit fall-through (es.78) Rust does not allow fall-through in matchers.

  15. Demo

  16. The big buckets 1. Casting (Type Safety) 2. Switch statements (Correctness) 3. Smarter loops (Bounds Safety) 4. Smarter copying (Performance) 5. Lifetimes (Reliability) 6. Mutability

  17. Rust Does not have C-style for loops Manually controlling each loop element is complicated and error prone. Forces the use of range checked iterator pattern. Forces loop variables to the loop itself // Example from the book let a = [10, 20, 30, 40, 50]; for element in a.iter() { println!("the value is: {}", element); } // Range-checked reverse iteration with step for number in (1..11).rev().step_by(2) { println!("{}", number); // 10 8 6 4 2 }

  18. C++ 1. Flag loop index variables declared outside of the loop (es.74) // should be `for (int i = 0; i < 10; ++i)` instead int i; for (i = 0; i < 10; ++i) { ... } // use `if (int x = func_call())` instead int x = func_call(); if (x) { ... }

  19. 2. Flag reuse of loop index variables(es.76) // Same loop index variable is used within two loops. // The variable is reset/set within the second loop. // The variable is not read in the gap between the two loops. void use() { int i; for (i = 0; i < 20; ++i) { /* ... */ } for (i = 0; i < 200; ++i) { /* ... */ } // bad: i recycled }

  20. 3. Convert C-style for loops into range-for loops (es.71) 1 loop index variable Condition: i < vec.size() or i != vec.size() // 1: A typical C-style for loop for (int i = 0; i < vec.size(); ++i) { foo(vec[i]); } Incrementor: ++ // 2: A more predictable pattern for (auto it = vec.begin(); it != vec.end(); ++it) { foo(*it); } `i` only used for indexing. - No pointer arithmetic - No vec[i+1] - No assignment to it. // 3: Recommended for (const auto& item: vec) { foo(item); } Warning XXXX: Don t use C-style for loop. Use for-range instead. No side-effects on the container. Example: push_back

  21. The big buckets 1. Casting (Type Safety) 2. Switch statements (Correctness) 3. Smarter loops (Bounds Safety) 4. Smarter copying (Performance) 5. Lifetimes (Reliability) 6. Mutability

  22. Rust Move-by-default. Copying semantics must be explicit. Big part of the overarching approach to memory management: lifetimes, ownership, borrowing. Allows Rust to avoid memory leaks and dangling references and provide compile time safety in many contexts. let a = ... big structure ...; let b = a; // Move, rather than a copy // `a` can no longer be used. Compiler error! // Instead, must copy explicitly let b = a.clone();

  23. C++ 1. Flag range-for loops that do unnecessary copying (es.71) Identify expensive copy in range-for and suggest using a reference // Consider the following code vector<ComplexType> v = ...; for (ComplexType x: v) { // Copy occurs on each iteration // `x` never mutated ... } - - Expensive: - Object being copied in 2x the platform-dependent pointer size. Object is not a view (gsl::span, gsl::string_span, std::string_view) Object is not a smart pointer. // This should now become for (const ComplexType& x: v) { // `x` never mutated ... } Loop variable is not mutated inside the loop body.

  24. Demo

  25. 2. Suggest using `const auto &` instead of `auto` when assigning from a reference (P.9) Rust example // Rust s copy semantics must be explicit. // `auto` defaults to a reference when assigned with a reference. let a = fnThatReturnsAReference(); // `a` is a reference, no copying C++ `auto` takes on a value for references auto a = fnThatReturnsAReference(); // If fn returns `T&`, then the type o f `a` is `T` auto b = fnThatReturnsAPointer(); // If fn returns T*, b is of type T* Ramifications: Confusing to new C++ users Arguably inconsistent with pointers Very easy to make an expensive copy by forgetting the `&` from `auto`.

  26. Demo

  27. The big buckets 1. Casting (Type Safety) 2. Switch statements (Correctness) 3. Smarter loops (Bounds Safety) 4. Smarter copying (Performance) 5. Lifetimes (Reliability) 6. Mutability

  28. Rust Rust s memory model makes it hard to accidentally leak memory C++ C++ Core Check supports lifetime profile Statically checks against invalidated iterators. No checks for never return a pointer to a local (F.43) struct A { int* a; }; void foo(std::vector<A>& v) { int x = 42; v.back().a = &x; // Uh-oh }

  29. Demo

  30. The big buckets 1. Casting 2. Switch statements 3. Smarter loops 4. Smarter copying 5. Lifetimes 6. Mutability

  31. Rust Immutable by default. let a = 3; a += 2; let mut b = 2; Error! Forbids having a mutable and immutable reference to the same object in the same scope. Helps make guarantees around data race safety. let mut var = 3; let const_ref = &var; let mut_ref = &mut var; *mut_ref += 1; foo(const_ref); Borrowing an immutable version while it s mutated above is invalid!

  32. C++ Marking immutable data as `const` viewed as a good programming practice. Core Guidelines has const-correctness rules. Example: Variable is assigned only once, mark it as `const` (con.4) Mutable and immutable reference in same scope. void foo() { int var = 3; int& mut_ref = var; const int& const_ref = var; ++mut_ref; std::cout << const_ref << std::endl; } Is 4 *really* expected?

  33. Bridging the gap: Summary Identified actionable safety differences between Rust and C++. Covered the big buckets of safety and correctness in Rust. C++ Core Guidelines has rules covering many of the big-ticket items. Implemented missing checks in Visual Studio 2019. More work needed in the toolchain loops, lifetimes, borrowing. A strict mode in the compiler enforcing safety rules at compile time would be ideal.

  34. Data from Microsoft production software Security vulnerabilities Performance problem Logical errors Interesting patterns

  35. Security vulnerability from Microsoft production software HRESULT CMessage::ReceiveMemory( PVOID pData, ULONG cbData ) { PBYTE pMemData; ULONG cbMemData; Warning XXXXX: Potential read overflow using pMemData. Buffer is apparently unbounded by buffer size. HRESULT hr = this->m_Msg->GetMemory( &pMemData, &cbMemData ); ASSERT( SUCCEEDED( hr )); if ( this->m_CurrentMemPos + cbData > cbMemData ) return E_UNEXPECTED; A missing bounds check before call to CopyMemory resulted in a vulnerability CopyMemory( pData, pMemData + this->m_CurrentMemPos, cbData ); this->m_CurrentMemPos += cbData; return hr; }

  36. Security vulnerability mitigation HRESULT CMessage::ReceiveMemory( gsl::span<BYTE> Data, ULONG cbData ) { gsl::span<BYTE> MemData; // MemData is passed by reference hr = this->m_Msg->GetMemory( MemData ); ASSERT( SUCCEEDED( hr )); if ( this->m_CurrentMemPos + cbData > (ULONG)MemData.size() ) return E_UNEXPECTED; Both Data and MemData are protected here, DOS instead of RCE gsl::copy(Data, MemData.subspan(this->m_CurrentMemPos, cbData)); this->m_CurrentMemPos += cbData; return hr; }

  37. Performance: Expensive Copy for (auto item : CommonClassTable) { if (String::Equals(item.ClassName, key, StringComparison::OrdinalIgnoreCa se)) { value = item.ClassLocation; Warning: Mark `item` as `const auto &` // If the class is a driver and a UP driver is // specified, then put the driver in the UP // subdirectory. // // Do the same for retail. We assume the -u switch is passed // only when actually needed. Logging::DebugMessage(L"PlaceTheFile", L"After checking for classes v ia ClassTablePointer. ClassMatch: true\n"); return value; } }

  38. Where are today Static analysis tools have been in use at Microsoft to validate millions lines of production C++ code for nearly two decades. All checks you saw today are available in Visual Studio 2019. Tools are run by: Visual Studio IDE - in the background or as an explicit task Build system Simple checks are built upon an abstract syntax tree (AST) layer: Provides a plugin model for writing checks that consume program data from C/C++ compiler frontend. Many checks require dataflow and path sensitive analysis Built on top of a control flow graph (CFG) layer on top of ASTs Product of 10+ years of research.

  39. Whats next for Microsoft C++ Static Analysis C++ Code Safety Checkers More safety rules around bounds, type, and lifetime safety Meet the tooling needs of the community as the language evolves. gsl::span is compliant with std::span (C++20) Rolled out lifetime related rules when using coroutines (C++20) Continue to invest in Microsoft GSL library. Improved diagnostics: Easy warning descriptions Improved source locations Path highlights Suppress warnings from external includes

  40. Whats next for Microsoft C++ Static Analysis Tighter integration with developer workflows: Inner loop (VS design time experience, Compiler Explorer) Outer loop (GitHub Security) Investigate opening the Microsoft Code Analysis architecture to 3rd party plugins Built on top of modern AST APIs. Make the platform accessible to C++ community. Please reach out to us and provide feedback.

  41. Join the fun! Join us give feedback and suggestions! Contribute checks, bug reports, fixes, tests, ideas Resources: C++ Team Blog: https://devblogs.microsoft.com/cppblog Code Analysis docs: https://docs.microsoft.com/en-us/cpp/code- quality/code-analysis-for-c-cpp-overview Clang-Tidy: http://clang.llvm.org/extra/clang-tidy Core Guidelines: https://github.com/isocpp/CppCoreGuidelines GSL: https://github.com/Microsoft/GSL SARIF: https://github.com/sarif-standard

  42. Contributors C++ Core Guideline Editors Daniel Winsor Dmitry Kobets Gabor Horvath Hwi-sung Im Jordan Maples Phil Christensen

  43. Questions?

More Related Content