1*67e74705SXin Li<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" 2*67e74705SXin Li "http://www.w3.org/TR/html4/strict.dtd"> 3*67e74705SXin Li<html> 4*67e74705SXin Li<head> 5*67e74705SXin Li <title>Open Projects</title> 6*67e74705SXin Li <link type="text/css" rel="stylesheet" href="menu.css"> 7*67e74705SXin Li <link type="text/css" rel="stylesheet" href="content.css"> 8*67e74705SXin Li <script type="text/javascript" src="scripts/menu.js"></script> 9*67e74705SXin Li</head> 10*67e74705SXin Li<body> 11*67e74705SXin Li 12*67e74705SXin Li<div id="page"> 13*67e74705SXin Li<!--#include virtual="menu.html.incl"--> 14*67e74705SXin Li<div id="content"> 15*67e74705SXin Li 16*67e74705SXin Li<h1>Open Projects</h1> 17*67e74705SXin Li 18*67e74705SXin Li<p>This page lists several projects that would boost analyzer's usability and 19*67e74705SXin Lipower. Most of the projects listed here are infrastructure-related so this list 20*67e74705SXin Liis an addition to the <a href="potential_checkers.html">potential checkers 21*67e74705SXin Lilist</a>. If you are interested in tackling one of these, please send an email 22*67e74705SXin Lito the <a href=http://lists.llvm.org/mailman/listinfo/cfe-dev>cfe-dev 23*67e74705SXin Limailing list</a> to notify other members of the community.</p> 24*67e74705SXin Li 25*67e74705SXin Li<ul> 26*67e74705SXin Li <li>Core Analyzer Infrastructure 27*67e74705SXin Li <ul> 28*67e74705SXin Li <li>Explicitly model standard library functions with <tt>BodyFarm</tt>. 29*67e74705SXin Li <p><tt><a href="http://clang.llvm.org/doxygen/classclang_1_1BodyFarm.html">BodyFarm</a></tt> 30*67e74705SXin Li allows the analyzer to explicitly model functions whose definitions are 31*67e74705SXin Li not available during analysis. Modeling more of the widely used functions 32*67e74705SXin Li (such as the members of <tt>std::string</tt>) will improve precision of the 33*67e74705SXin Li analysis. 34*67e74705SXin Li <i>(Difficulty: Easy, ongoing)</i><p> 35*67e74705SXin Li </li> 36*67e74705SXin Li 37*67e74705SXin Li <li>Handle floating-point values. 38*67e74705SXin Li <p>Currently, the analyzer treats all floating-point values as unknown. 39*67e74705SXin Li However, we already have most of the infrastructure we need to handle 40*67e74705SXin Li floats: RangeConstraintManager. This would involve adding a new SVal kind 41*67e74705SXin Li for constant floats, generalizing the constraint manager to handle floats 42*67e74705SXin Li and integers equally, and auditing existing code to make sure it doesn't 43*67e74705SXin Li make untoward assumptions. 44*67e74705SXin Li <i> (Difficulty: Medium)</i></p> 45*67e74705SXin Li </li> 46*67e74705SXin Li 47*67e74705SXin Li <li>Implement generalized loop execution modeling. 48*67e74705SXin Li <p>Currently, the analyzer simply unrolls each loop <tt>N</tt> times. This 49*67e74705SXin Li means that it will not execute any code after the loop if the loop is 50*67e74705SXin Li guaranteed to execute more than <tt>N</tt> times. This results in lost 51*67e74705SXin Li basic block coverage. We could continue exploring the path if we could 52*67e74705SXin Li model a generic <tt>i</tt>-th iteration of a loop. 53*67e74705SXin Li <i> (Difficulty: Hard)</i></p> 54*67e74705SXin Li </li> 55*67e74705SXin Li 56*67e74705SXin Li <li>Enhance CFG to model C++ temporaries properly. 57*67e74705SXin Li <p>There is an existing implementation of this, but it's not complete and 58*67e74705SXin Li is disabled in the analyzer. 59*67e74705SXin Li <i>(Difficulty: Medium; current contact: Alex McCarthy)</i></p> 60*67e74705SXin Li 61*67e74705SXin Li <li>Enhance CFG to model exception-handling properly. 62*67e74705SXin Li <p>Currently exceptions are treated as "black holes", and exception-handling 63*67e74705SXin Li control structures are poorly modeled (to be conservative). This could be 64*67e74705SXin Li much improved for both C++ and Objective-C exceptions. 65*67e74705SXin Li <i>(Difficulty: Medium)</i></p> 66*67e74705SXin Li 67*67e74705SXin Li <li>Enhance CFG to model C++ <code>new</code> more precisely. 68*67e74705SXin Li <p>The current representation of <code>new</code> does not provide an easy 69*67e74705SXin Li way for the analyzer to model the call to a memory allocation function 70*67e74705SXin Li (<code>operator new</code>), then initialize the result with a constructor 71*67e74705SXin Li call. The problem is discussed at length in 72*67e74705SXin Li <a href="http://llvm.org/bugs/show_bug.cgi?id=12014">PR12014</a>. 73*67e74705SXin Li <i>(Difficulty: Easy; current contact: Karthik Bhat)</i></p> 74*67e74705SXin Li 75*67e74705SXin Li <li>Enhance CFG to model C++ <code>delete</code> more precisely. 76*67e74705SXin Li <p>Similarly, the representation of <code>delete</code> does not include 77*67e74705SXin Li the call to the destructor, followed by the call to the deallocation 78*67e74705SXin Li function (<code>operator delete</code>). One particular issue 79*67e74705SXin Li (<tt>noreturn</tt> destructors) is discussed in 80*67e74705SXin Li <a href="http://llvm.org/bugs/show_bug.cgi?id=15599">PR15599</a> 81*67e74705SXin Li <i>(Difficulty: Easy; current contact: Karthik Bhat)</i></p> 82*67e74705SXin Li 83*67e74705SXin Li <li>Implement a BitwiseConstraintManager to handle <a href="http://llvm.org/bugs/show_bug.cgi?id=3098">PR3098</a>. 84*67e74705SXin Li <p>Constraints on the bits of an integer are not easily representable as 85*67e74705SXin Li ranges. A bitwise constraint manager would model constraints such as "bit 32 86*67e74705SXin Li is known to be 1". This would help code that made use of bitmasks</code>. 87*67e74705SXin Li <i>(Difficulty: Medium)</i></p> 88*67e74705SXin Li </li> 89*67e74705SXin Li 90*67e74705SXin Li <li>Track type info through casts more precisely. 91*67e74705SXin Li <p>The DynamicTypePropagation checker is in charge of inferring a region's 92*67e74705SXin Li dynamic type based on what operations the code is performing. Casts are a 93*67e74705SXin Li rich source of type information that the analyzer currently ignores. They 94*67e74705SXin Li are tricky to get right, but might have very useful consequences. 95*67e74705SXin Li <i>(Difficulty: Medium)</i></p> 96*67e74705SXin Li 97*67e74705SXin Li <li>Design and implement alpha-renaming. 98*67e74705SXin Li <p>Implement unifying two symbolic values along a path after they are 99*67e74705SXin Li determined to be equal via comparison. This would allow us to reduce the 100*67e74705SXin Li number of false positives and would be a building step to more advanced 101*67e74705SXin Li analyses, such as summary-based interprocedural and cross-translation-unit 102*67e74705SXin Li analysis. 103*67e74705SXin Li <i>(Difficulty: Hard)</i></p> 104*67e74705SXin Li </li> 105*67e74705SXin Li </ul> 106*67e74705SXin Li </li> 107*67e74705SXin Li 108*67e74705SXin Li <li>Bug Reporting 109*67e74705SXin Li <ul> 110*67e74705SXin Li <li>Add support for displaying cross-file diagnostic paths in HTML output 111*67e74705SXin Li (used by <tt>scan-build</tt>). 112*67e74705SXin Li <p>Currently <tt>scan-build</tt> output does not display reports that span 113*67e74705SXin Li multiple files. The main problem is that we do not have a good format to 114*67e74705SXin Li display such paths in HTML output. <i>(Difficulty: Medium)</i> </p> 115*67e74705SXin Li </li> 116*67e74705SXin Li 117*67e74705SXin Li <li>Refactor path diagnostic generation in <a href="http://clang.llvm.org/doxygen/BugReporter_8cpp_source.html">BugReporter.cpp</a>. 118*67e74705SXin Li <p>It would be great to have more code reuse between "Minimal" and 119*67e74705SXin Li "Extensive" PathDiagnostic generation algorithms. One idea is to create an 120*67e74705SXin Li IR for representing path diagnostics, which would be later be used to 121*67e74705SXin Li generate minimal or extensive report output. <i>(Difficulty: Medium)</i></p> 122*67e74705SXin Li </li> 123*67e74705SXin Li </ul> 124*67e74705SXin Li </li> 125*67e74705SXin Li 126*67e74705SXin Li <li>Other Infrastructure 127*67e74705SXin Li <ul> 128*67e74705SXin Li <li>Rewrite <tt>scan-build</tt> (in Python). 129*67e74705SXin Li <p><i>(Difficulty: Easy)</i></p> 130*67e74705SXin Li </li> 131*67e74705SXin Li 132*67e74705SXin Li <li>Do a better job interposing on a compilation. 133*67e74705SXin Li <p>Currently, <tt>scan-build</tt> just sets the <tt>CC</tt> and <tt>CXX</tt> 134*67e74705SXin Li environment variables to its wrapper scripts, which then call into an 135*67e74705SXin Li underlying platform compiler. This is problematic for any project that 136*67e74705SXin Li doesn't exclusively use <tt>CC</tt> and <tt>CXX</tt> to control its 137*67e74705SXin Li compilers. 138*67e74705SXin Li <p><i>(Difficulty: Medium-Hard)</i></p> 139*67e74705SXin Li </li> 140*67e74705SXin Li 141*67e74705SXin Li <li>Create an <tt>analyzer_annotate</tt> attribute for the analyzer 142*67e74705SXin Li annotations. 143*67e74705SXin Li <p>We would like to put all analyzer attributes behind a fence so that we 144*67e74705SXin Li could add/remove them without worrying that compiler (not analyzer) users 145*67e74705SXin Li depend on them. Design and implement such a generic analyzer attribute in 146*67e74705SXin Li the compiler. <i>(Difficulty: Medium)</i></p> 147*67e74705SXin Li </li> 148*67e74705SXin Li </ul> 149*67e74705SXin Li </li> 150*67e74705SXin Li 151*67e74705SXin Li <li>Enhanced Checks 152*67e74705SXin Li <ul> 153*67e74705SXin Li <li>Implement a production-ready StreamChecker. 154*67e74705SXin Li <p>A SimpleStreamChecker has been presented in the Building a Checker in 24 155*67e74705SXin Li Hours talk 156*67e74705SXin Li (<a href="http://llvm.org/devmtg/2012-11/Zaks-Rose-Checker24Hours.pdf">slides</a> 157*67e74705SXin Li <a href="http://llvm.org/devmtg/2012-11/videos/Zaks-Rose-Checker24Hours.mp4">video</a>). 158*67e74705SXin Li We need to implement a production version of the checker with richer set of 159*67e74705SXin Li APIs and evaluate it by running on real codebases. 160*67e74705SXin Li <i>(Difficulty: Easy)</i></p> 161*67e74705SXin Li </li> 162*67e74705SXin Li 163*67e74705SXin Li <li>Extend Malloc checker with reasoning about custom allocator, 164*67e74705SXin Li deallocator, and ownership-transfer functions. 165*67e74705SXin Li <p>This would require extending the MallocPessimistic checker to reason 166*67e74705SXin Li about annotated functions. It is strongly desired that one would rely on 167*67e74705SXin Li the <tt>analyzer_annotate</tt> attribute, as described above. 168*67e74705SXin Li <i>(Difficulty: Easy)</i></p> 169*67e74705SXin Li </li> 170*67e74705SXin Li 171*67e74705SXin Li <li>Implement a BitwiseMaskingChecker to handle <a href="http://llvm.org/bugs/show_bug.cgi?id=16615">PR16615</a>. 172*67e74705SXin Li <p>Symbolic expressions of the form <code>$sym & CONSTANT</code> can range from 0 to <code>CONSTANT-</code>1 if CONSTANT is <code>2^n-1</code>, e.g. 0xFF (0b11111111), 0x7F (0b01111111), 0x3 (0b0011), 0xFFFF, etc. Even without handling general bitwise operations on symbols, we can at least bound the value of the resulting expression. Bonus points for handling masks followed by shifts, e.g. <code>($sym & 0b1100) >> 2</code>. 173*67e74705SXin Li <i>(Difficulty: Easy)</i></p> 174*67e74705SXin Li </li> 175*67e74705SXin Li 176*67e74705SXin Li <li>Implement iterators invalidation checker. 177*67e74705SXin Li <p><i>(Difficulty: Easy)</i></p> 178*67e74705SXin Li </li> 179*67e74705SXin Li 180*67e74705SXin Li <li>Write checkers which catch Copy and Paste errors. 181*67e74705SXin Li <p>Take a look at the 182*67e74705SXin Li <a href="http://pages.cs.wisc.edu/~shanlu/paper/TSE-CPMiner.pdf">CP-Miner</a> 183*67e74705SXin Li paper for inspiration. 184*67e74705SXin Li <i>(Difficulty: Medium-Hard; current contacts: Daniel Marjamäki and Daniel Fahlgren)</i></p> 185*67e74705SXin Li </li> 186*67e74705SXin Li </ul> 187*67e74705SXin Li </li> 188*67e74705SXin Li</ul> 189*67e74705SXin Li 190*67e74705SXin Li</div> 191*67e74705SXin Li</div> 192*67e74705SXin Li</body> 193*67e74705SXin Li</html> 194*67e74705SXin Li 195