xref: /aosp_15_r20/external/clang/www/analyzer/open_projects.html (revision 67e74705e28f6214e480b399dd47ea732279e315)
1*67e74705SXin Li<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
2*67e74705SXin Li          "http://www.w3.org/TR/html4/strict.dtd">
3*67e74705SXin Li<html>
4*67e74705SXin Li<head>
5*67e74705SXin Li  <title>Open Projects</title>
6*67e74705SXin Li  <link type="text/css" rel="stylesheet" href="menu.css">
7*67e74705SXin Li  <link type="text/css" rel="stylesheet" href="content.css">
8*67e74705SXin Li  <script type="text/javascript" src="scripts/menu.js"></script>
9*67e74705SXin Li</head>
10*67e74705SXin Li<body>
11*67e74705SXin Li
12*67e74705SXin Li<div id="page">
13*67e74705SXin Li<!--#include virtual="menu.html.incl"-->
14*67e74705SXin Li<div id="content">
15*67e74705SXin Li
16*67e74705SXin Li<h1>Open Projects</h1>
17*67e74705SXin Li
18*67e74705SXin Li<p>This page lists several projects that would boost analyzer's usability and
19*67e74705SXin Lipower. Most of the projects listed here are infrastructure-related so this list
20*67e74705SXin Liis an addition to the <a href="potential_checkers.html">potential checkers
21*67e74705SXin Lilist</a>. If you are interested in tackling one of these, please send an email
22*67e74705SXin Lito the <a href=http://lists.llvm.org/mailman/listinfo/cfe-dev>cfe-dev
23*67e74705SXin Limailing list</a> to notify other members of the community.</p>
24*67e74705SXin Li
25*67e74705SXin Li<ul>
26*67e74705SXin Li  <li>Core Analyzer Infrastructure
27*67e74705SXin Li  <ul>
28*67e74705SXin Li    <li>Explicitly model standard library functions with <tt>BodyFarm</tt>.
29*67e74705SXin Li    <p><tt><a href="http://clang.llvm.org/doxygen/classclang_1_1BodyFarm.html">BodyFarm</a></tt>
30*67e74705SXin Li    allows the analyzer to explicitly model functions whose definitions are
31*67e74705SXin Li    not available during analysis. Modeling more of the widely used functions
32*67e74705SXin Li    (such as the members of <tt>std::string</tt>) will improve precision of the
33*67e74705SXin Li    analysis.
34*67e74705SXin Li    <i>(Difficulty: Easy, ongoing)</i><p>
35*67e74705SXin Li    </li>
36*67e74705SXin Li
37*67e74705SXin Li    <li>Handle floating-point values.
38*67e74705SXin Li    <p>Currently, the analyzer treats all floating-point values as unknown.
39*67e74705SXin Li    However, we already have most of the infrastructure we need to handle
40*67e74705SXin Li    floats: RangeConstraintManager. This would involve adding a new SVal kind
41*67e74705SXin Li    for constant floats, generalizing the constraint manager to handle floats
42*67e74705SXin Li    and integers equally, and auditing existing code to make sure it doesn't
43*67e74705SXin Li    make untoward assumptions.
44*67e74705SXin Li    <i> (Difficulty: Medium)</i></p>
45*67e74705SXin Li    </li>
46*67e74705SXin Li
47*67e74705SXin Li    <li>Implement generalized loop execution modeling.
48*67e74705SXin Li    <p>Currently, the analyzer simply unrolls each loop <tt>N</tt> times. This
49*67e74705SXin Li    means that it will not execute any code after the loop if the loop is
50*67e74705SXin Li    guaranteed to execute more than <tt>N</tt> times. This results in lost
51*67e74705SXin Li    basic block coverage. We could continue exploring the path if we could
52*67e74705SXin Li    model a generic <tt>i</tt>-th iteration of a loop.
53*67e74705SXin Li    <i> (Difficulty: Hard)</i></p>
54*67e74705SXin Li    </li>
55*67e74705SXin Li
56*67e74705SXin Li    <li>Enhance CFG to model C++ temporaries properly.
57*67e74705SXin Li    <p>There is an existing implementation of this, but it's not complete and
58*67e74705SXin Li    is disabled in the analyzer.
59*67e74705SXin Li    <i>(Difficulty: Medium; current contact: Alex McCarthy)</i></p>
60*67e74705SXin Li
61*67e74705SXin Li    <li>Enhance CFG to model exception-handling properly.
62*67e74705SXin Li    <p>Currently exceptions are treated as "black holes", and exception-handling
63*67e74705SXin Li    control structures are poorly modeled (to be conservative). This could be
64*67e74705SXin Li    much improved for both C++ and Objective-C exceptions.
65*67e74705SXin Li    <i>(Difficulty: Medium)</i></p>
66*67e74705SXin Li
67*67e74705SXin Li    <li>Enhance CFG to model C++ <code>new</code> more precisely.
68*67e74705SXin Li    <p>The current representation of <code>new</code> does not provide an easy
69*67e74705SXin Li    way for the analyzer to model the call to a memory allocation function
70*67e74705SXin Li    (<code>operator new</code>), then initialize the result with a constructor
71*67e74705SXin Li    call. The problem is discussed at length in
72*67e74705SXin Li    <a href="http://llvm.org/bugs/show_bug.cgi?id=12014">PR12014</a>.
73*67e74705SXin Li    <i>(Difficulty: Easy; current contact: Karthik Bhat)</i></p>
74*67e74705SXin Li
75*67e74705SXin Li    <li>Enhance CFG to model C++ <code>delete</code> more precisely.
76*67e74705SXin Li    <p>Similarly, the representation of <code>delete</code> does not include
77*67e74705SXin Li    the call to the destructor, followed by the call to the deallocation
78*67e74705SXin Li    function (<code>operator delete</code>). One particular issue
79*67e74705SXin Li    (<tt>noreturn</tt> destructors) is discussed in
80*67e74705SXin Li    <a href="http://llvm.org/bugs/show_bug.cgi?id=15599">PR15599</a>
81*67e74705SXin Li    <i>(Difficulty: Easy; current contact: Karthik Bhat)</i></p>
82*67e74705SXin Li
83*67e74705SXin Li    <li>Implement a BitwiseConstraintManager to handle <a href="http://llvm.org/bugs/show_bug.cgi?id=3098">PR3098</a>.
84*67e74705SXin Li    <p>Constraints on the bits of an integer are not easily representable as
85*67e74705SXin Li    ranges. A bitwise constraint manager would model constraints such as "bit 32
86*67e74705SXin Li    is known to be 1". This would help code that made use of bitmasks</code>.
87*67e74705SXin Li    <i>(Difficulty: Medium)</i></p>
88*67e74705SXin Li    </li>
89*67e74705SXin Li
90*67e74705SXin Li    <li>Track type info through casts more precisely.
91*67e74705SXin Li    <p>The DynamicTypePropagation checker is in charge of inferring a region's
92*67e74705SXin Li    dynamic type based on what operations the code is performing. Casts are a
93*67e74705SXin Li    rich source of type information that the analyzer currently ignores. They
94*67e74705SXin Li    are tricky to get right, but might have very useful consequences.
95*67e74705SXin Li    <i>(Difficulty: Medium)</i></p>
96*67e74705SXin Li
97*67e74705SXin Li    <li>Design and implement alpha-renaming.
98*67e74705SXin Li    <p>Implement unifying two symbolic values along a path after they are
99*67e74705SXin Li    determined to be equal via comparison. This would allow us to reduce the
100*67e74705SXin Li    number of false positives and would be a building step to more advanced
101*67e74705SXin Li    analyses, such as summary-based interprocedural and cross-translation-unit
102*67e74705SXin Li    analysis.
103*67e74705SXin Li    <i>(Difficulty: Hard)</i></p>
104*67e74705SXin Li    </li>
105*67e74705SXin Li  </ul>
106*67e74705SXin Li  </li>
107*67e74705SXin Li
108*67e74705SXin Li  <li>Bug Reporting
109*67e74705SXin Li  <ul>
110*67e74705SXin Li    <li>Add support for displaying cross-file diagnostic paths in HTML output
111*67e74705SXin Li    (used by <tt>scan-build</tt>).
112*67e74705SXin Li    <p>Currently <tt>scan-build</tt> output does not display reports that span
113*67e74705SXin Li    multiple files. The main problem is that we do not have a good format to
114*67e74705SXin Li    display such paths in HTML output. <i>(Difficulty: Medium)</i> </p>
115*67e74705SXin Li    </li>
116*67e74705SXin Li
117*67e74705SXin Li    <li>Refactor path diagnostic generation in <a href="http://clang.llvm.org/doxygen/BugReporter_8cpp_source.html">BugReporter.cpp</a>.
118*67e74705SXin Li    <p>It would be great to have more code reuse between "Minimal" and
119*67e74705SXin Li    "Extensive" PathDiagnostic generation algorithms. One idea is to create an
120*67e74705SXin Li    IR for representing path diagnostics, which would be later be used to
121*67e74705SXin Li    generate minimal or extensive report output. <i>(Difficulty: Medium)</i></p>
122*67e74705SXin Li    </li>
123*67e74705SXin Li  </ul>
124*67e74705SXin Li  </li>
125*67e74705SXin Li
126*67e74705SXin Li  <li>Other Infrastructure
127*67e74705SXin Li  <ul>
128*67e74705SXin Li    <li>Rewrite <tt>scan-build</tt> (in Python).
129*67e74705SXin Li    <p><i>(Difficulty: Easy)</i></p>
130*67e74705SXin Li    </li>
131*67e74705SXin Li
132*67e74705SXin Li    <li>Do a better job interposing on a compilation.
133*67e74705SXin Li    <p>Currently, <tt>scan-build</tt> just sets the <tt>CC</tt> and <tt>CXX</tt>
134*67e74705SXin Li    environment variables to its wrapper scripts, which then call into an
135*67e74705SXin Li    underlying platform compiler. This is problematic for any project that
136*67e74705SXin Li    doesn't exclusively use <tt>CC</tt> and <tt>CXX</tt> to control its
137*67e74705SXin Li    compilers.
138*67e74705SXin Li    <p><i>(Difficulty: Medium-Hard)</i></p>
139*67e74705SXin Li    </li>
140*67e74705SXin Li
141*67e74705SXin Li    <li>Create an <tt>analyzer_annotate</tt> attribute for the analyzer
142*67e74705SXin Li    annotations.
143*67e74705SXin Li    <p>We would like to put all analyzer attributes behind a fence so that we
144*67e74705SXin Li    could add/remove them without worrying that compiler (not analyzer) users
145*67e74705SXin Li    depend on them. Design and implement such a generic analyzer attribute in
146*67e74705SXin Li    the compiler. <i>(Difficulty: Medium)</i></p>
147*67e74705SXin Li    </li>
148*67e74705SXin Li  </ul>
149*67e74705SXin Li  </li>
150*67e74705SXin Li
151*67e74705SXin Li  <li>Enhanced Checks
152*67e74705SXin Li  <ul>
153*67e74705SXin Li    <li>Implement a production-ready StreamChecker.
154*67e74705SXin Li    <p>A SimpleStreamChecker has been presented in the Building a Checker in 24
155*67e74705SXin Li    Hours talk
156*67e74705SXin Li    (<a href="http://llvm.org/devmtg/2012-11/Zaks-Rose-Checker24Hours.pdf">slides</a>
157*67e74705SXin Li    <a href="http://llvm.org/devmtg/2012-11/videos/Zaks-Rose-Checker24Hours.mp4">video</a>).
158*67e74705SXin Li    We need to implement a production version of the checker with richer set of
159*67e74705SXin Li    APIs and evaluate it by running on real codebases.
160*67e74705SXin Li    <i>(Difficulty: Easy)</i></p>
161*67e74705SXin Li    </li>
162*67e74705SXin Li
163*67e74705SXin Li    <li>Extend Malloc checker with reasoning about custom allocator,
164*67e74705SXin Li    deallocator, and ownership-transfer functions.
165*67e74705SXin Li    <p>This would require extending the MallocPessimistic checker to reason
166*67e74705SXin Li    about annotated functions. It is strongly desired that one would rely on
167*67e74705SXin Li    the <tt>analyzer_annotate</tt> attribute, as described above.
168*67e74705SXin Li    <i>(Difficulty: Easy)</i></p>
169*67e74705SXin Li    </li>
170*67e74705SXin Li
171*67e74705SXin Li    <li>Implement a BitwiseMaskingChecker to handle <a href="http://llvm.org/bugs/show_bug.cgi?id=16615">PR16615</a>.
172*67e74705SXin Li    <p>Symbolic expressions of the form <code>$sym &amp; CONSTANT</code> can range from 0 to <code>CONSTANT-</code>1 if CONSTANT is <code>2^n-1</code>, e.g. 0xFF (0b11111111), 0x7F (0b01111111), 0x3 (0b0011), 0xFFFF, etc. Even without handling general bitwise operations on symbols, we can at least bound the value of the resulting expression. Bonus points for handling masks followed by shifts, e.g. <code>($sym &amp; 0b1100) >> 2</code>.
173*67e74705SXin Li    <i>(Difficulty: Easy)</i></p>
174*67e74705SXin Li    </li>
175*67e74705SXin Li
176*67e74705SXin Li    <li>Implement iterators invalidation checker.
177*67e74705SXin Li    <p><i>(Difficulty: Easy)</i></p>
178*67e74705SXin Li    </li>
179*67e74705SXin Li
180*67e74705SXin Li    <li>Write checkers which catch Copy and Paste errors.
181*67e74705SXin Li    <p>Take a look at the
182*67e74705SXin Li    <a href="http://pages.cs.wisc.edu/~shanlu/paper/TSE-CPMiner.pdf">CP-Miner</a>
183*67e74705SXin Li    paper for inspiration.
184*67e74705SXin Li    <i>(Difficulty: Medium-Hard; current contacts: Daniel Marjam&auml;ki and Daniel Fahlgren)</i></p>
185*67e74705SXin Li    </li>
186*67e74705SXin Li  </ul>
187*67e74705SXin Li  </li>
188*67e74705SXin Li</ul>
189*67e74705SXin Li
190*67e74705SXin Li</div>
191*67e74705SXin Li</div>
192*67e74705SXin Li</body>
193*67e74705SXin Li</html>
194*67e74705SXin Li
195