xref: /aosp_15_r20/external/pcre/doc/html/pcre2_compile.html (revision 22dc650d8ae982c6770746019a6f94af92b0f024)
1<html>
2<head>
3<title>pcre2_compile specification</title>
4</head>
5<body bgcolor="#FFFFFF" text="#00005A" link="#0066FF" alink="#3399FF" vlink="#2222BB">
6<h1>pcre2_compile man page</h1>
7<p>
8Return to the <a href="index.html">PCRE2 index page</a>.
9</p>
10<p>
11This page is part of the PCRE2 HTML documentation. It was generated
12automatically from the original man page. If there is any nonsense in it,
13please consult the man page, in case the conversion went wrong.
14<br>
15<br><b>
16SYNOPSIS
17</b><br>
18<P>
19<b>#include &#60;pcre2.h&#62;</b>
20</P>
21<P>
22<b>pcre2_code *pcre2_compile(PCRE2_SPTR <i>pattern</i>, PCRE2_SIZE <i>length</i>,</b>
23<b>  uint32_t <i>options</i>, int *<i>errorcode</i>, PCRE2_SIZE *<i>erroroffset,</i></b>
24<b>  pcre2_compile_context *<i>ccontext</i>);</b>
25</P>
26<br><b>
27DESCRIPTION
28</b><br>
29<P>
30This function compiles a regular expression pattern into an internal form. Its
31arguments are:
32<pre>
33  <i>pattern</i>       A string containing expression to be compiled
34  <i>length</i>        The length of the string or PCRE2_ZERO_TERMINATED
35  <i>options</i>       Primary option bits
36  <i>errorcode</i>     Where to put an error code
37  <i>erroffset</i>     Where to put an error offset
38  <i>ccontext</i>      Pointer to a compile context or NULL
39</pre>
40The length of the pattern and any error offset that is returned are in code
41units, not characters. A NULL pattern with zero length is treated as an empty
42string. A compile context is needed only if you want to provide custom memory
43allocation functions, or to provide an external function for system stack size
44checking (see <b>pcre2_set_compile_recursion_guard()</b>), or to change one or
45more of these parameters:
46<pre>
47  What \R matches (Unicode newlines, or CR, LF, CRLF only);
48  PCRE2's character tables;
49  The newline character sequence;
50  The compile time nested parentheses limit;
51  The maximum pattern length (in code units) that is allowed;
52  The additional options bits.
53</pre>
54The primary option bits are:
55<pre>
56  PCRE2_ANCHORED           Force pattern anchoring
57  PCRE2_ALLOW_EMPTY_CLASS  Allow empty classes
58  PCRE2_ALT_BSUX           Alternative handling of \u, \U, and \x
59  PCRE2_ALT_CIRCUMFLEX     Alternative handling of ^ in multiline mode
60  PCRE2_ALT_VERBNAMES      Process backslashes in verb names
61  PCRE2_AUTO_CALLOUT       Compile automatic callouts
62  PCRE2_CASELESS           Do caseless matching
63  PCRE2_DOLLAR_ENDONLY     $ not to match newline at end
64  PCRE2_DOTALL             . matches anything including NL
65  PCRE2_DUPNAMES           Allow duplicate names for subpatterns
66  PCRE2_ENDANCHORED        Pattern can match only at end of subject
67  PCRE2_EXTENDED           Ignore white space and # comments
68  PCRE2_FIRSTLINE          Force matching to be before newline
69  PCRE2_LITERAL            Pattern characters are all literal
70  PCRE2_MATCH_INVALID_UTF  Enable support for matching invalid UTF
71  PCRE2_MATCH_UNSET_BACKREF  Match unset backreferences
72  PCRE2_MULTILINE          ^ and $ match newlines within data
73  PCRE2_NEVER_BACKSLASH_C  Lock out the use of \C in patterns
74  PCRE2_NEVER_UCP          Lock out PCRE2_UCP, e.g. via (*UCP)
75  PCRE2_NEVER_UTF          Lock out PCRE2_UTF, e.g. via (*UTF)
76  PCRE2_NO_AUTO_CAPTURE    Disable numbered capturing paren-
77                            theses (named ones available)
78  PCRE2_NO_AUTO_POSSESS    Disable auto-possessification
79  PCRE2_NO_DOTSTAR_ANCHOR  Disable automatic anchoring for .*
80  PCRE2_NO_START_OPTIMIZE  Disable match-time start optimizations
81  PCRE2_NO_UTF_CHECK       Do not check the pattern for UTF validity
82                             (only relevant if PCRE2_UTF is set)
83  PCRE2_UCP                Use Unicode properties for \d, \w, etc.
84  PCRE2_UNGREEDY           Invert greediness of quantifiers
85  PCRE2_USE_OFFSET_LIMIT   Enable offset limit for unanchored matching
86  PCRE2_UTF                Treat pattern and subjects as UTF strings
87</pre>
88PCRE2 must be built with Unicode support (the default) in order to use
89PCRE2_UTF, PCRE2_UCP and related options.
90</P>
91<P>
92Additional options may be set in the compile context via the
93<a href="pcre2_set_compile_extra_options.html"><b>pcre2_set_compile_extra_options</b></a>
94function.
95</P>
96<P>
97If either of <i>errorcode</i> or <i>erroroffset</i> is NULL, the function returns
98NULL immediately. Otherwise, the yield of this function is a pointer to a
99private data structure that contains the compiled pattern, or NULL if an error
100was detected. In the error case, a text error message can be obtained by
101passing the value returned via the <i>errorcode</i> argument to the
102<b>pcre2_get_error_message()</b> function. The offset (in code units) where the
103error was encountered is returned via the <i>erroroffset</i> argument.
104</P>
105<P>
106If there is no error, the value passed via <i>errorcode</i> returns the message
107"no error" if passed to <b>pcre2_get_error_message()</b>, and the value passed
108via <i>erroroffset</i> is zero.
109</P>
110<P>
111There is a complete description of the PCRE2 native API, with more detail on
112each option, in the
113<a href="pcre2api.html"><b>pcre2api</b></a>
114page, and a description of the POSIX API in the
115<a href="pcre2posix.html"><b>pcre2posix</b></a>
116page.
117<p>
118Return to the <a href="index.html">PCRE2 index page</a>.
119</p>
120