xref: /aosp_15_r20/external/apache-xml/test/tests/contrib/garypeskin/SAX2DTMDesign.html (revision 1212f9a0ffdc28482b8821715d2222bf16dc14e2)
1<!--
2 * Licensed to the Apache Software Foundation (ASF) under one
3 * or more contributor license agreements. See the NOTICE file
4 * distributed with this work for additional information
5 * regarding copyright ownership. The ASF licenses this file
6 * to you under the Apache License, Version 2.0 (the  "License");
7 * you may not use this file except in compliance with the License.
8 * You may obtain a copy of the License at
9 *
10 *     http://www.apache.org/licenses/LICENSE-2.0
11 *
12 * Unless required by applicable law or agreed to in writing, software
13 * distributed under the License is distributed on an "AS IS" BASIS,
14 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
15 * See the License for the specific language governing permissions and
16 * limitations under the License.
17-->
18<html><head><title></title></head><body>
19<center><h1>SAX2DTM Design Notes</h1></center>
20<p>The current implementation is subject to change and this class
21should be accessed only through published interface methods.  However,
22the following information is provided to aid in an understanding of how this
23class currently works and is provided for debugging purposes only.
24This implementation stores information about each node in a series of arrays.  Conceptually,
25the arrays can be thought of as either <code>String</code> Vectors or <code>int</code>
26Vectors although they are implemented using some internal classes. The <code>m_chars</code>
27array is conceptually a Vector of <code>chars</code>.  The chief arrays of
28interest are shown in the following table:</p>
29
30<table border="1"
31summary="Key arrays used: the first cell contains the array name and the second contains the
32 conceptual type, and the third contains the description of the contents">
33<tr>
34<th>Array Name</th>
35<th>Array Type</th>
36<th>Contents</th>
37</tr>
38
39<tr>
40<td rowspan="1" colspan="1"><code>m_exptype</code></td>
41<td rowspan="1" colspan="1">int</td>
42<td rowspan="1" colspan="1">An integer representing a unique value for a Node.  The first 6
43bits represent the Node type, as shown below.  The next 10 bits represent an index
44into m_namespaceNames.  The remaining 16 bits represent an index into m_locNamesPool.
45<b>Start here.</b>  This Vector represents the list of Nodes.</td>
46</tr>
47
48<tr>
49<td rowspan="1" colspan="1"><code>m_locNamesPool</code></td>
50<td rowspan="1" colspan="1">String</td>
51<td rowspan="1" colspan="1">Local (prefixed) names.  Field of m_expandedNameTable.</td>
52</tr>
53
54<tr>
55<td rowspan="1" colspan="1"><code>m_namespaceNames</code></td>
56<td rowspan="1" colspan="1">String</td>
57<td rowspan="1" colspan="1">Namespace URIs.  Field of m_expandedNameTable.</td>
58</tr>
59
60<tr>
61<td rowspan="1" colspan="1"><code>m_dataOrQName</code></td>
62<td rowspan="1" colspan="1">int</td>
63<td rowspan="1" colspan="1">An index into either m_data or m_valuesOrPrefixes, as explained
64in the next table.</td>
65</tr>
66
67<tr>
68<td rowspan="1" colspan="1"><code>m_valuesOrPrefixes</code></td>
69<td rowspan="1" colspan="1">String</td>
70<td rowspan="1" colspan="1">Values and prefixes.</td>
71</tr>
72
73<tr>
74<td rowspan="1" colspan="1"><code>m_data</code></td>
75<td rowspan="1" colspan="1">int</td>
76<td rowspan="1" colspan="1">Entries here occur in pairs.  The use of this array is explained
77in the next table.</td>
78</tr>
79
80<tr>
81<td rowspan="1" colspan="1"><code>m_chars</code></td>
82<td rowspan="1" colspan="1">char</td>
83<td rowspan="1" colspan="1">Characters used to form Strings as explained in the next table.</td>
84</tr>
85</table>
86
87<p>This table shows how the array values are used for each type of Node supported by
88this implementation.  An <i>n</i> represents an index into <code>m_namespaceNames</code>
89for the namespace URI associated with the attribute or element.  It actually consists
90of the 10 bits, including the rightmost two bits of the leftmost byte.  The <i>eeee</i>
91represents an index into <code>m_locNamesPool</code> for the value indicated in the table.</p>
92
93<table border="1"
94summary="Node table">
95<tr>
96<th>NodeType</th>
97<th>m_exptype</th>
98<th>m_dataOrQName</th>
99<th>m_data</th>
100</tr>
101
102<tr>
103<td rowspan="1" colspan="1">Attr</td>
104<td rowspan="1" colspan="1">&nbsp;08<i>neeee</i><br>-0b<i>neeee</i><br>
105<i>eeee</i> is local name of attribute.</td>
106<td rowspan="1" colspan="1"><b>No namespace</b>: an index into
107<code>m_valuesOrPrefixes</code> pointing to the attribute value.
108<br><b>Namespace</b>: a negative number, the absolute value of which is an index
109into m_data.</td>
110<td rowspan="1" colspan="1"><b>index</b>: an int containing the index into
111<code>m_valuesOrPrefixes</code> for the Attr QName.
112<br><b>index+1</b>: an int
113containing the index into <code>m_valuesOrPrefixes</code> for the attribute value.</td>
114</tr>
115
116<tr>
117<td rowspan="1" colspan="1">Comment</td>
118<td rowspan="1" colspan="1">&nbsp;20000000</td>
119<td rowspan="1" colspan="1">index into <code>m_valuesOrPrefixes</code>
120for comment text.</td>
121<td rowspan="1" colspan="1">unused</td>
122</tr>
123
124<tr>
125<td rowspan="1" colspan="1">Document</td>
126<td rowspan="1" colspan="1">&nbsp;24000000</td>
127<td rowspan="1" colspan="1">0</td>
128<td rowspan="1" colspan="1">unused</td>
129</tr>
130
131<tr>
132<td rowspan="1" colspan="1">Element</td>
133<td rowspan="1" colspan="1">&nbsp;04<i>neeee</i><br>-07<i>neeee</i><br>
134<i>eeee</i> is local name of element.</td>
135<td rowspan="1" colspan="1"><b>No namespace</b>: 0.
136<br><b>Namespace</b>: an index into
137<code>m_valuesOrPrefixes</code> pointing to the QName.</td>
138<td rowspan="1" colspan="1">unused</td>
139</tr>
140
141<tr>
142<td rowspan="1" colspan="1">Text</td>
143<td rowspan="1" colspan="1">&nbsp;0C000000</td>
144<td rowspan="1" colspan="1">an index into m_data.</td>
145<td rowspan="1" colspan="1"><b>index</b>: an int containing starting subscript in
146<code>m_chars</code> for the text.
147<br><b>index+1</b>: an int
148containing the length of the text.</td>
149
150<tr>
151<td rowspan="1" colspan="1">ProcessingInstruction</td>
152<td rowspan="1" colspan="1">&nbsp;1C0<i>eeee</i>
153<br><i>eeee</i> is the target name.</td>
154<td rowspan="1" colspan="1">index into <code>m_valuesOrPrefixes</code>
155for PI data.</td>
156<td rowspan="1" colspan="1">unused</td>
157</tr>
158
159<tr>
160<td rowspan="1" colspan="1">Namespace</td>
161<td rowspan="1" colspan="1">&nbsp;34<i>neeee</i><br>
162<i>eeee</i> is namespace prefix.</td>
163<td rowspan="1" colspan="1">index into
164<code>m_valuesOrPrefixes</code> pointing to the namespace URI.</td>
165<td rowspan="1" colspan="1">unused</td>
166</tr>
167
168</table>
169</body>
170