bzr branch
http://gegoxaren.bato24.eu/bzr/brz/remove-bazaar
|
2940.2.1
by Ian Clatworthy
initial user doc for KnitPack repositories |
1 |
========================== |
|
2592.3.229
by Martin Pool
Initial pack format documentation |
2 |
KnitPack repository format |
3 |
========================== |
|
4 |
||
|
2940.2.1
by Ian Clatworthy
initial user doc for KnitPack repositories |
5 |
.. contents:: |
6 |
||
7 |
Using KnitPack repositories |
|
8 |
=========================== |
|
9 |
||
10 |
Preparation |
|
11 |
----------- |
|
12 |
||
13 |
A small percentage of existing repositories may have some inconsistent |
|
14 |
data within them. It's is a good idea to check the integrity of your |
|
15 |
repositories before migrating them to knitpack format. To do this, run:: |
|
16 |
||
17 |
bzr check |
|
18 |
||
19 |
If that reports a problem, run this command:: |
|
20 |
||
21 |
bzr reconcile |
|
22 |
||
23 |
Note that this can take many hours for repositories with deep history |
|
24 |
so be sure to set aside some time for this if it is required. |
|
25 |
||
26 |
Creating a new knitpack branch |
|
27 |
------------------------------ |
|
28 |
||
29 |
If you're starting a project from scratch, it's easy to make it a |
|
30 |
``knitpack`` one. Here's how:: |
|
31 |
||
32 |
cd my-stuff |
|
33 |
bzr init --knitpack-experimental |
|
34 |
bzr add |
|
35 |
bzr commit -m "initial import" |
|
36 |
||
37 |
In other words, use the normal sequence of commands but add the |
|
38 |
``--knitpack-experimental`` option to the ``init`` command. |
|
39 |
||
40 |
Creating a new knitpack repository |
|
41 |
---------------------------------- |
|
42 |
||
43 |
If you're starting a project from scratch and wish to use a shared repository |
|
44 |
for branches, you can make it a ``knitpack`` repository like this:: |
|
45 |
||
46 |
cd my-repo |
|
47 |
bzr init-repo --knitpack-experimental . |
|
48 |
cd my-stuff |
|
49 |
bzr init |
|
50 |
bzr add |
|
51 |
bzr commit -m "initial import" |
|
52 |
||
53 |
In other words, use the normal sequence of commands but add the |
|
54 |
``--knitpack-experimental`` option to the ``init-repo`` command. |
|
55 |
||
56 |
Upgrading an existing branch or repository to knitpack format |
|
57 |
------------------------------------------------------------- |
|
58 |
||
59 |
If you have an existing branch or repository and wish to migrate it to |
|
60 |
a ``knitpack`` format, use the ``upgrade`` command like this:: |
|
61 |
||
62 |
bzr upgrade path-to-my-repo |
|
63 |
||
64 |
If ``path-to-my-repo`` is not provided, the current repository - and |
|
65 |
all branches within it - will be upgraded. |
|
66 |
||
67 |
Starting a new knitpack branch from one in an older format |
|
68 |
---------------------------------------------------------- |
|
69 |
||
70 |
This can be done in one of several ways: |
|
71 |
||
72 |
1. Create a new branch and pull into it |
|
73 |
2. Create a standalone branch and upgrade its format |
|
74 |
3. Create a knitpack shared repository and branch into it |
|
75 |
||
76 |
Here are the commands for using the ``pull`` approach:: |
|
77 |
||
78 |
bzr init --knitpack-experimental my-new-branch |
|
79 |
cd my-new-branch |
|
80 |
bzr pull my-source-branch |
|
81 |
||
82 |
Here are the commands for using the ``upgrade`` approach:: |
|
83 |
||
84 |
bzr branch my-source-branch my-new-branch |
|
85 |
cd my-new-branch |
|
86 |
bzr upgrade --knitpack-experimental |
|
87 |
||
88 |
Here are the commands for the shared repository approach:: |
|
89 |
||
90 |
cd my-repo |
|
91 |
bzr init-repo --knitpack-experimental . |
|
92 |
bzr branch my-source-branch my-new-branch |
|
93 |
cd my-new-branch |
|
94 |
||
95 |
As a reminder, any of the above approaches can fail if the source branch |
|
96 |
has inconsistent data within it and hasn't been reconciled yet. Please |
|
97 |
be sure to check that before reporting problems. |
|
98 |
||
99 |
Testing the knitpack-subtree-experimental format |
|
100 |
------------------------------------------------ |
|
101 |
||
102 |
If you are using ``bzr-svn`` or are testing the prototype subtree support, |
|
103 |
you can still use and assist in testing KnitPacks. The commands to use |
|
104 |
are identical to the ones given above except that the name of the format |
|
105 |
to use is ``knitpack-subtree-experimental``. |
|
106 |
||
107 |
WARNING: Note that the subtree formats, ``distate-subtree`` and |
|
108 |
``knitpack-subtree-experimental``, are **not** production strength yet and |
|
109 |
may cause unexpected problems. They are required for the bzr-svn |
|
110 |
plug-in but should otherwise ony be used by people happy to live on the |
|
111 |
bleeding edge. If you are using bzr-svn, you're on the bleeding edge anyway. |
|
112 |
:-) |
|
113 |
||
114 |
Reporting problems |
|
115 |
------------------ |
|
116 |
||
117 |
If you need any help or encounter any problems, please contact the developers |
|
118 |
via the usual ways, i.e. chat to us on IRC or send a message to our mailing |
|
119 |
list. See http://bazaar-vcs.org/BzrSupport for contact details. |
|
120 |
||
121 |
||
122 |
Technical notes |
|
123 |
=============== |
|
124 |
||
|
2592.3.229
by Martin Pool
Initial pack format documentation |
125 |
Bazaar 0.92 adds a new format (experimental at first) implemented in |
126 |
``bzrlib.repofmt.pack_repo.py``. |
|
127 |
||
128 |
This format provides a knit-like interface which is quite compatible |
|
129 |
with knit format repositories: you can get a VersionedFile for a |
|
130 |
particular file-id, or for revisions, or for the inventory, even though |
|
131 |
these do not correspond to single files on disk. |
|
132 |
||
133 |
The on-disk format is that the repository directory contains these |
|
134 |
files and subdirectories: |
|
135 |
||
136 |
==================== ============================================= |
|
137 |
packs/ completed readonly packs |
|
138 |
indices/ indices for completed packs |
|
139 |
upload/ temporary files for packs currently being |
|
140 |
written |
|
141 |
obsolete_packs/ packs that have been repacked and are no |
|
142 |
longer normally needed |
|
143 |
pack-names index of all live packs |
|
144 |
lock/ lockdir |
|
145 |
==================== ============================================= |
|
146 |
||
|
2592.3.230
by Martin Pool
Review comments on knitpack docs |
147 |
Note that for consistency we always write "indices" not "indexes". |
148 |
||
|
2592.3.229
by Martin Pool
Initial pack format documentation |
149 |
This is implemented on top of pack files, which are written once from |
150 |
start to end, then left alone. A pack consists of a body file, plus |
|
151 |
several index files. There are four index files for each pack, which |
|
152 |
have the same basename and an extension indicating the purpose of the |
|
153 |
index: |
|
154 |
||
|
2592.3.230
by Martin Pool
Review comments on knitpack docs |
155 |
======== ========== ======================== ========================== |
156 |
extn Purpose Key References |
|
157 |
======== ========== ======================== ========================== |
|
158 |
``.tix`` File texts ``file_id, revision_id`` per-file parents, |
|
159 |
compression basis |
|
160 |
per-file parents |
|
161 |
``.six`` Signatures ``revision_id,`` - |
|
162 |
``.rix`` Revisions ``revision_id,`` revision parents |
|
163 |
``.iix`` Inventory ``revision_id,`` revision parents, |
|
164 |
compression base |
|
165 |
======== ========== ======================== ========================== |
|
|
2592.3.229
by Martin Pool
Initial pack format documentation |
166 |
|
|
2592.3.230
by Martin Pool
Review comments on knitpack docs |
167 |
Indices are accessed through the ``bzrlib.index.GraphIndex`` class. |
|
2592.3.229
by Martin Pool
Initial pack format documentation |
168 |
Indices are stored as sorted files on disk. Each line is one record, |
169 |
and contains: |
|
170 |
||
171 |
* key fields |
|
172 |
* a value string - for all these indices, this is an ascii decimal pair |
|
173 |
of "offset length" giving the position of the refenced data within |
|
174 |
the pack body file |
|
175 |
* a list of zero or more reference lists |
|
176 |
||
177 |
The reference lists let a graph be stored within the index. Each |
|
178 |
reference list entry points to another entry in the same index. The |
|
179 |
references are represented as a byte offset for the target within the |
|
180 |
index file. |
|
181 |
||
182 |
When a compression base is given, it indicates that the body of the text |
|
183 |
or inventory is a forward delta from the referenced revision. The |
|
184 |
compression base list must have length 0 or 1. |
|
185 |
||
|
2592.3.230
by Martin Pool
Review comments on knitpack docs |
186 |
Like packs, indexes are written only once and then unmodified. A |
187 |
GraphIndex builder is a mutable in-memory graph that can be sorted, |
|
188 |
cross-referenced and written out when the write group completes. |
|
189 |
||
190 |
There can also be index entries with a value of 'a' for absent. These |
|
191 |
records exist just to be pointed to in a graph. This is used, for |
|
192 |
example, to give the revision-parent pointer when the parent revision is |
|
193 |
in a previous pack. |
|
194 |
||
|
2592.3.229
by Martin Pool
Initial pack format documentation |
195 |
The data content for each record is a knit data chunk. The knits are |
196 |
always unannotated - the annotations must be generated when needed. |
|
197 |
(We'd like to cache/memoize the annotations.) The data hunks can be |
|
198 |
moved between packs without needing to recompress them. |
|
199 |
||
200 |
It is not possible to regenerate an index from the body file, because it |
|
201 |
contains information stored in the knit index that's not in the body. |
|
202 |
(In particular, the per-file graph is only stored in the index.) |
|
|
2592.3.230
by Martin Pool
Review comments on knitpack docs |
203 |
We would like to change this in a future format. |
|
2592.3.229
by Martin Pool
Initial pack format documentation |
204 |
|
205 |
The lock is a regular LockDir lock. The lock is only held for a much |
|
206 |
reduced scope, while updating the pack-names file. The bulk of the |
|
207 |
insertion can be done without the repository locked. This is an |
|
208 |
implementation detail; the repository user should still call |
|
209 |
``repository.lock_write`` at the regular time but be aware this does not |
|
210 |
correspond to a physical mutex. |
|
211 |
||
212 |
Read locks control caching but do not affect writers. |
|
213 |
||
214 |
The newly-added repository write group concept is very important to |
|
215 |
KnitPack repositories. When ``start_write_group`` is called, a new |
|
216 |
temporary pack is created and all modifications to the repository will |
|
217 |
go into it until either ``commit_write_group`` or ``abort_write_group`` |
|
218 |
is called, at which time it is either finished and moved into place or |
|
219 |
discarded respectively. Write groups cannot be nested, only one can be |
|
220 |
underway at a time on a Repository instance and they must occur within a |
|
221 |
write lock. |
|
222 |
||
223 |
Normally the data for each revision will be entirely within a single |
|
224 |
pack but this is not required. |
|
225 |
||
226 |
When a pack is finished, it gets a final name based on the md5 of all |
|
227 |
the data written into the pack body file. |
|
228 |
||
229 |
The ``pack-names`` file gives the list of all finished non-obsolete |
|
230 |
packs. (This should always be the same as the list of files in the |
|
231 |
``packs/`` directory, but the file is needed for readonly http clients |
|
232 |
that can't easily list directories, and it includes other information.) |
|
|
2592.3.230
by Martin Pool
Review comments on knitpack docs |
233 |
The constraint on the ``pack-names`` list is that every file mentioned |
234 |
must exist in the ``packs/`` directory. |
|
235 |
||
236 |
In rare cases, when a writer is interrupted, about-to-be-removed packs |
|
237 |
may still be present in the directory but removed from the list. |
|
238 |
||
239 |
As well as the list of names, the pack-names file also contains the |
|
240 |
size, in bytes, of each of the four indices. This is used to bootstrap |
|
241 |
bisection search within the indices. |
|
|
2592.3.229
by Martin Pool
Initial pack format documentation |
242 |
|
243 |
In normal use, one pack will be created for each commit to a repository. |
|
244 |
This would build up to an inefficient number of files over time, so a |
|
245 |
``repack`` operation is available to recombine them, by producing larger |
|
246 |
files containing data on multiple revisions. This can be done manually |
|
247 |
by running ``bzr pack``, and it also may happen automatically when a |
|
248 |
write group is committed. |
|
249 |
||
250 |
The repacking strategy used at the moment tries to balance not doing too |
|
251 |
much work during commit with not having too many small files left in the |
|
252 |
repository. The algorithm is roughly this: the total number of |
|
253 |
revisions in the repository is expressed as a decimal number, e.g. |
|
254 |
"532". Then we'll repack until we have five packs containing a hundred |
|
255 |
revisions each, three packs containing ten revisions each, and two packs |
|
256 |
with single revisions. This means that each revision will normally |
|
257 |
initially be created in a single-revision pack, then moved to a |
|
258 |
ten-revision pack, then to a 100-pack, and so on. |
|
259 |
||
|
2592.3.230
by Martin Pool
Review comments on knitpack docs |
260 |
As with other repositories, in normal use data is only inserted. |
261 |
However, in some circumstances we may want to garbage-collect or prune |
|
262 |
existing data, or reconcile indexes. |
|
|
2592.3.229
by Martin Pool
Initial pack format documentation |
263 |
|
264 |
vim: tw=72 ft=rest expandtab |