bzr branch
http://gegoxaren.bato24.eu/bzr/brz/remove-bazaar
| 
2513.1.1
by Martin Pool
 (in progress) analysis of commit  | 
1  | 
.. This document describes _how_ to do use case analyses and what we want  | 
2  | 
.. to get out of them; for the specific cases see the files referenced by  | 
|
3  | 
.. performance-roadmap.txt  | 
|
4  | 
||
| 
2481.1.5
by Robert Collins
 Incremental push-pull performance anlysis draft.  | 
5  | 
Analysing a specific use case  | 
| 
2506.1.1
by Alexander Belchenko
 sanitize developers docs  | 
6  | 
=============================  | 
| 
2481.1.5
by Robert Collins
 Incremental push-pull performance anlysis draft.  | 
7  | 
|
8  | 
The analysis of a use case needs to provide as outputs:  | 
|
9  | 
* The functional requirements that the use case has to satisfy.  | 
|
10  | 
* The file level operations and access patterns that will give the best  | 
|
11  | 
performance.  | 
|
12  | 
* A low friction API which will allow the use case to be implemented.  | 
|
13  | 
* The release of bzr (and thus the supported features) for which the analysis  | 
|
14  | 
was performed. The feature set of bzr defines the access patterns and data  | 
|
15  | 
required to implement any use case. So when we add features, their design  | 
|
16  | 
changes the requirements for the parts of the system they alter, so we need  | 
|
17  | 
to re-analyse use cases when bzr's feature set changes. If future plans are  | 
|
18  | 
considered in the analysis with the intention of avoiding rework, these  | 
|
19  | 
should also be mentioned.  | 
|
20  | 
||
21  | 
Performing the analysis  | 
|
| 
2506.1.1
by Alexander Belchenko
 sanitize developers docs  | 
22  | 
=======================  | 
| 
2481.1.5
by Robert Collins
 Incremental push-pull performance anlysis draft.  | 
23  | 
|
24  | 
The analysis needs to be able to define the characteristics of the  | 
|
25  | 
involved disk storage and APIs. That means we need to examine the data  | 
|
26  | 
required for the operation, in what order it is required, on both the  | 
|
27  | 
read and write sides, and how that needs to be presented to be  | 
|
28  | 
consistent with our layering.  | 
|
29  | 
||
30  | 
As a quick example: 'annotation of a file requires the file id looked up  | 
|
31  | 
from the tree, the basis revision id from the tree, and then the text of  | 
|
32  | 
that fileid-revisionid pair along with the creating revision id  | 
|
33  | 
allocated to each line, and the dotted revision number of each of those  | 
|
34  | 
revision ids.' All three of our key domain objects are involved here,  | 
|
35  | 
but we haven't defined any characteristics of the api or disk facilities  | 
|
36  | 
yet. We could then do that by saying something like 'the file-id lookup  | 
|
37  | 
should degrade gracefully as trees become huge. The tree basis id should  | 
|
38  | 
be constant time. Retrieval of the annotated text should be roughly  | 
|
39  | 
constant for any text of the same size regardless of the number of  | 
|
40  | 
revisions contributing to its content. Mapping of the revision ids to  | 
|
41  | 
dotted revnos could be done as the text is retrieved, but its completely  | 
|
42  | 
fine to post-process the annotated text to obtain dotted-revnos.'  | 
|
43  | 
||
44  | 
What factors should be considered?  | 
|
| 
2506.1.1
by Alexander Belchenko
 sanitize developers docs  | 
45  | 
==================================  | 
| 
2481.1.5
by Robert Collins
 Incremental push-pull performance anlysis draft.  | 
46  | 
|
47  | 
Obviously, those that will make for an extremely fast system :). There  | 
|
48  | 
are many possible factors, but the ones I think are most interesting to  | 
|
49  | 
design with are:  | 
|
50  | 
||
51  | 
- baseline overhead:  | 
|
52  | 
||
53  | 
- The time to get bzr ready to begin the use case.  | 
|
54  | 
||
55  | 
- scaling: how does performance change when any of the follow aspects  | 
|
| 
2513.1.1
by Martin Pool
 (in progress) analysis of commit  | 
56  | 
of the system are ratcheted massively up or down:  | 
| 
2481.1.5
by Robert Collins
 Incremental push-pull performance anlysis draft.  | 
57  | 
|
58  | 
- number of files/dirs/symlinks/subtrees in a tree (both working and  | 
|
59  | 
revision trees)  | 
|
60  | 
- size of any particular file  | 
|
61  | 
- number of elements within a single directory  | 
|
62  | 
- length of symlinks  | 
|
63  | 
- number of changes to any file over time  | 
|
64  | 
(subordinately also the number of merges of the file)  | 
|
65  | 
- number of commits in the ancestry of a branch  | 
|
66  | 
(subordinately also the number of merges)  | 
|
67  | 
- number of revisions in a repository  | 
|
68  | 
- number of fileids in a repository  | 
|
69  | 
- number of ghosts in a given graph (revision or per-file)  | 
|
70  | 
- number of branches in a repository  | 
|
71  | 
- number of concurrent readers for a tree/branch/repository  | 
|
72  | 
- number of concurrent writers for objects that support that.  | 
|
73  | 
- latency to perform file operations (e.g. slow disks, network file systems,  | 
|
74  | 
our VFS layer and FTP/SFTP/etc)  | 
|
75  | 
- bandwidth to the disk storage  | 
|
76  | 
- latency to perform semantic operations (hpss specific)  | 
|
77  | 
- bandwidth when performing semantic operations.  | 
|
| 
2506.1.1
by Alexander Belchenko
 sanitize developers docs  | 
78  | 
|
| 
2481.1.5
by Robert Collins
 Incremental push-pull performance anlysis draft.  | 
79  | 
- locality of reference: If an operation requires data that is located  | 
| 
2506.1.1
by Alexander Belchenko
 sanitize developers docs  | 
80  | 
within a small region at any point, we often get better performance  | 
81  | 
than with an implementation of the same operation that requires the  | 
|
82  | 
same amount of data but with a lower locality of reference. Its  | 
|
83  | 
fairly tricky to add locality of reference after the fact, so I think  | 
|
84  | 
its worth considering up front.  | 
|
| 
2481.1.5
by Robert Collins
 Incremental push-pull performance anlysis draft.  | 
85  | 
|
86  | 
Using these factors, to the annotate example we can add that its  | 
|
87  | 
reasonable to do two 'semantic' round trips to the local tree, one to  | 
|
88  | 
the branch object, and two to the repository. In file-operation level  | 
|
89  | 
measurements, in an ideal world there would be no more than one round  | 
|
90  | 
trip for each semantic operation. What there must not be is one round  | 
|
91  | 
trip per revision involved in the revisionid->dotted number mapping, nor  | 
|
92  | 
per each revision id attributed to a line in the text.  | 
|
93  | 
||
94  | 
Not all the items mentioned above are created equal. The analysis should  | 
|
95  | 
include the parameters considered and the common case values for each - the  | 
|
96  | 
optimisation should be around the common cases not around the exceptions.  | 
|
97  | 
||
98  | 
For instance, we have a smart server now; file level operations are relatively  | 
|
99  | 
low latency and we should use that as the common case. At this point we intend  | 
|
100  | 
to preserve the performance of the dumb protocol networking, but focus on  | 
|
101  | 
improving network performance via the smart server and thus escape the  | 
|
102  | 
file-level operation latency considerations.  | 
|
103  | 
||
104  | 
Many performance problems only become visible when changing the scaling knobs  | 
|
105  | 
upwards to large trees. On small trees its our baseline performance that drives  | 
|
106  | 
incremental improvements; on large trees its the amount of processing per item  | 
|
| 
2513.1.1
by Martin Pool
 (in progress) analysis of commit  | 
107  | 
that drives performance. A significant goal therefore is to keep the amount of  | 
| 
2481.1.5
by Robert Collins
 Incremental push-pull performance anlysis draft.  | 
108  | 
data to be processed under control. Ideally we can scale in a sublinear fashion  | 
109  | 
for all operations, but we MUST NOT scale even linearly for operations that  | 
|
110  | 
invoke a latency multiplier. For example, reading a file on disk requires  | 
|
111  | 
finding the inode for the file, then the block with the data and returning the  | 
|
112  | 
contents. Due to directory grouping logic we pay a massive price to read files  | 
|
113  | 
if we do not group the reads of files within the same directory.  |