bzr branch
http://gegoxaren.bato24.eu/bzr/brz/remove-bazaar
| 
2484.1.1
by John Arbash Meinel
 Add an initial function to read knit indexes in pyrex.  | 
1  | 
# Copyright (C) 2005, 2006, 2007 Canonical Ltd
 | 
| 
1563.2.4
by Robert Collins
 First cut at including the knit implementation of versioned_file.  | 
2  | 
#
 | 
3  | 
# This program is free software; you can redistribute it and/or modify
 | 
|
4  | 
# it under the terms of the GNU General Public License as published by
 | 
|
5  | 
# the Free Software Foundation; either version 2 of the License, or
 | 
|
6  | 
# (at your option) any later version.
 | 
|
7  | 
#
 | 
|
8  | 
# This program is distributed in the hope that it will be useful,
 | 
|
9  | 
# but WITHOUT ANY WARRANTY; without even the implied warranty of
 | 
|
10  | 
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 | 
|
11  | 
# GNU General Public License for more details.
 | 
|
12  | 
#
 | 
|
13  | 
# You should have received a copy of the GNU General Public License
 | 
|
14  | 
# along with this program; if not, write to the Free Software
 | 
|
15  | 
# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
 | 
|
16  | 
||
17  | 
"""Knit versionedfile implementation.
 | 
|
18  | 
||
19  | 
A knit is a versioned file implementation that supports efficient append only
 | 
|
20  | 
updates.
 | 
|
| 
1563.2.6
by Robert Collins
 Start check tests for knits (pending), and remove dead code.  | 
21  | 
|
22  | 
Knit file layout:
 | 
|
23  | 
lifeless: the data file is made up of "delta records".  each delta record has a delta header 
 | 
|
24  | 
that contains; (1) a version id, (2) the size of the delta (in lines), and (3)  the digest of 
 | 
|
25  | 
the -expanded data- (ie, the delta applied to the parent).  the delta also ends with a 
 | 
|
26  | 
end-marker; simply "end VERSION"
 | 
|
27  | 
||
28  | 
delta can be line or full contents.a
 | 
|
29  | 
... the 8's there are the index number of the annotation.
 | 
|
30  | 
version robertc@robertcollins.net-20051003014215-ee2990904cc4c7ad 7 c7d23b2a5bd6ca00e8e266cec0ec228158ee9f9e
 | 
|
31  | 
59,59,3
 | 
|
32  | 
8
 | 
|
33  | 
8         if ie.executable:
 | 
|
34  | 
8             e.set('executable', 'yes')
 | 
|
35  | 
130,130,2
 | 
|
36  | 
8         if elt.get('executable') == 'yes':
 | 
|
37  | 
8             ie.executable = True
 | 
|
38  | 
end robertc@robertcollins.net-20051003014215-ee2990904cc4c7ad 
 | 
|
39  | 
||
40  | 
||
41  | 
whats in an index:
 | 
|
42  | 
09:33 < jrydberg> lifeless: each index is made up of a tuple of; version id, options, position, size, parents
 | 
|
43  | 
09:33 < jrydberg> lifeless: the parents are currently dictionary compressed
 | 
|
44  | 
09:33 < jrydberg> lifeless: (meaning it currently does not support ghosts)
 | 
|
45  | 
09:33 < lifeless> right
 | 
|
46  | 
09:33 < jrydberg> lifeless: the position and size is the range in the data file
 | 
|
47  | 
||
48  | 
||
49  | 
so the index sequence is the dictionary compressed sequence number used
 | 
|
50  | 
in the deltas to provide line annotation
 | 
|
51  | 
||
| 
1563.2.4
by Robert Collins
 First cut at including the knit implementation of versioned_file.  | 
52  | 
"""
 | 
53  | 
||
| 
1563.2.6
by Robert Collins
 Start check tests for knits (pending), and remove dead code.  | 
54  | 
# TODOS:
 | 
55  | 
# 10:16 < lifeless> make partial index writes safe
 | 
|
56  | 
# 10:16 < lifeless> implement 'knit.check()' like weave.check()
 | 
|
57  | 
# 10:17 < lifeless> record known ghosts so we can detect when they are filled in rather than the current 'reweave 
 | 
|
58  | 
#                    always' approach.
 | 
|
| 
1563.2.11
by Robert Collins
 Consolidate reweave and join as we have no separate usage, make reweave tests apply to all versionedfile implementations and deprecate the old reweave apis.  | 
59  | 
# move sha1 out of the content so that join is faster at verifying parents
 | 
60  | 
# record content length ?
 | 
|
| 
1563.2.6
by Robert Collins
 Start check tests for knits (pending), and remove dead code.  | 
61  | 
|
62  | 
||
| 
1563.2.11
by Robert Collins
 Consolidate reweave and join as we have no separate usage, make reweave tests apply to all versionedfile implementations and deprecate the old reweave apis.  | 
63  | 
from cStringIO import StringIO  | 
| 
1596.2.28
by Robert Collins
 more knit profile based tuning.  | 
64  | 
from itertools import izip, chain  | 
| 
1756.2.17
by Aaron Bentley
 Fixes suggested by John Meinel  | 
65  | 
import operator  | 
| 
1563.2.6
by Robert Collins
 Start check tests for knits (pending), and remove dead code.  | 
66  | 
import os  | 
| 
1594.2.19
by Robert Collins
 More coalescing tweaks, and knit feedback.  | 
67  | 
|
| 
2592.3.66
by Robert Collins
 Allow adaption of KnitData to pack files.  | 
68  | 
from bzrlib.lazy_import import lazy_import  | 
69  | 
lazy_import(globals(), """  | 
|
70  | 
from bzrlib import (
 | 
|
| 
2770.1.1
by Aaron Bentley
 Initial implmentation of plain knit annotation  | 
71  | 
    annotate,
 | 
| 
3535.5.1
by John Arbash Meinel
 cleanup a few imports to be lazily loaded.  | 
72  | 
    debug,
 | 
73  | 
    diff,
 | 
|
| 
3224.1.10
by John Arbash Meinel
 Introduce the heads_provider for reannotate.  | 
74  | 
    graph as _mod_graph,
 | 
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
75  | 
    index as _mod_index,
 | 
| 
2998.2.2
by John Arbash Meinel
 implement a faster path for copying from packs back to knits.  | 
76  | 
    lru_cache,
 | 
| 
2592.3.66
by Robert Collins
 Allow adaption of KnitData to pack files.  | 
77  | 
    pack,
 | 
| 
3535.5.1
by John Arbash Meinel
 cleanup a few imports to be lazily loaded.  | 
78  | 
    progress,
 | 
| 
2745.1.2
by Robert Collins
 Ensure mutter_callsite is not directly called on a lazy_load object, to make the stacklevel parameter work correctly.  | 
79  | 
    trace,
 | 
| 
3535.5.1
by John Arbash Meinel
 cleanup a few imports to be lazily loaded.  | 
80  | 
    tsort,
 | 
81  | 
    tuned_gzip,
 | 
|
| 
2592.3.66
by Robert Collins
 Allow adaption of KnitData to pack files.  | 
82  | 
    )
 | 
83  | 
""")  | 
|
| 
1911.2.3
by John Arbash Meinel
 Moving everything into a new location so that we can cache more than just revision ids  | 
84  | 
from bzrlib import (  | 
85  | 
errors,  | 
|
| 
2249.5.12
by John Arbash Meinel
 Change the APIs for VersionedFile, Store, and some of Repository into utf-8  | 
86  | 
osutils,  | 
| 
2104.4.2
by John Arbash Meinel
 Small cleanup and NEWS entry about fixing bug #65714  | 
87  | 
patiencediff,  | 
| 
2158.3.1
by Dmitry Vasiliev
 KnitIndex tests/fixes/optimizations  | 
88  | 
    )
 | 
89  | 
from bzrlib.errors import (  | 
|
90  | 
FileExists,  | 
|
91  | 
NoSuchFile,  | 
|
92  | 
KnitError,  | 
|
93  | 
InvalidRevisionId,  | 
|
94  | 
KnitCorrupt,  | 
|
95  | 
KnitHeaderError,  | 
|
96  | 
RevisionNotPresent,  | 
|
97  | 
RevisionAlreadyPresent,  | 
|
98  | 
    )
 | 
|
99  | 
from bzrlib.osutils import (  | 
|
100  | 
contains_whitespace,  | 
|
101  | 
contains_linebreaks,  | 
|
| 
2850.1.1
by Robert Collins
 * ``KnitVersionedFile.add*`` will no longer cache added records even when  | 
102  | 
sha_string,  | 
| 
2158.3.1
by Dmitry Vasiliev
 KnitIndex tests/fixes/optimizations  | 
103  | 
sha_strings,  | 
| 
3350.3.8
by Robert Collins
 Basic stream insertion, no fast path yet for knit to knit.  | 
104  | 
split_lines,  | 
| 
2158.3.1
by Dmitry Vasiliev
 KnitIndex tests/fixes/optimizations  | 
105  | 
    )
 | 
| 
3350.3.3
by Robert Collins
 Functional get_record_stream interface tests covering full interface.  | 
106  | 
from bzrlib.versionedfile import (  | 
| 
3350.3.12
by Robert Collins
 Generate streams with absent records.  | 
107  | 
AbsentContentFactory,  | 
| 
3350.3.8
by Robert Collins
 Basic stream insertion, no fast path yet for knit to knit.  | 
108  | 
adapter_registry,  | 
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
109  | 
ConstantMapper,  | 
| 
3350.3.3
by Robert Collins
 Functional get_record_stream interface tests covering full interface.  | 
110  | 
ContentFactory,  | 
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
111  | 
FulltextContentFactory,  | 
| 
3350.3.3
by Robert Collins
 Functional get_record_stream interface tests covering full interface.  | 
112  | 
VersionedFile,  | 
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
113  | 
VersionedFiles,  | 
| 
3350.3.3
by Robert Collins
 Functional get_record_stream interface tests covering full interface.  | 
114  | 
    )
 | 
| 
1563.2.4
by Robert Collins
 First cut at including the knit implementation of versioned_file.  | 
115  | 
|
116  | 
||
117  | 
# TODO: Split out code specific to this format into an associated object.
 | 
|
118  | 
||
119  | 
# TODO: Can we put in some kind of value to check that the index and data
 | 
|
120  | 
# files belong together?
 | 
|
121  | 
||
| 
1759.2.1
by Jelmer Vernooij
 Fix some types (found using aspell).  | 
122  | 
# TODO: accommodate binaries, perhaps by storing a byte count
 | 
| 
1563.2.4
by Robert Collins
 First cut at including the knit implementation of versioned_file.  | 
123  | 
|
124  | 
# TODO: function to check whole file
 | 
|
125  | 
||
126  | 
# TODO: atomically append data, then measure backwards from the cursor
 | 
|
127  | 
# position after writing to work out where it was located.  we may need to
 | 
|
128  | 
# bypass python file buffering.
 | 
|
129  | 
||
130  | 
DATA_SUFFIX = '.knit'  | 
|
131  | 
INDEX_SUFFIX = '.kndx'  | 
|
132  | 
||
133  | 
||
| 
3350.3.4
by Robert Collins
 Finish adapters for annotated knits to unannotated knits and full texts.  | 
134  | 
class KnitAdapter(object):  | 
135  | 
"""Base class for knit record adaption."""  | 
|
136  | 
||
| 
3350.3.7
by Robert Collins
 Create a registry of versioned file record adapters.  | 
137  | 
def __init__(self, basis_vf):  | 
138  | 
"""Create an adapter which accesses full texts from basis_vf.  | 
|
139  | 
        
 | 
|
140  | 
        :param basis_vf: A versioned file to access basis texts of deltas from.
 | 
|
141  | 
            May be None for adapters that do not need to access basis texts.
 | 
|
142  | 
        """
 | 
|
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
143  | 
self._data = KnitVersionedFiles(None, None)  | 
| 
3350.3.4
by Robert Collins
 Finish adapters for annotated knits to unannotated knits and full texts.  | 
144  | 
self._annotate_factory = KnitAnnotateFactory()  | 
145  | 
self._plain_factory = KnitPlainFactory()  | 
|
| 
3350.3.7
by Robert Collins
 Create a registry of versioned file record adapters.  | 
146  | 
self._basis_vf = basis_vf  | 
| 
3350.3.4
by Robert Collins
 Finish adapters for annotated knits to unannotated knits and full texts.  | 
147  | 
|
148  | 
||
149  | 
class FTAnnotatedToUnannotated(KnitAdapter):  | 
|
150  | 
"""An adapter from FT annotated knits to unannotated ones."""  | 
|
151  | 
||
152  | 
def get_bytes(self, factory, annotated_compressed_bytes):  | 
|
153  | 
rec, contents = \  | 
|
154  | 
self._data._parse_record_unchecked(annotated_compressed_bytes)  | 
|
155  | 
content = self._annotate_factory.parse_fulltext(contents, rec[1])  | 
|
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
156  | 
size, bytes = self._data._record_to_data((rec[1],), rec[3], content.text())  | 
| 
3350.3.4
by Robert Collins
 Finish adapters for annotated knits to unannotated knits and full texts.  | 
157  | 
return bytes  | 
158  | 
||
159  | 
||
160  | 
class DeltaAnnotatedToUnannotated(KnitAdapter):  | 
|
161  | 
"""An adapter for deltas from annotated to unannotated."""  | 
|
162  | 
||
163  | 
def get_bytes(self, factory, annotated_compressed_bytes):  | 
|
164  | 
rec, contents = \  | 
|
165  | 
self._data._parse_record_unchecked(annotated_compressed_bytes)  | 
|
166  | 
delta = self._annotate_factory.parse_line_delta(contents, rec[1],  | 
|
167  | 
plain=True)  | 
|
168  | 
contents = self._plain_factory.lower_line_delta(delta)  | 
|
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
169  | 
size, bytes = self._data._record_to_data((rec[1],), rec[3], contents)  | 
| 
3350.3.4
by Robert Collins
 Finish adapters for annotated knits to unannotated knits and full texts.  | 
170  | 
return bytes  | 
171  | 
||
172  | 
||
173  | 
class FTAnnotatedToFullText(KnitAdapter):  | 
|
174  | 
"""An adapter from FT annotated knits to unannotated ones."""  | 
|
175  | 
||
176  | 
def get_bytes(self, factory, annotated_compressed_bytes):  | 
|
177  | 
rec, contents = \  | 
|
178  | 
self._data._parse_record_unchecked(annotated_compressed_bytes)  | 
|
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
179  | 
content, delta = self._annotate_factory.parse_record(factory.key[-1],  | 
| 
3350.3.4
by Robert Collins
 Finish adapters for annotated knits to unannotated knits and full texts.  | 
180  | 
contents, factory._build_details, None)  | 
181  | 
return ''.join(content.text())  | 
|
182  | 
||
183  | 
||
184  | 
class DeltaAnnotatedToFullText(KnitAdapter):  | 
|
185  | 
"""An adapter for deltas from annotated to unannotated."""  | 
|
186  | 
||
187  | 
def get_bytes(self, factory, annotated_compressed_bytes):  | 
|
188  | 
rec, contents = \  | 
|
189  | 
self._data._parse_record_unchecked(annotated_compressed_bytes)  | 
|
190  | 
delta = self._annotate_factory.parse_line_delta(contents, rec[1],  | 
|
191  | 
plain=True)  | 
|
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
192  | 
compression_parent = factory.parents[0]  | 
193  | 
basis_entry = self._basis_vf.get_record_stream(  | 
|
194  | 
[compression_parent], 'unordered', True).next()  | 
|
195  | 
if basis_entry.storage_kind == 'absent':  | 
|
196  | 
raise errors.RevisionNotPresent(compression_parent, self._basis_vf)  | 
|
197  | 
basis_lines = split_lines(basis_entry.get_bytes_as('fulltext'))  | 
|
| 
3350.3.4
by Robert Collins
 Finish adapters for annotated knits to unannotated knits and full texts.  | 
198  | 
        # Manually apply the delta because we have one annotated content and
 | 
199  | 
        # one plain.
 | 
|
200  | 
basis_content = PlainKnitContent(basis_lines, compression_parent)  | 
|
201  | 
basis_content.apply_delta(delta, rec[1])  | 
|
202  | 
basis_content._should_strip_eol = factory._build_details[1]  | 
|
203  | 
return ''.join(basis_content.text())  | 
|
204  | 
||
205  | 
||
| 
3350.3.5
by Robert Collins
 Create adapters from plain compressed knit content.  | 
206  | 
class FTPlainToFullText(KnitAdapter):  | 
207  | 
"""An adapter from FT plain knits to unannotated ones."""  | 
|
208  | 
||
209  | 
def get_bytes(self, factory, compressed_bytes):  | 
|
210  | 
rec, contents = \  | 
|
211  | 
self._data._parse_record_unchecked(compressed_bytes)  | 
|
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
212  | 
content, delta = self._plain_factory.parse_record(factory.key[-1],  | 
| 
3350.3.5
by Robert Collins
 Create adapters from plain compressed knit content.  | 
213  | 
contents, factory._build_details, None)  | 
214  | 
return ''.join(content.text())  | 
|
215  | 
||
216  | 
||
217  | 
class DeltaPlainToFullText(KnitAdapter):  | 
|
218  | 
"""An adapter for deltas from annotated to unannotated."""  | 
|
219  | 
||
220  | 
def get_bytes(self, factory, compressed_bytes):  | 
|
221  | 
rec, contents = \  | 
|
222  | 
self._data._parse_record_unchecked(compressed_bytes)  | 
|
223  | 
delta = self._plain_factory.parse_line_delta(contents, rec[1])  | 
|
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
224  | 
compression_parent = factory.parents[0]  | 
225  | 
        # XXX: string splitting overhead.
 | 
|
226  | 
basis_entry = self._basis_vf.get_record_stream(  | 
|
227  | 
[compression_parent], 'unordered', True).next()  | 
|
228  | 
if basis_entry.storage_kind == 'absent':  | 
|
229  | 
raise errors.RevisionNotPresent(compression_parent, self._basis_vf)  | 
|
230  | 
basis_lines = split_lines(basis_entry.get_bytes_as('fulltext'))  | 
|
| 
3350.3.5
by Robert Collins
 Create adapters from plain compressed knit content.  | 
231  | 
basis_content = PlainKnitContent(basis_lines, compression_parent)  | 
232  | 
        # Manually apply the delta because we have one annotated content and
 | 
|
233  | 
        # one plain.
 | 
|
234  | 
content, _ = self._plain_factory.parse_record(rec[1], contents,  | 
|
235  | 
factory._build_details, basis_content)  | 
|
236  | 
return ''.join(content.text())  | 
|
237  | 
||
238  | 
||
| 
3350.3.3
by Robert Collins
 Functional get_record_stream interface tests covering full interface.  | 
239  | 
class KnitContentFactory(ContentFactory):  | 
240  | 
"""Content factory for streaming from knits.  | 
|
241  | 
    
 | 
|
242  | 
    :seealso ContentFactory:
 | 
|
243  | 
    """
 | 
|
244  | 
||
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
245  | 
def __init__(self, key, parents, build_details, sha1, raw_record,  | 
| 
3350.3.3
by Robert Collins
 Functional get_record_stream interface tests covering full interface.  | 
246  | 
annotated, knit=None):  | 
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
247  | 
"""Create a KnitContentFactory for key.  | 
| 
3350.3.3
by Robert Collins
 Functional get_record_stream interface tests covering full interface.  | 
248  | 
        
 | 
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
249  | 
        :param key: The key.
 | 
| 
3350.3.3
by Robert Collins
 Functional get_record_stream interface tests covering full interface.  | 
250  | 
        :param parents: The parents.
 | 
251  | 
        :param build_details: The build details as returned from
 | 
|
252  | 
            get_build_details.
 | 
|
253  | 
        :param sha1: The sha1 expected from the full text of this object.
 | 
|
254  | 
        :param raw_record: The bytes of the knit data from disk.
 | 
|
255  | 
        :param annotated: True if the raw data is annotated.
 | 
|
256  | 
        """
 | 
|
257  | 
ContentFactory.__init__(self)  | 
|
258  | 
self.sha1 = sha1  | 
|
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
259  | 
self.key = key  | 
260  | 
self.parents = parents  | 
|
| 
3350.3.3
by Robert Collins
 Functional get_record_stream interface tests covering full interface.  | 
261  | 
if build_details[0] == 'line-delta':  | 
262  | 
kind = 'delta'  | 
|
263  | 
else:  | 
|
264  | 
kind = 'ft'  | 
|
265  | 
if annotated:  | 
|
266  | 
annotated_kind = 'annotated-'  | 
|
267  | 
else:  | 
|
268  | 
annotated_kind = ''  | 
|
269  | 
self.storage_kind = 'knit-%s%s-gz' % (annotated_kind, kind)  | 
|
270  | 
self._raw_record = raw_record  | 
|
271  | 
self._build_details = build_details  | 
|
272  | 
self._knit = knit  | 
|
273  | 
||
274  | 
def get_bytes_as(self, storage_kind):  | 
|
275  | 
if storage_kind == self.storage_kind:  | 
|
276  | 
return self._raw_record  | 
|
277  | 
if storage_kind == 'fulltext' and self._knit is not None:  | 
|
278  | 
return self._knit.get_text(self.key[0])  | 
|
279  | 
else:  | 
|
280  | 
raise errors.UnavailableRepresentation(self.key, storage_kind,  | 
|
281  | 
self.storage_kind)  | 
|
282  | 
||
283  | 
||
| 
1563.2.4
by Robert Collins
 First cut at including the knit implementation of versioned_file.  | 
284  | 
class KnitContent(object):  | 
| 
3468.2.4
by Martin Pool
 Test and fix #234748 problems in trailing newline diffs  | 
285  | 
"""Content of a knit version to which deltas can be applied.  | 
286  | 
    
 | 
|
| 
3468.2.5
by Martin Pool
 Correct comment and remove overbroad except block  | 
287  | 
    This is always stored in memory as a list of lines with \n at the end,
 | 
288  | 
    plus a flag saying if the final ending is really there or not, because that 
 | 
|
289  | 
    corresponds to the on-disk knit representation.
 | 
|
| 
3468.2.4
by Martin Pool
 Test and fix #234748 problems in trailing newline diffs  | 
290  | 
    """
 | 
| 
1563.2.4
by Robert Collins
 First cut at including the knit implementation of versioned_file.  | 
291  | 
|
| 
3224.1.15
by John Arbash Meinel
 Finish removing method and noeol from general knowledge,  | 
292  | 
def __init__(self):  | 
293  | 
self._should_strip_eol = False  | 
|
294  | 
||
| 
2921.2.1
by Robert Collins
 * Knit text reconstruction now avoids making copies of the lines list for  | 
295  | 
def apply_delta(self, delta, new_version_id):  | 
| 
2921.2.2
by Robert Collins
 Review feedback.  | 
296  | 
"""Apply delta to this object to become new_version_id."""  | 
| 
2921.2.1
by Robert Collins
 * Knit text reconstruction now avoids making copies of the lines list for  | 
297  | 
raise NotImplementedError(self.apply_delta)  | 
298  | 
||
| 
1563.2.4
by Robert Collins
 First cut at including the knit implementation of versioned_file.  | 
299  | 
def line_delta_iter(self, new_lines):  | 
| 
1596.2.32
by Robert Collins
 Reduce re-extraction of texts during weave to knit joins by providing a memoisation facility.  | 
300  | 
"""Generate line-based delta from this content to new_lines."""  | 
| 
2151.1.1
by John Arbash Meinel
 (Dmitry Vasiliev) Tune KnitContent and add tests  | 
301  | 
new_texts = new_lines.text()  | 
302  | 
old_texts = self.text()  | 
|
| 
2781.1.1
by Martin Pool
 merge cpatiencediff from Lukas  | 
303  | 
s = patiencediff.PatienceSequenceMatcher(None, old_texts, new_texts)  | 
| 
2151.1.1
by John Arbash Meinel
 (Dmitry Vasiliev) Tune KnitContent and add tests  | 
304  | 
for tag, i1, i2, j1, j2 in s.get_opcodes():  | 
305  | 
if tag == 'equal':  | 
|
| 
1563.2.4
by Robert Collins
 First cut at including the knit implementation of versioned_file.  | 
306  | 
                continue
 | 
| 
2151.1.1
by John Arbash Meinel
 (Dmitry Vasiliev) Tune KnitContent and add tests  | 
307  | 
            # ofrom, oto, length, data
 | 
308  | 
yield i1, i2, j2 - j1, new_lines._lines[j1:j2]  | 
|
| 
1563.2.4
by Robert Collins
 First cut at including the knit implementation of versioned_file.  | 
309  | 
|
310  | 
def line_delta(self, new_lines):  | 
|
311  | 
return list(self.line_delta_iter(new_lines))  | 
|
312  | 
||
| 
2520.4.41
by Aaron Bentley
 Accelerate mpdiff generation  | 
313  | 
    @staticmethod
 | 
| 
2520.4.48
by Aaron Bentley
 Support getting blocks from knit deltas with no final EOL  | 
314  | 
def get_line_delta_blocks(knit_delta, source, target):  | 
| 
2520.4.41
by Aaron Bentley
 Accelerate mpdiff generation  | 
315  | 
"""Extract SequenceMatcher.get_matching_blocks() from a knit delta"""  | 
| 
2520.4.48
by Aaron Bentley
 Support getting blocks from knit deltas with no final EOL  | 
316  | 
target_len = len(target)  | 
| 
2520.4.41
by Aaron Bentley
 Accelerate mpdiff generation  | 
317  | 
s_pos = 0  | 
318  | 
t_pos = 0  | 
|
319  | 
for s_begin, s_end, t_len, new_text in knit_delta:  | 
|
| 
2520.4.47
by Aaron Bentley
 Fix get_line_delta_blocks with eol  | 
320  | 
true_n = s_begin - s_pos  | 
321  | 
n = true_n  | 
|
| 
2520.4.41
by Aaron Bentley
 Accelerate mpdiff generation  | 
322  | 
if n > 0:  | 
| 
2520.4.48
by Aaron Bentley
 Support getting blocks from knit deltas with no final EOL  | 
323  | 
                # knit deltas do not provide reliable info about whether the
 | 
324  | 
                # last line of a file matches, due to eol handling.
 | 
|
325  | 
if source[s_pos + n -1] != target[t_pos + n -1]:  | 
|
| 
2520.4.47
by Aaron Bentley
 Fix get_line_delta_blocks with eol  | 
326  | 
n-=1  | 
327  | 
if n > 0:  | 
|
328  | 
yield s_pos, t_pos, n  | 
|
329  | 
t_pos += t_len + true_n  | 
|
| 
2520.4.41
by Aaron Bentley
 Accelerate mpdiff generation  | 
330  | 
s_pos = s_end  | 
| 
2520.4.48
by Aaron Bentley
 Support getting blocks from knit deltas with no final EOL  | 
331  | 
n = target_len - t_pos  | 
332  | 
if n > 0:  | 
|
333  | 
if source[s_pos + n -1] != target[t_pos + n -1]:  | 
|
334  | 
n-=1  | 
|
335  | 
if n > 0:  | 
|
336  | 
yield s_pos, t_pos, n  | 
|
| 
2520.4.41
by Aaron Bentley
 Accelerate mpdiff generation  | 
337  | 
yield s_pos + (target_len - t_pos), target_len, 0  | 
338  | 
||
| 
1563.2.4
by Robert Collins
 First cut at including the knit implementation of versioned_file.  | 
339  | 
|
| 
2794.1.2
by Robert Collins
 Nuke versioned file add/get delta support, allowing easy simplification of unannotated Content, reducing memory copies and friction during commit on unannotated texts.  | 
340  | 
class AnnotatedKnitContent(KnitContent):  | 
341  | 
"""Annotated content."""  | 
|
342  | 
||
343  | 
def __init__(self, lines):  | 
|
| 
3224.1.15
by John Arbash Meinel
 Finish removing method and noeol from general knowledge,  | 
344  | 
KnitContent.__init__(self)  | 
| 
2794.1.2
by Robert Collins
 Nuke versioned file add/get delta support, allowing easy simplification of unannotated Content, reducing memory copies and friction during commit on unannotated texts.  | 
345  | 
self._lines = lines  | 
346  | 
||
| 
3316.2.13
by Robert Collins
 * ``VersionedFile.annotate_iter`` is deprecated. While in principal this  | 
347  | 
def annotate(self):  | 
348  | 
"""Return a list of (origin, text) for each content line."""  | 
|
| 
3468.2.4
by Martin Pool
 Test and fix #234748 problems in trailing newline diffs  | 
349  | 
lines = self._lines[:]  | 
350  | 
if self._should_strip_eol:  | 
|
351  | 
origin, last_line = lines[-1]  | 
|
352  | 
lines[-1] = (origin, last_line.rstrip('\n'))  | 
|
353  | 
return lines  | 
|
| 
2794.1.2
by Robert Collins
 Nuke versioned file add/get delta support, allowing easy simplification of unannotated Content, reducing memory copies and friction during commit on unannotated texts.  | 
354  | 
|
| 
2921.2.1
by Robert Collins
 * Knit text reconstruction now avoids making copies of the lines list for  | 
355  | 
def apply_delta(self, delta, new_version_id):  | 
| 
2921.2.2
by Robert Collins
 Review feedback.  | 
356  | 
"""Apply delta to this object to become new_version_id."""  | 
| 
2921.2.1
by Robert Collins
 * Knit text reconstruction now avoids making copies of the lines list for  | 
357  | 
offset = 0  | 
358  | 
lines = self._lines  | 
|
359  | 
for start, end, count, delta_lines in delta:  | 
|
360  | 
lines[offset+start:offset+end] = delta_lines  | 
|
361  | 
offset = offset + (start - end) + count  | 
|
362  | 
||
| 
2794.1.2
by Robert Collins
 Nuke versioned file add/get delta support, allowing easy simplification of unannotated Content, reducing memory copies and friction during commit on unannotated texts.  | 
363  | 
def text(self):  | 
| 
2911.1.1
by Martin Pool
 Better messages when problems are detected inside a knit  | 
364  | 
try:  | 
| 
3224.1.22
by John Arbash Meinel
 Cleanup the extra debugging info, and some >80 char lines.  | 
365  | 
lines = [text for origin, text in self._lines]  | 
| 
2911.1.1
by Martin Pool
 Better messages when problems are detected inside a knit  | 
366  | 
except ValueError, e:  | 
367  | 
            # most commonly (only?) caused by the internal form of the knit
 | 
|
368  | 
            # missing annotation information because of a bug - see thread
 | 
|
369  | 
            # around 20071015
 | 
|
370  | 
raise KnitCorrupt(self,  | 
|
371  | 
"line in annotated knit missing annotation information: %s"  | 
|
372  | 
% (e,))  | 
|
| 
3224.1.15
by John Arbash Meinel
 Finish removing method and noeol from general knowledge,  | 
373  | 
if self._should_strip_eol:  | 
| 
3350.3.4
by Robert Collins
 Finish adapters for annotated knits to unannotated knits and full texts.  | 
374  | 
lines[-1] = lines[-1].rstrip('\n')  | 
| 
3224.1.15
by John Arbash Meinel
 Finish removing method and noeol from general knowledge,  | 
375  | 
return lines  | 
376  | 
||
| 
2794.1.2
by Robert Collins
 Nuke versioned file add/get delta support, allowing easy simplification of unannotated Content, reducing memory copies and friction during commit on unannotated texts.  | 
377  | 
def copy(self):  | 
378  | 
return AnnotatedKnitContent(self._lines[:])  | 
|
379  | 
||
380  | 
||
381  | 
class PlainKnitContent(KnitContent):  | 
|
| 
2794.1.3
by Robert Collins
 Review feedback.  | 
382  | 
"""Unannotated content.  | 
383  | 
    
 | 
|
384  | 
    When annotate[_iter] is called on this content, the same version is reported
 | 
|
385  | 
    for all lines. Generally, annotate[_iter] is not useful on PlainKnitContent
 | 
|
386  | 
    objects.
 | 
|
387  | 
    """
 | 
|
| 
2794.1.2
by Robert Collins
 Nuke versioned file add/get delta support, allowing easy simplification of unannotated Content, reducing memory copies and friction during commit on unannotated texts.  | 
388  | 
|
389  | 
def __init__(self, lines, version_id):  | 
|
| 
3224.1.15
by John Arbash Meinel
 Finish removing method and noeol from general knowledge,  | 
390  | 
KnitContent.__init__(self)  | 
| 
2794.1.2
by Robert Collins
 Nuke versioned file add/get delta support, allowing easy simplification of unannotated Content, reducing memory copies and friction during commit on unannotated texts.  | 
391  | 
self._lines = lines  | 
392  | 
self._version_id = version_id  | 
|
393  | 
||
| 
3316.2.13
by Robert Collins
 * ``VersionedFile.annotate_iter`` is deprecated. While in principal this  | 
394  | 
def annotate(self):  | 
395  | 
"""Return a list of (origin, text) for each content line."""  | 
|
396  | 
return [(self._version_id, line) for line in self._lines]  | 
|
| 
2794.1.2
by Robert Collins
 Nuke versioned file add/get delta support, allowing easy simplification of unannotated Content, reducing memory copies and friction during commit on unannotated texts.  | 
397  | 
|
| 
2921.2.1
by Robert Collins
 * Knit text reconstruction now avoids making copies of the lines list for  | 
398  | 
def apply_delta(self, delta, new_version_id):  | 
| 
2921.2.2
by Robert Collins
 Review feedback.  | 
399  | 
"""Apply delta to this object to become new_version_id."""  | 
| 
2921.2.1
by Robert Collins
 * Knit text reconstruction now avoids making copies of the lines list for  | 
400  | 
offset = 0  | 
401  | 
lines = self._lines  | 
|
402  | 
for start, end, count, delta_lines in delta:  | 
|
403  | 
lines[offset+start:offset+end] = delta_lines  | 
|
404  | 
offset = offset + (start - end) + count  | 
|
405  | 
self._version_id = new_version_id  | 
|
406  | 
||
| 
2794.1.2
by Robert Collins
 Nuke versioned file add/get delta support, allowing easy simplification of unannotated Content, reducing memory copies and friction during commit on unannotated texts.  | 
407  | 
def copy(self):  | 
408  | 
return PlainKnitContent(self._lines[:], self._version_id)  | 
|
409  | 
||
410  | 
def text(self):  | 
|
| 
3224.1.15
by John Arbash Meinel
 Finish removing method and noeol from general knowledge,  | 
411  | 
lines = self._lines  | 
412  | 
if self._should_strip_eol:  | 
|
413  | 
lines = lines[:]  | 
|
414  | 
lines[-1] = lines[-1].rstrip('\n')  | 
|
415  | 
return lines  | 
|
416  | 
||
417  | 
||
418  | 
class _KnitFactory(object):  | 
|
419  | 
"""Base class for common Factory functions."""  | 
|
420  | 
||
421  | 
def parse_record(self, version_id, record, record_details,  | 
|
422  | 
base_content, copy_base_content=True):  | 
|
423  | 
"""Parse a record into a full content object.  | 
|
424  | 
||
425  | 
        :param version_id: The official version id for this content
 | 
|
426  | 
        :param record: The data returned by read_records_iter()
 | 
|
427  | 
        :param record_details: Details about the record returned by
 | 
|
428  | 
            get_build_details
 | 
|
429  | 
        :param base_content: If get_build_details returns a compression_parent,
 | 
|
430  | 
            you must return a base_content here, else use None
 | 
|
431  | 
        :param copy_base_content: When building from the base_content, decide
 | 
|
432  | 
            you can either copy it and return a new object, or modify it in
 | 
|
433  | 
            place.
 | 
|
434  | 
        :return: (content, delta) A Content object and possibly a line-delta,
 | 
|
435  | 
            delta may be None
 | 
|
436  | 
        """
 | 
|
437  | 
method, noeol = record_details  | 
|
438  | 
if method == 'line-delta':  | 
|
439  | 
if copy_base_content:  | 
|
440  | 
content = base_content.copy()  | 
|
441  | 
else:  | 
|
442  | 
content = base_content  | 
|
443  | 
delta = self.parse_line_delta(record, version_id)  | 
|
444  | 
content.apply_delta(delta, version_id)  | 
|
445  | 
else:  | 
|
446  | 
content = self.parse_fulltext(record, version_id)  | 
|
447  | 
delta = None  | 
|
448  | 
content._should_strip_eol = noeol  | 
|
449  | 
return (content, delta)  | 
|
450  | 
||
451  | 
||
452  | 
class KnitAnnotateFactory(_KnitFactory):  | 
|
| 
1563.2.4
by Robert Collins
 First cut at including the knit implementation of versioned_file.  | 
453  | 
"""Factory for creating annotated Content objects."""  | 
454  | 
||
455  | 
annotated = True  | 
|
456  | 
||
| 
2794.1.2
by Robert Collins
 Nuke versioned file add/get delta support, allowing easy simplification of unannotated Content, reducing memory copies and friction during commit on unannotated texts.  | 
457  | 
def make(self, lines, version_id):  | 
458  | 
num_lines = len(lines)  | 
|
459  | 
return AnnotatedKnitContent(zip([version_id] * num_lines, lines))  | 
|
460  | 
||
| 
2249.5.12
by John Arbash Meinel
 Change the APIs for VersionedFile, Store, and some of Repository into utf-8  | 
461  | 
def parse_fulltext(self, content, version_id):  | 
| 
1596.2.7
by Robert Collins
 Remove the requirement for reannotation in knit joins.  | 
462  | 
"""Convert fulltext to internal representation  | 
463  | 
||
464  | 
        fulltext content is of the format
 | 
|
465  | 
        revid(utf8) plaintext\n
 | 
|
466  | 
        internal representation is of the format:
 | 
|
467  | 
        (revid, plaintext)
 | 
|
468  | 
        """
 | 
|
| 
2249.5.12
by John Arbash Meinel
 Change the APIs for VersionedFile, Store, and some of Repository into utf-8  | 
469  | 
        # TODO: jam 20070209 The tests expect this to be returned as tuples,
 | 
470  | 
        #       but the code itself doesn't really depend on that.
 | 
|
471  | 
        #       Figure out a way to not require the overhead of turning the
 | 
|
472  | 
        #       list back into tuples.
 | 
|
473  | 
lines = [tuple(line.split(' ', 1)) for line in content]  | 
|
| 
2794.1.2
by Robert Collins
 Nuke versioned file add/get delta support, allowing easy simplification of unannotated Content, reducing memory copies and friction during commit on unannotated texts.  | 
474  | 
return AnnotatedKnitContent(lines)  | 
| 
1563.2.4
by Robert Collins
 First cut at including the knit implementation of versioned_file.  | 
475  | 
|
476  | 
def parse_line_delta_iter(self, lines):  | 
|
| 
2163.1.2
by John Arbash Meinel
 Don't modify the list during parse_line_delta  | 
477  | 
return iter(self.parse_line_delta(lines))  | 
| 
1628.1.2
by Robert Collins
 More knit micro-optimisations.  | 
478  | 
|
| 
2851.4.2
by Ian Clatworthy
 use factory methods in annotated-to-plain conversion instead of duplicating format knowledge  | 
479  | 
def parse_line_delta(self, lines, version_id, plain=False):  | 
| 
1596.2.7
by Robert Collins
 Remove the requirement for reannotation in knit joins.  | 
480  | 
"""Convert a line based delta into internal representation.  | 
481  | 
||
482  | 
        line delta is in the form of:
 | 
|
483  | 
        intstart intend intcount
 | 
|
484  | 
        1..count lines:
 | 
|
485  | 
        revid(utf8) newline\n
 | 
|
| 
1759.2.1
by Jelmer Vernooij
 Fix some types (found using aspell).  | 
486  | 
        internal representation is
 | 
| 
1596.2.7
by Robert Collins
 Remove the requirement for reannotation in knit joins.  | 
487  | 
        (start, end, count, [1..count tuples (revid, newline)])
 | 
| 
2851.4.2
by Ian Clatworthy
 use factory methods in annotated-to-plain conversion instead of duplicating format knowledge  | 
488  | 
|
489  | 
        :param plain: If True, the lines are returned as a plain
 | 
|
| 
2911.1.1
by Martin Pool
 Better messages when problems are detected inside a knit  | 
490  | 
            list without annotations, not as a list of (origin, content) tuples, i.e.
 | 
| 
2851.4.2
by Ian Clatworthy
 use factory methods in annotated-to-plain conversion instead of duplicating format knowledge  | 
491  | 
            (start, end, count, [1..count newline])
 | 
| 
1596.2.7
by Robert Collins
 Remove the requirement for reannotation in knit joins.  | 
492  | 
        """
 | 
| 
1628.1.2
by Robert Collins
 More knit micro-optimisations.  | 
493  | 
result = []  | 
494  | 
lines = iter(lines)  | 
|
495  | 
next = lines.next  | 
|
| 
2249.5.1
by John Arbash Meinel
 Leave revision-ids in utf-8 when reading.  | 
496  | 
|
| 
2249.5.15
by John Arbash Meinel
 remove get_cached_utf8 checks which were slowing things down.  | 
497  | 
cache = {}  | 
498  | 
def cache_and_return(line):  | 
|
499  | 
origin, text = line.split(' ', 1)  | 
|
500  | 
return cache.setdefault(origin, origin), text  | 
|
501  | 
||
| 
1628.1.2
by Robert Collins
 More knit micro-optimisations.  | 
502  | 
        # walk through the lines parsing.
 | 
| 
2851.4.2
by Ian Clatworthy
 use factory methods in annotated-to-plain conversion instead of duplicating format knowledge  | 
503  | 
        # Note that the plain test is explicitly pulled out of the
 | 
504  | 
        # loop to minimise any performance impact
 | 
|
505  | 
if plain:  | 
|
506  | 
for header in lines:  | 
|
507  | 
start, end, count = [int(n) for n in header.split(',')]  | 
|
508  | 
contents = [next().split(' ', 1)[1] for i in xrange(count)]  | 
|
509  | 
result.append((start, end, count, contents))  | 
|
510  | 
else:  | 
|
511  | 
for header in lines:  | 
|
512  | 
start, end, count = [int(n) for n in header.split(',')]  | 
|
513  | 
contents = [tuple(next().split(' ', 1)) for i in xrange(count)]  | 
|
514  | 
result.append((start, end, count, contents))  | 
|
| 
1628.1.2
by Robert Collins
 More knit micro-optimisations.  | 
515  | 
return result  | 
| 
1563.2.4
by Robert Collins
 First cut at including the knit implementation of versioned_file.  | 
516  | 
|
| 
2163.2.2
by John Arbash Meinel
 Don't deal with annotations when we don't care about them. Saves another 300+ms  | 
517  | 
def get_fulltext_content(self, lines):  | 
518  | 
"""Extract just the content lines from a fulltext."""  | 
|
519  | 
return (line.split(' ', 1)[1] for line in lines)  | 
|
520  | 
||
521  | 
def get_linedelta_content(self, lines):  | 
|
522  | 
"""Extract just the content from a line delta.  | 
|
523  | 
||
524  | 
        This doesn't return all of the extra information stored in a delta.
 | 
|
525  | 
        Only the actual content lines.
 | 
|
526  | 
        """
 | 
|
527  | 
lines = iter(lines)  | 
|
528  | 
next = lines.next  | 
|
529  | 
for header in lines:  | 
|
530  | 
header = header.split(',')  | 
|
531  | 
count = int(header[2])  | 
|
532  | 
for i in xrange(count):  | 
|
533  | 
origin, text = next().split(' ', 1)  | 
|
534  | 
yield text  | 
|
535  | 
||
| 
1563.2.4
by Robert Collins
 First cut at including the knit implementation of versioned_file.  | 
536  | 
def lower_fulltext(self, content):  | 
| 
1596.2.7
by Robert Collins
 Remove the requirement for reannotation in knit joins.  | 
537  | 
"""convert a fulltext content record into a serializable form.  | 
538  | 
||
539  | 
        see parse_fulltext which this inverts.
 | 
|
540  | 
        """
 | 
|
| 
2249.5.12
by John Arbash Meinel
 Change the APIs for VersionedFile, Store, and some of Repository into utf-8  | 
541  | 
        # TODO: jam 20070209 We only do the caching thing to make sure that
 | 
542  | 
        #       the origin is a valid utf-8 line, eventually we could remove it
 | 
|
| 
2249.5.15
by John Arbash Meinel
 remove get_cached_utf8 checks which were slowing things down.  | 
543  | 
return ['%s %s' % (o, t) for o, t in content._lines]  | 
| 
1563.2.4
by Robert Collins
 First cut at including the knit implementation of versioned_file.  | 
544  | 
|
545  | 
def lower_line_delta(self, delta):  | 
|
| 
1596.2.7
by Robert Collins
 Remove the requirement for reannotation in knit joins.  | 
546  | 
"""convert a delta into a serializable form.  | 
547  | 
||
| 
1628.1.2
by Robert Collins
 More knit micro-optimisations.  | 
548  | 
        See parse_line_delta which this inverts.
 | 
| 
1596.2.7
by Robert Collins
 Remove the requirement for reannotation in knit joins.  | 
549  | 
        """
 | 
| 
2249.5.12
by John Arbash Meinel
 Change the APIs for VersionedFile, Store, and some of Repository into utf-8  | 
550  | 
        # TODO: jam 20070209 We only do the caching thing to make sure that
 | 
551  | 
        #       the origin is a valid utf-8 line, eventually we could remove it
 | 
|
| 
1563.2.4
by Robert Collins
 First cut at including the knit implementation of versioned_file.  | 
552  | 
out = []  | 
553  | 
for start, end, c, lines in delta:  | 
|
554  | 
out.append('%d,%d,%d\n' % (start, end, c))  | 
|
| 
2249.5.15
by John Arbash Meinel
 remove get_cached_utf8 checks which were slowing things down.  | 
555  | 
out.extend(origin + ' ' + text  | 
| 
1911.2.1
by John Arbash Meinel
 Cache encode/decode operations, saves memory and time. Especially when committing a new kernel tree with 7.7M new lines to annotate  | 
556  | 
for origin, text in lines)  | 
| 
1563.2.4
by Robert Collins
 First cut at including the knit implementation of versioned_file.  | 
557  | 
return out  | 
558  | 
||
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
559  | 
def annotate(self, knit, key):  | 
560  | 
content = knit._get_content(key)  | 
|
561  | 
        # adjust for the fact that serialised annotations are only key suffixes
 | 
|
562  | 
        # for this factory.
 | 
|
563  | 
if type(key) == tuple:  | 
|
564  | 
prefix = key[:-1]  | 
|
565  | 
origins = content.annotate()  | 
|
566  | 
result = []  | 
|
567  | 
for origin, line in origins:  | 
|
568  | 
result.append((prefix + (origin,), line))  | 
|
569  | 
return result  | 
|
570  | 
else:  | 
|
| 
3350.6.10
by Martin Pool
 VersionedFiles review cleanups  | 
571  | 
            # XXX: This smells a bit.  Why would key ever be a non-tuple here?
 | 
572  | 
            # Aren't keys defined to be tuples?  -- spiv 20080618
 | 
|
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
573  | 
return content.annotate()  | 
| 
2770.1.1
by Aaron Bentley
 Initial implmentation of plain knit annotation  | 
574  | 
|
| 
1563.2.4
by Robert Collins
 First cut at including the knit implementation of versioned_file.  | 
575  | 
|
| 
3224.1.15
by John Arbash Meinel
 Finish removing method and noeol from general knowledge,  | 
576  | 
class KnitPlainFactory(_KnitFactory):  | 
| 
1563.2.4
by Robert Collins
 First cut at including the knit implementation of versioned_file.  | 
577  | 
"""Factory for creating plain Content objects."""  | 
578  | 
||
579  | 
annotated = False  | 
|
580  | 
||
| 
2794.1.2
by Robert Collins
 Nuke versioned file add/get delta support, allowing easy simplification of unannotated Content, reducing memory copies and friction during commit on unannotated texts.  | 
581  | 
def make(self, lines, version_id):  | 
582  | 
return PlainKnitContent(lines, version_id)  | 
|
583  | 
||
| 
2249.5.12
by John Arbash Meinel
 Change the APIs for VersionedFile, Store, and some of Repository into utf-8  | 
584  | 
def parse_fulltext(self, content, version_id):  | 
| 
1596.2.7
by Robert Collins
 Remove the requirement for reannotation in knit joins.  | 
585  | 
"""This parses an unannotated fulltext.  | 
586  | 
||
587  | 
        Note that this is not a noop - the internal representation
 | 
|
588  | 
        has (versionid, line) - its just a constant versionid.
 | 
|
589  | 
        """
 | 
|
| 
2249.5.12
by John Arbash Meinel
 Change the APIs for VersionedFile, Store, and some of Repository into utf-8  | 
590  | 
return self.make(content, version_id)  | 
| 
1563.2.4
by Robert Collins
 First cut at including the knit implementation of versioned_file.  | 
591  | 
|
| 
2249.5.12
by John Arbash Meinel
 Change the APIs for VersionedFile, Store, and some of Repository into utf-8  | 
592  | 
def parse_line_delta_iter(self, lines, version_id):  | 
| 
2163.1.2
by John Arbash Meinel
 Don't modify the list during parse_line_delta  | 
593  | 
cur = 0  | 
594  | 
num_lines = len(lines)  | 
|
595  | 
while cur < num_lines:  | 
|
596  | 
header = lines[cur]  | 
|
597  | 
cur += 1  | 
|
| 
1563.2.4
by Robert Collins
 First cut at including the knit implementation of versioned_file.  | 
598  | 
start, end, c = [int(n) for n in header.split(',')]  | 
| 
2794.1.2
by Robert Collins
 Nuke versioned file add/get delta support, allowing easy simplification of unannotated Content, reducing memory copies and friction during commit on unannotated texts.  | 
599  | 
yield start, end, c, lines[cur:cur+c]  | 
| 
2163.1.2
by John Arbash Meinel
 Don't modify the list during parse_line_delta  | 
600  | 
cur += c  | 
| 
1563.2.4
by Robert Collins
 First cut at including the knit implementation of versioned_file.  | 
601  | 
|
| 
2249.5.12
by John Arbash Meinel
 Change the APIs for VersionedFile, Store, and some of Repository into utf-8  | 
602  | 
def parse_line_delta(self, lines, version_id):  | 
603  | 
return list(self.parse_line_delta_iter(lines, version_id))  | 
|
| 
2158.3.1
by Dmitry Vasiliev
 KnitIndex tests/fixes/optimizations  | 
604  | 
|
| 
2163.2.2
by John Arbash Meinel
 Don't deal with annotations when we don't care about them. Saves another 300+ms  | 
605  | 
def get_fulltext_content(self, lines):  | 
606  | 
"""Extract just the content lines from a fulltext."""  | 
|
607  | 
return iter(lines)  | 
|
608  | 
||
609  | 
def get_linedelta_content(self, lines):  | 
|
610  | 
"""Extract just the content from a line delta.  | 
|
611  | 
||
612  | 
        This doesn't return all of the extra information stored in a delta.
 | 
|
613  | 
        Only the actual content lines.
 | 
|
614  | 
        """
 | 
|
615  | 
lines = iter(lines)  | 
|
616  | 
next = lines.next  | 
|
617  | 
for header in lines:  | 
|
618  | 
header = header.split(',')  | 
|
619  | 
count = int(header[2])  | 
|
620  | 
for i in xrange(count):  | 
|
621  | 
yield next()  | 
|
622  | 
||
| 
1563.2.4
by Robert Collins
 First cut at including the knit implementation of versioned_file.  | 
623  | 
def lower_fulltext(self, content):  | 
624  | 
return content.text()  | 
|
625  | 
||
626  | 
def lower_line_delta(self, delta):  | 
|
627  | 
out = []  | 
|
628  | 
for start, end, c, lines in delta:  | 
|
629  | 
out.append('%d,%d,%d\n' % (start, end, c))  | 
|
| 
2794.1.2
by Robert Collins
 Nuke versioned file add/get delta support, allowing easy simplification of unannotated Content, reducing memory copies and friction during commit on unannotated texts.  | 
630  | 
out.extend(lines)  | 
| 
1563.2.4
by Robert Collins
 First cut at including the knit implementation of versioned_file.  | 
631  | 
return out  | 
632  | 
||
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
633  | 
def annotate(self, knit, key):  | 
| 
3224.1.7
by John Arbash Meinel
 _StreamIndex also needs to return the proper values for get_build_details.  | 
634  | 
annotator = _KnitAnnotator(knit)  | 
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
635  | 
return annotator.annotate(key)  | 
636  | 
||
637  | 
||
638  | 
||
639  | 
def make_file_factory(annotated, mapper):  | 
|
640  | 
"""Create a factory for creating a file based KnitVersionedFiles.  | 
|
| 
3350.6.10
by Martin Pool
 VersionedFiles review cleanups  | 
641  | 
|
642  | 
    This is only functional enough to run interface tests, it doesn't try to
 | 
|
643  | 
    provide a full pack environment.
 | 
|
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
644  | 
    
 | 
645  | 
    :param annotated: knit annotations are wanted.
 | 
|
646  | 
    :param mapper: The mapper from keys to paths.
 | 
|
647  | 
    """
 | 
|
648  | 
def factory(transport):  | 
|
649  | 
index = _KndxIndex(transport, mapper, lambda:None, lambda:True, lambda:True)  | 
|
650  | 
access = _KnitKeyAccess(transport, mapper)  | 
|
651  | 
return KnitVersionedFiles(index, access, annotated=annotated)  | 
|
652  | 
return factory  | 
|
653  | 
||
654  | 
||
655  | 
def make_pack_factory(graph, delta, keylength):  | 
|
656  | 
"""Create a factory for creating a pack based VersionedFiles.  | 
|
657  | 
||
658  | 
    This is only functional enough to run interface tests, it doesn't try to
 | 
|
659  | 
    provide a full pack environment.
 | 
|
660  | 
    
 | 
|
661  | 
    :param graph: Store a graph.
 | 
|
662  | 
    :param delta: Delta compress contents.
 | 
|
663  | 
    :param keylength: How long should keys be.
 | 
|
664  | 
    """
 | 
|
665  | 
def factory(transport):  | 
|
666  | 
parents = graph or delta  | 
|
667  | 
ref_length = 0  | 
|
668  | 
if graph:  | 
|
669  | 
ref_length += 1  | 
|
670  | 
if delta:  | 
|
671  | 
ref_length += 1  | 
|
672  | 
max_delta_chain = 200  | 
|
673  | 
else:  | 
|
674  | 
max_delta_chain = 0  | 
|
675  | 
graph_index = _mod_index.InMemoryGraphIndex(reference_lists=ref_length,  | 
|
676  | 
key_elements=keylength)  | 
|
677  | 
stream = transport.open_write_stream('newpack')  | 
|
678  | 
writer = pack.ContainerWriter(stream.write)  | 
|
679  | 
writer.begin()  | 
|
680  | 
index = _KnitGraphIndex(graph_index, lambda:True, parents=parents,  | 
|
681  | 
deltas=delta, add_callback=graph_index.add_nodes)  | 
|
682  | 
access = _DirectPackAccess({})  | 
|
683  | 
access.set_writer(writer, graph_index, (transport, 'newpack'))  | 
|
684  | 
result = KnitVersionedFiles(index, access,  | 
|
685  | 
max_delta_chain=max_delta_chain)  | 
|
686  | 
result.stream = stream  | 
|
687  | 
result.writer = writer  | 
|
688  | 
return result  | 
|
689  | 
return factory  | 
|
690  | 
||
691  | 
||
692  | 
def cleanup_pack_knit(versioned_files):  | 
|
693  | 
versioned_files.stream.close()  | 
|
694  | 
versioned_files.writer.end()  | 
|
695  | 
||
696  | 
||
697  | 
class KnitVersionedFiles(VersionedFiles):  | 
|
698  | 
"""Storage for many versioned files using knit compression.  | 
|
699  | 
||
700  | 
    Backend storage is managed by indices and data objects.
 | 
|
| 
3582.1.14
by Martin Pool
 Clearer comments about KnitVersionedFile stacking  | 
701  | 
|
702  | 
    :ivar _index: A _KnitGraphIndex or similar that can describe the 
 | 
|
703  | 
        parents, graph, compression and data location of entries in this 
 | 
|
704  | 
        KnitVersionedFiles.  Note that this is only the index for 
 | 
|
| 
3582.1.16
by Martin Pool
 Review feedback and news entry  | 
705  | 
        *this* vfs; if there are fallbacks they must be queried separately.
 | 
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
706  | 
    """
 | 
707  | 
||
708  | 
def __init__(self, index, data_access, max_delta_chain=200,  | 
|
709  | 
annotated=False):  | 
|
710  | 
"""Create a KnitVersionedFiles with index and data_access.  | 
|
711  | 
||
712  | 
        :param index: The index for the knit data.
 | 
|
713  | 
        :param data_access: The access object to store and retrieve knit
 | 
|
714  | 
            records.
 | 
|
715  | 
        :param max_delta_chain: The maximum number of deltas to permit during
 | 
|
716  | 
            insertion. Set to 0 to prohibit the use of deltas.
 | 
|
717  | 
        :param annotated: Set to True to cause annotations to be calculated and
 | 
|
718  | 
            stored during insertion.
 | 
|
| 
1563.2.25
by Robert Collins
 Merge in upstream.  | 
719  | 
        """
 | 
| 
3316.2.3
by Robert Collins
 Remove manual notification of transaction finishing on versioned files.  | 
720  | 
self._index = index  | 
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
721  | 
self._access = data_access  | 
722  | 
self._max_delta_chain = max_delta_chain  | 
|
723  | 
if annotated:  | 
|
724  | 
self._factory = KnitAnnotateFactory()  | 
|
725  | 
else:  | 
|
726  | 
self._factory = KnitPlainFactory()  | 
|
| 
3350.8.1
by Robert Collins
 KnitVersionedFiles.add_fallback_versioned_files exists.  | 
727  | 
self._fallback_vfs = []  | 
728  | 
||
| 
3702.1.1
by Martin Pool
 Add repr for KnitVersionedFiles  | 
729  | 
def __repr__(self):  | 
730  | 
return "%s(%r, %r)" % (  | 
|
731  | 
self.__class__.__name__,  | 
|
732  | 
self._index,  | 
|
733  | 
self._access)  | 
|
734  | 
||
| 
3350.8.1
by Robert Collins
 KnitVersionedFiles.add_fallback_versioned_files exists.  | 
735  | 
def add_fallback_versioned_files(self, a_versioned_files):  | 
736  | 
"""Add a source of texts for texts not present in this knit.  | 
|
737  | 
||
738  | 
        :param a_versioned_files: A VersionedFiles object.
 | 
|
739  | 
        """
 | 
|
740  | 
self._fallback_vfs.append(a_versioned_files)  | 
|
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
741  | 
|
742  | 
def add_lines(self, key, parents, lines, parent_texts=None,  | 
|
743  | 
left_matching_blocks=None, nostore_sha=None, random_id=False,  | 
|
744  | 
check_content=True):  | 
|
745  | 
"""See VersionedFiles.add_lines()."""  | 
|
746  | 
self._index._check_write_ok()  | 
|
747  | 
self._check_add(key, lines, random_id, check_content)  | 
|
748  | 
if parents is None:  | 
|
| 
3350.6.11
by Martin Pool
 Review cleanups and documentation from Robert's mail on 2080618  | 
749  | 
            # The caller might pass None if there is no graph data, but kndx
 | 
750  | 
            # indexes can't directly store that, so we give them
 | 
|
751  | 
            # an empty tuple instead.
 | 
|
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
752  | 
parents = ()  | 
753  | 
return self._add(key, lines, parents,  | 
|
754  | 
parent_texts, left_matching_blocks, nostore_sha, random_id)  | 
|
755  | 
||
756  | 
def _add(self, key, lines, parents, parent_texts,  | 
|
757  | 
left_matching_blocks, nostore_sha, random_id):  | 
|
758  | 
"""Add a set of lines on top of version specified by parents.  | 
|
759  | 
||
760  | 
        Any versions not present will be converted into ghosts.
 | 
|
761  | 
        """
 | 
|
762  | 
        # first thing, if the content is something we don't need to store, find
 | 
|
763  | 
        # that out.
 | 
|
764  | 
line_bytes = ''.join(lines)  | 
|
765  | 
digest = sha_string(line_bytes)  | 
|
766  | 
if nostore_sha == digest:  | 
|
767  | 
raise errors.ExistingContent  | 
|
768  | 
||
769  | 
present_parents = []  | 
|
770  | 
if parent_texts is None:  | 
|
771  | 
parent_texts = {}  | 
|
772  | 
        # Do a single query to ascertain parent presence.
 | 
|
773  | 
present_parent_map = self.get_parent_map(parents)  | 
|
774  | 
for parent in parents:  | 
|
775  | 
if parent in present_parent_map:  | 
|
776  | 
present_parents.append(parent)  | 
|
777  | 
||
778  | 
        # Currently we can only compress against the left most present parent.
 | 
|
779  | 
if (len(present_parents) == 0 or  | 
|
780  | 
present_parents[0] != parents[0]):  | 
|
781  | 
delta = False  | 
|
782  | 
else:  | 
|
783  | 
            # To speed the extract of texts the delta chain is limited
 | 
|
784  | 
            # to a fixed number of deltas.  This should minimize both
 | 
|
785  | 
            # I/O and the time spend applying deltas.
 | 
|
786  | 
delta = self._check_should_delta(present_parents[0])  | 
|
787  | 
||
788  | 
text_length = len(line_bytes)  | 
|
789  | 
options = []  | 
|
790  | 
if lines:  | 
|
791  | 
if lines[-1][-1] != '\n':  | 
|
792  | 
                # copy the contents of lines.
 | 
|
793  | 
lines = lines[:]  | 
|
794  | 
options.append('no-eol')  | 
|
795  | 
lines[-1] = lines[-1] + '\n'  | 
|
796  | 
line_bytes += '\n'  | 
|
797  | 
||
798  | 
for element in key:  | 
|
799  | 
if type(element) != str:  | 
|
800  | 
raise TypeError("key contains non-strings: %r" % (key,))  | 
|
801  | 
        # Knit hunks are still last-element only
 | 
|
802  | 
version_id = key[-1]  | 
|
803  | 
content = self._factory.make(lines, version_id)  | 
|
804  | 
if 'no-eol' in options:  | 
|
805  | 
            # Hint to the content object that its text() call should strip the
 | 
|
806  | 
            # EOL.
 | 
|
807  | 
content._should_strip_eol = True  | 
|
808  | 
if delta or (self._factory.annotated and len(present_parents) > 0):  | 
|
809  | 
            # Merge annotations from parent texts if needed.
 | 
|
810  | 
delta_hunks = self._merge_annotations(content, present_parents,  | 
|
811  | 
parent_texts, delta, self._factory.annotated,  | 
|
812  | 
left_matching_blocks)  | 
|
813  | 
||
814  | 
if delta:  | 
|
815  | 
options.append('line-delta')  | 
|
816  | 
store_lines = self._factory.lower_line_delta(delta_hunks)  | 
|
817  | 
size, bytes = self._record_to_data(key, digest,  | 
|
818  | 
store_lines)  | 
|
819  | 
else:  | 
|
820  | 
options.append('fulltext')  | 
|
821  | 
            # isinstance is slower and we have no hierarchy.
 | 
|
822  | 
if self._factory.__class__ == KnitPlainFactory:  | 
|
823  | 
                # Use the already joined bytes saving iteration time in
 | 
|
824  | 
                # _record_to_data.
 | 
|
825  | 
size, bytes = self._record_to_data(key, digest,  | 
|
826  | 
lines, [line_bytes])  | 
|
827  | 
else:  | 
|
828  | 
                # get mixed annotation + content and feed it into the
 | 
|
829  | 
                # serialiser.
 | 
|
830  | 
store_lines = self._factory.lower_fulltext(content)  | 
|
831  | 
size, bytes = self._record_to_data(key, digest,  | 
|
832  | 
store_lines)  | 
|
833  | 
||
834  | 
access_memo = self._access.add_raw_records([(key, size)], bytes)[0]  | 
|
835  | 
self._index.add_records(  | 
|
836  | 
((key, options, access_memo, parents),),  | 
|
837  | 
random_id=random_id)  | 
|
838  | 
return digest, text_length, content  | 
|
839  | 
||
840  | 
def annotate(self, key):  | 
|
841  | 
"""See VersionedFiles.annotate."""  | 
|
842  | 
return self._factory.annotate(self, key)  | 
|
843  | 
||
844  | 
def check(self, progress_bar=None):  | 
|
845  | 
"""See VersionedFiles.check()."""  | 
|
846  | 
        # This doesn't actually test extraction of everything, but that will
 | 
|
847  | 
        # impact 'bzr check' substantially, and needs to be integrated with
 | 
|
848  | 
        # care. However, it does check for the obvious problem of a delta with
 | 
|
849  | 
        # no basis.
 | 
|
| 
3517.4.14
by Martin Pool
 KnitVersionedFiles.check should just check its own keys then recurse into fallbacks  | 
850  | 
keys = self._index.keys()  | 
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
851  | 
parent_map = self.get_parent_map(keys)  | 
852  | 
for key in keys:  | 
|
853  | 
if self._index.get_method(key) != 'fulltext':  | 
|
854  | 
compression_parent = parent_map[key][0]  | 
|
855  | 
if compression_parent not in parent_map:  | 
|
856  | 
raise errors.KnitCorrupt(self,  | 
|
857  | 
"Missing basis parent %s for %s" % (  | 
|
858  | 
compression_parent, key))  | 
|
| 
3517.4.14
by Martin Pool
 KnitVersionedFiles.check should just check its own keys then recurse into fallbacks  | 
859  | 
for fallback_vfs in self._fallback_vfs:  | 
860  | 
fallback_vfs.check()  | 
|
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
861  | 
|
862  | 
def _check_add(self, key, lines, random_id, check_content):  | 
|
863  | 
"""check that version_id and lines are safe to add."""  | 
|
| 
3350.6.10
by Martin Pool
 VersionedFiles review cleanups  | 
864  | 
version_id = key[-1]  | 
865  | 
if contains_whitespace(version_id):  | 
|
| 
3517.3.1
by Andrew Bennetts
 Fix error in error path.  | 
866  | 
raise InvalidRevisionId(version_id, self)  | 
| 
3350.6.10
by Martin Pool
 VersionedFiles review cleanups  | 
867  | 
self.check_not_reserved_id(version_id)  | 
| 
3350.6.11
by Martin Pool
 Review cleanups and documentation from Robert's mail on 2080618  | 
868  | 
        # TODO: If random_id==False and the key is already present, we should
 | 
869  | 
        # probably check that the existing content is identical to what is
 | 
|
870  | 
        # being inserted, and otherwise raise an exception.  This would make
 | 
|
871  | 
        # the bundle code simpler.
 | 
|
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
872  | 
if check_content:  | 
873  | 
self._check_lines_not_unicode(lines)  | 
|
874  | 
self._check_lines_are_lines(lines)  | 
|
875  | 
||
876  | 
def _check_header(self, key, line):  | 
|
877  | 
rec = self._split_header(line)  | 
|
878  | 
self._check_header_version(rec, key[-1])  | 
|
879  | 
return rec  | 
|
880  | 
||
881  | 
def _check_header_version(self, rec, version_id):  | 
|
882  | 
"""Checks the header version on original format knit records.  | 
|
883  | 
        
 | 
|
884  | 
        These have the last component of the key embedded in the record.
 | 
|
885  | 
        """
 | 
|
886  | 
if rec[1] != version_id:  | 
|
887  | 
raise KnitCorrupt(self,  | 
|
888  | 
'unexpected version, wanted %r, got %r' % (version_id, rec[1]))  | 
|
889  | 
||
890  | 
def _check_should_delta(self, parent):  | 
|
| 
2147.1.1
by John Arbash Meinel
 Factor the common knit delta selection into a helper func, and allow the fulltext to be chosen based on cumulative delta size  | 
891  | 
"""Iterate back through the parent listing, looking for a fulltext.  | 
892  | 
||
893  | 
        This is used when we want to decide whether to add a delta or a new
 | 
|
894  | 
        fulltext. It searches for _max_delta_chain parents. When it finds a
 | 
|
895  | 
        fulltext parent, it sees if the total size of the deltas leading up to
 | 
|
896  | 
        it is large enough to indicate that we want a new full text anyway.
 | 
|
897  | 
||
898  | 
        Return True if we should create a new delta, False if we should use a
 | 
|
899  | 
        full text.
 | 
|
900  | 
        """
 | 
|
901  | 
delta_size = 0  | 
|
902  | 
fulltext_size = None  | 
|
| 
2147.1.2
by John Arbash Meinel
 Simplify the knit max-chain detection code.  | 
903  | 
for count in xrange(self._max_delta_chain):  | 
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
904  | 
            # XXX: Collapse these two queries:
 | 
| 
3350.8.9
by Robert Collins
 define behaviour for add_lines with stacked storage.  | 
905  | 
try:  | 
| 
3582.1.14
by Martin Pool
 Clearer comments about KnitVersionedFile stacking  | 
906  | 
                # Note that this only looks in the index of this particular
 | 
907  | 
                # KnitVersionedFiles, not in the fallbacks.  This ensures that
 | 
|
908  | 
                # we won't store a delta spanning physical repository
 | 
|
909  | 
                # boundaries.
 | 
|
| 
3350.8.9
by Robert Collins
 define behaviour for add_lines with stacked storage.  | 
910  | 
method = self._index.get_method(parent)  | 
911  | 
except RevisionNotPresent:  | 
|
912  | 
                # Some basis is not locally present: always delta
 | 
|
913  | 
return False  | 
|
| 
2592.3.71
by Robert Collins
 Basic version of knit-based repository operating, many tests failing.  | 
914  | 
index, pos, size = self._index.get_position(parent)  | 
| 
2147.1.1
by John Arbash Meinel
 Factor the common knit delta selection into a helper func, and allow the fulltext to be chosen based on cumulative delta size  | 
915  | 
if method == 'fulltext':  | 
916  | 
fulltext_size = size  | 
|
917  | 
                break
 | 
|
918  | 
delta_size += size  | 
|
| 
3350.6.11
by Martin Pool
 Review cleanups and documentation from Robert's mail on 2080618  | 
919  | 
            # We don't explicitly check for presence because this is in an
 | 
920  | 
            # inner loop, and if it's missing it'll fail anyhow.
 | 
|
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
921  | 
            # TODO: This should be asking for compression parent, not graph
 | 
922  | 
            # parent.
 | 
|
923  | 
parent = self._index.get_parent_map([parent])[parent][0]  | 
|
| 
2147.1.2
by John Arbash Meinel
 Simplify the knit max-chain detection code.  | 
924  | 
else:  | 
925  | 
            # We couldn't find a fulltext, so we must create a new one
 | 
|
| 
2147.1.1
by John Arbash Meinel
 Factor the common knit delta selection into a helper func, and allow the fulltext to be chosen based on cumulative delta size  | 
926  | 
return False  | 
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
927  | 
        # Simple heuristic - if the total I/O wold be greater as a delta than
 | 
928  | 
        # the originally installed fulltext, we create a new fulltext.
 | 
|
| 
2147.1.2
by John Arbash Meinel
 Simplify the knit max-chain detection code.  | 
929  | 
return fulltext_size > delta_size  | 
| 
2147.1.1
by John Arbash Meinel
 Factor the common knit delta selection into a helper func, and allow the fulltext to be chosen based on cumulative delta size  | 
930  | 
|
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
931  | 
def _build_details_to_components(self, build_details):  | 
932  | 
"""Convert a build_details tuple to a position tuple."""  | 
|
933  | 
        # record_details, access_memo, compression_parent
 | 
|
934  | 
return build_details[3], build_details[0], build_details[1]  | 
|
935  | 
||
| 
3350.6.10
by Martin Pool
 VersionedFiles review cleanups  | 
936  | 
def _get_components_positions(self, keys, allow_missing=False):  | 
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
937  | 
"""Produce a map of position data for the components of keys.  | 
938  | 
||
939  | 
        This data is intended to be used for retrieving the knit records.
 | 
|
940  | 
||
941  | 
        A dict of key to (record_details, index_memo, next, parents) is
 | 
|
942  | 
        returned.
 | 
|
943  | 
        method is the way referenced data should be applied.
 | 
|
944  | 
        index_memo is the handle to pass to the data access to actually get the
 | 
|
945  | 
            data
 | 
|
946  | 
        next is the build-parent of the version, or None for fulltexts.
 | 
|
947  | 
        parents is the version_ids of the parents of this version
 | 
|
948  | 
||
| 
3350.6.10
by Martin Pool
 VersionedFiles review cleanups  | 
949  | 
        :param allow_missing: If True do not raise an error on a missing component,
 | 
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
950  | 
            just ignore it.
 | 
951  | 
        """
 | 
|
952  | 
component_data = {}  | 
|
953  | 
pending_components = keys  | 
|
954  | 
while pending_components:  | 
|
955  | 
build_details = self._index.get_build_details(pending_components)  | 
|
956  | 
current_components = set(pending_components)  | 
|
957  | 
pending_components = set()  | 
|
958  | 
for key, details in build_details.iteritems():  | 
|
959  | 
(index_memo, compression_parent, parents,  | 
|
960  | 
record_details) = details  | 
|
961  | 
method = record_details[0]  | 
|
962  | 
if compression_parent is not None:  | 
|
963  | 
pending_components.add(compression_parent)  | 
|
964  | 
component_data[key] = self._build_details_to_components(details)  | 
|
965  | 
missing = current_components.difference(build_details)  | 
|
| 
3350.6.10
by Martin Pool
 VersionedFiles review cleanups  | 
966  | 
if missing and not allow_missing:  | 
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
967  | 
raise errors.RevisionNotPresent(missing.pop(), self)  | 
968  | 
return component_data  | 
|
969  | 
||
970  | 
def _get_content(self, key, parent_texts={}):  | 
|
971  | 
"""Returns a content object that makes up the specified  | 
|
972  | 
        version."""
 | 
|
973  | 
cached_version = parent_texts.get(key, None)  | 
|
974  | 
if cached_version is not None:  | 
|
975  | 
            # Ensure the cache dict is valid.
 | 
|
976  | 
if not self.get_parent_map([key]):  | 
|
977  | 
raise RevisionNotPresent(key, self)  | 
|
978  | 
return cached_version  | 
|
979  | 
text_map, contents_map = self._get_content_maps([key])  | 
|
980  | 
return contents_map[key]  | 
|
981  | 
||
| 
3350.8.7
by Robert Collins
 get_record_stream for fulltexts working (but note extreme memory use!).  | 
982  | 
def _get_content_maps(self, keys, nonlocal_keys=None):  | 
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
983  | 
"""Produce maps of text and KnitContents  | 
984  | 
        
 | 
|
| 
3350.8.7
by Robert Collins
 get_record_stream for fulltexts working (but note extreme memory use!).  | 
985  | 
        :param keys: The keys to produce content maps for.
 | 
986  | 
        :param nonlocal_keys: An iterable of keys(possibly intersecting keys)
 | 
|
987  | 
            which are known to not be in this knit, but rather in one of the
 | 
|
988  | 
            fallback knits.
 | 
|
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
989  | 
        :return: (text_map, content_map) where text_map contains the texts for
 | 
| 
3350.6.10
by Martin Pool
 VersionedFiles review cleanups  | 
990  | 
            the requested versions and content_map contains the KnitContents.
 | 
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
991  | 
        """
 | 
992  | 
        # FUTURE: This function could be improved for the 'extract many' case
 | 
|
993  | 
        # by tracking each component and only doing the copy when the number of
 | 
|
994  | 
        # children than need to apply delta's to it is > 1 or it is part of the
 | 
|
995  | 
        # final output.
 | 
|
996  | 
keys = list(keys)  | 
|
997  | 
multiple_versions = len(keys) != 1  | 
|
| 
3350.8.7
by Robert Collins
 get_record_stream for fulltexts working (but note extreme memory use!).  | 
998  | 
record_map = self._get_record_map(keys, allow_missing=True)  | 
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
999  | 
|
1000  | 
text_map = {}  | 
|
1001  | 
content_map = {}  | 
|
1002  | 
final_content = {}  | 
|
| 
3350.8.7
by Robert Collins
 get_record_stream for fulltexts working (but note extreme memory use!).  | 
1003  | 
if nonlocal_keys is None:  | 
1004  | 
nonlocal_keys = set()  | 
|
1005  | 
else:  | 
|
1006  | 
nonlocal_keys = frozenset(nonlocal_keys)  | 
|
1007  | 
missing_keys = set(nonlocal_keys)  | 
|
1008  | 
for source in self._fallback_vfs:  | 
|
1009  | 
if not missing_keys:  | 
|
1010  | 
                break
 | 
|
1011  | 
for record in source.get_record_stream(missing_keys,  | 
|
1012  | 
'unordered', True):  | 
|
1013  | 
if record.storage_kind == 'absent':  | 
|
1014  | 
                    continue
 | 
|
1015  | 
missing_keys.remove(record.key)  | 
|
1016  | 
lines = split_lines(record.get_bytes_as('fulltext'))  | 
|
1017  | 
text_map[record.key] = lines  | 
|
| 
3350.8.10
by Robert Collins
 Stacked insert_record_stream.  | 
1018  | 
content_map[record.key] = PlainKnitContent(lines, record.key)  | 
1019  | 
if record.key in keys:  | 
|
1020  | 
final_content[record.key] = content_map[record.key]  | 
|
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
1021  | 
for key in keys:  | 
| 
3350.8.7
by Robert Collins
 get_record_stream for fulltexts working (but note extreme memory use!).  | 
1022  | 
if key in nonlocal_keys:  | 
1023  | 
                # already handled
 | 
|
1024  | 
                continue
 | 
|
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
1025  | 
components = []  | 
1026  | 
cursor = key  | 
|
1027  | 
while cursor is not None:  | 
|
| 
3350.8.10
by Robert Collins
 Stacked insert_record_stream.  | 
1028  | 
try:  | 
1029  | 
record, record_details, digest, next = record_map[cursor]  | 
|
1030  | 
except KeyError:  | 
|
1031  | 
raise RevisionNotPresent(cursor, self)  | 
|
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
1032  | 
components.append((cursor, record, record_details, digest))  | 
| 
3350.8.10
by Robert Collins
 Stacked insert_record_stream.  | 
1033  | 
cursor = next  | 
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
1034  | 
if cursor in content_map:  | 
| 
3350.8.10
by Robert Collins
 Stacked insert_record_stream.  | 
1035  | 
                    # no need to plan further back
 | 
1036  | 
components.append((cursor, None, None, None))  | 
|
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
1037  | 
                    break
 | 
1038  | 
||
1039  | 
content = None  | 
|
1040  | 
for (component_id, record, record_details,  | 
|
1041  | 
digest) in reversed(components):  | 
|
1042  | 
if component_id in content_map:  | 
|
1043  | 
content = content_map[component_id]  | 
|
1044  | 
else:  | 
|
1045  | 
content, delta = self._factory.parse_record(key[-1],  | 
|
1046  | 
record, record_details, content,  | 
|
1047  | 
copy_base_content=multiple_versions)  | 
|
1048  | 
if multiple_versions:  | 
|
1049  | 
content_map[component_id] = content  | 
|
1050  | 
||
1051  | 
final_content[key] = content  | 
|
1052  | 
||
1053  | 
            # digest here is the digest from the last applied component.
 | 
|
1054  | 
text = content.text()  | 
|
1055  | 
actual_sha = sha_strings(text)  | 
|
1056  | 
if actual_sha != digest:  | 
|
1057  | 
raise KnitCorrupt(self,  | 
|
1058  | 
'\n sha-1 %s'  | 
|
1059  | 
'\n of reconstructed text does not match'  | 
|
1060  | 
'\n expected %s'  | 
|
1061  | 
'\n for version %s' %  | 
|
1062  | 
(actual_sha, digest, key))  | 
|
1063  | 
text_map[key] = text  | 
|
1064  | 
return text_map, final_content  | 
|
1065  | 
||
1066  | 
def get_parent_map(self, keys):  | 
|
| 
3517.4.17
by Martin Pool
 Redo base Repository.get_parent_map to use .revisions graph  | 
1067  | 
"""Get a map of the graph parents of keys.  | 
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
1068  | 
|
1069  | 
        :param keys: The keys to look up parents for.
 | 
|
1070  | 
        :return: A mapping from keys to parents. Absent keys are absent from
 | 
|
1071  | 
            the mapping.
 | 
|
1072  | 
        """
 | 
|
| 
3350.8.14
by Robert Collins
 Review feedback.  | 
1073  | 
return self._get_parent_map_with_sources(keys)[0]  | 
| 
3350.8.6
by Robert Collins
 get_record_stream stacking for delta access.  | 
1074  | 
|
| 
3350.8.14
by Robert Collins
 Review feedback.  | 
1075  | 
def _get_parent_map_with_sources(self, keys):  | 
| 
3350.8.6
by Robert Collins
 get_record_stream stacking for delta access.  | 
1076  | 
"""Get a map of the parents of keys.  | 
1077  | 
||
1078  | 
        :param keys: The keys to look up parents for.
 | 
|
1079  | 
        :return: A tuple. The first element is a mapping from keys to parents.
 | 
|
1080  | 
            Absent keys are absent from the mapping. The second element is a
 | 
|
1081  | 
            list with the locations each key was found in. The first element
 | 
|
1082  | 
            is the in-this-knit parents, the second the first fallback source,
 | 
|
1083  | 
            and so on.
 | 
|
1084  | 
        """
 | 
|
| 
3350.8.2
by Robert Collins
 stacked get_parent_map.  | 
1085  | 
result = {}  | 
1086  | 
sources = [self._index] + self._fallback_vfs  | 
|
| 
3350.8.6
by Robert Collins
 get_record_stream stacking for delta access.  | 
1087  | 
source_results = []  | 
| 
3350.8.2
by Robert Collins
 stacked get_parent_map.  | 
1088  | 
missing = set(keys)  | 
1089  | 
for source in sources:  | 
|
1090  | 
if not missing:  | 
|
1091  | 
                break
 | 
|
1092  | 
new_result = source.get_parent_map(missing)  | 
|
| 
3350.8.6
by Robert Collins
 get_record_stream stacking for delta access.  | 
1093  | 
source_results.append(new_result)  | 
| 
3350.8.2
by Robert Collins
 stacked get_parent_map.  | 
1094  | 
result.update(new_result)  | 
1095  | 
missing.difference_update(set(new_result))  | 
|
| 
3350.8.6
by Robert Collins
 get_record_stream stacking for delta access.  | 
1096  | 
return result, source_results  | 
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
1097  | 
|
| 
3350.8.3
by Robert Collins
 VF.get_sha1s needed changing to be stackable.  | 
1098  | 
def _get_record_map(self, keys, allow_missing=False):  | 
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
1099  | 
"""Produce a dictionary of knit records.  | 
1100  | 
        
 | 
|
1101  | 
        :return: {key:(record, record_details, digest, next)}
 | 
|
1102  | 
            record
 | 
|
1103  | 
                data returned from read_records
 | 
|
1104  | 
            record_details
 | 
|
1105  | 
                opaque information to pass to parse_record
 | 
|
1106  | 
            digest
 | 
|
1107  | 
                SHA1 digest of the full text after all steps are done
 | 
|
1108  | 
            next
 | 
|
1109  | 
                build-parent of the version, i.e. the leftmost ancestor.
 | 
|
1110  | 
                Will be None if the record is not a delta.
 | 
|
| 
3350.8.3
by Robert Collins
 VF.get_sha1s needed changing to be stackable.  | 
1111  | 
        :param keys: The keys to build a map for
 | 
1112  | 
        :param allow_missing: If some records are missing, rather than 
 | 
|
1113  | 
            error, just return the data that could be generated.
 | 
|
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
1114  | 
        """
 | 
| 
3350.8.3
by Robert Collins
 VF.get_sha1s needed changing to be stackable.  | 
1115  | 
position_map = self._get_components_positions(keys,  | 
| 
3350.8.13
by Robert Collins
 Merge bzr.dev, fixing minor skew.  | 
1116  | 
allow_missing=allow_missing)  | 
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
1117  | 
        # key = component_id, r = record_details, i_m = index_memo, n = next
 | 
1118  | 
records = [(key, i_m) for key, (r, i_m, n)  | 
|
1119  | 
in position_map.iteritems()]  | 
|
1120  | 
record_map = {}  | 
|
1121  | 
for key, record, digest in \  | 
|
1122  | 
self._read_records_iter(records):  | 
|
1123  | 
(record_details, index_memo, next) = position_map[key]  | 
|
1124  | 
record_map[key] = record, record_details, digest, next  | 
|
1125  | 
return record_map  | 
|
1126  | 
||
1127  | 
def get_record_stream(self, keys, ordering, include_delta_closure):  | 
|
1128  | 
"""Get a stream of records for keys.  | 
|
1129  | 
||
1130  | 
        :param keys: The keys to include.
 | 
|
| 
3350.3.3
by Robert Collins
 Functional get_record_stream interface tests covering full interface.  | 
1131  | 
        :param ordering: Either 'unordered' or 'topological'. A topologically
 | 
1132  | 
            sorted stream has compression parents strictly before their
 | 
|
1133  | 
            children.
 | 
|
1134  | 
        :param include_delta_closure: If True then the closure across any
 | 
|
1135  | 
            compression parents will be included (in the opaque data).
 | 
|
1136  | 
        :return: An iterator of ContentFactory objects, each of which is only
 | 
|
1137  | 
            valid until the iterator is advanced.
 | 
|
1138  | 
        """
 | 
|
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
1139  | 
        # keys might be a generator
 | 
1140  | 
keys = set(keys)  | 
|
| 
3350.8.6
by Robert Collins
 get_record_stream stacking for delta access.  | 
1141  | 
if not keys:  | 
1142  | 
            return
 | 
|
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
1143  | 
if not self._index.has_graph:  | 
1144  | 
            # Cannot topological order when no graph has been stored.
 | 
|
1145  | 
ordering = 'unordered'  | 
|
| 
3350.3.3
by Robert Collins
 Functional get_record_stream interface tests covering full interface.  | 
1146  | 
if include_delta_closure:  | 
| 
3350.6.10
by Martin Pool
 VersionedFiles review cleanups  | 
1147  | 
positions = self._get_components_positions(keys, allow_missing=True)  | 
| 
3350.3.3
by Robert Collins
 Functional get_record_stream interface tests covering full interface.  | 
1148  | 
else:  | 
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
1149  | 
build_details = self._index.get_build_details(keys)  | 
| 
3350.6.11
by Martin Pool
 Review cleanups and documentation from Robert's mail on 2080618  | 
1150  | 
            # map from key to
 | 
1151  | 
            # (record_details, access_memo, compression_parent_key)
 | 
|
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
1152  | 
positions = dict((key, self._build_details_to_components(details))  | 
1153  | 
for key, details in build_details.iteritems())  | 
|
1154  | 
absent_keys = keys.difference(set(positions))  | 
|
1155  | 
        # There may be more absent keys : if we're missing the basis component
 | 
|
1156  | 
        # and are trying to include the delta closure.
 | 
|
1157  | 
if include_delta_closure:  | 
|
| 
3350.8.6
by Robert Collins
 get_record_stream stacking for delta access.  | 
1158  | 
needed_from_fallback = set()  | 
| 
3350.6.11
by Martin Pool
 Review cleanups and documentation from Robert's mail on 2080618  | 
1159  | 
            # Build up reconstructable_keys dict.  key:True in this dict means
 | 
1160  | 
            # the key can be reconstructed.
 | 
|
| 
3350.6.10
by Martin Pool
 VersionedFiles review cleanups  | 
1161  | 
reconstructable_keys = {}  | 
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
1162  | 
for key in keys:  | 
1163  | 
                # the delta chain
 | 
|
1164  | 
try:  | 
|
1165  | 
chain = [key, positions[key][2]]  | 
|
1166  | 
except KeyError:  | 
|
| 
3350.8.6
by Robert Collins
 get_record_stream stacking for delta access.  | 
1167  | 
needed_from_fallback.add(key)  | 
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
1168  | 
                    continue
 | 
1169  | 
result = True  | 
|
1170  | 
while chain[-1] is not None:  | 
|
| 
3350.6.10
by Martin Pool
 VersionedFiles review cleanups  | 
1171  | 
if chain[-1] in reconstructable_keys:  | 
1172  | 
result = reconstructable_keys[chain[-1]]  | 
|
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
1173  | 
                        break
 | 
1174  | 
else:  | 
|
1175  | 
try:  | 
|
1176  | 
chain.append(positions[chain[-1]][2])  | 
|
1177  | 
except KeyError:  | 
|
1178  | 
                            # missing basis component
 | 
|
| 
3350.8.10
by Robert Collins
 Stacked insert_record_stream.  | 
1179  | 
needed_from_fallback.add(chain[-1])  | 
1180  | 
result = True  | 
|
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
1181  | 
                            break
 | 
1182  | 
for chain_key in chain[:-1]:  | 
|
| 
3350.6.10
by Martin Pool
 VersionedFiles review cleanups  | 
1183  | 
reconstructable_keys[chain_key] = result  | 
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
1184  | 
if not result:  | 
| 
3350.8.6
by Robert Collins
 get_record_stream stacking for delta access.  | 
1185  | 
needed_from_fallback.add(key)  | 
1186  | 
        # Double index lookups here : need a unified api ?
 | 
|
| 
3350.8.14
by Robert Collins
 Review feedback.  | 
1187  | 
global_map, parent_maps = self._get_parent_map_with_sources(keys)  | 
| 
3350.8.6
by Robert Collins
 get_record_stream stacking for delta access.  | 
1188  | 
if ordering == 'topological':  | 
1189  | 
            # Global topological sort
 | 
|
| 
3535.5.1
by John Arbash Meinel
 cleanup a few imports to be lazily loaded.  | 
1190  | 
present_keys = tsort.topo_sort(global_map)  | 
| 
3350.8.6
by Robert Collins
 get_record_stream stacking for delta access.  | 
1191  | 
            # Now group by source:
 | 
1192  | 
source_keys = []  | 
|
1193  | 
current_source = None  | 
|
1194  | 
for key in present_keys:  | 
|
1195  | 
for parent_map in parent_maps:  | 
|
1196  | 
if key in parent_map:  | 
|
1197  | 
key_source = parent_map  | 
|
1198  | 
                        break
 | 
|
1199  | 
if current_source is not key_source:  | 
|
1200  | 
source_keys.append((key_source, []))  | 
|
1201  | 
current_source = key_source  | 
|
1202  | 
source_keys[-1][1].append(key)  | 
|
1203  | 
else:  | 
|
| 
3606.7.7
by John Arbash Meinel
 Add tests for the fetching behavior.  | 
1204  | 
if ordering != 'unordered':  | 
1205  | 
raise AssertionError('valid values for ordering are:'  | 
|
1206  | 
' "unordered" or "topological" not: %r'  | 
|
1207  | 
% (ordering,))  | 
|
| 
3350.8.6
by Robert Collins
 get_record_stream stacking for delta access.  | 
1208  | 
            # Just group by source; remote sources first.
 | 
1209  | 
present_keys = []  | 
|
1210  | 
source_keys = []  | 
|
1211  | 
for parent_map in reversed(parent_maps):  | 
|
1212  | 
source_keys.append((parent_map, []))  | 
|
1213  | 
for key in parent_map:  | 
|
1214  | 
present_keys.append(key)  | 
|
1215  | 
source_keys[-1][1].append(key)  | 
|
1216  | 
absent_keys = keys - set(global_map)  | 
|
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
1217  | 
for key in absent_keys:  | 
1218  | 
yield AbsentContentFactory(key)  | 
|
1219  | 
        # restrict our view to the keys we can answer.
 | 
|
1220  | 
        # XXX: Memory: TODO: batch data here to cap buffered data at (say) 1MB.
 | 
|
| 
3350.8.6
by Robert Collins
 get_record_stream stacking for delta access.  | 
1221  | 
        # XXX: At that point we need to consider the impact of double reads by
 | 
1222  | 
        # utilising components multiple times.
 | 
|
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
1223  | 
if include_delta_closure:  | 
1224  | 
            # XXX: get_content_maps performs its own index queries; allow state
 | 
|
1225  | 
            # to be passed in.
 | 
|
| 
3350.8.7
by Robert Collins
 get_record_stream for fulltexts working (but note extreme memory use!).  | 
1226  | 
text_map, _ = self._get_content_maps(present_keys,  | 
1227  | 
needed_from_fallback - absent_keys)  | 
|
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
1228  | 
for key in present_keys:  | 
| 
3350.8.6
by Robert Collins
 get_record_stream stacking for delta access.  | 
1229  | 
yield FulltextContentFactory(key, global_map[key], None,  | 
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
1230  | 
''.join(text_map[key]))  | 
1231  | 
else:  | 
|
| 
3350.8.6
by Robert Collins
 get_record_stream stacking for delta access.  | 
1232  | 
for source, keys in source_keys:  | 
1233  | 
if source is parent_maps[0]:  | 
|
1234  | 
                    # this KnitVersionedFiles
 | 
|
1235  | 
records = [(key, positions[key][1]) for key in keys]  | 
|
1236  | 
for key, raw_data, sha1 in self._read_records_iter_raw(records):  | 
|
1237  | 
(record_details, index_memo, _) = positions[key]  | 
|
1238  | 
yield KnitContentFactory(key, global_map[key],  | 
|
1239  | 
record_details, sha1, raw_data, self._factory.annotated, None)  | 
|
1240  | 
else:  | 
|
1241  | 
vf = self._fallback_vfs[parent_maps.index(source) - 1]  | 
|
1242  | 
for record in vf.get_record_stream(keys, ordering,  | 
|
1243  | 
include_delta_closure):  | 
|
1244  | 
yield record  | 
|
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
1245  | 
|
1246  | 
def get_sha1s(self, keys):  | 
|
1247  | 
"""See VersionedFiles.get_sha1s()."""  | 
|
| 
3350.8.3
by Robert Collins
 VF.get_sha1s needed changing to be stackable.  | 
1248  | 
missing = set(keys)  | 
1249  | 
record_map = self._get_record_map(missing, allow_missing=True)  | 
|
1250  | 
result = {}  | 
|
1251  | 
for key, details in record_map.iteritems():  | 
|
1252  | 
if key not in missing:  | 
|
1253  | 
                continue
 | 
|
1254  | 
            # record entry 2 is the 'digest'.
 | 
|
1255  | 
result[key] = details[2]  | 
|
1256  | 
missing.difference_update(set(result))  | 
|
1257  | 
for source in self._fallback_vfs:  | 
|
1258  | 
if not missing:  | 
|
1259  | 
                break
 | 
|
1260  | 
new_result = source.get_sha1s(missing)  | 
|
1261  | 
result.update(new_result)  | 
|
1262  | 
missing.difference_update(set(new_result))  | 
|
1263  | 
return result  | 
|
| 
3052.2.2
by Robert Collins
 * Operations pulling data from a smart server where the underlying  | 
1264  | 
|
| 
3350.3.8
by Robert Collins
 Basic stream insertion, no fast path yet for knit to knit.  | 
1265  | 
def insert_record_stream(self, stream):  | 
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
1266  | 
"""Insert a record stream into this container.  | 
| 
3350.3.8
by Robert Collins
 Basic stream insertion, no fast path yet for knit to knit.  | 
1267  | 
|
1268  | 
        :param stream: A stream of records to insert. 
 | 
|
1269  | 
        :return: None
 | 
|
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
1270  | 
        :seealso VersionedFiles.get_record_stream:
 | 
| 
3350.3.8
by Robert Collins
 Basic stream insertion, no fast path yet for knit to knit.  | 
1271  | 
        """
 | 
| 
3350.3.9
by Robert Collins
 Avoid full text reconstruction when transferring knit to knit via record streams.  | 
1272  | 
def get_adapter(adapter_key):  | 
1273  | 
try:  | 
|
1274  | 
return adapters[adapter_key]  | 
|
1275  | 
except KeyError:  | 
|
1276  | 
adapter_factory = adapter_registry.get(adapter_key)  | 
|
1277  | 
adapter = adapter_factory(self)  | 
|
1278  | 
adapters[adapter_key] = adapter  | 
|
1279  | 
return adapter  | 
|
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
1280  | 
if self._factory.annotated:  | 
| 
3350.3.11
by Robert Collins
 Test inserting a stream that overlaps the current content of a knit does not error.  | 
1281  | 
            # self is annotated, we need annotated knits to use directly.
 | 
| 
3350.3.9
by Robert Collins
 Avoid full text reconstruction when transferring knit to knit via record streams.  | 
1282  | 
annotated = "annotated-"  | 
| 
3350.3.11
by Robert Collins
 Test inserting a stream that overlaps the current content of a knit does not error.  | 
1283  | 
convertibles = []  | 
| 
3350.3.9
by Robert Collins
 Avoid full text reconstruction when transferring knit to knit via record streams.  | 
1284  | 
else:  | 
| 
3350.3.11
by Robert Collins
 Test inserting a stream that overlaps the current content of a knit does not error.  | 
1285  | 
            # self is not annotated, but we can strip annotations cheaply.
 | 
| 
3350.3.9
by Robert Collins
 Avoid full text reconstruction when transferring knit to knit via record streams.  | 
1286  | 
annotated = ""  | 
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
1287  | 
convertibles = set(["knit-annotated-ft-gz"])  | 
1288  | 
if self._max_delta_chain:  | 
|
1289  | 
convertibles.add("knit-annotated-delta-gz")  | 
|
| 
3350.3.22
by Robert Collins
 Review feedback.  | 
1290  | 
        # The set of types we can cheaply adapt without needing basis texts.
 | 
| 
3350.3.9
by Robert Collins
 Avoid full text reconstruction when transferring knit to knit via record streams.  | 
1291  | 
native_types = set()  | 
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
1292  | 
if self._max_delta_chain:  | 
1293  | 
native_types.add("knit-%sdelta-gz" % annotated)  | 
|
| 
3350.3.9
by Robert Collins
 Avoid full text reconstruction when transferring knit to knit via record streams.  | 
1294  | 
native_types.add("knit-%sft-gz" % annotated)  | 
| 
3350.3.11
by Robert Collins
 Test inserting a stream that overlaps the current content of a knit does not error.  | 
1295  | 
knit_types = native_types.union(convertibles)  | 
| 
3350.3.8
by Robert Collins
 Basic stream insertion, no fast path yet for knit to knit.  | 
1296  | 
adapters = {}  | 
| 
3350.3.22
by Robert Collins
 Review feedback.  | 
1297  | 
        # Buffer all index entries that we can't add immediately because their
 | 
| 
3350.3.17
by Robert Collins
 Prevent corrupt knits being created when a stream is interrupted with basis parents not present.  | 
1298  | 
        # basis parent is missing. We don't buffer all because generating
 | 
1299  | 
        # annotations may require access to some of the new records. However we
 | 
|
1300  | 
        # can't generate annotations from new deltas until their basis parent
 | 
|
1301  | 
        # is present anyway, so we get away with not needing an index that
 | 
|
| 
3350.3.22
by Robert Collins
 Review feedback.  | 
1302  | 
        # includes the new keys.
 | 
| 
3350.3.17
by Robert Collins
 Prevent corrupt knits being created when a stream is interrupted with basis parents not present.  | 
1303  | 
        # key = basis_parent, value = index entry to add
 | 
1304  | 
buffered_index_entries = {}  | 
|
| 
3350.3.8
by Robert Collins
 Basic stream insertion, no fast path yet for knit to knit.  | 
1305  | 
for record in stream:  | 
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
1306  | 
parents = record.parents  | 
| 
3350.3.15
by Robert Collins
 Update the insert_record_stream contract to error if an absent record is provided.  | 
1307  | 
            # Raise an error when a record is missing.
 | 
1308  | 
if record.storage_kind == 'absent':  | 
|
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
1309  | 
raise RevisionNotPresent([record.key], self)  | 
| 
3350.3.9
by Robert Collins
 Avoid full text reconstruction when transferring knit to knit via record streams.  | 
1310  | 
if record.storage_kind in knit_types:  | 
1311  | 
if record.storage_kind not in native_types:  | 
|
1312  | 
try:  | 
|
1313  | 
adapter_key = (record.storage_kind, "knit-delta-gz")  | 
|
1314  | 
adapter = get_adapter(adapter_key)  | 
|
1315  | 
except KeyError:  | 
|
1316  | 
adapter_key = (record.storage_kind, "knit-ft-gz")  | 
|
1317  | 
adapter = get_adapter(adapter_key)  | 
|
1318  | 
bytes = adapter.get_bytes(  | 
|
1319  | 
record, record.get_bytes_as(record.storage_kind))  | 
|
1320  | 
else:  | 
|
1321  | 
bytes = record.get_bytes_as(record.storage_kind)  | 
|
1322  | 
options = [record._build_details[0]]  | 
|
1323  | 
if record._build_details[1]:  | 
|
1324  | 
options.append('no-eol')  | 
|
| 
3350.3.11
by Robert Collins
 Test inserting a stream that overlaps the current content of a knit does not error.  | 
1325  | 
                # Just blat it across.
 | 
1326  | 
                # Note: This does end up adding data on duplicate keys. As
 | 
|
1327  | 
                # modern repositories use atomic insertions this should not
 | 
|
1328  | 
                # lead to excessive growth in the event of interrupted fetches.
 | 
|
1329  | 
                # 'knit' repositories may suffer excessive growth, but as a
 | 
|
1330  | 
                # deprecated format this is tolerable. It can be fixed if
 | 
|
1331  | 
                # needed by in the kndx index support raising on a duplicate
 | 
|
1332  | 
                # add with identical parents and options.
 | 
|
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
1333  | 
access_memo = self._access.add_raw_records(  | 
1334  | 
[(record.key, len(bytes))], bytes)[0]  | 
|
1335  | 
index_entry = (record.key, options, access_memo, parents)  | 
|
| 
3350.3.17
by Robert Collins
 Prevent corrupt knits being created when a stream is interrupted with basis parents not present.  | 
1336  | 
buffered = False  | 
1337  | 
if 'fulltext' not in options:  | 
|
1338  | 
basis_parent = parents[0]  | 
|
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
1339  | 
                    # Note that pack backed knits don't need to buffer here
 | 
1340  | 
                    # because they buffer all writes to the transaction level,
 | 
|
| 
3350.6.10
by Martin Pool
 VersionedFiles review cleanups  | 
1341  | 
                    # but we don't expose that difference at the index level. If
 | 
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
1342  | 
                    # the query here has sufficient cost to show up in
 | 
1343  | 
                    # profiling we should do that.
 | 
|
1344  | 
if basis_parent not in self.get_parent_map([basis_parent]):  | 
|
| 
3350.3.17
by Robert Collins
 Prevent corrupt knits being created when a stream is interrupted with basis parents not present.  | 
1345  | 
pending = buffered_index_entries.setdefault(  | 
1346  | 
basis_parent, [])  | 
|
1347  | 
pending.append(index_entry)  | 
|
1348  | 
buffered = True  | 
|
1349  | 
if not buffered:  | 
|
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
1350  | 
self._index.add_records([index_entry])  | 
| 
3350.3.9
by Robert Collins
 Avoid full text reconstruction when transferring knit to knit via record streams.  | 
1351  | 
elif record.storage_kind == 'fulltext':  | 
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
1352  | 
self.add_lines(record.key, parents,  | 
| 
3350.3.8
by Robert Collins
 Basic stream insertion, no fast path yet for knit to knit.  | 
1353  | 
split_lines(record.get_bytes_as('fulltext')))  | 
1354  | 
else:  | 
|
1355  | 
adapter_key = record.storage_kind, 'fulltext'  | 
|
| 
3350.3.9
by Robert Collins
 Avoid full text reconstruction when transferring knit to knit via record streams.  | 
1356  | 
adapter = get_adapter(adapter_key)  | 
| 
3350.3.8
by Robert Collins
 Basic stream insertion, no fast path yet for knit to knit.  | 
1357  | 
lines = split_lines(adapter.get_bytes(  | 
1358  | 
record, record.get_bytes_as(record.storage_kind)))  | 
|
| 
3350.3.11
by Robert Collins
 Test inserting a stream that overlaps the current content of a knit does not error.  | 
1359  | 
try:  | 
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
1360  | 
self.add_lines(record.key, parents, lines)  | 
| 
3350.3.11
by Robert Collins
 Test inserting a stream that overlaps the current content of a knit does not error.  | 
1361  | 
except errors.RevisionAlreadyPresent:  | 
1362  | 
                    pass
 | 
|
| 
3350.3.17
by Robert Collins
 Prevent corrupt knits being created when a stream is interrupted with basis parents not present.  | 
1363  | 
            # Add any records whose basis parent is now available.
 | 
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
1364  | 
added_keys = [record.key]  | 
| 
3350.3.17
by Robert Collins
 Prevent corrupt knits being created when a stream is interrupted with basis parents not present.  | 
1365  | 
while added_keys:  | 
1366  | 
key = added_keys.pop(0)  | 
|
1367  | 
if key in buffered_index_entries:  | 
|
1368  | 
index_entries = buffered_index_entries[key]  | 
|
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
1369  | 
self._index.add_records(index_entries)  | 
| 
3350.3.17
by Robert Collins
 Prevent corrupt knits being created when a stream is interrupted with basis parents not present.  | 
1370  | 
added_keys.extend(  | 
1371  | 
[index_entry[0] for index_entry in index_entries])  | 
|
1372  | 
del buffered_index_entries[key]  | 
|
1373  | 
        # If there were any deltas which had a missing basis parent, error.
 | 
|
1374  | 
if buffered_index_entries:  | 
|
1375  | 
raise errors.RevisionNotPresent(buffered_index_entries.keys()[0],  | 
|
1376  | 
self)  | 
|
| 
3350.3.8
by Robert Collins
 Basic stream insertion, no fast path yet for knit to knit.  | 
1377  | 
|
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
1378  | 
def iter_lines_added_or_present_in_keys(self, keys, pb=None):  | 
1379  | 
"""Iterate over the lines in the versioned files from keys.  | 
|
1380  | 
||
1381  | 
        This may return lines from other keys. Each item the returned
 | 
|
1382  | 
        iterator yields is a tuple of a line and a text version that that line
 | 
|
1383  | 
        is present in (not introduced in).
 | 
|
1384  | 
||
1385  | 
        Ordering of results is in whatever order is most suitable for the
 | 
|
1386  | 
        underlying storage format.
 | 
|
1387  | 
||
1388  | 
        If a progress bar is supplied, it may be used to indicate progress.
 | 
|
1389  | 
        The caller is responsible for cleaning up progress bars (because this
 | 
|
1390  | 
        is an iterator).
 | 
|
1391  | 
||
1392  | 
        NOTES:
 | 
|
1393  | 
         * Lines are normalised by the underlying store: they will all have \n
 | 
|
1394  | 
           terminators.
 | 
|
1395  | 
         * Lines are returned in arbitrary order.
 | 
|
1396  | 
||
1397  | 
        :return: An iterator over (line, key).
 | 
|
1398  | 
        """
 | 
|
1399  | 
if pb is None:  | 
|
1400  | 
pb = progress.DummyProgress()  | 
|
1401  | 
keys = set(keys)  | 
|
| 
3350.8.5
by Robert Collins
 Iter_lines_added_or_present_in_keys stacks.  | 
1402  | 
total = len(keys)  | 
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
1403  | 
        # we don't care about inclusions, the caller cares.
 | 
1404  | 
        # but we need to setup a list of records to visit.
 | 
|
1405  | 
        # we need key, position, length
 | 
|
1406  | 
key_records = []  | 
|
1407  | 
build_details = self._index.get_build_details(keys)  | 
|
| 
3350.8.5
by Robert Collins
 Iter_lines_added_or_present_in_keys stacks.  | 
1408  | 
for key, details in build_details.iteritems():  | 
1409  | 
if key in keys:  | 
|
1410  | 
key_records.append((key, details[0]))  | 
|
1411  | 
keys.remove(key)  | 
|
| 
3350.6.10
by Martin Pool
 VersionedFiles review cleanups  | 
1412  | 
records_iter = enumerate(self._read_records_iter(key_records))  | 
1413  | 
for (key_idx, (key, data, sha_value)) in records_iter:  | 
|
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
1414  | 
pb.update('Walking content.', key_idx, total)  | 
1415  | 
compression_parent = build_details[key][1]  | 
|
1416  | 
if compression_parent is None:  | 
|
1417  | 
                # fulltext
 | 
|
1418  | 
line_iterator = self._factory.get_fulltext_content(data)  | 
|
1419  | 
else:  | 
|
1420  | 
                # Delta 
 | 
|
1421  | 
line_iterator = self._factory.get_linedelta_content(data)  | 
|
1422  | 
            # XXX: It might be more efficient to yield (key,
 | 
|
1423  | 
            # line_iterator) in the future. However for now, this is a simpler
 | 
|
1424  | 
            # change to integrate into the rest of the codebase. RBC 20071110
 | 
|
1425  | 
for line in line_iterator:  | 
|
1426  | 
yield line, key  | 
|
| 
3350.8.5
by Robert Collins
 Iter_lines_added_or_present_in_keys stacks.  | 
1427  | 
for source in self._fallback_vfs:  | 
1428  | 
if not keys:  | 
|
1429  | 
                break
 | 
|
1430  | 
source_keys = set()  | 
|
1431  | 
for line, key in source.iter_lines_added_or_present_in_keys(keys):  | 
|
1432  | 
source_keys.add(key)  | 
|
1433  | 
yield line, key  | 
|
1434  | 
keys.difference_update(source_keys)  | 
|
1435  | 
if keys:  | 
|
1436  | 
raise RevisionNotPresent(keys, self.filename)  | 
|
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
1437  | 
pb.update('Walking content.', total, total)  | 
1438  | 
||
1439  | 
def _make_line_delta(self, delta_seq, new_content):  | 
|
1440  | 
"""Generate a line delta from delta_seq and new_content."""  | 
|
1441  | 
diff_hunks = []  | 
|
1442  | 
for op in delta_seq.get_opcodes():  | 
|
1443  | 
if op[0] == 'equal':  | 
|
1444  | 
                continue
 | 
|
1445  | 
diff_hunks.append((op[1], op[2], op[4]-op[3], new_content._lines[op[3]:op[4]]))  | 
|
1446  | 
return diff_hunks  | 
|
| 
1563.2.4
by Robert Collins
 First cut at including the knit implementation of versioned_file.  | 
1447  | 
|
| 
1596.2.34
by Robert Collins
 Optimise knit add to only diff once per parent, not once per parent + once for the delta generation.  | 
1448  | 
def _merge_annotations(self, content, parents, parent_texts={},  | 
| 
2520.4.140
by Aaron Bentley
 Use matching blocks from mpdiff for knit delta creation  | 
1449  | 
delta=None, annotated=None,  | 
1450  | 
left_matching_blocks=None):  | 
|
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
1451  | 
"""Merge annotations for content and generate deltas.  | 
1452  | 
        
 | 
|
1453  | 
        This is done by comparing the annotations based on changes to the text
 | 
|
1454  | 
        and generating a delta on the resulting full texts. If annotations are
 | 
|
1455  | 
        not being created then a simple delta is created.
 | 
|
| 
1596.2.27
by Robert Collins
 Note potential improvements in knit adds.  | 
1456  | 
        """
 | 
| 
2520.4.146
by Aaron Bentley
 Avoid get_matching_blocks for un-annotated text  | 
1457  | 
if left_matching_blocks is not None:  | 
1458  | 
delta_seq = diff._PrematchedMatcher(left_matching_blocks)  | 
|
1459  | 
else:  | 
|
1460  | 
delta_seq = None  | 
|
| 
1596.2.34
by Robert Collins
 Optimise knit add to only diff once per parent, not once per parent + once for the delta generation.  | 
1461  | 
if annotated:  | 
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
1462  | 
for parent_key in parents:  | 
1463  | 
merge_content = self._get_content(parent_key, parent_texts)  | 
|
1464  | 
if (parent_key == parents[0] and delta_seq is not None):  | 
|
| 
2520.4.146
by Aaron Bentley
 Avoid get_matching_blocks for un-annotated text  | 
1465  | 
seq = delta_seq  | 
| 
2520.4.140
by Aaron Bentley
 Use matching blocks from mpdiff for knit delta creation  | 
1466  | 
else:  | 
1467  | 
seq = patiencediff.PatienceSequenceMatcher(  | 
|
1468  | 
None, merge_content.text(), content.text())  | 
|
| 
1596.2.34
by Robert Collins
 Optimise knit add to only diff once per parent, not once per parent + once for the delta generation.  | 
1469  | 
for i, j, n in seq.get_matching_blocks():  | 
1470  | 
if n == 0:  | 
|
1471  | 
                        continue
 | 
|
| 
3460.2.1
by Robert Collins
 * Inserting a bundle which changes the contents of a file with no trailing  | 
1472  | 
                    # this copies (origin, text) pairs across to the new
 | 
1473  | 
                    # content for any line that matches the last-checked
 | 
|
| 
2520.4.146
by Aaron Bentley
 Avoid get_matching_blocks for un-annotated text  | 
1474  | 
                    # parent.
 | 
| 
1596.2.34
by Robert Collins
 Optimise knit add to only diff once per parent, not once per parent + once for the delta generation.  | 
1475  | 
content._lines[j:j+n] = merge_content._lines[i:i+n]  | 
| 
3350.6.10
by Martin Pool
 VersionedFiles review cleanups  | 
1476  | 
            # XXX: Robert says the following block is a workaround for a
 | 
1477  | 
            # now-fixed bug and it can probably be deleted. -- mbp 20080618
 | 
|
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
1478  | 
if content._lines and content._lines[-1][1][-1] != '\n':  | 
1479  | 
                # The copied annotation was from a line without a trailing EOL,
 | 
|
1480  | 
                # reinstate one for the content object, to ensure correct
 | 
|
1481  | 
                # serialization.
 | 
|
1482  | 
line = content._lines[-1][1] + '\n'  | 
|
1483  | 
content._lines[-1] = (content._lines[-1][0], line)  | 
|
| 
1596.2.36
by Robert Collins
 add a get_delta api to versioned_file.  | 
1484  | 
if delta:  | 
| 
2520.4.146
by Aaron Bentley
 Avoid get_matching_blocks for un-annotated text  | 
1485  | 
if delta_seq is None:  | 
| 
1596.2.36
by Robert Collins
 add a get_delta api to versioned_file.  | 
1486  | 
reference_content = self._get_content(parents[0], parent_texts)  | 
1487  | 
new_texts = content.text()  | 
|
1488  | 
old_texts = reference_content.text()  | 
|
| 
2104.4.2
by John Arbash Meinel
 Small cleanup and NEWS entry about fixing bug #65714  | 
1489  | 
delta_seq = patiencediff.PatienceSequenceMatcher(  | 
| 
2100.2.1
by wang
 Replace python's difflib by patiencediff because the worst case  | 
1490  | 
None, old_texts, new_texts)  | 
| 
1596.2.36
by Robert Collins
 add a get_delta api to versioned_file.  | 
1491  | 
return self._make_line_delta(delta_seq, content)  | 
1492  | 
||
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
1493  | 
def _parse_record(self, version_id, data):  | 
1494  | 
"""Parse an original format knit record.  | 
|
1495  | 
||
1496  | 
        These have the last element of the key only present in the stored data.
 | 
|
1497  | 
        """
 | 
|
1498  | 
rec, record_contents = self._parse_record_unchecked(data)  | 
|
1499  | 
self._check_header_version(rec, version_id)  | 
|
1500  | 
return record_contents, rec[3]  | 
|
1501  | 
||
1502  | 
def _parse_record_header(self, key, raw_data):  | 
|
1503  | 
"""Parse a record header for consistency.  | 
|
1504  | 
||
1505  | 
        :return: the header and the decompressor stream.
 | 
|
1506  | 
                 as (stream, header_record)
 | 
|
1507  | 
        """
 | 
|
| 
3535.5.1
by John Arbash Meinel
 cleanup a few imports to be lazily loaded.  | 
1508  | 
df = tuned_gzip.GzipFile(mode='rb', fileobj=StringIO(raw_data))  | 
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
1509  | 
try:  | 
1510  | 
            # Current serialise
 | 
|
1511  | 
rec = self._check_header(key, df.readline())  | 
|
1512  | 
except Exception, e:  | 
|
1513  | 
raise KnitCorrupt(self,  | 
|
1514  | 
"While reading {%s} got %s(%s)"  | 
|
1515  | 
% (key, e.__class__.__name__, str(e)))  | 
|
1516  | 
return df, rec  | 
|
1517  | 
||
1518  | 
def _parse_record_unchecked(self, data):  | 
|
1519  | 
        # profiling notes:
 | 
|
1520  | 
        # 4168 calls in 2880 217 internal
 | 
|
1521  | 
        # 4168 calls to _parse_record_header in 2121
 | 
|
1522  | 
        # 4168 calls to readlines in 330
 | 
|
| 
3535.5.1
by John Arbash Meinel
 cleanup a few imports to be lazily loaded.  | 
1523  | 
df = tuned_gzip.GzipFile(mode='rb', fileobj=StringIO(data))  | 
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
1524  | 
try:  | 
1525  | 
record_contents = df.readlines()  | 
|
1526  | 
except Exception, e:  | 
|
1527  | 
raise KnitCorrupt(self, "Corrupt compressed record %r, got %s(%s)" %  | 
|
1528  | 
(data, e.__class__.__name__, str(e)))  | 
|
1529  | 
header = record_contents.pop(0)  | 
|
1530  | 
rec = self._split_header(header)  | 
|
1531  | 
last_line = record_contents.pop()  | 
|
1532  | 
if len(record_contents) != int(rec[2]):  | 
|
1533  | 
raise KnitCorrupt(self,  | 
|
1534  | 
'incorrect number of lines %s != %s'  | 
|
1535  | 
' for version {%s} %s'  | 
|
1536  | 
% (len(record_contents), int(rec[2]),  | 
|
1537  | 
rec[1], record_contents))  | 
|
1538  | 
if last_line != 'end %s\n' % rec[1]:  | 
|
1539  | 
raise KnitCorrupt(self,  | 
|
1540  | 
'unexpected version end line %r, wanted %r'  | 
|
1541  | 
% (last_line, rec[1]))  | 
|
1542  | 
df.close()  | 
|
1543  | 
return rec, record_contents  | 
|
1544  | 
||
1545  | 
def _read_records_iter(self, records):  | 
|
1546  | 
"""Read text records from data file and yield result.  | 
|
1547  | 
||
1548  | 
        The result will be returned in whatever is the fastest to read.
 | 
|
1549  | 
        Not by the order requested. Also, multiple requests for the same
 | 
|
1550  | 
        record will only yield 1 response.
 | 
|
1551  | 
        :param records: A list of (key, access_memo) entries
 | 
|
1552  | 
        :return: Yields (key, contents, digest) in the order
 | 
|
1553  | 
                 read, not the order requested
 | 
|
1554  | 
        """
 | 
|
1555  | 
if not records:  | 
|
1556  | 
            return
 | 
|
1557  | 
||
1558  | 
        # XXX: This smells wrong, IO may not be getting ordered right.
 | 
|
1559  | 
needed_records = sorted(set(records), key=operator.itemgetter(1))  | 
|
1560  | 
if not needed_records:  | 
|
1561  | 
            return
 | 
|
1562  | 
||
1563  | 
        # The transport optimizes the fetching as well 
 | 
|
1564  | 
        # (ie, reads continuous ranges.)
 | 
|
1565  | 
raw_data = self._access.get_raw_records(  | 
|
1566  | 
[index_memo for key, index_memo in needed_records])  | 
|
1567  | 
||
1568  | 
for (key, index_memo), data in \  | 
|
1569  | 
izip(iter(needed_records), raw_data):  | 
|
1570  | 
content, digest = self._parse_record(key[-1], data)  | 
|
1571  | 
yield key, content, digest  | 
|
1572  | 
||
1573  | 
def _read_records_iter_raw(self, records):  | 
|
1574  | 
"""Read text records from data file and yield raw data.  | 
|
1575  | 
||
1576  | 
        This unpacks enough of the text record to validate the id is
 | 
|
1577  | 
        as expected but thats all.
 | 
|
1578  | 
||
1579  | 
        Each item the iterator yields is (key, bytes, sha1_of_full_text).
 | 
|
1580  | 
        """
 | 
|
1581  | 
        # setup an iterator of the external records:
 | 
|
1582  | 
        # uses readv so nice and fast we hope.
 | 
|
1583  | 
if len(records):  | 
|
1584  | 
            # grab the disk data needed.
 | 
|
1585  | 
needed_offsets = [index_memo for key, index_memo  | 
|
1586  | 
in records]  | 
|
1587  | 
raw_records = self._access.get_raw_records(needed_offsets)  | 
|
1588  | 
||
1589  | 
for key, index_memo in records:  | 
|
1590  | 
data = raw_records.next()  | 
|
1591  | 
            # validate the header (note that we can only use the suffix in
 | 
|
1592  | 
            # current knit records).
 | 
|
1593  | 
df, rec = self._parse_record_header(key, data)  | 
|
1594  | 
df.close()  | 
|
1595  | 
yield key, data, rec[3]  | 
|
1596  | 
||
1597  | 
def _record_to_data(self, key, digest, lines, dense_lines=None):  | 
|
1598  | 
"""Convert key, digest, lines into a raw data block.  | 
|
1599  | 
        
 | 
|
1600  | 
        :param key: The key of the record. Currently keys are always serialised
 | 
|
1601  | 
            using just the trailing component.
 | 
|
1602  | 
        :param dense_lines: The bytes of lines but in a denser form. For
 | 
|
1603  | 
            instance, if lines is a list of 1000 bytestrings each ending in \n,
 | 
|
1604  | 
            dense_lines may be a list with one line in it, containing all the
 | 
|
1605  | 
            1000's lines and their \n's. Using dense_lines if it is already
 | 
|
1606  | 
            known is a win because the string join to create bytes in this
 | 
|
1607  | 
            function spends less time resizing the final string.
 | 
|
1608  | 
        :return: (len, a StringIO instance with the raw data ready to read.)
 | 
|
1609  | 
        """
 | 
|
1610  | 
        # Note: using a string copy here increases memory pressure with e.g.
 | 
|
1611  | 
        # ISO's, but it is about 3 seconds faster on a 1.2Ghz intel machine
 | 
|
1612  | 
        # when doing the initial commit of a mozilla tree. RBC 20070921
 | 
|
1613  | 
bytes = ''.join(chain(  | 
|
1614  | 
["version %s %d %s\n" % (key[-1],  | 
|
1615  | 
len(lines),  | 
|
1616  | 
digest)],  | 
|
1617  | 
dense_lines or lines,  | 
|
1618  | 
["end %s\n" % key[-1]]))  | 
|
1619  | 
if type(bytes) != str:  | 
|
1620  | 
raise AssertionError(  | 
|
1621  | 
'data must be plain bytes was %s' % type(bytes))  | 
|
1622  | 
if lines and lines[-1][-1] != '\n':  | 
|
1623  | 
raise ValueError('corrupt lines value %r' % lines)  | 
|
| 
3535.5.1
by John Arbash Meinel
 cleanup a few imports to be lazily loaded.  | 
1624  | 
compressed_bytes = tuned_gzip.bytes_to_gzip(bytes)  | 
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
1625  | 
return len(compressed_bytes), compressed_bytes  | 
1626  | 
||
1627  | 
def _split_header(self, line):  | 
|
1628  | 
rec = line.split()  | 
|
1629  | 
if len(rec) != 4:  | 
|
1630  | 
raise KnitCorrupt(self,  | 
|
1631  | 
'unexpected number of elements in record header')  | 
|
1632  | 
return rec  | 
|
1633  | 
||
1634  | 
def keys(self):  | 
|
1635  | 
"""See VersionedFiles.keys."""  | 
|
1636  | 
if 'evil' in debug.debug_flags:  | 
|
1637  | 
trace.mutter_callsite(2, "keys scales with size of history")  | 
|
| 
3350.8.4
by Robert Collins
 Vf.keys() stacking support.  | 
1638  | 
sources = [self._index] + self._fallback_vfs  | 
1639  | 
result = set()  | 
|
1640  | 
for source in sources:  | 
|
1641  | 
result.update(source.keys())  | 
|
1642  | 
return result  | 
|
1643  | 
||
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
1644  | 
|
1645  | 
||
1646  | 
class _KndxIndex(object):  | 
|
1647  | 
"""Manages knit index files  | 
|
1648  | 
||
| 
3350.6.10
by Martin Pool
 VersionedFiles review cleanups  | 
1649  | 
    The index is kept in memory and read on startup, to enable
 | 
| 
1563.2.4
by Robert Collins
 First cut at including the knit implementation of versioned_file.  | 
1650  | 
    fast lookups of revision information.  The cursor of the index
 | 
1651  | 
    file is always pointing to the end, making it easy to append
 | 
|
1652  | 
    entries.
 | 
|
1653  | 
||
1654  | 
    _cache is a cache for fast mapping from version id to a Index
 | 
|
1655  | 
    object.
 | 
|
1656  | 
||
1657  | 
    _history is a cache for fast mapping from indexes to version ids.
 | 
|
1658  | 
||
1659  | 
    The index data format is dictionary compressed when it comes to
 | 
|
1660  | 
    parent references; a index entry may only have parents that with a
 | 
|
1661  | 
    lover index number.  As a result, the index is topological sorted.
 | 
|
| 
1563.2.11
by Robert Collins
 Consolidate reweave and join as we have no separate usage, make reweave tests apply to all versionedfile implementations and deprecate the old reweave apis.  | 
1662  | 
|
1663  | 
    Duplicate entries may be written to the index for a single version id
 | 
|
1664  | 
    if this is done then the latter one completely replaces the former:
 | 
|
1665  | 
    this allows updates to correct version and parent information. 
 | 
|
1666  | 
    Note that the two entries may share the delta, and that successive
 | 
|
1667  | 
    annotations and references MUST point to the first entry.
 | 
|
| 
1641.1.2
by Robert Collins
 Change knit index files to be robust in the presence of partial writes.  | 
1668  | 
|
1669  | 
    The index file on disc contains a header, followed by one line per knit
 | 
|
1670  | 
    record. The same revision can be present in an index file more than once.
 | 
|
| 
1759.2.1
by Jelmer Vernooij
 Fix some types (found using aspell).  | 
1671  | 
    The first occurrence gets assigned a sequence number starting from 0. 
 | 
| 
1641.1.2
by Robert Collins
 Change knit index files to be robust in the presence of partial writes.  | 
1672  | 
    
 | 
1673  | 
    The format of a single line is
 | 
|
1674  | 
    REVISION_ID FLAGS BYTE_OFFSET LENGTH( PARENT_ID|PARENT_SEQUENCE_ID)* :\n
 | 
|
1675  | 
    REVISION_ID is a utf8-encoded revision id
 | 
|
1676  | 
    FLAGS is a comma separated list of flags about the record. Values include 
 | 
|
1677  | 
        no-eol, line-delta, fulltext.
 | 
|
1678  | 
    BYTE_OFFSET is the ascii representation of the byte offset in the data file
 | 
|
1679  | 
        that the the compressed data starts at.
 | 
|
1680  | 
    LENGTH is the ascii representation of the length of the data file.
 | 
|
1681  | 
    PARENT_ID a utf-8 revision id prefixed by a '.' that is a parent of
 | 
|
1682  | 
        REVISION_ID.
 | 
|
1683  | 
    PARENT_SEQUENCE_ID the ascii representation of the sequence number of a
 | 
|
1684  | 
        revision id already in the knit that is a parent of REVISION_ID.
 | 
|
1685  | 
    The ' :' marker is the end of record marker.
 | 
|
1686  | 
    
 | 
|
1687  | 
    partial writes:
 | 
|
| 
2158.3.1
by Dmitry Vasiliev
 KnitIndex tests/fixes/optimizations  | 
1688  | 
    when a write is interrupted to the index file, it will result in a line
 | 
1689  | 
    that does not end in ' :'. If the ' :' is not present at the end of a line,
 | 
|
1690  | 
    or at the end of the file, then the record that is missing it will be
 | 
|
1691  | 
    ignored by the parser.
 | 
|
| 
1641.1.2
by Robert Collins
 Change knit index files to be robust in the presence of partial writes.  | 
1692  | 
|
| 
1759.2.1
by Jelmer Vernooij
 Fix some types (found using aspell).  | 
1693  | 
    When writing new records to the index file, the data is preceded by '\n'
 | 
| 
1641.1.2
by Robert Collins
 Change knit index files to be robust in the presence of partial writes.  | 
1694  | 
    to ensure that records always start on new lines even if the last write was
 | 
1695  | 
    interrupted. As a result its normal for the last line in the index to be
 | 
|
1696  | 
    missing a trailing newline. One can be added with no harmful effects.
 | 
|
| 
3350.6.11
by Martin Pool
 Review cleanups and documentation from Robert's mail on 2080618  | 
1697  | 
|
1698  | 
    :ivar _kndx_cache: dict from prefix to the old state of KnitIndex objects,
 | 
|
1699  | 
        where prefix is e.g. the (fileid,) for .texts instances or () for
 | 
|
1700  | 
        constant-mapped things like .revisions, and the old state is
 | 
|
1701  | 
        tuple(history_vector, cache_dict).  This is used to prevent having an
 | 
|
1702  | 
        ABI change with the C extension that reads .kndx files.
 | 
|
| 
1563.2.4
by Robert Collins
 First cut at including the knit implementation of versioned_file.  | 
1703  | 
    """
 | 
1704  | 
||
| 
1666.1.6
by Robert Collins
 Make knit the default format.  | 
1705  | 
HEADER = "# bzr knit index 8\n"  | 
| 
1563.2.4
by Robert Collins
 First cut at including the knit implementation of versioned_file.  | 
1706  | 
|
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
1707  | 
def __init__(self, transport, mapper, get_scope, allow_writes, is_locked):  | 
1708  | 
"""Create a _KndxIndex on transport using mapper."""  | 
|
1709  | 
self._transport = transport  | 
|
1710  | 
self._mapper = mapper  | 
|
1711  | 
self._get_scope = get_scope  | 
|
1712  | 
self._allow_writes = allow_writes  | 
|
1713  | 
self._is_locked = is_locked  | 
|
1714  | 
self._reset_cache()  | 
|
1715  | 
self.has_graph = True  | 
|
1716  | 
||
1717  | 
def add_records(self, records, random_id=False):  | 
|
1718  | 
"""Add multiple records to the index.  | 
|
1719  | 
        
 | 
|
1720  | 
        :param records: a list of tuples:
 | 
|
1721  | 
                         (key, options, access_memo, parents).
 | 
|
1722  | 
        :param random_id: If True the ids being added were randomly generated
 | 
|
1723  | 
            and no check for existence will be performed.
 | 
|
1724  | 
        """
 | 
|
1725  | 
paths = {}  | 
|
1726  | 
for record in records:  | 
|
1727  | 
key = record[0]  | 
|
1728  | 
prefix = key[:-1]  | 
|
1729  | 
path = self._mapper.map(key) + '.kndx'  | 
|
1730  | 
path_keys = paths.setdefault(path, (prefix, []))  | 
|
1731  | 
path_keys[1].append(record)  | 
|
1732  | 
for path in sorted(paths):  | 
|
1733  | 
prefix, path_keys = paths[path]  | 
|
1734  | 
self._load_prefixes([prefix])  | 
|
1735  | 
lines = []  | 
|
1736  | 
orig_history = self._kndx_cache[prefix][1][:]  | 
|
1737  | 
orig_cache = self._kndx_cache[prefix][0].copy()  | 
|
1738  | 
||
1739  | 
try:  | 
|
1740  | 
for key, options, (_, pos, size), parents in path_keys:  | 
|
1741  | 
if parents is None:  | 
|
1742  | 
                        # kndx indices cannot be parentless.
 | 
|
1743  | 
parents = ()  | 
|
1744  | 
line = "\n%s %s %s %s %s :" % (  | 
|
1745  | 
key[-1], ','.join(options), pos, size,  | 
|
1746  | 
self._dictionary_compress(parents))  | 
|
1747  | 
if type(line) != str:  | 
|
1748  | 
raise AssertionError(  | 
|
1749  | 
'data must be utf8 was %s' % type(line))  | 
|
1750  | 
lines.append(line)  | 
|
1751  | 
self._cache_key(key, options, pos, size, parents)  | 
|
1752  | 
if len(orig_history):  | 
|
1753  | 
self._transport.append_bytes(path, ''.join(lines))  | 
|
1754  | 
else:  | 
|
1755  | 
self._init_index(path, lines)  | 
|
1756  | 
except:  | 
|
1757  | 
                # If any problems happen, restore the original values and re-raise
 | 
|
1758  | 
self._kndx_cache[prefix] = (orig_cache, orig_history)  | 
|
1759  | 
                raise
 | 
|
1760  | 
||
1761  | 
def _cache_key(self, key, options, pos, size, parent_keys):  | 
|
| 
1596.2.18
by Robert Collins
 More microopimisations on index reading, now down to 16000 records/seconds.  | 
1762  | 
"""Cache a version record in the history array and index cache.  | 
| 
2158.3.1
by Dmitry Vasiliev
 KnitIndex tests/fixes/optimizations  | 
1763  | 
|
1764  | 
        This is inlined into _load_data for performance. KEEP IN SYNC.
 | 
|
| 
1596.2.18
by Robert Collins
 More microopimisations on index reading, now down to 16000 records/seconds.  | 
1765  | 
        (It saves 60ms, 25% of the __init__ overhead on local 4000 record
 | 
1766  | 
         indexes).
 | 
|
1767  | 
        """
 | 
|
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
1768  | 
prefix = key[:-1]  | 
1769  | 
version_id = key[-1]  | 
|
1770  | 
        # last-element only for compatibilty with the C load_data.
 | 
|
1771  | 
parents = tuple(parent[-1] for parent in parent_keys)  | 
|
1772  | 
for parent in parent_keys:  | 
|
1773  | 
if parent[:-1] != prefix:  | 
|
1774  | 
raise ValueError("mismatched prefixes for %r, %r" % (  | 
|
1775  | 
key, parent_keys))  | 
|
1776  | 
cache, history = self._kndx_cache[prefix]  | 
|
| 
1596.2.14
by Robert Collins
 Make knit parsing non quadratic?  | 
1777  | 
        # only want the _history index to reference the 1st index entry
 | 
1778  | 
        # for version_id
 | 
|
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
1779  | 
if version_id not in cache:  | 
1780  | 
index = len(history)  | 
|
1781  | 
history.append(version_id)  | 
|
| 
1628.1.1
by Robert Collins
 Cache the index number of versions in the knit index's self._cache so that  | 
1782  | 
else:  | 
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
1783  | 
index = cache[version_id][5]  | 
1784  | 
cache[version_id] = (version_id,  | 
|
| 
1628.1.1
by Robert Collins
 Cache the index number of versions in the knit index's self._cache so that  | 
1785  | 
options,  | 
1786  | 
pos,  | 
|
1787  | 
size,  | 
|
1788  | 
parents,  | 
|
1789  | 
index)  | 
|
| 
1563.2.4
by Robert Collins
 First cut at including the knit implementation of versioned_file.  | 
1790  | 
|
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
1791  | 
def check_header(self, fp):  | 
1792  | 
line = fp.readline()  | 
|
1793  | 
if line == '':  | 
|
1794  | 
            # An empty file can actually be treated as though the file doesn't
 | 
|
1795  | 
            # exist yet.
 | 
|
1796  | 
raise errors.NoSuchFile(self)  | 
|
1797  | 
if line != self.HEADER:  | 
|
1798  | 
raise KnitHeaderError(badline=line, filename=self)  | 
|
1799  | 
||
1800  | 
def _check_read(self):  | 
|
1801  | 
if not self._is_locked():  | 
|
1802  | 
raise errors.ObjectNotLocked(self)  | 
|
1803  | 
if self._get_scope() != self._scope:  | 
|
1804  | 
self._reset_cache()  | 
|
1805  | 
||
| 
3316.2.3
by Robert Collins
 Remove manual notification of transaction finishing on versioned files.  | 
1806  | 
def _check_write_ok(self):  | 
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
1807  | 
"""Assert if not writes are permitted."""  | 
1808  | 
if not self._is_locked():  | 
|
1809  | 
raise errors.ObjectNotLocked(self)  | 
|
| 
3316.2.5
by Robert Collins
 Review feedback.  | 
1810  | 
if self._get_scope() != self._scope:  | 
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
1811  | 
self._reset_cache()  | 
| 
3316.2.3
by Robert Collins
 Remove manual notification of transaction finishing on versioned files.  | 
1812  | 
if self._mode != 'w':  | 
1813  | 
raise errors.ReadOnlyObjectDirtiedError(self)  | 
|
1814  | 
||
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
1815  | 
def get_build_details(self, keys):  | 
1816  | 
"""Get the method, index_memo and compression parent for keys.  | 
|
| 
3218.1.1
by Robert Collins
 Reduce index query pressure for text construction by batching the individual queries into single batch queries.  | 
1817  | 
|
| 
3224.1.29
by John Arbash Meinel
 Properly handle annotating when ghosts are present.  | 
1818  | 
        Ghosts are omitted from the result.
 | 
1819  | 
||
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
1820  | 
        :param keys: An iterable of keys.
 | 
| 
3350.6.10
by Martin Pool
 VersionedFiles review cleanups  | 
1821  | 
        :return: A dict of key:(index_memo, compression_parent, parents,
 | 
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
1822  | 
            record_details).
 | 
| 
3224.1.14
by John Arbash Meinel
 Switch to making content_details opaque, step 1  | 
1823  | 
            index_memo
 | 
1824  | 
                opaque structure to pass to read_records to extract the raw
 | 
|
1825  | 
                data
 | 
|
1826  | 
            compression_parent
 | 
|
1827  | 
                Content that this record is built upon, may be None
 | 
|
1828  | 
            parents
 | 
|
1829  | 
                Logical parents of this node
 | 
|
| 
3224.1.15
by John Arbash Meinel
 Finish removing method and noeol from general knowledge,  | 
1830  | 
            record_details
 | 
| 
3224.1.14
by John Arbash Meinel
 Switch to making content_details opaque, step 1  | 
1831  | 
                extra information about the content which needs to be passed to
 | 
| 
3224.1.15
by John Arbash Meinel
 Finish removing method and noeol from general knowledge,  | 
1832  | 
                Factory.parse_record
 | 
| 
3218.1.1
by Robert Collins
 Reduce index query pressure for text construction by batching the individual queries into single batch queries.  | 
1833  | 
        """
 | 
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
1834  | 
prefixes = self._partition_keys(keys)  | 
1835  | 
parent_map = self.get_parent_map(keys)  | 
|
| 
3218.1.1
by Robert Collins
 Reduce index query pressure for text construction by batching the individual queries into single batch queries.  | 
1836  | 
result = {}  | 
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
1837  | 
for key in keys:  | 
1838  | 
if key not in parent_map:  | 
|
1839  | 
continue # Ghost  | 
|
1840  | 
method = self.get_method(key)  | 
|
1841  | 
parents = parent_map[key]  | 
|
| 
3218.1.1
by Robert Collins
 Reduce index query pressure for text construction by batching the individual queries into single batch queries.  | 
1842  | 
if method == 'fulltext':  | 
1843  | 
compression_parent = None  | 
|
1844  | 
else:  | 
|
| 
3224.1.6
by John Arbash Meinel
 Refactor the annotation logic into a helper class.  | 
1845  | 
compression_parent = parents[0]  | 
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
1846  | 
noeol = 'no-eol' in self.get_options(key)  | 
1847  | 
index_memo = self.get_position(key)  | 
|
1848  | 
result[key] = (index_memo, compression_parent,  | 
|
| 
3224.1.14
by John Arbash Meinel
 Switch to making content_details opaque, step 1  | 
1849  | 
parents, (method, noeol))  | 
| 
3218.1.1
by Robert Collins
 Reduce index query pressure for text construction by batching the individual queries into single batch queries.  | 
1850  | 
return result  | 
1851  | 
||
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
1852  | 
def get_method(self, key):  | 
1853  | 
"""Return compression method of specified key."""  | 
|
1854  | 
options = self.get_options(key)  | 
|
1855  | 
if 'fulltext' in options:  | 
|
1856  | 
return 'fulltext'  | 
|
1857  | 
elif 'line-delta' in options:  | 
|
1858  | 
return 'line-delta'  | 
|
1859  | 
else:  | 
|
1860  | 
raise errors.KnitIndexUnknownMethod(self, options)  | 
|
1861  | 
||
1862  | 
def get_options(self, key):  | 
|
1863  | 
"""Return a list representing options.  | 
|
1864  | 
||
1865  | 
        e.g. ['foo', 'bar']
 | 
|
1866  | 
        """
 | 
|
1867  | 
prefix, suffix = self._split_key(key)  | 
|
1868  | 
self._load_prefixes([prefix])  | 
|
| 
3350.8.9
by Robert Collins
 define behaviour for add_lines with stacked storage.  | 
1869  | 
try:  | 
1870  | 
return self._kndx_cache[prefix][0][suffix][1]  | 
|
1871  | 
except KeyError:  | 
|
1872  | 
raise RevisionNotPresent(key, self)  | 
|
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
1873  | 
|
1874  | 
def get_parent_map(self, keys):  | 
|
1875  | 
"""Get a map of the parents of keys.  | 
|
1876  | 
||
1877  | 
        :param keys: The keys to look up parents for.
 | 
|
1878  | 
        :return: A mapping from keys to parents. Absent keys are absent from
 | 
|
1879  | 
            the mapping.
 | 
|
1880  | 
        """
 | 
|
1881  | 
        # Parse what we need to up front, this potentially trades off I/O
 | 
|
1882  | 
        # locality (.kndx and .knit in the same block group for the same file
 | 
|
1883  | 
        # id) for less checking in inner loops.
 | 
|
| 
3350.6.10
by Martin Pool
 VersionedFiles review cleanups  | 
1884  | 
prefixes = set(key[:-1] for key in keys)  | 
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
1885  | 
self._load_prefixes(prefixes)  | 
1886  | 
result = {}  | 
|
1887  | 
for key in keys:  | 
|
1888  | 
prefix = key[:-1]  | 
|
1889  | 
try:  | 
|
1890  | 
suffix_parents = self._kndx_cache[prefix][0][key[-1]][4]  | 
|
1891  | 
except KeyError:  | 
|
1892  | 
                pass
 | 
|
1893  | 
else:  | 
|
1894  | 
result[key] = tuple(prefix + (suffix,) for  | 
|
1895  | 
suffix in suffix_parents)  | 
|
1896  | 
return result  | 
|
1897  | 
||
1898  | 
def get_position(self, key):  | 
|
1899  | 
"""Return details needed to access the version.  | 
|
1900  | 
        
 | 
|
1901  | 
        :return: a tuple (key, data position, size) to hand to the access
 | 
|
1902  | 
            logic to get the record.
 | 
|
1903  | 
        """
 | 
|
1904  | 
prefix, suffix = self._split_key(key)  | 
|
1905  | 
self._load_prefixes([prefix])  | 
|
1906  | 
entry = self._kndx_cache[prefix][0][suffix]  | 
|
1907  | 
return key, entry[2], entry[3]  | 
|
1908  | 
||
1909  | 
def _init_index(self, path, extra_lines=[]):  | 
|
1910  | 
"""Initialize an index."""  | 
|
1911  | 
sio = StringIO()  | 
|
1912  | 
sio.write(self.HEADER)  | 
|
1913  | 
sio.writelines(extra_lines)  | 
|
1914  | 
sio.seek(0)  | 
|
1915  | 
self._transport.put_file_non_atomic(path, sio,  | 
|
1916  | 
create_parent_dir=True)  | 
|
1917  | 
                           # self._create_parent_dir)
 | 
|
1918  | 
                           # mode=self._file_mode,
 | 
|
1919  | 
                           # dir_mode=self._dir_mode)
 | 
|
1920  | 
||
1921  | 
def keys(self):  | 
|
1922  | 
"""Get all the keys in the collection.  | 
|
1923  | 
        
 | 
|
1924  | 
        The keys are not ordered.
 | 
|
1925  | 
        """
 | 
|
1926  | 
result = set()  | 
|
1927  | 
        # Identify all key prefixes.
 | 
|
1928  | 
        # XXX: A bit hacky, needs polish.
 | 
|
1929  | 
if type(self._mapper) == ConstantMapper:  | 
|
1930  | 
prefixes = [()]  | 
|
1931  | 
else:  | 
|
1932  | 
relpaths = set()  | 
|
1933  | 
for quoted_relpath in self._transport.iter_files_recursive():  | 
|
1934  | 
path, ext = os.path.splitext(quoted_relpath)  | 
|
1935  | 
relpaths.add(path)  | 
|
1936  | 
prefixes = [self._mapper.unmap(path) for path in relpaths]  | 
|
1937  | 
self._load_prefixes(prefixes)  | 
|
1938  | 
for prefix in prefixes:  | 
|
1939  | 
for suffix in self._kndx_cache[prefix][1]:  | 
|
1940  | 
result.add(prefix + (suffix,))  | 
|
1941  | 
return result  | 
|
1942  | 
||
1943  | 
def _load_prefixes(self, prefixes):  | 
|
1944  | 
"""Load the indices for prefixes."""  | 
|
1945  | 
self._check_read()  | 
|
1946  | 
for prefix in prefixes:  | 
|
1947  | 
if prefix not in self._kndx_cache:  | 
|
1948  | 
                # the load_data interface writes to these variables.
 | 
|
1949  | 
self._cache = {}  | 
|
1950  | 
self._history = []  | 
|
1951  | 
self._filename = prefix  | 
|
1952  | 
try:  | 
|
1953  | 
path = self._mapper.map(prefix) + '.kndx'  | 
|
1954  | 
fp = self._transport.get(path)  | 
|
1955  | 
try:  | 
|
1956  | 
                        # _load_data may raise NoSuchFile if the target knit is
 | 
|
1957  | 
                        # completely empty.
 | 
|
1958  | 
_load_data(self, fp)  | 
|
1959  | 
finally:  | 
|
1960  | 
fp.close()  | 
|
1961  | 
self._kndx_cache[prefix] = (self._cache, self._history)  | 
|
1962  | 
del self._cache  | 
|
1963  | 
del self._filename  | 
|
1964  | 
del self._history  | 
|
1965  | 
except NoSuchFile:  | 
|
1966  | 
self._kndx_cache[prefix] = ({}, [])  | 
|
1967  | 
if type(self._mapper) == ConstantMapper:  | 
|
1968  | 
                        # preserve behaviour for revisions.kndx etc.
 | 
|
1969  | 
self._init_index(path)  | 
|
1970  | 
del self._cache  | 
|
1971  | 
del self._filename  | 
|
1972  | 
del self._history  | 
|
1973  | 
||
1974  | 
def _partition_keys(self, keys):  | 
|
1975  | 
"""Turn keys into a dict of prefix:suffix_list."""  | 
|
1976  | 
result = {}  | 
|
1977  | 
for key in keys:  | 
|
1978  | 
prefix_keys = result.setdefault(key[:-1], [])  | 
|
1979  | 
prefix_keys.append(key[-1])  | 
|
1980  | 
return result  | 
|
1981  | 
||
1982  | 
def _dictionary_compress(self, keys):  | 
|
1983  | 
"""Dictionary compress keys.  | 
|
1984  | 
        
 | 
|
1985  | 
        :param keys: The keys to generate references to.
 | 
|
1986  | 
        :return: A string representation of keys. keys which are present are
 | 
|
1987  | 
            dictionary compressed, and others are emitted as fulltext with a
 | 
|
1988  | 
            '.' prefix.
 | 
|
1989  | 
        """
 | 
|
1990  | 
if not keys:  | 
|
1991  | 
return ''  | 
|
| 
1594.2.8
by Robert Collins
 add ghost aware apis to knits.  | 
1992  | 
result_list = []  | 
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
1993  | 
prefix = keys[0][:-1]  | 
1994  | 
cache = self._kndx_cache[prefix][0]  | 
|
1995  | 
for key in keys:  | 
|
1996  | 
if key[:-1] != prefix:  | 
|
1997  | 
                # kndx indices cannot refer across partitioned storage.
 | 
|
1998  | 
raise ValueError("mismatched prefixes for %r" % keys)  | 
|
1999  | 
if key[-1] in cache:  | 
|
| 
1628.1.1
by Robert Collins
 Cache the index number of versions in the knit index's self._cache so that  | 
2000  | 
                # -- inlined lookup() --
 | 
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
2001  | 
result_list.append(str(cache[key[-1]][5]))  | 
| 
1628.1.1
by Robert Collins
 Cache the index number of versions in the knit index's self._cache so that  | 
2002  | 
                # -- end lookup () --
 | 
| 
1594.2.8
by Robert Collins
 add ghost aware apis to knits.  | 
2003  | 
else:  | 
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
2004  | 
result_list.append('.' + key[-1])  | 
| 
1594.2.8
by Robert Collins
 add ghost aware apis to knits.  | 
2005  | 
return ' '.join(result_list)  | 
2006  | 
||
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
2007  | 
def _reset_cache(self):  | 
2008  | 
        # Possibly this should be a LRU cache. A dictionary from key_prefix to
 | 
|
2009  | 
        # (cache_dict, history_vector) for parsed kndx files.
 | 
|
2010  | 
self._kndx_cache = {}  | 
|
2011  | 
self._scope = self._get_scope()  | 
|
2012  | 
allow_writes = self._allow_writes()  | 
|
2013  | 
if allow_writes:  | 
|
2014  | 
self._mode = 'w'  | 
|
| 
1563.2.4
by Robert Collins
 First cut at including the knit implementation of versioned_file.  | 
2015  | 
else:  | 
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
2016  | 
self._mode = 'r'  | 
2017  | 
||
2018  | 
def _split_key(self, key):  | 
|
2019  | 
"""Split key into a prefix and suffix."""  | 
|
2020  | 
return key[:-1], key[-1]  | 
|
2021  | 
||
2022  | 
||
2023  | 
class _KnitGraphIndex(object):  | 
|
2024  | 
"""A KnitVersionedFiles index layered on GraphIndex."""  | 
|
2025  | 
||
2026  | 
def __init__(self, graph_index, is_locked, deltas=False, parents=True,  | 
|
2027  | 
add_callback=None):  | 
|
| 
2592.3.2
by Robert Collins
 Implement a get_graph for a new KnitGraphIndex that will implement a KnitIndex on top of the GraphIndex API.  | 
2028  | 
"""Construct a KnitGraphIndex on a graph_index.  | 
2029  | 
||
2030  | 
        :param graph_index: An implementation of bzrlib.index.GraphIndex.
 | 
|
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
2031  | 
        :param is_locked: A callback to check whether the object should answer
 | 
2032  | 
            queries.
 | 
|
| 
2592.3.13
by Robert Collins
 Implement KnitGraphIndex.get_method.  | 
2033  | 
        :param deltas: Allow delta-compressed records.
 | 
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
2034  | 
        :param parents: If True, record knits parents, if not do not record 
 | 
2035  | 
            parents.
 | 
|
| 
2592.3.19
by Robert Collins
 Change KnitGraphIndex from returning data to performing a callback on insertions.  | 
2036  | 
        :param add_callback: If not None, allow additions to the index and call
 | 
2037  | 
            this callback with a list of added GraphIndex nodes:
 | 
|
| 
2592.3.33
by Robert Collins
 Change the order of index refs and values to make the no-graph knit index easier.  | 
2038  | 
            [(node, value, node_refs), ...]
 | 
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
2039  | 
        :param is_locked: A callback, returns True if the index is locked and
 | 
2040  | 
            thus usable.
 | 
|
| 
2592.3.2
by Robert Collins
 Implement a get_graph for a new KnitGraphIndex that will implement a KnitIndex on top of the GraphIndex API.  | 
2041  | 
        """
 | 
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
2042  | 
self._add_callback = add_callback  | 
| 
2592.3.2
by Robert Collins
 Implement a get_graph for a new KnitGraphIndex that will implement a KnitIndex on top of the GraphIndex API.  | 
2043  | 
self._graph_index = graph_index  | 
| 
2592.3.13
by Robert Collins
 Implement KnitGraphIndex.get_method.  | 
2044  | 
self._deltas = deltas  | 
| 
2592.3.34
by Robert Collins
 Rough unfactored support for parentless KnitGraphIndexs.  | 
2045  | 
self._parents = parents  | 
2046  | 
if deltas and not parents:  | 
|
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
2047  | 
            # XXX: TODO: Delta tree and parent graph should be conceptually
 | 
2048  | 
            # separate.
 | 
|
| 
2592.3.34
by Robert Collins
 Rough unfactored support for parentless KnitGraphIndexs.  | 
2049  | 
raise KnitCorrupt(self, "Cannot do delta compression without "  | 
2050  | 
"parent tracking.")  | 
|
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
2051  | 
self.has_graph = parents  | 
2052  | 
self._is_locked = is_locked  | 
|
2053  | 
||
| 
3517.4.13
by Martin Pool
 Add repr methods  | 
2054  | 
def __repr__(self):  | 
2055  | 
return "%s(%r)" % (self.__class__.__name__, self._graph_index)  | 
|
2056  | 
||
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
2057  | 
def add_records(self, records, random_id=False):  | 
2058  | 
"""Add multiple records to the index.  | 
|
| 
2592.3.17
by Robert Collins
 Add add_version(s) to KnitGraphIndex, completing the required api for KnitVersionedFile.  | 
2059  | 
        
 | 
2060  | 
        This function does not insert data into the Immutable GraphIndex
 | 
|
2061  | 
        backing the KnitGraphIndex, instead it prepares data for insertion by
 | 
|
| 
2592.3.19
by Robert Collins
 Change KnitGraphIndex from returning data to performing a callback on insertions.  | 
2062  | 
        the caller and checks that it is safe to insert then calls
 | 
2063  | 
        self._add_callback with the prepared GraphIndex nodes.
 | 
|
| 
2592.3.17
by Robert Collins
 Add add_version(s) to KnitGraphIndex, completing the required api for KnitVersionedFile.  | 
2064  | 
|
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
2065  | 
        :param records: a list of tuples:
 | 
2066  | 
                         (key, options, access_memo, parents).
 | 
|
| 
2841.2.1
by Robert Collins
 * Commit no longer checks for new text keys during insertion when the  | 
2067  | 
        :param random_id: If True the ids being added were randomly generated
 | 
2068  | 
            and no check for existence will be performed.
 | 
|
| 
2592.3.17
by Robert Collins
 Add add_version(s) to KnitGraphIndex, completing the required api for KnitVersionedFile.  | 
2069  | 
        """
 | 
| 
2592.3.19
by Robert Collins
 Change KnitGraphIndex from returning data to performing a callback on insertions.  | 
2070  | 
if not self._add_callback:  | 
2071  | 
raise errors.ReadOnlyError(self)  | 
|
| 
2592.3.17
by Robert Collins
 Add add_version(s) to KnitGraphIndex, completing the required api for KnitVersionedFile.  | 
2072  | 
        # we hope there are no repositories with inconsistent parentage
 | 
2073  | 
        # anymore.
 | 
|
2074  | 
||
2075  | 
keys = {}  | 
|
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
2076  | 
for (key, options, access_memo, parents) in records:  | 
2077  | 
if self._parents:  | 
|
2078  | 
parents = tuple(parents)  | 
|
| 
2670.2.2
by Robert Collins
 * In ``bzrlib.knit`` the internal interface has been altered to use  | 
2079  | 
index, pos, size = access_memo  | 
| 
2592.3.17
by Robert Collins
 Add add_version(s) to KnitGraphIndex, completing the required api for KnitVersionedFile.  | 
2080  | 
if 'no-eol' in options:  | 
2081  | 
value = 'N'  | 
|
2082  | 
else:  | 
|
2083  | 
value = ' '  | 
|
2084  | 
value += "%d %d" % (pos, size)  | 
|
| 
2592.3.34
by Robert Collins
 Rough unfactored support for parentless KnitGraphIndexs.  | 
2085  | 
if not self._deltas:  | 
| 
2592.3.17
by Robert Collins
 Add add_version(s) to KnitGraphIndex, completing the required api for KnitVersionedFile.  | 
2086  | 
if 'line-delta' in options:  | 
2087  | 
raise KnitCorrupt(self, "attempt to add line-delta in non-delta knit")  | 
|
| 
2592.3.34
by Robert Collins
 Rough unfactored support for parentless KnitGraphIndexs.  | 
2088  | 
if self._parents:  | 
2089  | 
if self._deltas:  | 
|
2090  | 
if 'line-delta' in options:  | 
|
| 
2624.2.5
by Robert Collins
 Change bzrlib.index.Index keys to be 1-tuples, not strings.  | 
2091  | 
node_refs = (parents, (parents[0],))  | 
| 
2592.3.34
by Robert Collins
 Rough unfactored support for parentless KnitGraphIndexs.  | 
2092  | 
else:  | 
| 
2624.2.5
by Robert Collins
 Change bzrlib.index.Index keys to be 1-tuples, not strings.  | 
2093  | 
node_refs = (parents, ())  | 
| 
2592.3.34
by Robert Collins
 Rough unfactored support for parentless KnitGraphIndexs.  | 
2094  | 
else:  | 
| 
2624.2.5
by Robert Collins
 Change bzrlib.index.Index keys to be 1-tuples, not strings.  | 
2095  | 
node_refs = (parents, )  | 
| 
2592.3.34
by Robert Collins
 Rough unfactored support for parentless KnitGraphIndexs.  | 
2096  | 
else:  | 
2097  | 
if parents:  | 
|
2098  | 
raise KnitCorrupt(self, "attempt to add node with parents "  | 
|
2099  | 
"in parentless index.")  | 
|
2100  | 
node_refs = ()  | 
|
| 
2624.2.5
by Robert Collins
 Change bzrlib.index.Index keys to be 1-tuples, not strings.  | 
2101  | 
keys[key] = (value, node_refs)  | 
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
2102  | 
        # check for dups
 | 
| 
2841.2.1
by Robert Collins
 * Commit no longer checks for new text keys during insertion when the  | 
2103  | 
if not random_id:  | 
2104  | 
present_nodes = self._get_entries(keys)  | 
|
2105  | 
for (index, key, value, node_refs) in present_nodes:  | 
|
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
2106  | 
if (value[0] != keys[key][0][0] or  | 
2107  | 
node_refs != keys[key][1]):  | 
|
2108  | 
raise KnitCorrupt(self, "inconsistent details in add_records"  | 
|
| 
2841.2.1
by Robert Collins
 * Commit no longer checks for new text keys during insertion when the  | 
2109  | 
": %s %s" % ((value, node_refs), keys[key]))  | 
2110  | 
del keys[key]  | 
|
| 
2592.3.17
by Robert Collins
 Add add_version(s) to KnitGraphIndex, completing the required api for KnitVersionedFile.  | 
2111  | 
result = []  | 
| 
2592.3.34
by Robert Collins
 Rough unfactored support for parentless KnitGraphIndexs.  | 
2112  | 
if self._parents:  | 
2113  | 
for key, (value, node_refs) in keys.iteritems():  | 
|
2114  | 
result.append((key, value, node_refs))  | 
|
2115  | 
else:  | 
|
2116  | 
for key, (value, node_refs) in keys.iteritems():  | 
|
2117  | 
result.append((key, value))  | 
|
| 
2592.3.19
by Robert Collins
 Change KnitGraphIndex from returning data to performing a callback on insertions.  | 
2118  | 
self._add_callback(result)  | 
| 
2592.3.17
by Robert Collins
 Add add_version(s) to KnitGraphIndex, completing the required api for KnitVersionedFile.  | 
2119  | 
|
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
2120  | 
def _check_read(self):  | 
2121  | 
"""raise if reads are not permitted."""  | 
|
2122  | 
if not self._is_locked():  | 
|
2123  | 
raise errors.ObjectNotLocked(self)  | 
|
2124  | 
||
2125  | 
def _check_write_ok(self):  | 
|
2126  | 
"""Assert if writes are not permitted."""  | 
|
2127  | 
if not self._is_locked():  | 
|
2128  | 
raise errors.ObjectNotLocked(self)  | 
|
2129  | 
||
2130  | 
def _compression_parent(self, an_entry):  | 
|
2131  | 
        # return the key that an_entry is compressed against, or None
 | 
|
2132  | 
        # Grab the second parent list (as deltas implies parents currently)
 | 
|
2133  | 
compression_parents = an_entry[3][1]  | 
|
2134  | 
if not compression_parents:  | 
|
2135  | 
return None  | 
|
2136  | 
if len(compression_parents) != 1:  | 
|
2137  | 
raise AssertionError(  | 
|
2138  | 
"Too many compression parents: %r" % compression_parents)  | 
|
2139  | 
return compression_parents[0]  | 
|
2140  | 
||
2141  | 
def get_build_details(self, keys):  | 
|
2142  | 
"""Get the method, index_memo and compression parent for version_ids.  | 
|
2143  | 
||
2144  | 
        Ghosts are omitted from the result.
 | 
|
2145  | 
||
2146  | 
        :param keys: An iterable of keys.
 | 
|
2147  | 
        :return: A dict of key:
 | 
|
2148  | 
            (index_memo, compression_parent, parents, record_details).
 | 
|
2149  | 
            index_memo
 | 
|
2150  | 
                opaque structure to pass to read_records to extract the raw
 | 
|
2151  | 
                data
 | 
|
2152  | 
            compression_parent
 | 
|
2153  | 
                Content that this record is built upon, may be None
 | 
|
2154  | 
            parents
 | 
|
2155  | 
                Logical parents of this node
 | 
|
2156  | 
            record_details
 | 
|
2157  | 
                extra information about the content which needs to be passed to
 | 
|
2158  | 
                Factory.parse_record
 | 
|
2159  | 
        """
 | 
|
2160  | 
self._check_read()  | 
|
2161  | 
result = {}  | 
|
2162  | 
entries = self._get_entries(keys, False)  | 
|
2163  | 
for entry in entries:  | 
|
2164  | 
key = entry[1]  | 
|
2165  | 
if not self._parents:  | 
|
2166  | 
parents = ()  | 
|
2167  | 
else:  | 
|
2168  | 
parents = entry[3][0]  | 
|
2169  | 
if not self._deltas:  | 
|
2170  | 
compression_parent_key = None  | 
|
2171  | 
else:  | 
|
2172  | 
compression_parent_key = self._compression_parent(entry)  | 
|
2173  | 
noeol = (entry[2][0] == 'N')  | 
|
2174  | 
if compression_parent_key:  | 
|
2175  | 
method = 'line-delta'  | 
|
2176  | 
else:  | 
|
2177  | 
method = 'fulltext'  | 
|
2178  | 
result[key] = (self._node_to_position(entry),  | 
|
2179  | 
compression_parent_key, parents,  | 
|
2180  | 
(method, noeol))  | 
|
2181  | 
return result  | 
|
2182  | 
||
2183  | 
def _get_entries(self, keys, check_present=False):  | 
|
2184  | 
"""Get the entries for keys.  | 
|
2185  | 
        
 | 
|
2186  | 
        :param keys: An iterable of index key tuples.
 | 
|
2187  | 
        """
 | 
|
2188  | 
keys = set(keys)  | 
|
2189  | 
found_keys = set()  | 
|
2190  | 
if self._parents:  | 
|
2191  | 
for node in self._graph_index.iter_entries(keys):  | 
|
2192  | 
yield node  | 
|
2193  | 
found_keys.add(node[1])  | 
|
2194  | 
else:  | 
|
2195  | 
            # adapt parentless index to the rest of the code.
 | 
|
2196  | 
for node in self._graph_index.iter_entries(keys):  | 
|
2197  | 
yield node[0], node[1], node[2], ()  | 
|
2198  | 
found_keys.add(node[1])  | 
|
2199  | 
if check_present:  | 
|
2200  | 
missing_keys = keys.difference(found_keys)  | 
|
2201  | 
if missing_keys:  | 
|
2202  | 
raise RevisionNotPresent(missing_keys.pop(), self)  | 
|
2203  | 
||
2204  | 
def get_method(self, key):  | 
|
2205  | 
"""Return compression method of specified key."""  | 
|
2206  | 
return self._get_method(self._get_node(key))  | 
|
2207  | 
||
2208  | 
def _get_method(self, node):  | 
|
2209  | 
if not self._deltas:  | 
|
2210  | 
return 'fulltext'  | 
|
2211  | 
if self._compression_parent(node):  | 
|
2212  | 
return 'line-delta'  | 
|
2213  | 
else:  | 
|
2214  | 
return 'fulltext'  | 
|
2215  | 
||
2216  | 
def _get_node(self, key):  | 
|
2217  | 
try:  | 
|
2218  | 
return list(self._get_entries([key]))[0]  | 
|
2219  | 
except IndexError:  | 
|
2220  | 
raise RevisionNotPresent(key, self)  | 
|
2221  | 
||
2222  | 
def get_options(self, key):  | 
|
2223  | 
"""Return a list representing options.  | 
|
2224  | 
||
2225  | 
        e.g. ['foo', 'bar']
 | 
|
2226  | 
        """
 | 
|
2227  | 
node = self._get_node(key)  | 
|
2228  | 
options = [self._get_method(node)]  | 
|
2229  | 
if node[2][0] == 'N':  | 
|
2230  | 
options.append('no-eol')  | 
|
2231  | 
return options  | 
|
2232  | 
||
2233  | 
def get_parent_map(self, keys):  | 
|
2234  | 
"""Get a map of the parents of keys.  | 
|
2235  | 
||
2236  | 
        :param keys: The keys to look up parents for.
 | 
|
2237  | 
        :return: A mapping from keys to parents. Absent keys are absent from
 | 
|
2238  | 
            the mapping.
 | 
|
2239  | 
        """
 | 
|
2240  | 
self._check_read()  | 
|
2241  | 
nodes = self._get_entries(keys)  | 
|
2242  | 
result = {}  | 
|
2243  | 
if self._parents:  | 
|
2244  | 
for node in nodes:  | 
|
2245  | 
result[node[1]] = node[3][0]  | 
|
2246  | 
else:  | 
|
2247  | 
for node in nodes:  | 
|
2248  | 
result[node[1]] = None  | 
|
2249  | 
return result  | 
|
2250  | 
||
2251  | 
def get_position(self, key):  | 
|
2252  | 
"""Return details needed to access the version.  | 
|
2253  | 
        
 | 
|
2254  | 
        :return: a tuple (index, data position, size) to hand to the access
 | 
|
2255  | 
            logic to get the record.
 | 
|
2256  | 
        """
 | 
|
2257  | 
node = self._get_node(key)  | 
|
2258  | 
return self._node_to_position(node)  | 
|
2259  | 
||
2260  | 
def keys(self):  | 
|
2261  | 
"""Get all the keys in the collection.  | 
|
2262  | 
        
 | 
|
2263  | 
        The keys are not ordered.
 | 
|
2264  | 
        """
 | 
|
2265  | 
self._check_read()  | 
|
2266  | 
return [node[1] for node in self._graph_index.iter_all_entries()]  | 
|
2267  | 
||
2268  | 
def _node_to_position(self, node):  | 
|
2269  | 
"""Convert an index value to position details."""  | 
|
2270  | 
bits = node[2][1:].split(' ')  | 
|
2271  | 
return node[0], int(bits[0]), int(bits[1])  | 
|
2272  | 
||
2273  | 
||
2274  | 
class _KnitKeyAccess(object):  | 
|
2275  | 
"""Access to records in .knit files."""  | 
|
2276  | 
||
2277  | 
def __init__(self, transport, mapper):  | 
|
2278  | 
"""Create a _KnitKeyAccess with transport and mapper.  | 
|
2279  | 
||
2280  | 
        :param transport: The transport the access object is rooted at.
 | 
|
2281  | 
        :param mapper: The mapper used to map keys to .knit files.
 | 
|
| 
2592.3.66
by Robert Collins
 Allow adaption of KnitData to pack files.  | 
2282  | 
        """
 | 
2283  | 
self._transport = transport  | 
|
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
2284  | 
self._mapper = mapper  | 
| 
2592.3.66
by Robert Collins
 Allow adaption of KnitData to pack files.  | 
2285  | 
|
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
2286  | 
def add_raw_records(self, key_sizes, raw_data):  | 
| 
2592.3.66
by Robert Collins
 Allow adaption of KnitData to pack files.  | 
2287  | 
"""Add raw knit bytes to a storage area.  | 
2288  | 
||
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
2289  | 
        The data is spooled to the container writer in one bytes-record per
 | 
2290  | 
        raw data item.
 | 
|
| 
2592.3.66
by Robert Collins
 Allow adaption of KnitData to pack files.  | 
2291  | 
|
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
2292  | 
        :param sizes: An iterable of tuples containing the key and size of each
 | 
2293  | 
            raw data segment.
 | 
|
| 
2592.3.66
by Robert Collins
 Allow adaption of KnitData to pack files.  | 
2294  | 
        :param raw_data: A bytestring containing the data.
 | 
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
2295  | 
        :return: A list of memos to retrieve the record later. Each memo is an
 | 
2296  | 
            opaque index memo. For _KnitKeyAccess the memo is (key, pos,
 | 
|
2297  | 
            length), where the key is the record key.
 | 
|
| 
2592.3.66
by Robert Collins
 Allow adaption of KnitData to pack files.  | 
2298  | 
        """
 | 
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
2299  | 
if type(raw_data) != str:  | 
2300  | 
raise AssertionError(  | 
|
2301  | 
'data must be plain bytes was %s' % type(raw_data))  | 
|
| 
2592.3.66
by Robert Collins
 Allow adaption of KnitData to pack files.  | 
2302  | 
result = []  | 
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
2303  | 
offset = 0  | 
2304  | 
        # TODO: This can be tuned for writing to sftp and other servers where
 | 
|
2305  | 
        # append() is relatively expensive by grouping the writes to each key
 | 
|
2306  | 
        # prefix.
 | 
|
2307  | 
for key, size in key_sizes:  | 
|
2308  | 
path = self._mapper.map(key)  | 
|
2309  | 
try:  | 
|
2310  | 
base = self._transport.append_bytes(path + '.knit',  | 
|
2311  | 
raw_data[offset:offset+size])  | 
|
2312  | 
except errors.NoSuchFile:  | 
|
2313  | 
self._transport.mkdir(osutils.dirname(path))  | 
|
2314  | 
base = self._transport.append_bytes(path + '.knit',  | 
|
2315  | 
raw_data[offset:offset+size])  | 
|
2316  | 
            # if base == 0:
 | 
|
2317  | 
            # chmod.
 | 
|
2318  | 
offset += size  | 
|
2319  | 
result.append((key, base, size))  | 
|
| 
2592.3.66
by Robert Collins
 Allow adaption of KnitData to pack files.  | 
2320  | 
return result  | 
2321  | 
||
2322  | 
def get_raw_records(self, memos_for_retrieval):  | 
|
2323  | 
"""Get the raw bytes for a records.  | 
|
2324  | 
||
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
2325  | 
        :param memos_for_retrieval: An iterable containing the access memo for
 | 
2326  | 
            retrieving the bytes.
 | 
|
| 
2592.3.66
by Robert Collins
 Allow adaption of KnitData to pack files.  | 
2327  | 
        :return: An iterator over the bytes of the records.
 | 
2328  | 
        """
 | 
|
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
2329  | 
        # first pass, group into same-index request to minimise readv's issued.
 | 
2330  | 
request_lists = []  | 
|
2331  | 
current_prefix = None  | 
|
2332  | 
for (key, offset, length) in memos_for_retrieval:  | 
|
2333  | 
if current_prefix == key[:-1]:  | 
|
2334  | 
current_list.append((offset, length))  | 
|
2335  | 
else:  | 
|
2336  | 
if current_prefix is not None:  | 
|
2337  | 
request_lists.append((current_prefix, current_list))  | 
|
2338  | 
current_prefix = key[:-1]  | 
|
2339  | 
current_list = [(offset, length)]  | 
|
2340  | 
        # handle the last entry
 | 
|
2341  | 
if current_prefix is not None:  | 
|
2342  | 
request_lists.append((current_prefix, current_list))  | 
|
2343  | 
for prefix, read_vector in request_lists:  | 
|
2344  | 
path = self._mapper.map(prefix) + '.knit'  | 
|
2345  | 
for pos, data in self._transport.readv(path, read_vector):  | 
|
2346  | 
yield data  | 
|
2347  | 
||
2348  | 
||
2349  | 
class _DirectPackAccess(object):  | 
|
2350  | 
"""Access to data in one or more packs with less translation."""  | 
|
2351  | 
||
2352  | 
def __init__(self, index_to_packs):  | 
|
2353  | 
"""Create a _DirectPackAccess object.  | 
|
| 
2592.3.66
by Robert Collins
 Allow adaption of KnitData to pack files.  | 
2354  | 
|
2355  | 
        :param index_to_packs: A dict mapping index objects to the transport
 | 
|
2356  | 
            and file names for obtaining data.
 | 
|
2357  | 
        """
 | 
|
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
2358  | 
self._container_writer = None  | 
2359  | 
self._write_index = None  | 
|
2360  | 
self._indices = index_to_packs  | 
|
| 
2592.3.66
by Robert Collins
 Allow adaption of KnitData to pack files.  | 
2361  | 
|
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
2362  | 
def add_raw_records(self, key_sizes, raw_data):  | 
| 
2592.3.66
by Robert Collins
 Allow adaption of KnitData to pack files.  | 
2363  | 
"""Add raw knit bytes to a storage area.  | 
2364  | 
||
| 
2670.2.3
by Robert Collins
 Review feedback.  | 
2365  | 
        The data is spooled to the container writer in one bytes-record per
 | 
| 
2592.3.66
by Robert Collins
 Allow adaption of KnitData to pack files.  | 
2366  | 
        raw data item.
 | 
2367  | 
||
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
2368  | 
        :param sizes: An iterable of tuples containing the key and size of each
 | 
2369  | 
            raw data segment.
 | 
|
| 
2592.3.66
by Robert Collins
 Allow adaption of KnitData to pack files.  | 
2370  | 
        :param raw_data: A bytestring containing the data.
 | 
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
2371  | 
        :return: A list of memos to retrieve the record later. Each memo is an
 | 
2372  | 
            opaque index memo. For _DirectPackAccess the memo is (index, pos,
 | 
|
2373  | 
            length), where the index field is the write_index object supplied
 | 
|
2374  | 
            to the PackAccess object.
 | 
|
| 
2592.3.66
by Robert Collins
 Allow adaption of KnitData to pack files.  | 
2375  | 
        """
 | 
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
2376  | 
if type(raw_data) != str:  | 
2377  | 
raise AssertionError(  | 
|
2378  | 
'data must be plain bytes was %s' % type(raw_data))  | 
|
| 
2592.3.66
by Robert Collins
 Allow adaption of KnitData to pack files.  | 
2379  | 
result = []  | 
2380  | 
offset = 0  | 
|
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
2381  | 
for key, size in key_sizes:  | 
2382  | 
p_offset, p_length = self._container_writer.add_bytes_record(  | 
|
| 
2592.3.66
by Robert Collins
 Allow adaption of KnitData to pack files.  | 
2383  | 
raw_data[offset:offset+size], [])  | 
2384  | 
offset += size  | 
|
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
2385  | 
result.append((self._write_index, p_offset, p_length))  | 
| 
2592.3.66
by Robert Collins
 Allow adaption of KnitData to pack files.  | 
2386  | 
return result  | 
2387  | 
||
2388  | 
def get_raw_records(self, memos_for_retrieval):  | 
|
2389  | 
"""Get the raw bytes for a records.  | 
|
2390  | 
||
| 
2670.2.2
by Robert Collins
 * In ``bzrlib.knit`` the internal interface has been altered to use  | 
2391  | 
        :param memos_for_retrieval: An iterable containing the (index, pos, 
 | 
2392  | 
            length) memo for retrieving the bytes. The Pack access method
 | 
|
2393  | 
            looks up the pack to use for a given record in its index_to_pack
 | 
|
2394  | 
            map.
 | 
|
| 
2592.3.66
by Robert Collins
 Allow adaption of KnitData to pack files.  | 
2395  | 
        :return: An iterator over the bytes of the records.
 | 
2396  | 
        """
 | 
|
2397  | 
        # first pass, group into same-index requests
 | 
|
2398  | 
request_lists = []  | 
|
2399  | 
current_index = None  | 
|
2400  | 
for (index, offset, length) in memos_for_retrieval:  | 
|
2401  | 
if current_index == index:  | 
|
2402  | 
current_list.append((offset, length))  | 
|
2403  | 
else:  | 
|
2404  | 
if current_index is not None:  | 
|
2405  | 
request_lists.append((current_index, current_list))  | 
|
2406  | 
current_index = index  | 
|
2407  | 
current_list = [(offset, length)]  | 
|
2408  | 
        # handle the last entry
 | 
|
2409  | 
if current_index is not None:  | 
|
2410  | 
request_lists.append((current_index, current_list))  | 
|
2411  | 
for index, offsets in request_lists:  | 
|
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
2412  | 
transport, path = self._indices[index]  | 
| 
2592.3.66
by Robert Collins
 Allow adaption of KnitData to pack files.  | 
2413  | 
reader = pack.make_readv_reader(transport, path, offsets)  | 
2414  | 
for names, read_func in reader.iter_records():  | 
|
2415  | 
yield read_func(None)  | 
|
2416  | 
||
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
2417  | 
def set_writer(self, writer, index, transport_packname):  | 
| 
2592.3.70
by Robert Collins
 Allow setting a writer after creating a knit._PackAccess object.  | 
2418  | 
"""Set a writer to use for adding data."""  | 
| 
2592.3.208
by Robert Collins
 Start refactoring the knit-pack thunking to be clearer.  | 
2419  | 
if index is not None:  | 
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
2420  | 
self._indices[index] = transport_packname  | 
2421  | 
self._container_writer = writer  | 
|
2422  | 
self._write_index = index  | 
|
| 
1684.3.3
by Robert Collins
 Add a special cased weaves to knit converter.  | 
2423  | 
|
2424  | 
||
| 
2781.1.1
by Martin Pool
 merge cpatiencediff from Lukas  | 
2425  | 
# Deprecated, use PatienceSequenceMatcher instead
 | 
2426  | 
KnitSequenceMatcher = patiencediff.PatienceSequenceMatcher  | 
|
| 
2484.1.1
by John Arbash Meinel
 Add an initial function to read knit indexes in pyrex.  | 
2427  | 
|
2428  | 
||
| 
2770.1.2
by Aaron Bentley
 Convert to knit-only annotation  | 
2429  | 
def annotate_knit(knit, revision_id):  | 
2430  | 
"""Annotate a knit with no cached annotations.  | 
|
2431  | 
||
2432  | 
    This implementation is for knits with no cached annotations.
 | 
|
2433  | 
    It will work for knits with cached annotations, but this is not
 | 
|
2434  | 
    recommended.
 | 
|
2435  | 
    """
 | 
|
| 
3224.1.7
by John Arbash Meinel
 _StreamIndex also needs to return the proper values for get_build_details.  | 
2436  | 
annotator = _KnitAnnotator(knit)  | 
| 
3224.1.25
by John Arbash Meinel
 Quick change to the _KnitAnnotator api to use .annotate() instead of get_annotated_lines()  | 
2437  | 
return iter(annotator.annotate(revision_id))  | 
| 
3224.1.7
by John Arbash Meinel
 _StreamIndex also needs to return the proper values for get_build_details.  | 
2438  | 
|
2439  | 
||
2440  | 
class _KnitAnnotator(object):  | 
|
| 
3224.1.5
by John Arbash Meinel
 Start using a helper class for doing the knit-pack annotations.  | 
2441  | 
"""Build up the annotations for a text."""  | 
2442  | 
||
2443  | 
def __init__(self, knit):  | 
|
2444  | 
self._knit = knit  | 
|
2445  | 
||
| 
3224.1.6
by John Arbash Meinel
 Refactor the annotation logic into a helper class.  | 
2446  | 
        # Content objects, differs from fulltexts because of how final newlines
 | 
2447  | 
        # are treated by knits. the content objects here will always have a
 | 
|
2448  | 
        # final newline
 | 
|
2449  | 
self._fulltext_contents = {}  | 
|
2450  | 
||
2451  | 
        # Annotated lines of specific revisions
 | 
|
2452  | 
self._annotated_lines = {}  | 
|
2453  | 
||
2454  | 
        # Track the raw data for nodes that we could not process yet.
 | 
|
2455  | 
        # This maps the revision_id of the base to a list of children that will
 | 
|
2456  | 
        # annotated from it.
 | 
|
2457  | 
self._pending_children = {}  | 
|
2458  | 
||
| 
3224.1.29
by John Arbash Meinel
 Properly handle annotating when ghosts are present.  | 
2459  | 
        # Nodes which cannot be extracted
 | 
2460  | 
self._ghosts = set()  | 
|
2461  | 
||
| 
3224.1.19
by John Arbash Meinel
 Work on removing nodes from the working set once they aren't needed.  | 
2462  | 
        # Track how many children this node has, so we know if we need to keep
 | 
2463  | 
        # it
 | 
|
2464  | 
self._annotate_children = {}  | 
|
2465  | 
self._compression_children = {}  | 
|
2466  | 
||
| 
3224.1.6
by John Arbash Meinel
 Refactor the annotation logic into a helper class.  | 
2467  | 
self._all_build_details = {}  | 
| 
3224.1.10
by John Arbash Meinel
 Introduce the heads_provider for reannotate.  | 
2468  | 
        # The children => parent revision_id graph
 | 
| 
3224.1.6
by John Arbash Meinel
 Refactor the annotation logic into a helper class.  | 
2469  | 
self._revision_id_graph = {}  | 
2470  | 
||
| 
3224.1.10
by John Arbash Meinel
 Introduce the heads_provider for reannotate.  | 
2471  | 
self._heads_provider = None  | 
2472  | 
||
| 
3224.1.19
by John Arbash Meinel
 Work on removing nodes from the working set once they aren't needed.  | 
2473  | 
self._nodes_to_keep_annotations = set()  | 
| 
3224.1.22
by John Arbash Meinel
 Cleanup the extra debugging info, and some >80 char lines.  | 
2474  | 
self._generations_until_keep = 100  | 
2475  | 
||
2476  | 
def set_generations_until_keep(self, value):  | 
|
2477  | 
"""Set the number of generations before caching a node.  | 
|
2478  | 
||
2479  | 
        Setting this to -1 will cache every merge node, setting this higher
 | 
|
2480  | 
        will cache fewer nodes.
 | 
|
2481  | 
        """
 | 
|
2482  | 
self._generations_until_keep = value  | 
|
| 
3224.1.19
by John Arbash Meinel
 Work on removing nodes from the working set once they aren't needed.  | 
2483  | 
|
| 
3224.1.15
by John Arbash Meinel
 Finish removing method and noeol from general knowledge,  | 
2484  | 
def _add_fulltext_content(self, revision_id, content_obj):  | 
| 
3224.1.6
by John Arbash Meinel
 Refactor the annotation logic into a helper class.  | 
2485  | 
self._fulltext_contents[revision_id] = content_obj  | 
| 
3224.1.19
by John Arbash Meinel
 Work on removing nodes from the working set once they aren't needed.  | 
2486  | 
        # TODO: jam 20080305 It might be good to check the sha1digest here
 | 
| 
3224.1.22
by John Arbash Meinel
 Cleanup the extra debugging info, and some >80 char lines.  | 
2487  | 
return content_obj.text()  | 
| 
3224.1.6
by John Arbash Meinel
 Refactor the annotation logic into a helper class.  | 
2488  | 
|
2489  | 
def _check_parents(self, child, nodes_to_annotate):  | 
|
2490  | 
"""Check if all parents have been processed.  | 
|
2491  | 
||
2492  | 
        :param child: A tuple of (rev_id, parents, raw_content)
 | 
|
2493  | 
        :param nodes_to_annotate: If child is ready, add it to
 | 
|
2494  | 
            nodes_to_annotate, otherwise put it back in self._pending_children
 | 
|
2495  | 
        """
 | 
|
2496  | 
for parent_id in child[1]:  | 
|
| 
3224.1.29
by John Arbash Meinel
 Properly handle annotating when ghosts are present.  | 
2497  | 
if (parent_id not in self._annotated_lines):  | 
| 
3224.1.6
by John Arbash Meinel
 Refactor the annotation logic into a helper class.  | 
2498  | 
                # This parent is present, but another parent is missing
 | 
2499  | 
self._pending_children.setdefault(parent_id,  | 
|
2500  | 
[]).append(child)  | 
|
2501  | 
                break
 | 
|
2502  | 
else:  | 
|
2503  | 
            # This one is ready to be processed
 | 
|
2504  | 
nodes_to_annotate.append(child)  | 
|
2505  | 
||
2506  | 
def _add_annotation(self, revision_id, fulltext, parent_ids,  | 
|
2507  | 
left_matching_blocks=None):  | 
|
2508  | 
"""Add an annotation entry.  | 
|
2509  | 
||
2510  | 
        All parents should already have been annotated.
 | 
|
2511  | 
        :return: A list of children that now have their parents satisfied.
 | 
|
2512  | 
        """
 | 
|
2513  | 
a = self._annotated_lines  | 
|
2514  | 
annotated_parent_lines = [a[p] for p in parent_ids]  | 
|
2515  | 
annotated_lines = list(annotate.reannotate(annotated_parent_lines,  | 
|
| 
3224.1.10
by John Arbash Meinel
 Introduce the heads_provider for reannotate.  | 
2516  | 
fulltext, revision_id, left_matching_blocks,  | 
2517  | 
heads_provider=self._get_heads_provider()))  | 
|
| 
3224.1.6
by John Arbash Meinel
 Refactor the annotation logic into a helper class.  | 
2518  | 
self._annotated_lines[revision_id] = annotated_lines  | 
| 
3224.1.19
by John Arbash Meinel
 Work on removing nodes from the working set once they aren't needed.  | 
2519  | 
for p in parent_ids:  | 
2520  | 
ann_children = self._annotate_children[p]  | 
|
2521  | 
ann_children.remove(revision_id)  | 
|
2522  | 
if (not ann_children  | 
|
2523  | 
and p not in self._nodes_to_keep_annotations):  | 
|
2524  | 
del self._annotated_lines[p]  | 
|
2525  | 
del self._all_build_details[p]  | 
|
2526  | 
if p in self._fulltext_contents:  | 
|
2527  | 
del self._fulltext_contents[p]  | 
|
| 
3224.1.6
by John Arbash Meinel
 Refactor the annotation logic into a helper class.  | 
2528  | 
        # Now that we've added this one, see if there are any pending
 | 
2529  | 
        # deltas to be done, certainly this parent is finished
 | 
|
2530  | 
nodes_to_annotate = []  | 
|
2531  | 
for child in self._pending_children.pop(revision_id, []):  | 
|
2532  | 
self._check_parents(child, nodes_to_annotate)  | 
|
2533  | 
return nodes_to_annotate  | 
|
2534  | 
||
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
2535  | 
def _get_build_graph(self, key):  | 
| 
3224.1.6
by John Arbash Meinel
 Refactor the annotation logic into a helper class.  | 
2536  | 
"""Get the graphs for building texts and annotations.  | 
2537  | 
||
2538  | 
        The data you need for creating a full text may be different than the
 | 
|
2539  | 
        data you need to annotate that text. (At a minimum, you need both
 | 
|
2540  | 
        parents to create an annotation, but only need 1 parent to generate the
 | 
|
2541  | 
        fulltext.)
 | 
|
2542  | 
||
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
2543  | 
        :return: A list of (key, index_memo) records, suitable for
 | 
| 
3224.1.19
by John Arbash Meinel
 Work on removing nodes from the working set once they aren't needed.  | 
2544  | 
            passing to read_records_iter to start reading in the raw data fro/
 | 
| 
3224.1.6
by John Arbash Meinel
 Refactor the annotation logic into a helper class.  | 
2545  | 
            the pack file.
 | 
2546  | 
        """
 | 
|
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
2547  | 
if key in self._annotated_lines:  | 
| 
3224.1.10
by John Arbash Meinel
 Introduce the heads_provider for reannotate.  | 
2548  | 
            # Nothing to do
 | 
2549  | 
return []  | 
|
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
2550  | 
pending = set([key])  | 
| 
3224.1.6
by John Arbash Meinel
 Refactor the annotation logic into a helper class.  | 
2551  | 
records = []  | 
| 
3224.1.19
by John Arbash Meinel
 Work on removing nodes from the working set once they aren't needed.  | 
2552  | 
generation = 0  | 
| 
3224.1.22
by John Arbash Meinel
 Cleanup the extra debugging info, and some >80 char lines.  | 
2553  | 
kept_generation = 0  | 
| 
3224.1.6
by John Arbash Meinel
 Refactor the annotation logic into a helper class.  | 
2554  | 
while pending:  | 
2555  | 
            # get all pending nodes
 | 
|
| 
3224.1.19
by John Arbash Meinel
 Work on removing nodes from the working set once they aren't needed.  | 
2556  | 
generation += 1  | 
| 
3224.1.6
by John Arbash Meinel
 Refactor the annotation logic into a helper class.  | 
2557  | 
this_iteration = pending  | 
2558  | 
build_details = self._knit._index.get_build_details(this_iteration)  | 
|
2559  | 
self._all_build_details.update(build_details)  | 
|
2560  | 
            # new_nodes = self._knit._index._get_entries(this_iteration)
 | 
|
2561  | 
pending = set()  | 
|
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
2562  | 
for key, details in build_details.iteritems():  | 
| 
3224.1.14
by John Arbash Meinel
 Switch to making content_details opaque, step 1  | 
2563  | 
(index_memo, compression_parent, parents,  | 
| 
3224.1.15
by John Arbash Meinel
 Finish removing method and noeol from general knowledge,  | 
2564  | 
record_details) = details  | 
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
2565  | 
self._revision_id_graph[key] = parents  | 
2566  | 
records.append((key, index_memo))  | 
|
| 
3224.1.19
by John Arbash Meinel
 Work on removing nodes from the working set once they aren't needed.  | 
2567  | 
                # Do we actually need to check _annotated_lines?
 | 
| 
3224.1.6
by John Arbash Meinel
 Refactor the annotation logic into a helper class.  | 
2568  | 
pending.update(p for p in parents  | 
2569  | 
if p not in self._all_build_details)  | 
|
| 
3224.1.19
by John Arbash Meinel
 Work on removing nodes from the working set once they aren't needed.  | 
2570  | 
if compression_parent:  | 
2571  | 
self._compression_children.setdefault(compression_parent,  | 
|
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
2572  | 
[]).append(key)  | 
| 
3224.1.19
by John Arbash Meinel
 Work on removing nodes from the working set once they aren't needed.  | 
2573  | 
if parents:  | 
2574  | 
for parent in parents:  | 
|
| 
3224.1.22
by John Arbash Meinel
 Cleanup the extra debugging info, and some >80 char lines.  | 
2575  | 
self._annotate_children.setdefault(parent,  | 
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
2576  | 
[]).append(key)  | 
| 
3224.1.22
by John Arbash Meinel
 Cleanup the extra debugging info, and some >80 char lines.  | 
2577  | 
num_gens = generation - kept_generation  | 
2578  | 
if ((num_gens >= self._generations_until_keep)  | 
|
| 
3224.1.19
by John Arbash Meinel
 Work on removing nodes from the working set once they aren't needed.  | 
2579  | 
and len(parents) > 1):  | 
| 
3224.1.22
by John Arbash Meinel
 Cleanup the extra debugging info, and some >80 char lines.  | 
2580  | 
kept_generation = generation  | 
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
2581  | 
self._nodes_to_keep_annotations.add(key)  | 
| 
3224.1.6
by John Arbash Meinel
 Refactor the annotation logic into a helper class.  | 
2582  | 
|
2583  | 
missing_versions = this_iteration.difference(build_details.keys())  | 
|
| 
3224.1.29
by John Arbash Meinel
 Properly handle annotating when ghosts are present.  | 
2584  | 
self._ghosts.update(missing_versions)  | 
| 
3224.1.6
by John Arbash Meinel
 Refactor the annotation logic into a helper class.  | 
2585  | 
for missing_version in missing_versions:  | 
2586  | 
                # add a key, no parents
 | 
|
| 
3224.1.29
by John Arbash Meinel
 Properly handle annotating when ghosts are present.  | 
2587  | 
self._revision_id_graph[missing_version] = ()  | 
| 
3224.1.6
by John Arbash Meinel
 Refactor the annotation logic into a helper class.  | 
2588  | 
pending.discard(missing_version) # don't look for it  | 
| 
3376.2.4
by Martin Pool
 Remove every assert statement from bzrlib!  | 
2589  | 
if self._ghosts.intersection(self._compression_children):  | 
2590  | 
raise KnitCorrupt(  | 
|
2591  | 
"We cannot have nodes which have a ghost compression parent:\n"  | 
|
2592  | 
"ghosts: %r\n"  | 
|
2593  | 
"compression children: %r"  | 
|
2594  | 
% (self._ghosts, self._compression_children))  | 
|
| 
3224.1.29
by John Arbash Meinel
 Properly handle annotating when ghosts are present.  | 
2595  | 
        # Cleanout anything that depends on a ghost so that we don't wait for
 | 
2596  | 
        # the ghost to show up
 | 
|
2597  | 
for node in self._ghosts:  | 
|
2598  | 
if node in self._annotate_children:  | 
|
2599  | 
                # We won't be building this node
 | 
|
2600  | 
del self._annotate_children[node]  | 
|
| 
3224.1.6
by John Arbash Meinel
 Refactor the annotation logic into a helper class.  | 
2601  | 
        # Generally we will want to read the records in reverse order, because
 | 
2602  | 
        # we find the parent nodes after the children
 | 
|
2603  | 
records.reverse()  | 
|
2604  | 
return records  | 
|
2605  | 
||
2606  | 
def _annotate_records(self, records):  | 
|
2607  | 
"""Build the annotations for the listed records."""  | 
|
2608  | 
        # We iterate in the order read, rather than a strict order requested
 | 
|
| 
3224.1.22
by John Arbash Meinel
 Cleanup the extra debugging info, and some >80 char lines.  | 
2609  | 
        # However, process what we can, and put off to the side things that
 | 
2610  | 
        # still need parents, cleaning them up when those parents are
 | 
|
2611  | 
        # processed.
 | 
|
| 
3224.1.15
by John Arbash Meinel
 Finish removing method and noeol from general knowledge,  | 
2612  | 
for (rev_id, record,  | 
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
2613  | 
digest) in self._knit._read_records_iter(records):  | 
| 
3224.1.6
by John Arbash Meinel
 Refactor the annotation logic into a helper class.  | 
2614  | 
if rev_id in self._annotated_lines:  | 
2615  | 
                continue
 | 
|
2616  | 
parent_ids = self._revision_id_graph[rev_id]  | 
|
| 
3224.1.29
by John Arbash Meinel
 Properly handle annotating when ghosts are present.  | 
2617  | 
parent_ids = [p for p in parent_ids if p not in self._ghosts]  | 
| 
3224.1.6
by John Arbash Meinel
 Refactor the annotation logic into a helper class.  | 
2618  | 
details = self._all_build_details[rev_id]  | 
| 
3224.1.14
by John Arbash Meinel
 Switch to making content_details opaque, step 1  | 
2619  | 
(index_memo, compression_parent, parents,  | 
| 
3224.1.15
by John Arbash Meinel
 Finish removing method and noeol from general knowledge,  | 
2620  | 
record_details) = details  | 
| 
3224.1.6
by John Arbash Meinel
 Refactor the annotation logic into a helper class.  | 
2621  | 
nodes_to_annotate = []  | 
2622  | 
            # TODO: Remove the punning between compression parents, and
 | 
|
2623  | 
            #       parent_ids, we should be able to do this without assuming
 | 
|
2624  | 
            #       the build order
 | 
|
2625  | 
if len(parent_ids) == 0:  | 
|
2626  | 
                # There are no parents for this node, so just add it
 | 
|
2627  | 
                # TODO: This probably needs to be decoupled
 | 
|
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
2628  | 
fulltext_content, delta = self._knit._factory.parse_record(  | 
| 
3224.1.15
by John Arbash Meinel
 Finish removing method and noeol from general knowledge,  | 
2629  | 
rev_id, record, record_details, None)  | 
2630  | 
fulltext = self._add_fulltext_content(rev_id, fulltext_content)  | 
|
| 
3224.1.6
by John Arbash Meinel
 Refactor the annotation logic into a helper class.  | 
2631  | 
nodes_to_annotate.extend(self._add_annotation(rev_id, fulltext,  | 
2632  | 
parent_ids, left_matching_blocks=None))  | 
|
2633  | 
else:  | 
|
| 
3224.1.15
by John Arbash Meinel
 Finish removing method and noeol from general knowledge,  | 
2634  | 
child = (rev_id, parent_ids, record)  | 
| 
3224.1.6
by John Arbash Meinel
 Refactor the annotation logic into a helper class.  | 
2635  | 
                # Check if all the parents are present
 | 
2636  | 
self._check_parents(child, nodes_to_annotate)  | 
|
2637  | 
while nodes_to_annotate:  | 
|
2638  | 
                # Should we use a queue here instead of a stack?
 | 
|
| 
3224.1.15
by John Arbash Meinel
 Finish removing method and noeol from general knowledge,  | 
2639  | 
(rev_id, parent_ids, record) = nodes_to_annotate.pop()  | 
| 
3224.1.14
by John Arbash Meinel
 Switch to making content_details opaque, step 1  | 
2640  | 
(index_memo, compression_parent, parents,  | 
| 
3224.1.15
by John Arbash Meinel
 Finish removing method and noeol from general knowledge,  | 
2641  | 
record_details) = self._all_build_details[rev_id]  | 
| 
3224.1.14
by John Arbash Meinel
 Switch to making content_details opaque, step 1  | 
2642  | 
if compression_parent is not None:  | 
| 
3224.1.19
by John Arbash Meinel
 Work on removing nodes from the working set once they aren't needed.  | 
2643  | 
comp_children = self._compression_children[compression_parent]  | 
| 
3376.2.4
by Martin Pool
 Remove every assert statement from bzrlib!  | 
2644  | 
if rev_id not in comp_children:  | 
2645  | 
raise AssertionError("%r not in compression children %r"  | 
|
2646  | 
% (rev_id, comp_children))  | 
|
| 
3224.1.19
by John Arbash Meinel
 Work on removing nodes from the working set once they aren't needed.  | 
2647  | 
                    # If there is only 1 child, it is safe to reuse this
 | 
2648  | 
                    # content
 | 
|
2649  | 
reuse_content = (len(comp_children) == 1  | 
|
2650  | 
and compression_parent not in  | 
|
2651  | 
self._nodes_to_keep_annotations)  | 
|
2652  | 
if reuse_content:  | 
|
2653  | 
                        # Remove it from the cache since it will be changing
 | 
|
2654  | 
parent_fulltext_content = self._fulltext_contents.pop(compression_parent)  | 
|
2655  | 
                        # Make sure to copy the fulltext since it might be
 | 
|
2656  | 
                        # modified
 | 
|
| 
3224.1.22
by John Arbash Meinel
 Cleanup the extra debugging info, and some >80 char lines.  | 
2657  | 
parent_fulltext = list(parent_fulltext_content.text())  | 
| 
3224.1.19
by John Arbash Meinel
 Work on removing nodes from the working set once they aren't needed.  | 
2658  | 
else:  | 
2659  | 
parent_fulltext_content = self._fulltext_contents[compression_parent]  | 
|
| 
3224.1.22
by John Arbash Meinel
 Cleanup the extra debugging info, and some >80 char lines.  | 
2660  | 
parent_fulltext = parent_fulltext_content.text()  | 
| 
3224.1.19
by John Arbash Meinel
 Work on removing nodes from the working set once they aren't needed.  | 
2661  | 
comp_children.remove(rev_id)  | 
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
2662  | 
fulltext_content, delta = self._knit._factory.parse_record(  | 
| 
3224.1.22
by John Arbash Meinel
 Cleanup the extra debugging info, and some >80 char lines.  | 
2663  | 
rev_id, record, record_details,  | 
2664  | 
parent_fulltext_content,  | 
|
| 
3224.1.19
by John Arbash Meinel
 Work on removing nodes from the working set once they aren't needed.  | 
2665  | 
copy_base_content=(not reuse_content))  | 
| 
3224.1.22
by John Arbash Meinel
 Cleanup the extra debugging info, and some >80 char lines.  | 
2666  | 
fulltext = self._add_fulltext_content(rev_id,  | 
2667  | 
fulltext_content)  | 
|
| 
3224.1.6
by John Arbash Meinel
 Refactor the annotation logic into a helper class.  | 
2668  | 
blocks = KnitContent.get_line_delta_blocks(delta,  | 
2669  | 
parent_fulltext, fulltext)  | 
|
2670  | 
else:  | 
|
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
2671  | 
fulltext_content = self._knit._factory.parse_fulltext(  | 
| 
3224.1.15
by John Arbash Meinel
 Finish removing method and noeol from general knowledge,  | 
2672  | 
record, rev_id)  | 
| 
3224.1.6
by John Arbash Meinel
 Refactor the annotation logic into a helper class.  | 
2673  | 
fulltext = self._add_fulltext_content(rev_id,  | 
| 
3224.1.15
by John Arbash Meinel
 Finish removing method and noeol from general knowledge,  | 
2674  | 
fulltext_content)  | 
| 
3224.1.6
by John Arbash Meinel
 Refactor the annotation logic into a helper class.  | 
2675  | 
blocks = None  | 
2676  | 
nodes_to_annotate.extend(  | 
|
2677  | 
self._add_annotation(rev_id, fulltext, parent_ids,  | 
|
2678  | 
left_matching_blocks=blocks))  | 
|
2679  | 
||
| 
3224.1.10
by John Arbash Meinel
 Introduce the heads_provider for reannotate.  | 
2680  | 
def _get_heads_provider(self):  | 
2681  | 
"""Create a heads provider for resolving ancestry issues."""  | 
|
2682  | 
if self._heads_provider is not None:  | 
|
2683  | 
return self._heads_provider  | 
|
2684  | 
parent_provider = _mod_graph.DictParentsProvider(  | 
|
2685  | 
self._revision_id_graph)  | 
|
2686  | 
graph_obj = _mod_graph.Graph(parent_provider)  | 
|
| 
3224.1.20
by John Arbash Meinel
 Reduce the number of cache misses by caching known heads answers  | 
2687  | 
head_cache = _mod_graph.FrozenHeadsCache(graph_obj)  | 
| 
3224.1.10
by John Arbash Meinel
 Introduce the heads_provider for reannotate.  | 
2688  | 
self._heads_provider = head_cache  | 
2689  | 
return head_cache  | 
|
2690  | 
||
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
2691  | 
def annotate(self, key):  | 
2692  | 
"""Return the annotated fulltext at the given key.  | 
|
| 
3224.1.5
by John Arbash Meinel
 Start using a helper class for doing the knit-pack annotations.  | 
2693  | 
|
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
2694  | 
        :param key: The key to annotate.
 | 
| 
3224.1.5
by John Arbash Meinel
 Start using a helper class for doing the knit-pack annotations.  | 
2695  | 
        """
 | 
| 
3517.4.1
by Martin Pool
 Merge unoptimized annotate code for stacking, and only use it when needed  | 
2696  | 
if True or len(self._knit._fallback_vfs) > 0:  | 
2697  | 
            # stacked knits can't use the fast path at present.
 | 
|
2698  | 
return self._simple_annotate(key)  | 
|
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
2699  | 
records = self._get_build_graph(key)  | 
2700  | 
if key in self._ghosts:  | 
|
2701  | 
raise errors.RevisionNotPresent(key, self._knit)  | 
|
| 
3224.1.6
by John Arbash Meinel
 Refactor the annotation logic into a helper class.  | 
2702  | 
self._annotate_records(records)  | 
| 
3350.6.4
by Robert Collins
 First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.  | 
2703  | 
return self._annotated_lines[key]  | 
| 
3224.1.5
by John Arbash Meinel
 Start using a helper class for doing the knit-pack annotations.  | 
2704  | 
|
| 
3517.4.1
by Martin Pool
 Merge unoptimized annotate code for stacking, and only use it when needed  | 
2705  | 
def _simple_annotate(self, key):  | 
2706  | 
"""Return annotated fulltext, rediffing from the full texts.  | 
|
2707  | 
||
2708  | 
        This is slow but makes no assumptions about the repository
 | 
|
2709  | 
        being able to produce line deltas.
 | 
|
2710  | 
        """
 | 
|
2711  | 
        # TODO: this code generates a parent maps of present ancestors; it
 | 
|
2712  | 
        # could be split out into a separate method, and probably should use
 | 
|
2713  | 
        # iter_ancestry instead. -- mbp and robertc 20080704
 | 
|
| 
3535.5.1
by John Arbash Meinel
 cleanup a few imports to be lazily loaded.  | 
2714  | 
graph = _mod_graph.Graph(self._knit)  | 
| 
3350.9.1
by Robert Collins
 Redo annotate more simply, using just the public interfaces for VersionedFiles.  | 
2715  | 
head_cache = _mod_graph.FrozenHeadsCache(graph)  | 
2716  | 
search = graph._make_breadth_first_searcher([key])  | 
|
2717  | 
keys = set()  | 
|
2718  | 
while True:  | 
|
2719  | 
try:  | 
|
2720  | 
present, ghosts = search.next_with_ghosts()  | 
|
2721  | 
except StopIteration:  | 
|
2722  | 
                break
 | 
|
2723  | 
keys.update(present)  | 
|
2724  | 
parent_map = self._knit.get_parent_map(keys)  | 
|
2725  | 
parent_cache = {}  | 
|
2726  | 
reannotate = annotate.reannotate  | 
|
2727  | 
for record in self._knit.get_record_stream(keys, 'topological', True):  | 
|
2728  | 
key = record.key  | 
|
2729  | 
fulltext = split_lines(record.get_bytes_as('fulltext'))  | 
|
| 
3517.4.2
by Martin Pool
 Make simple-annotation and graph code more tolerant of knits with no graph  | 
2730  | 
parents = parent_map[key]  | 
2731  | 
if parents is not None:  | 
|
2732  | 
parent_lines = [parent_cache[parent] for parent in parent_map[key]]  | 
|
2733  | 
else:  | 
|
2734  | 
parent_lines = []  | 
|
| 
3350.9.1
by Robert Collins
 Redo annotate more simply, using just the public interfaces for VersionedFiles.  | 
2735  | 
parent_cache[key] = list(  | 
2736  | 
reannotate(parent_lines, fulltext, key, None, head_cache))  | 
|
| 
3517.4.2
by Martin Pool
 Make simple-annotation and graph code more tolerant of knits with no graph  | 
2737  | 
try:  | 
2738  | 
return parent_cache[key]  | 
|
2739  | 
except KeyError, e:  | 
|
2740  | 
raise errors.RevisionNotPresent(key, self._knit)  | 
|
| 
3224.1.5
by John Arbash Meinel
 Start using a helper class for doing the knit-pack annotations.  | 
2741  | 
|
2742  | 
||
| 
2484.1.1
by John Arbash Meinel
 Add an initial function to read knit indexes in pyrex.  | 
2743  | 
try:  | 
| 
2484.1.12
by John Arbash Meinel
 Switch the layout to use a matching _knit_load_data_py.py and _knit_load_data_c.pyx  | 
2744  | 
from bzrlib._knit_load_data_c import _load_data_c as _load_data  | 
| 
2484.1.1
by John Arbash Meinel
 Add an initial function to read knit indexes in pyrex.  | 
2745  | 
except ImportError:  | 
| 
2484.1.12
by John Arbash Meinel
 Switch the layout to use a matching _knit_load_data_py.py and _knit_load_data_c.pyx  | 
2746  | 
from bzrlib._knit_load_data_py import _load_data_py as _load_data  |