/brz/remove-bazaar

To get this branch, use:
bzr branch http://gegoxaren.bato24.eu/bzr/brz/remove-bazaar
4763.2.4 by John Arbash Meinel
merge bzr.2.1 in preparation for NEWS entry.
1
# Copyright (C) 2006-2010 Canonical Ltd
1570.1.2 by Robert Collins
Import bzrtools' 'fix' command as 'bzr reconcile.'
2
#
3
# This program is free software; you can redistribute it and/or modify
4
# it under the terms of the GNU General Public License as published by
5
# the Free Software Foundation; either version 2 of the License, or
6
# (at your option) any later version.
7
#
8
# This program is distributed in the hope that it will be useful,
9
# but WITHOUT ANY WARRANTY; without even the implied warranty of
10
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
11
# GNU General Public License for more details.
12
#
13
# You should have received a copy of the GNU General Public License
14
# along with this program; if not, write to the Free Software
4183.7.1 by Sabin Iacob
update FSF mailing address
15
# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
1570.1.2 by Robert Collins
Import bzrtools' 'fix' command as 'bzr reconcile.'
16
6379.6.3 by Jelmer Vernooij
Use absolute_import.
17
from __future__ import absolute_import
18
1570.1.7 by Robert Collins
Replace the slow topo_sort routine with a much faster one for non trivial datasets.
19
"""Reconcilers are able to fix some potential data errors in a branch."""
1570.1.2 by Robert Collins
Import bzrtools' 'fix' command as 'bzr reconcile.'
20
21
2592.3.80 by Robert Collins
Make reconcile work, and pass tests.
22
__all__ = [
23
    'KnitReconciler',
24
    'PackReconciler',
25
    'reconcile',
26
    'Reconciler',
27
    'RepoReconciler',
28
    ]
1570.1.14 by Robert Collins
Enforce repository consistency during 'fetch' operations.
29
30
2745.6.11 by Aaron Bentley
Fix knit file parents to follow parentage from revision/inventory XML
31
from bzrlib import (
4936.1.1 by Andrew Bennetts
Replace some fragile try/finally cleanups in bzrlib.reconcile with OperationWithCleanups (borrowing run_simple from command-cleanup branch).
32
    cleanup,
2745.6.16 by Aaron Bentley
Update from review
33
    errors,
5972.2.2 by Jelmer Vernooij
Fix import
34
    revision as _mod_revision,
2745.6.11 by Aaron Bentley
Fix knit file parents to follow parentage from revision/inventory XML
35
    ui,
36
    )
4936.1.1 by Andrew Bennetts
Replace some fragile try/finally cleanups in bzrlib.reconcile with OperationWithCleanups (borrowing run_simple from command-cleanup branch).
37
from bzrlib.trace import mutter
4577.2.4 by Maarten Bosmans
Make shure the faster topo_sort function is used where appropriate
38
from bzrlib.tsort import topo_sort
3350.6.4 by Robert Collins
First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.
39
from bzrlib.versionedfile import AdapterFactory, FulltextContentFactory
6138.3.4 by Jonathan Riddell
add gettext() to uses of trace.note()
40
from bzrlib.i18n import gettext
1570.1.2 by Robert Collins
Import bzrtools' 'fix' command as 'bzr reconcile.'
41
42
5375.1.3 by Andrew Bennetts
Add hidden --canonicalize-chks option to reconcile to trigger GCCHKCanonicalizingPacker, improve progress reporting a little.
43
def reconcile(dir, canonicalize_chks=False):
1570.1.2 by Robert Collins
Import bzrtools' 'fix' command as 'bzr reconcile.'
44
    """Reconcile the data in dir.
45
46
    Currently this is limited to a inventory 'reweave'.
47
1570.1.8 by Robert Collins
Only reconcile if doing so will perform gc or correct ancestry.
48
    This is a convenience method, for using a Reconciler object.
49
50
    Directly using Reconciler is recommended for library users that
51
    desire fine grained control or analysis of the found issues.
1692.1.1 by Robert Collins
* Repository.reconcile now takes a thorough keyword parameter to allow
52
5375.1.3 by Andrew Bennetts
Add hidden --canonicalize-chks option to reconcile to trigger GCCHKCanonicalizingPacker, improve progress reporting a little.
53
    :param canonicalize_chks: Make sure CHKs are in canonical form.
1570.1.2 by Robert Collins
Import bzrtools' 'fix' command as 'bzr reconcile.'
54
    """
5375.1.3 by Andrew Bennetts
Add hidden --canonicalize-chks option to reconcile to trigger GCCHKCanonicalizingPacker, improve progress reporting a little.
55
    reconciler = Reconciler(dir, canonicalize_chks=canonicalize_chks)
1570.1.2 by Robert Collins
Import bzrtools' 'fix' command as 'bzr reconcile.'
56
    reconciler.reconcile()
57
58
1570.1.6 by Robert Collins
Update fast topological_sort to be a function and to have the topo_sort tests run against it.
59
class Reconciler(object):
1570.1.14 by Robert Collins
Enforce repository consistency during 'fetch' operations.
60
    """Reconcilers are used to reconcile existing data."""
1570.1.6 by Robert Collins
Update fast topological_sort to be a function and to have the topo_sort tests run against it.
61
5375.1.3 by Andrew Bennetts
Add hidden --canonicalize-chks option to reconcile to trigger GCCHKCanonicalizingPacker, improve progress reporting a little.
62
    def __init__(self, dir, other=None, canonicalize_chks=False):
1692.1.1 by Robert Collins
* Repository.reconcile now takes a thorough keyword parameter to allow
63
        """Create a Reconciler."""
1570.1.6 by Robert Collins
Update fast topological_sort to be a function and to have the topo_sort tests run against it.
64
        self.bzrdir = dir
5375.1.3 by Andrew Bennetts
Add hidden --canonicalize-chks option to reconcile to trigger GCCHKCanonicalizingPacker, improve progress reporting a little.
65
        self.canonicalize_chks = canonicalize_chks
1570.1.6 by Robert Collins
Update fast topological_sort to be a function and to have the topo_sort tests run against it.
66
67
    def reconcile(self):
1570.1.8 by Robert Collins
Only reconcile if doing so will perform gc or correct ancestry.
68
        """Perform reconciliation.
3943.8.1 by Marius Kruger
remove all trailing whitespace from bzr source
69
1570.1.8 by Robert Collins
Only reconcile if doing so will perform gc or correct ancestry.
70
        After reconciliation the following attributes document found issues:
5891.1.2 by Andrew Bennetts
Fix a bunch of docstring formatting nits, making pydoctor a bit happier.
71
72
        * `inconsistent_parents`: The number of revisions in the repository
73
          whose ancestry was being reported incorrectly.
74
        * `garbage_inventories`: The number of inventory objects without
75
          revisions that were garbage collected.
76
        * `fixed_branch_history`: None if there was no branch, False if the
77
          branch history was correct, True if the branch history needed to be
78
          re-normalized.
1570.1.8 by Robert Collins
Only reconcile if doing so will perform gc or correct ancestry.
79
        """
1594.1.3 by Robert Collins
Fixup pb usage to use nested_progress_bar.
80
        self.pb = ui.ui_factory.nested_progress_bar()
81
        try:
82
            self._reconcile()
83
        finally:
84
            self.pb.finished()
85
86
    def _reconcile(self):
87
        """Helper function for performing reconciliation."""
3389.2.3 by John Arbash Meinel
Add Branch.reconcile() functionality.
88
        self._reconcile_branch()
89
        self._reconcile_repository()
90
91
    def _reconcile_branch(self):
92
        try:
93
            self.branch = self.bzrdir.open_branch()
94
        except errors.NotBranchError:
95
            # Nothing to check here
3389.2.7 by John Arbash Meinel
Review comments from Ian
96
            self.fixed_branch_history = None
3389.2.3 by John Arbash Meinel
Add Branch.reconcile() functionality.
97
            return
6138.3.4 by Jonathan Riddell
add gettext() to uses of trace.note()
98
        ui.ui_factory.note(gettext('Reconciling branch %s') % self.branch.base)
3389.2.3 by John Arbash Meinel
Add Branch.reconcile() functionality.
99
        branch_reconciler = self.branch.reconcile(thorough=True)
100
        self.fixed_branch_history = branch_reconciler.fixed_history
101
102
    def _reconcile_repository(self):
1570.1.11 by Robert Collins
Make reconcile work with shared repositories.
103
        self.repo = self.bzrdir.find_repository()
6138.3.4 by Jonathan Riddell
add gettext() to uses of trace.note()
104
        ui.ui_factory.note(gettext('Reconciling repository %s') %
5158.6.10 by Martin Pool
Update more code to use user_transport when it should
105
            self.repo.user_url)
6138.3.4 by Jonathan Riddell
add gettext() to uses of trace.note()
106
        self.pb.update(gettext("Reconciling repository"), 0, 1)
5375.1.3 by Andrew Bennetts
Add hidden --canonicalize-chks option to reconcile to trigger GCCHKCanonicalizingPacker, improve progress reporting a little.
107
        if self.canonicalize_chks:
5375.1.6 by Andrew Bennetts
Don't traceback if a repository doesn't support reconcile_canonicalize_chks.
108
            try:
109
                self.repo.reconcile_canonicalize_chks
110
            except AttributeError:
111
                raise errors.BzrError(
6138.3.4 by Jonathan Riddell
add gettext() to uses of trace.note()
112
                    gettext("%s cannot canonicalize CHKs.") % (self.repo,))
5375.1.3 by Andrew Bennetts
Add hidden --canonicalize-chks option to reconcile to trigger GCCHKCanonicalizingPacker, improve progress reporting a little.
113
            repo_reconciler = self.repo.reconcile_canonicalize_chks()
114
        else:
115
            repo_reconciler = self.repo.reconcile(thorough=True)
1570.1.14 by Robert Collins
Enforce repository consistency during 'fetch' operations.
116
        self.inconsistent_parents = repo_reconciler.inconsistent_parents
117
        self.garbage_inventories = repo_reconciler.garbage_inventories
2819.2.5 by Andrew Bennetts
Make reconcile abort gracefully if the revision index has bad parents.
118
        if repo_reconciler.aborted:
6138.3.4 by Jonathan Riddell
add gettext() to uses of trace.note()
119
            ui.ui_factory.note(gettext(
120
                'Reconcile aborted: revision index has inconsistent parents.'))
121
            ui.ui_factory.note(gettext(
122
                'Run "bzr check" for more details.'))
2819.2.5 by Andrew Bennetts
Make reconcile abort gracefully if the revision index has bad parents.
123
        else:
6138.3.4 by Jonathan Riddell
add gettext() to uses of trace.note()
124
            ui.ui_factory.note(gettext('Reconciliation complete.'))
1570.1.14 by Robert Collins
Enforce repository consistency during 'fetch' operations.
125
126
3389.2.3 by John Arbash Meinel
Add Branch.reconcile() functionality.
127
class BranchReconciler(object):
128
    """Reconciler that works on a branch."""
129
130
    def __init__(self, a_branch, thorough=False):
131
        self.fixed_history = None
132
        self.thorough = thorough
133
        self.branch = a_branch
134
135
    def reconcile(self):
4936.1.1 by Andrew Bennetts
Replace some fragile try/finally cleanups in bzrlib.reconcile with OperationWithCleanups (borrowing run_simple from command-cleanup branch).
136
        operation = cleanup.OperationWithCleanups(self._reconcile)
137
        self.add_cleanup = operation.add_cleanup
138
        operation.run_simple()
139
140
    def _reconcile(self):
3389.2.3 by John Arbash Meinel
Add Branch.reconcile() functionality.
141
        self.branch.lock_write()
4936.1.1 by Andrew Bennetts
Replace some fragile try/finally cleanups in bzrlib.reconcile with OperationWithCleanups (borrowing run_simple from command-cleanup branch).
142
        self.add_cleanup(self.branch.unlock)
143
        self.pb = ui.ui_factory.nested_progress_bar()
144
        self.add_cleanup(self.pb.finished)
145
        self._reconcile_steps()
3389.2.3 by John Arbash Meinel
Add Branch.reconcile() functionality.
146
147
    def _reconcile_steps(self):
148
        self._reconcile_revision_history()
149
150
    def _reconcile_revision_history(self):
151
        last_revno, last_revision_id = self.branch.last_revision_info()
4266.3.11 by Jelmer Vernooij
Support reconcile on branches with ghosts in their mainline.
152
        real_history = []
5972.2.1 by Jelmer Vernooij
Deprecate Repository.iter_reverse_revision_history.
153
        graph = self.branch.repository.get_graph()
4266.3.11 by Jelmer Vernooij
Support reconcile on branches with ghosts in their mainline.
154
        try:
5972.2.1 by Jelmer Vernooij
Deprecate Repository.iter_reverse_revision_history.
155
            for revid in graph.iter_lefthand_ancestry(
156
                    last_revision_id, (_mod_revision.NULL_REVISION,)):
4266.3.11 by Jelmer Vernooij
Support reconcile on branches with ghosts in their mainline.
157
                real_history.append(revid)
158
        except errors.RevisionNotPresent:
159
            pass # Hit a ghost left hand parent
3389.2.3 by John Arbash Meinel
Add Branch.reconcile() functionality.
160
        real_history.reverse()
161
        if last_revno != len(real_history):
162
            self.fixed_history = True
163
            # Technically for Branch5 formats, it is more efficient to use
164
            # set_revision_history, as this will regenerate it again.
165
            # Not really worth a whole BranchReconciler class just for this,
166
            # though.
6147.1.1 by Jonathan Riddell
use .format() instead of % for string formatting where there are multiple formats in one string to allow for translations
167
            ui.ui_factory.note(gettext('Fixing last revision info {0} '\
168
                                       ' => {1}').format(
169
                                       last_revno, len(real_history)))
3389.2.3 by John Arbash Meinel
Add Branch.reconcile() functionality.
170
            self.branch.set_last_revision_info(len(real_history),
171
                                               last_revision_id)
172
        else:
173
            self.fixed_history = False
6138.3.4 by Jonathan Riddell
add gettext() to uses of trace.note()
174
            ui.ui_factory.note(gettext('revision_history ok.'))
3389.2.3 by John Arbash Meinel
Add Branch.reconcile() functionality.
175
176
1570.1.14 by Robert Collins
Enforce repository consistency during 'fetch' operations.
177
class RepoReconciler(object):
178
    """Reconciler that reconciles a repository.
179
2857.1.2 by Robert Collins
Review feedback.
180
    The goal of repository reconciliation is to make any derived data
3943.8.1 by Marius Kruger
remove all trailing whitespace from bzr source
181
    consistent with the core data committed by a user. This can involve
2592.3.80 by Robert Collins
Make reconcile work, and pass tests.
182
    reindexing, or removing unreferenced data if that can interfere with
183
    queries in a given repository.
184
1570.1.14 by Robert Collins
Enforce repository consistency during 'fetch' operations.
185
    Currently this consists of an inventory reweave with revision cross-checks.
186
    """
187
1692.1.1 by Robert Collins
* Repository.reconcile now takes a thorough keyword parameter to allow
188
    def __init__(self, repo, other=None, thorough=False):
189
        """Construct a RepoReconciler.
190
191
        :param thorough: perform a thorough check which may take longer but
192
                         will correct non-data loss issues such as incorrect
193
                         cached data.
194
        """
195
        self.garbage_inventories = 0
196
        self.inconsistent_parents = 0
2819.2.5 by Andrew Bennetts
Make reconcile abort gracefully if the revision index has bad parents.
197
        self.aborted = False
1570.1.14 by Robert Collins
Enforce repository consistency during 'fetch' operations.
198
        self.repo = repo
1692.1.1 by Robert Collins
* Repository.reconcile now takes a thorough keyword parameter to allow
199
        self.thorough = thorough
1570.1.14 by Robert Collins
Enforce repository consistency during 'fetch' operations.
200
201
    def reconcile(self):
202
        """Perform reconciliation.
3943.8.1 by Marius Kruger
remove all trailing whitespace from bzr source
203
1570.1.14 by Robert Collins
Enforce repository consistency during 'fetch' operations.
204
        After reconciliation the following attributes document found issues:
5891.1.2 by Andrew Bennetts
Fix a bunch of docstring formatting nits, making pydoctor a bit happier.
205
206
        * `inconsistent_parents`: The number of revisions in the repository
207
          whose ancestry was being reported incorrectly.
208
        * `garbage_inventories`: The number of inventory objects without
209
          revisions that were garbage collected.
1570.1.14 by Robert Collins
Enforce repository consistency during 'fetch' operations.
210
        """
4936.1.1 by Andrew Bennetts
Replace some fragile try/finally cleanups in bzrlib.reconcile with OperationWithCleanups (borrowing run_simple from command-cleanup branch).
211
        operation = cleanup.OperationWithCleanups(self._reconcile)
212
        self.add_cleanup = operation.add_cleanup
213
        operation.run_simple()
214
215
    def _reconcile(self):
1570.1.6 by Robert Collins
Update fast topological_sort to be a function and to have the topo_sort tests run against it.
216
        self.repo.lock_write()
4936.1.1 by Andrew Bennetts
Replace some fragile try/finally cleanups in bzrlib.reconcile with OperationWithCleanups (borrowing run_simple from command-cleanup branch).
217
        self.add_cleanup(self.repo.unlock)
218
        self.pb = ui.ui_factory.nested_progress_bar()
219
        self.add_cleanup(self.pb.finished)
220
        self._reconcile_steps()
1570.1.6 by Robert Collins
Update fast topological_sort to be a function and to have the topo_sort tests run against it.
221
1594.2.7 by Robert Collins
Add versionedfile.fix_parents api for correcting data post hoc.
222
    def _reconcile_steps(self):
223
        """Perform the steps to reconcile this repository."""
1692.1.3 by Robert Collins
Finish the reconcile tweak: filled in ghosts are a data loss issue and need to be checked during fast reconciles.
224
        self._reweave_inventory()
1594.2.7 by Robert Collins
Add versionedfile.fix_parents api for correcting data post hoc.
225
1570.1.6 by Robert Collins
Update fast topological_sort to be a function and to have the topo_sort tests run against it.
226
    def _reweave_inventory(self):
1692.1.3 by Robert Collins
Finish the reconcile tweak: filled in ghosts are a data loss issue and need to be checked during fast reconciles.
227
        """Regenerate the inventory weave for the repository from scratch.
3943.8.1 by Marius Kruger
remove all trailing whitespace from bzr source
228
229
        This is a smart function: it will only do the reweave if doing it
1692.1.3 by Robert Collins
Finish the reconcile tweak: filled in ghosts are a data loss issue and need to be checked during fast reconciles.
230
        will correct data issues. The self.thorough flag controls whether
231
        only data-loss causing issues (!self.thorough) or all issues
232
        (self.thorough) are treated as requiring the reweave.
233
        """
1563.2.29 by Robert Collins
Remove all but fetch references to repository.revision_store.
234
        transaction = self.repo.get_transaction()
6138.3.4 by Jonathan Riddell
add gettext() to uses of trace.note()
235
        self.pb.update(gettext('Reading inventory data'))
3350.6.4 by Robert Collins
First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.
236
        self.inventory = self.repo.inventories
237
        self.revisions = self.repo.revisions
1570.1.6 by Robert Collins
Update fast topological_sort to be a function and to have the topo_sort tests run against it.
238
        # the total set of revisions to process
3350.6.4 by Robert Collins
First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.
239
        self.pending = set([key[-1] for key in self.revisions.keys()])
1570.1.6 by Robert Collins
Update fast topological_sort to be a function and to have the topo_sort tests run against it.
240
241
        # mapping from revision_id to parents
242
        self._rev_graph = {}
1570.1.8 by Robert Collins
Only reconcile if doing so will perform gc or correct ancestry.
243
        # errors that we detect
244
        self.inconsistent_parents = 0
1570.1.6 by Robert Collins
Update fast topological_sort to be a function and to have the topo_sort tests run against it.
245
        # we need the revision id of each revision and its available parents list
1570.1.10 by Robert Collins
UI tweaks to reconcile - show progress for inventory backup.
246
        self._setup_steps(len(self.pending))
1570.1.6 by Robert Collins
Update fast topological_sort to be a function and to have the topo_sort tests run against it.
247
        for rev_id in self.pending:
248
            # put a revision into the graph.
249
            self._graph_revision(rev_id)
1594.2.2 by Robert Collins
Trivial change to reconcile to mutter the cause of reconciliation to bzr.log
250
        self._check_garbage_inventories()
3943.8.1 by Marius Kruger
remove all trailing whitespace from bzr source
251
        # if there are no inconsistent_parents and
1692.1.3 by Robert Collins
Finish the reconcile tweak: filled in ghosts are a data loss issue and need to be checked during fast reconciles.
252
        # (no garbage inventories or we are not doing a thorough check)
3943.8.1 by Marius Kruger
remove all trailing whitespace from bzr source
253
        if (not self.inconsistent_parents and
1692.1.3 by Robert Collins
Finish the reconcile tweak: filled in ghosts are a data loss issue and need to be checked during fast reconciles.
254
            (not self.garbage_inventories or not self.thorough)):
6138.3.4 by Jonathan Riddell
add gettext() to uses of trace.note()
255
            ui.ui_factory.note(gettext('Inventory ok.'))
1570.1.8 by Robert Collins
Only reconcile if doing so will perform gc or correct ancestry.
256
            return
6138.3.4 by Jonathan Riddell
add gettext() to uses of trace.note()
257
        self.pb.update(gettext('Backing up inventory'), 0, 0)
3350.6.4 by Robert Collins
First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.
258
        self.repo._backup_inventory()
6138.3.4 by Jonathan Riddell
add gettext() to uses of trace.note()
259
        ui.ui_factory.note(gettext('Backup inventory created.'))
3350.6.4 by Robert Collins
First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.
260
        new_inventories = self.repo._temp_inventories()
1570.1.6 by Robert Collins
Update fast topological_sort to be a function and to have the topo_sort tests run against it.
261
1570.1.4 by Robert Collins
Somewhat optimised version of reconciler.
262
        # we have topological order of revisions and non ghost parents ready.
1570.1.10 by Robert Collins
UI tweaks to reconcile - show progress for inventory backup.
263
        self._setup_steps(len(self._rev_graph))
4577.2.4 by Maarten Bosmans
Make shure the faster topo_sort function is used where appropriate
264
        revision_keys = [(rev_id,) for rev_id in topo_sort(self._rev_graph)]
3350.6.4 by Robert Collins
First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.
265
        stream = self._change_inv_parents(
3606.7.7 by John Arbash Meinel
Add tests for the fetching behavior.
266
            self.inventory.get_record_stream(revision_keys, 'unordered', True),
3350.6.4 by Robert Collins
First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.
267
            self._new_inv_parents,
268
            set(revision_keys))
269
        new_inventories.insert_record_stream(stream)
270
        # if this worked, the set of new_inventories.keys should equal
1570.1.4 by Robert Collins
Somewhat optimised version of reconciler.
271
        # self.pending
3350.6.4 by Robert Collins
First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.
272
        if not (set(new_inventories.keys()) ==
273
            set([(revid,) for revid in self.pending])):
3376.2.4 by Martin Pool
Remove every assert statement from bzrlib!
274
            raise AssertionError()
6138.3.4 by Jonathan Riddell
add gettext() to uses of trace.note()
275
        self.pb.update(gettext('Writing weave'))
3350.6.4 by Robert Collins
First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.
276
        self.repo._activate_new_inventory()
1570.1.3 by Robert Collins
Optimise reconcilation to only hit each revision once.
277
        self.inventory = None
6138.3.4 by Jonathan Riddell
add gettext() to uses of trace.note()
278
        ui.ui_factory.note(gettext('Inventory regenerated.'))
1570.1.3 by Robert Collins
Optimise reconcilation to only hit each revision once.
279
3350.6.4 by Robert Collins
First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.
280
    def _new_inv_parents(self, revision_key):
281
        """Lookup ghost-filtered parents for revision_key."""
282
        # Use the filtered ghostless parents list:
283
        return tuple([(revid,) for revid in self._rev_graph[revision_key[-1]]])
284
285
    def _change_inv_parents(self, stream, get_parents, all_revision_keys):
286
        """Adapt a record stream to reconcile the parents."""
287
        for record in stream:
288
            wanted_parents = get_parents(record.key)
289
            if wanted_parents and wanted_parents[0] not in all_revision_keys:
290
                # The check for the left most parent only handles knit
291
                # compressors, but this code only applies to knit and weave
292
                # repositories anyway.
293
                bytes = record.get_bytes_as('fulltext')
294
                yield FulltextContentFactory(record.key, wanted_parents, record.sha1, bytes)
295
            else:
296
                adapted_record = AdapterFactory(record.key, wanted_parents, record)
297
                yield adapted_record
298
            self._reweave_step('adding inventories')
299
1570.1.10 by Robert Collins
UI tweaks to reconcile - show progress for inventory backup.
300
    def _setup_steps(self, new_total):
301
        """Setup the markers we need to control the progress bar."""
302
        self.total = new_total
303
        self.count = 0
304
1570.1.4 by Robert Collins
Somewhat optimised version of reconciler.
305
    def _graph_revision(self, rev_id):
306
        """Load a revision into the revision graph."""
307
        # pick a random revision
308
        # analyse revision id rev_id and put it in the stack.
309
        self._reweave_step('loading revisions')
1570.1.13 by Robert Collins
Check for incorrect revision parentage in the weave during revision access.
310
        rev = self.repo.get_revision_reconcile(rev_id)
1570.1.3 by Robert Collins
Optimise reconcilation to only hit each revision once.
311
        parents = []
312
        for parent in rev.parent_ids:
1570.1.14 by Robert Collins
Enforce repository consistency during 'fetch' operations.
313
            if self._parent_is_available(parent):
1570.1.3 by Robert Collins
Optimise reconcilation to only hit each revision once.
314
                parents.append(parent)
315
            else:
316
                mutter('found ghost %s', parent)
3287.5.2 by Robert Collins
Deprecate VersionedFile.get_parents, breaking pulling from a ghost containing knit or pack repository to weaves, which improves correctness and allows simplification of core code.
317
        self._rev_graph[rev_id] = parents
1692.1.3 by Robert Collins
Finish the reconcile tweak: filled in ghosts are a data loss issue and need to be checked during fast reconciles.
318
1594.2.2 by Robert Collins
Trivial change to reconcile to mutter the cause of reconciliation to bzr.log
319
    def _check_garbage_inventories(self):
320
        """Check for garbage inventories which we cannot trust
321
322
        We cant trust them because their pre-requisite file data may not
323
        be present - all we know is that their revision was not installed.
324
        """
1692.1.3 by Robert Collins
Finish the reconcile tweak: filled in ghosts are a data loss issue and need to be checked during fast reconciles.
325
        if not self.thorough:
326
            return
3350.6.4 by Robert Collins
First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.
327
        inventories = set(self.inventory.keys())
328
        revisions = set(self.revisions.keys())
1594.2.2 by Robert Collins
Trivial change to reconcile to mutter the cause of reconciliation to bzr.log
329
        garbage = inventories.difference(revisions)
330
        self.garbage_inventories = len(garbage)
3350.6.4 by Robert Collins
First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.
331
        for revision_key in garbage:
332
            mutter('Garbage inventory {%s} found.', revision_key[-1])
1570.1.4 by Robert Collins
Somewhat optimised version of reconciler.
333
1570.1.14 by Robert Collins
Enforce repository consistency during 'fetch' operations.
334
    def _parent_is_available(self, parent):
335
        """True if parent is a fully available revision
336
337
        A fully available revision has a inventory and a revision object in the
338
        repository.
339
        """
3350.6.4 by Robert Collins
First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.
340
        if parent in self._rev_graph:
341
            return True
342
        inv_present = (1 == len(self.inventory.get_parent_map([(parent,)])))
343
        return (inv_present and self.repo.has_revision(parent))
1570.1.14 by Robert Collins
Enforce repository consistency during 'fetch' operations.
344
1570.1.4 by Robert Collins
Somewhat optimised version of reconciler.
345
    def _reweave_step(self, message):
346
        """Mark a single step of regeneration complete."""
347
        self.pb.update(message, self.count, self.total)
348
        self.count += 1
1594.2.7 by Robert Collins
Add versionedfile.fix_parents api for correcting data post hoc.
349
350
351
class KnitReconciler(RepoReconciler):
352
    """Reconciler that reconciles a knit format repository.
353
2592.3.80 by Robert Collins
Make reconcile work, and pass tests.
354
    This will detect garbage inventories and remove them in thorough mode.
1594.2.7 by Robert Collins
Add versionedfile.fix_parents api for correcting data post hoc.
355
    """
356
357
    def _reconcile_steps(self):
358
        """Perform the steps to reconcile this repository."""
1692.1.1 by Robert Collins
* Repository.reconcile now takes a thorough keyword parameter to allow
359
        if self.thorough:
2819.2.5 by Andrew Bennetts
Make reconcile abort gracefully if the revision index has bad parents.
360
            try:
361
                self._load_indexes()
362
            except errors.BzrCheckError:
363
                self.aborted = True
364
                return
1692.1.1 by Robert Collins
* Repository.reconcile now takes a thorough keyword parameter to allow
365
            # knits never suffer this
366
            self._gc_inventory()
2745.6.13 by Aaron Bentley
Misc cleanup
367
            self._fix_text_parents()
1594.2.7 by Robert Collins
Add versionedfile.fix_parents api for correcting data post hoc.
368
369
    def _load_indexes(self):
370
        """Load indexes for the reconciliation."""
371
        self.transaction = self.repo.get_transaction()
6138.3.4 by Jonathan Riddell
add gettext() to uses of trace.note()
372
        self.pb.update(gettext('Reading indexes'), 0, 2)
3350.6.4 by Robert Collins
First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.
373
        self.inventory = self.repo.inventories
6138.3.4 by Jonathan Riddell
add gettext() to uses of trace.note()
374
        self.pb.update(gettext('Reading indexes'), 1, 2)
2819.2.5 by Andrew Bennetts
Make reconcile abort gracefully if the revision index has bad parents.
375
        self.repo._check_for_inconsistent_revision_parents()
3350.6.4 by Robert Collins
First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.
376
        self.revisions = self.repo.revisions
6138.3.4 by Jonathan Riddell
add gettext() to uses of trace.note()
377
        self.pb.update(gettext('Reading indexes'), 2, 2)
1594.2.7 by Robert Collins
Add versionedfile.fix_parents api for correcting data post hoc.
378
379
    def _gc_inventory(self):
380
        """Remove inventories that are not referenced from the revision store."""
6138.3.4 by Jonathan Riddell
add gettext() to uses of trace.note()
381
        self.pb.update(gettext('Checking unused inventories'), 0, 1)
1594.2.7 by Robert Collins
Add versionedfile.fix_parents api for correcting data post hoc.
382
        self._check_garbage_inventories()
6138.3.4 by Jonathan Riddell
add gettext() to uses of trace.note()
383
        self.pb.update(gettext('Checking unused inventories'), 1, 3)
1594.2.7 by Robert Collins
Add versionedfile.fix_parents api for correcting data post hoc.
384
        if not self.garbage_inventories:
6138.3.4 by Jonathan Riddell
add gettext() to uses of trace.note()
385
            ui.ui_factory.note(gettext('Inventory ok.'))
1594.2.7 by Robert Collins
Add versionedfile.fix_parents api for correcting data post hoc.
386
            return
6138.3.4 by Jonathan Riddell
add gettext() to uses of trace.note()
387
        self.pb.update(gettext('Backing up inventory'), 0, 0)
3350.6.4 by Robert Collins
First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.
388
        self.repo._backup_inventory()
6138.3.4 by Jonathan Riddell
add gettext() to uses of trace.note()
389
        ui.ui_factory.note(gettext('Backup Inventory created'))
1594.2.7 by Robert Collins
Add versionedfile.fix_parents api for correcting data post hoc.
390
        # asking for '' should never return a non-empty weave
3350.6.4 by Robert Collins
First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.
391
        new_inventories = self.repo._temp_inventories()
1594.2.7 by Robert Collins
Add versionedfile.fix_parents api for correcting data post hoc.
392
        # we have topological order of revisions and non ghost parents ready.
3350.6.4 by Robert Collins
First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.
393
        graph = self.revisions.get_parent_map(self.revisions.keys())
4577.2.4 by Maarten Bosmans
Make shure the faster topo_sort function is used where appropriate
394
        revision_keys = topo_sort(graph)
3350.6.4 by Robert Collins
First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.
395
        revision_ids = [key[-1] for key in revision_keys]
396
        self._setup_steps(len(revision_keys))
397
        stream = self._change_inv_parents(
3606.7.7 by John Arbash Meinel
Add tests for the fetching behavior.
398
            self.inventory.get_record_stream(revision_keys, 'unordered', True),
3350.6.4 by Robert Collins
First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.
399
            graph.__getitem__,
400
            set(revision_keys))
401
        new_inventories.insert_record_stream(stream)
1616.1.1 by Martin Pool
[merge] robertc
402
        # if this worked, the set of new_inventory_vf.names should equal
3350.6.4 by Robert Collins
First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.
403
        # the revisionds list
404
        if not(set(new_inventories.keys()) == set(revision_keys)):
3376.2.4 by Martin Pool
Remove every assert statement from bzrlib!
405
            raise AssertionError()
6138.3.4 by Jonathan Riddell
add gettext() to uses of trace.note()
406
        self.pb.update(gettext('Writing weave'))
3350.6.4 by Robert Collins
First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.
407
        self.repo._activate_new_inventory()
1594.2.7 by Robert Collins
Add versionedfile.fix_parents api for correcting data post hoc.
408
        self.inventory = None
6138.3.4 by Jonathan Riddell
add gettext() to uses of trace.note()
409
        ui.ui_factory.note(gettext('Inventory regenerated.'))
1594.2.7 by Robert Collins
Add versionedfile.fix_parents api for correcting data post hoc.
410
2745.6.11 by Aaron Bentley
Fix knit file parents to follow parentage from revision/inventory XML
411
    def _fix_text_parents(self):
2745.6.13 by Aaron Bentley
Misc cleanup
412
        """Fix bad versionedfile parent entries.
413
2745.6.16 by Aaron Bentley
Update from review
414
        It is possible for the parents entry in a versionedfile entry to be
2745.6.13 by Aaron Bentley
Misc cleanup
415
        inconsistent with the values in the revision and inventory.
416
417
        This method finds entries with such inconsistencies, corrects their
418
        parent lists, and replaces the versionedfile with a corrected version.
419
        """
2745.6.11 by Aaron Bentley
Fix knit file parents to follow parentage from revision/inventory XML
420
        transaction = self.repo.get_transaction()
3350.6.4 by Robert Collins
First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.
421
        versions = [key[-1] for key in self.revisions.keys()]
2927.2.2 by Andrew Bennetts
Only try to check versions that actually exist in the versioned file, and do a little more muttering.
422
        mutter('Prepopulating revision text cache with %d revisions',
423
                len(versions))
3036.1.3 by Robert Collins
Privatise VersionedFileChecker.
424
        vf_checker = self.repo._get_versioned_file_checker()
3350.6.4 by Robert Collins
First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.
425
        bad_parents, unused_versions = vf_checker.check_file_version_parents(
426
            self.repo.texts, self.pb)
427
        text_index = vf_checker.text_index
428
        per_id_bad_parents = {}
429
        for key in unused_versions:
430
            # Ensure that every file with unused versions gets rewritten.
431
            # NB: This is really not needed, reconcile != pack.
432
            per_id_bad_parents[key[0]] = {}
433
        # Generate per-knit/weave data.
434
        for key, details in bad_parents.iteritems():
435
            file_id = key[0]
436
            rev_id = key[1]
437
            knit_parents = tuple([parent[-1] for parent in details[0]])
438
            correct_parents = tuple([parent[-1] for parent in details[1]])
439
            file_details = per_id_bad_parents.setdefault(file_id, {})
440
            file_details[rev_id] = (knit_parents, correct_parents)
441
        file_id_versions = {}
442
        for text_key in text_index:
443
            versions_list = file_id_versions.setdefault(text_key[0], [])
444
            versions_list.append(text_key[1])
445
        # Do the reconcile of individual weaves.
446
        for num, file_id in enumerate(per_id_bad_parents):
6138.3.4 by Jonathan Riddell
add gettext() to uses of trace.note()
447
            self.pb.update(gettext('Fixing text parents'), num,
3350.6.4 by Robert Collins
First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.
448
                           len(per_id_bad_parents))
449
            versions_with_bad_parents = per_id_bad_parents[file_id]
450
            id_unused_versions = set(key[-1] for key in unused_versions
451
                if key[0] == file_id)
452
            if file_id in file_id_versions:
453
                file_versions = file_id_versions[file_id]
454
            else:
455
                # This id was present in the disk store but is not referenced
456
                # by any revision at all.
457
                file_versions = []
458
            self._fix_text_parent(file_id, versions_with_bad_parents,
459
                 id_unused_versions, file_versions)
2745.6.53 by Andrew Bennetts
Some more changes suggested by review.
460
3350.6.4 by Robert Collins
First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.
461
    def _fix_text_parent(self, file_id, versions_with_bad_parents,
462
            unused_versions, all_versions):
2745.6.53 by Andrew Bennetts
Some more changes suggested by review.
463
        """Fix bad versionedfile entries in a single versioned file."""
2927.2.2 by Andrew Bennetts
Only try to check versions that actually exist in the versioned file, and do a little more muttering.
464
        mutter('fixing text parent: %r (%d versions)', file_id,
465
                len(versions_with_bad_parents))
3350.6.4 by Robert Collins
First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.
466
        mutter('(%d are unused)', len(unused_versions))
467
        new_file_id = 'temp:%s' % file_id
2745.6.53 by Andrew Bennetts
Some more changes suggested by review.
468
        new_parents = {}
3350.6.4 by Robert Collins
First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.
469
        needed_keys = set()
470
        for version in all_versions:
2988.1.8 by Robert Collins
Change check and reconcile to use the new _generate_text_key_index rather
471
            if version in unused_versions:
472
                continue
473
            elif version in versions_with_bad_parents:
2745.6.53 by Andrew Bennetts
Some more changes suggested by review.
474
                parents = versions_with_bad_parents[version][1]
475
            else:
3350.6.4 by Robert Collins
First cut at pluralised VersionedFiles. Some rather massive API incompatabilities, primarily because of the difficulty of coherence among competing stores.
476
                pmap = self.repo.texts.get_parent_map([(file_id, version)])
477
                parents = [key[-1] for key in pmap[(file_id, version)]]
478
            new_parents[(new_file_id, version)] = [
479
                (new_file_id, parent) for parent in parents]
480
            needed_keys.add((file_id, version))
481
        def fix_parents(stream):
482
            for record in stream:
483
                bytes = record.get_bytes_as('fulltext')
484
                new_key = (new_file_id, record.key[-1])
485
                parents = new_parents[new_key]
486
                yield FulltextContentFactory(new_key, parents, record.sha1, bytes)
487
        stream = self.repo.texts.get_record_stream(needed_keys, 'topological', True)
488
        self.repo._remove_file_id(new_file_id)
489
        self.repo.texts.insert_record_stream(fix_parents(stream))
490
        self.repo._remove_file_id(file_id)
491
        if len(new_parents):
492
            self.repo._move_file_id(new_file_id, file_id)
2745.6.11 by Aaron Bentley
Fix knit file parents to follow parentage from revision/inventory XML
493
2592.3.80 by Robert Collins
Make reconcile work, and pass tests.
494
495
class PackReconciler(RepoReconciler):
496
    """Reconciler that reconciles a pack based repository.
497
498
    Garbage inventories do not affect ancestry queries, and removal is
499
    considerably more expensive as there is no separate versioned file for
500
    them, so they are not cleaned. In short it is currently a no-op.
501
502
    In future this may be a good place to hook in annotation cache checking,
503
    index recreation etc.
504
    """
505
2592.3.239 by Martin Pool
doc
506
    # XXX: The index corruption that _fix_text_parents performs is needed for
507
    # packs, but not yet implemented. The basic approach is to:
508
    #  - lock the names list
509
    #  - perform a customised pack() that regenerates data as needed
510
    #  - unlock the names list
5243.1.2 by Martin
Point launchpad links in comments at production server rather than edge
511
    # https://bugs.launchpad.net/bzr/+bug/154173
2592.3.239 by Martin Pool
doc
512
5375.1.3 by Andrew Bennetts
Add hidden --canonicalize-chks option to reconcile to trigger GCCHKCanonicalizingPacker, improve progress reporting a little.
513
    def __init__(self, repo, other=None, thorough=False,
514
            canonicalize_chks=False):
515
        super(PackReconciler, self).__init__(repo, other=other,
516
            thorough=thorough)
517
        self.canonicalize_chks = canonicalize_chks
518
2592.3.80 by Robert Collins
Make reconcile work, and pass tests.
519
    def _reconcile_steps(self):
520
        """Perform the steps to reconcile this repository."""
2951.1.2 by Robert Collins
Partial refactoring of pack_repo to create a Packer object for packing.
521
        if not self.thorough:
522
            return
2951.1.3 by Robert Collins
Partial support for native reconcile with packs.
523
        collection = self.repo._pack_collection
524
        collection.ensure_loaded()
525
        collection.lock_names()
4936.1.1 by Andrew Bennetts
Replace some fragile try/finally cleanups in bzrlib.reconcile with OperationWithCleanups (borrowing run_simple from command-cleanup branch).
526
        self.add_cleanup(collection._unlock_names)
527
        packs = collection.all_packs()
528
        all_revisions = self.repo.all_revision_ids()
529
        total_inventories = len(list(
530
            collection.inventory_index.combined_index.iter_all_entries()))
531
        if len(all_revisions):
5375.1.3 by Andrew Bennetts
Add hidden --canonicalize-chks option to reconcile to trigger GCCHKCanonicalizingPacker, improve progress reporting a little.
532
            if self.canonicalize_chks:
533
                reconcile_meth = self.repo._canonicalize_chks_pack
534
            else:
535
                reconcile_meth = self.repo._reconcile_pack
536
            new_pack = reconcile_meth(collection, packs, ".reconcile",
537
                all_revisions, self.pb)
4936.1.1 by Andrew Bennetts
Replace some fragile try/finally cleanups in bzrlib.reconcile with OperationWithCleanups (borrowing run_simple from command-cleanup branch).
538
            if new_pack is not None:
2951.1.10 by Robert Collins
Peer review feedback with Ian.
539
                self._discard_and_save(packs)
4936.1.1 by Andrew Bennetts
Replace some fragile try/finally cleanups in bzrlib.reconcile with OperationWithCleanups (borrowing run_simple from command-cleanup branch).
540
        else:
541
            # only make a new pack when there is data to copy.
542
            self._discard_and_save(packs)
543
        self.garbage_inventories = total_inventories - len(list(
544
            collection.inventory_index.combined_index.iter_all_entries()))
2951.1.3 by Robert Collins
Partial support for native reconcile with packs.
545
2951.1.10 by Robert Collins
Peer review feedback with Ian.
546
    def _discard_and_save(self, packs):
2951.1.3 by Robert Collins
Partial support for native reconcile with packs.
547
        """Discard some packs from the repository.
548
2951.1.10 by Robert Collins
Peer review feedback with Ian.
549
        This removes them from the memory index, saves the in-memory index
550
        which makes the newly reconciled pack visible and hides the packs to be
551
        discarded, and finally renames the packs being discarded into the
2951.1.3 by Robert Collins
Partial support for native reconcile with packs.
552
        obsolete packs directory.
2951.1.10 by Robert Collins
Peer review feedback with Ian.
553
2951.1.3 by Robert Collins
Partial support for native reconcile with packs.
554
        :param packs: The packs to discard.
555
        """
556
        for pack in packs:
557
            self.repo._pack_collection._remove_pack_from_memory(pack)
558
        self.repo._pack_collection._save_pack_names()
559
        self.repo._pack_collection._obsolete_packs(packs)