/brz/remove-bazaar

To get this branch, use:
bzr branch http://gegoxaren.bato24.eu/bzr/brz/remove-bazaar
2052.3.2 by John Arbash Meinel
Change Copyright .. by Canonical to Copyright ... Canonical
1
# Copyright (C) 2005, 2006 Canonical Ltd
1887.1.1 by Adeodato Simó
Do not separate paragraphs in the copyright statement with blank lines,
2
#
1 by mbp at sourcefrog
import from baz patch-364
3
# This program is free software; you can redistribute it and/or modify
4
# it under the terms of the GNU General Public License as published by
5
# the Free Software Foundation; either version 2 of the License, or
6
# (at your option) any later version.
1887.1.1 by Adeodato Simó
Do not separate paragraphs in the copyright statement with blank lines,
7
#
1 by mbp at sourcefrog
import from baz patch-364
8
# This program is distributed in the hope that it will be useful,
9
# but WITHOUT ANY WARRANTY; without even the implied warranty of
10
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
11
# GNU General Public License for more details.
1887.1.1 by Adeodato Simó
Do not separate paragraphs in the copyright statement with blank lines,
12
#
1 by mbp at sourcefrog
import from baz patch-364
13
# You should have received a copy of the GNU General Public License
14
# along with this program; if not, write to the Free Software
15
# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
16
1335 by Martin Pool
doc
17
# TODO: Check ancestries are correct for every revision: includes
18
# every committed so far, and in a reasonable order.
19
1347 by Martin Pool
- refactor check code into method object
20
# TODO: Also check non-mainline revisions mentioned as parents.
21
22
# TODO: Check for extra files in the control directory.
23
1348 by Martin Pool
- more refactoring of check code
24
# TODO: Check revision, inventory and entry objects have all 
25
# required fields.
26
1185.16.101 by mbp at sourcefrog
todo
27
# TODO: Get every revision in the revision-store even if they're not
28
# referenced by history and make sure they're all valid.
1347 by Martin Pool
- refactor check code into method object
29
1616.1.5 by Martin Pool
Cleanup and document some check code
30
# TODO: Perhaps have a way to record errors other than by raising exceptions;
31
# would perhaps be enough to accumulate exception objects in a list without
32
# raising them.  If there's more than one exception it'd be good to see them
33
# all.
34
3015.3.8 by Daniel Watkins
Added _scan_for_branches.
35
from bzrlib import errors, osutils
2745.6.16 by Aaron Bentley
Update from review
36
from bzrlib import repository as _mod_repository
2745.6.47 by Andrew Bennetts
Move check_parents out of VersionedFile.
37
from bzrlib import revision
3015.3.2 by Daniel Watkins
Check.check now takes a path rather than a branch.
38
from bzrlib.branch import Branch
1773.4.1 by Martin Pool
Add pyflakes makefile target; fix many warnings
39
from bzrlib.errors import BzrCheckError
3015.3.3 by Daniel Watkins
Added _check_repository.
40
from bzrlib.repository import Repository
1104 by Martin Pool
- Add a simple UIFactory
41
import bzrlib.ui
2745.6.61 by Andrew Bennetts
Remove some useless mutters, and cope with a file_id that isn't present in a revision's inventory.
42
from bzrlib.trace import log_error, note
3015.3.11 by Daniel Watkins
Move WT checking from builtins to check.
43
from bzrlib.workingtree import WorkingTree
1104 by Martin Pool
- Add a simple UIFactory
44
1347 by Martin Pool
- refactor check code into method object
45
class Check(object):
1732.2.4 by Martin Pool
Split check into Branch.check and Repository.check
46
    """Check a repository"""
1449 by Robert Collins
teach check about ghosts
47
1616.1.5 by Martin Pool
Cleanup and document some check code
48
    # The Check object interacts with InventoryEntry.check, etc.
49
1732.2.4 by Martin Pool
Split check into Branch.check and Repository.check
50
    def __init__(self, repository):
51
        self.repository = repository
1383 by Martin Pool
- untabify only
52
        self.checked_text_cnt = 0
53
        self.checked_rev_cnt = 0
1449 by Robert Collins
teach check about ghosts
54
        self.ghosts = []
1365 by Martin Pool
- try to avoid checking texts repeatedly
55
        self.repeated_text_cnt = 0
1449 by Robert Collins
teach check about ghosts
56
        self.missing_parent_links = {}
1348 by Martin Pool
- more refactoring of check code
57
        self.missing_inventory_sha_cnt = 0
58
        self.missing_revision_cnt = 0
1616.1.5 by Martin Pool
Cleanup and document some check code
59
        # maps (file-id, version) -> sha1; used by InventoryFile._check
1365 by Martin Pool
- try to avoid checking texts repeatedly
60
        self.checked_texts = {}
1185.50.28 by John Arbash Meinel
Lots of updates for 'bzr check'
61
        self.checked_weaves = {}
2988.1.8 by Robert Collins
Change check and reconcile to use the new _generate_text_key_index rather
62
        self.unreferenced_versions = set()
2745.6.33 by Andrew Bennetts
Add VersionedFile.check_parents, and use it instead of find_bad_ancestors in reconcile.
63
        self.inconsistent_parents = []
676 by Martin Pool
- lock branch while checking
64
1449 by Robert Collins
teach check about ghosts
65
    def check(self):
1732.2.4 by Martin Pool
Split check into Branch.check and Repository.check
66
        self.repository.lock_read()
1594.1.3 by Robert Collins
Fixup pb usage to use nested_progress_bar.
67
        self.progress = bzrlib.ui.ui_factory.nested_progress_bar()
1449 by Robert Collins
teach check about ghosts
68
        try:
2819.2.3 by Andrew Bennetts
Add test that repo.check will report on wrong parents in the revision graph.
69
            self.progress.update('retrieving inventory', 0, 2)
1510 by Robert Collins
Merge from mpool, adjusting check to retain HTTP support.
70
            # do not put in init, as it should be done with progess,
71
            # and inside the lock.
1732.2.4 by Martin Pool
Split check into Branch.check and Repository.check
72
            self.inventory_weave = self.repository.get_inventory_weave()
2819.2.3 by Andrew Bennetts
Add test that repo.check will report on wrong parents in the revision graph.
73
            self.progress.update('checking revision graph', 1)
74
            self.check_revision_graph()
1510 by Robert Collins
Merge from mpool, adjusting check to retain HTTP support.
75
            self.plan_revisions()
76
            revno = 0
77
            while revno < len(self.planned_revisions):
78
                rev_id = self.planned_revisions[revno]
79
                self.progress.update('checking revision', revno,
1449 by Robert Collins
teach check about ghosts
80
                                     len(self.planned_revisions))
1510 by Robert Collins
Merge from mpool, adjusting check to retain HTTP support.
81
                revno += 1
1449 by Robert Collins
teach check about ghosts
82
                self.check_one_rev(rev_id)
2745.6.16 by Aaron Bentley
Update from review
83
            # check_weaves is done after the revision scan so that
2988.1.8 by Robert Collins
Change check and reconcile to use the new _generate_text_key_index rather
84
            # revision index is known to be valid.
2745.6.3 by Aaron Bentley
Implement versionedfile checking for bzr check
85
            self.check_weaves()
1185.35.34 by Aaron Bentley
Made bzr check for stored revisions missing from ancestry
86
        finally:
1594.1.3 by Robert Collins
Fixup pb usage to use nested_progress_bar.
87
            self.progress.finished()
1732.2.4 by Martin Pool
Split check into Branch.check and Repository.check
88
            self.repository.unlock()
1449 by Robert Collins
teach check about ghosts
89
2819.2.3 by Andrew Bennetts
Add test that repo.check will report on wrong parents in the revision graph.
90
    def check_revision_graph(self):
2819.2.4 by Andrew Bennetts
Add a 'revision_graph_can_have_wrong_parents' method to repository.
91
        if not self.repository.revision_graph_can_have_wrong_parents():
92
            # This check is not necessary.
2819.2.3 by Andrew Bennetts
Add test that repo.check will report on wrong parents in the revision graph.
93
            self.revs_with_bad_parents_in_index = None
94
            return
95
        bad_revisions = self.repository._find_inconsistent_revision_parents()
96
        self.revs_with_bad_parents_in_index = list(bad_revisions)
97
1510 by Robert Collins
Merge from mpool, adjusting check to retain HTTP support.
98
    def plan_revisions(self):
1732.2.4 by Martin Pool
Split check into Branch.check and Repository.check
99
        repository = self.repository
2745.6.3 by Aaron Bentley
Implement versionedfile checking for bzr check
100
        self.planned_revisions = repository.all_revision_ids()
1563.2.22 by Robert Collins
Move responsibility for repository.has_revision into RevisionStore
101
        self.progress.clear()
1563.2.35 by Robert Collins
cleanup deprecation warnings and finish conversion so the inventory is knit based too.
102
        inventoried = set(self.inventory_weave.versions())
2745.6.3 by Aaron Bentley
Implement versionedfile checking for bzr check
103
        awol = set(self.planned_revisions) - inventoried
1510 by Robert Collins
Merge from mpool, adjusting check to retain HTTP support.
104
        if len(awol) > 0:
105
            raise BzrCheckError('Stored revisions missing from inventory'
106
                '{%s}' % ','.join([f for f in awol]))
107
1449 by Robert Collins
teach check about ghosts
108
    def report_results(self, verbose):
1732.2.4 by Martin Pool
Split check into Branch.check and Repository.check
109
        note('checked repository %s format %s',
110
             self.repository.bzrdir.root_transport,
111
             self.repository._format)
1365 by Martin Pool
- try to avoid checking texts repeatedly
112
        note('%6d revisions', self.checked_rev_cnt)
2745.6.47 by Andrew Bennetts
Move check_parents out of VersionedFile.
113
        note('%6d file-ids', len(self.checked_weaves))
1365 by Martin Pool
- try to avoid checking texts repeatedly
114
        note('%6d unique file texts', self.checked_text_cnt)
115
        note('%6d repeated file texts', self.repeated_text_cnt)
2988.1.8 by Robert Collins
Change check and reconcile to use the new _generate_text_key_index rather
116
        note('%6d unreferenced text versions',
117
             len(self.unreferenced_versions))
1348 by Martin Pool
- more refactoring of check code
118
        if self.missing_inventory_sha_cnt:
1449 by Robert Collins
teach check about ghosts
119
            note('%6d revisions are missing inventory_sha1',
1383 by Martin Pool
- untabify only
120
                 self.missing_inventory_sha_cnt)
1348 by Martin Pool
- more refactoring of check code
121
        if self.missing_revision_cnt:
1449 by Robert Collins
teach check about ghosts
122
            note('%6d revisions are mentioned but not present',
1383 by Martin Pool
- untabify only
123
                 self.missing_revision_cnt)
1449 by Robert Collins
teach check about ghosts
124
        if len(self.ghosts):
125
            note('%6d ghost revisions', len(self.ghosts))
126
            if verbose:
127
                for ghost in self.ghosts:
128
                    note('      %s', ghost)
129
        if len(self.missing_parent_links):
1732.2.4 by Martin Pool
Split check into Branch.check and Repository.check
130
            note('%6d revisions missing parents in ancestry',
1449 by Robert Collins
teach check about ghosts
131
                 len(self.missing_parent_links))
132
            if verbose:
133
                for link, linkers in self.missing_parent_links.items():
134
                    note('      %s should be in the ancestry for:', link)
135
                    for linker in linkers:
136
                        note('       * %s', linker)
2745.6.6 by Aaron Bentley
Add unreferenced ancestors to check output
137
            if verbose:
2988.1.8 by Robert Collins
Change check and reconcile to use the new _generate_text_key_index rather
138
                for file_id, revision_id in self.unreferenced_versions:
139
                    log_error('unreferenced version: {%s} in %s', revision_id,
2745.6.6 by Aaron Bentley
Add unreferenced ancestors to check output
140
                        file_id)
2745.6.39 by Andrew Bennetts
Use scenario in test_check too, and make check actually report inconsistent parents to the end user.
141
        if len(self.inconsistent_parents):
142
            note('%6d inconsistent parents', len(self.inconsistent_parents))
143
            if verbose:
144
                for info in self.inconsistent_parents:
145
                    revision_id, file_id, found_parents, correct_parents = info
146
                    note('      * %s version %s has parents %r '
147
                         'but should have %r'
148
                         % (file_id, revision_id, found_parents,
149
                             correct_parents))
2819.2.3 by Andrew Bennetts
Add test that repo.check will report on wrong parents in the revision graph.
150
        if self.revs_with_bad_parents_in_index:
151
            note('%6d revisions have incorrect parents in the revision index',
152
                 len(self.revs_with_bad_parents_in_index))
153
            if verbose:
154
                for item in self.revs_with_bad_parents_in_index:
155
                    revision_id, index_parents, actual_parents = item
156
                    note(
157
                        '       %s has wrong parents in index: '
158
                        '%r should be %r',
159
                        revision_id, index_parents, actual_parents)
1449 by Robert Collins
teach check about ghosts
160
161
    def check_one_rev(self, rev_id):
1383 by Martin Pool
- untabify only
162
        """Check one revision.
163
164
        rev_id - the one to check
165
        """
1732.2.4 by Martin Pool
Split check into Branch.check and Repository.check
166
        rev = self.repository.get_revision(rev_id)
1449 by Robert Collins
teach check about ghosts
167
                
1383 by Martin Pool
- untabify only
168
        if rev.revision_id != rev_id:
169
            raise BzrCheckError('wrong internal revision id in revision {%s}'
170
                                % rev_id)
171
1732.2.4 by Martin Pool
Split check into Branch.check and Repository.check
172
        for parent in rev.parent_ids:
173
            if not parent in self.planned_revisions:
174
                missing_links = self.missing_parent_links.get(parent, [])
175
                missing_links.append(rev_id)
176
                self.missing_parent_links[parent] = missing_links
177
                # list based so somewhat slow,
178
                # TODO have a planned_revisions list and set.
179
                if self.repository.has_revision(parent):
180
                    missing_ancestry = self.repository.get_ancestry(parent)
181
                    for missing in missing_ancestry:
182
                        if (missing is not None 
183
                            and missing not in self.planned_revisions):
184
                            self.planned_revisions.append(missing)
1449 by Robert Collins
teach check about ghosts
185
                else:
1732.2.4 by Martin Pool
Split check into Branch.check and Repository.check
186
                    self.ghosts.append(rev_id)
1383 by Martin Pool
- untabify only
187
188
        if rev.inventory_sha1:
1732.2.4 by Martin Pool
Split check into Branch.check and Repository.check
189
            inv_sha1 = self.repository.get_inventory_sha1(rev_id)
1383 by Martin Pool
- untabify only
190
            if inv_sha1 != rev.inventory_sha1:
191
                raise BzrCheckError('Inventory sha1 hash doesn\'t match'
192
                    ' value in revision {%s}' % rev_id)
193
        self._check_revision_tree(rev_id)
1362 by Martin Pool
- keep track of number of checked revisions
194
        self.checked_rev_cnt += 1
1349 by Martin Pool
- more refactoring of check code
195
1185.50.28 by John Arbash Meinel
Lots of updates for 'bzr check'
196
    def check_weaves(self):
197
        """Check all the weaves we can get our hands on.
198
        """
199
        n_weaves = 1
200
        weave_ids = []
1732.2.4 by Martin Pool
Split check into Branch.check and Repository.check
201
        if self.repository.weave_store.listable():
202
            weave_ids = list(self.repository.weave_store)
2592.3.63 by Robert Collins
Supply weave_store.__iter__ for compatibility with check.
203
            n_weaves = len(weave_ids) + 1
2745.6.8 by Aaron Bentley
Clean up text
204
        self.progress.update('checking versionedfile', 0, n_weaves)
1185.50.28 by John Arbash Meinel
Lots of updates for 'bzr check'
205
        self.inventory_weave.check(progress_bar=self.progress)
2745.6.33 by Andrew Bennetts
Add VersionedFile.check_parents, and use it instead of find_bad_ancestors in reconcile.
206
        files_in_revisions = {}
207
        revisions_of_files = {}
3036.1.3 by Robert Collins
Privatise VersionedFileChecker.
208
        weave_checker = self.repository._get_versioned_file_checker()
1185.50.28 by John Arbash Meinel
Lots of updates for 'bzr check'
209
        for i, weave_id in enumerate(weave_ids):
2745.6.8 by Aaron Bentley
Clean up text
210
            self.progress.update('checking versionedfile', i, n_weaves)
1732.2.4 by Martin Pool
Split check into Branch.check and Repository.check
211
            w = self.repository.weave_store.get_weave(weave_id,
212
                    self.repository.get_transaction())
1185.50.28 by John Arbash Meinel
Lots of updates for 'bzr check'
213
            # No progress here, because it looks ugly.
214
            w.check()
3036.1.2 by Robert Collins
Simplify the check_file_version_parents API some more. This has already changed in this release cycle.
215
            result = weave_checker.check_file_version_parents(w, weave_id)
2988.1.8 by Robert Collins
Change check and reconcile to use the new _generate_text_key_index rather
216
            bad_parents, unused_versions = result
2927.2.6 by Andrew Bennetts
Make some more check tests pass.
217
            bad_parents = bad_parents.items()
2988.1.8 by Robert Collins
Change check and reconcile to use the new _generate_text_key_index rather
218
            for revision_id, (weave_parents, correct_parents) in bad_parents:
2745.6.33 by Andrew Bennetts
Add VersionedFile.check_parents, and use it instead of find_bad_ancestors in reconcile.
219
                self.inconsistent_parents.append(
220
                    (revision_id, weave_id, weave_parents, correct_parents))
2988.1.8 by Robert Collins
Change check and reconcile to use the new _generate_text_key_index rather
221
            for revision_id in unused_versions:
222
                self.unreferenced_versions.add((weave_id, revision_id))
1185.50.28 by John Arbash Meinel
Lots of updates for 'bzr check'
223
            self.checked_weaves[weave_id] = True
224
1349 by Martin Pool
- more refactoring of check code
225
    def _check_revision_tree(self, rev_id):
1732.2.4 by Martin Pool
Split check into Branch.check and Repository.check
226
        tree = self.repository.revision_tree(rev_id)
1383 by Martin Pool
- untabify only
227
        inv = tree.inventory
228
        seen_ids = {}
229
        for file_id in inv:
230
            if file_id in seen_ids:
231
                raise BzrCheckError('duplicated file_id {%s} '
232
                                    'in inventory for revision {%s}'
233
                                    % (file_id, rev_id))
234
            seen_ids[file_id] = True
235
        for file_id in inv:
1092.2.20 by Robert Collins
symlink and weaves, whaddya know
236
            ie = inv[file_id]
237
            ie.check(self, rev_id, inv, tree)
1383 by Martin Pool
- untabify only
238
        seen_names = {}
239
        for path, ie in inv.iter_entries():
240
            if path in seen_names:
241
                raise BzrCheckError('duplicated path %s '
242
                                    'in inventory for revision {%s}'
243
                                    % (path, rev_id))
244
            seen_names[path] = True
1349 by Martin Pool
- more refactoring of check code
245
1347 by Martin Pool
- refactor check code into method object
246
3015.3.7 by Daniel Watkins
Fixed failing tests.
247
def check_branch(branch, verbose):
1732.2.4 by Martin Pool
Split check into Branch.check and Repository.check
248
    """Run consistency checks on a branch.
249
    
250
    Results are reported through logging.
251
    
252
    :raise BzrCheckError: if there's a consistency error.
253
    """
254
    branch.lock_read()
255
    try:
256
        branch_result = branch.check()
257
    finally:
258
        branch.unlock()
259
    branch_result.report_results(verbose)
2745.6.47 by Andrew Bennetts
Move check_parents out of VersionedFile.
260
261
3015.3.11 by Daniel Watkins
Move WT checking from builtins to check.
262
def _check_working_tree(tree):
263
    # bit hacky, check the tree parent is accurate
264
    tree.lock_read()
265
    try:
266
        tree_basis = tree.basis_tree()
267
        tree_basis.lock_read()
268
        try:
269
            repo_basis = tree.branch.repository.revision_tree(
270
                tree.last_revision())
271
            if len(list(repo_basis._iter_changes(tree_basis))):
272
                raise errors.BzrCheckError(
273
                    "Mismatched basis inventory content.")
274
            tree._validate()
275
        finally:
276
            tree_basis.unlock()
277
    finally:
278
        tree.unlock()
279
3015.3.23 by Daniel Watkins
Abstracted discovery of elements away.
280
def _get_elements(path):
3015.3.4 by Daniel Watkins
If not in a branch or a repo, that check is simply skipped.
281
    try:
3015.3.23 by Daniel Watkins
Abstracted discovery of elements away.
282
        tree = WorkingTree.open(path)
3015.3.17 by Daniel Watkins
Consolidated WT checking.
283
    except (errors.NoWorkingTree, errors.NotLocalUrl):
284
        tree = None
3015.3.4 by Daniel Watkins
If not in a branch or a repo, that check is simply skipped.
285
    except errors.NotBranchError:
3015.3.18 by Daniel Watkins
Improved errors.
286
        raise errors.NotVersionedError(path)
3015.3.4 by Daniel Watkins
If not in a branch or a repo, that check is simply skipped.
287
288
    try:
289
        repo = Repository.open(path)
290
    except errors.NoRepositoryPresent:
291
        repo = None
3015.3.18 by Daniel Watkins
Improved errors.
292
    except errors.NotBranchError:
293
        raise errors.NotVersionedError(path)
3015.3.4 by Daniel Watkins
If not in a branch or a repo, that check is simply skipped.
294
3015.3.18 by Daniel Watkins
Improved errors.
295
    try:
296
        branch = Branch.open_containing(path)[0]
3015.3.20 by Daniel Watkins
Made code path a little clearer.
297
    except errors.NotBranchError:
298
        branch = None
299
3015.3.23 by Daniel Watkins
Abstracted discovery of elements away.
300
    return tree, repo, branch
301
302
3015.4.2 by Daniel Watkins
Made UI changes to include CLI options.
303
def check_dwim(path, verbose, do_branch=True, do_repo=True, do_tree=True):
3015.3.23 by Daniel Watkins
Abstracted discovery of elements away.
304
    tree, repo, branch = _get_elements(path)
305
306
    if tree is not None:
3015.3.24 by Daniel Watkins
Added indication of what is being checked.
307
        note("Checking working tree at '%s'." 
308
             % (tree.bzrdir.root_transport.base,))
3015.3.23 by Daniel Watkins
Abstracted discovery of elements away.
309
        _check_working_tree(tree)
310
3015.3.20 by Daniel Watkins
Made code path a little clearer.
311
    if branch is not None:
312
        # We have a branch
3015.3.5 by Daniel Watkins
Removed needless duplication of repository checks.
313
        if repo is None:
3015.3.10 by Daniel Watkins
Reorganised comments.
314
            # The branch is in a shared repository
3015.3.5 by Daniel Watkins
Removed needless duplication of repository checks.
315
            repo = branch.repository
3015.3.18 by Daniel Watkins
Improved errors.
316
        branches = [branch]
3015.3.20 by Daniel Watkins
Made code path a little clearer.
317
    else:
318
        if repo is not None:
319
            branches = repo.find_branches(using=True)
3015.3.9 by Daniel Watkins
Scan for branches and check them.
320
3015.3.4 by Daniel Watkins
If not in a branch or a repo, that check is simply skipped.
321
    if repo is not None:
3015.3.21 by Daniel Watkins
Fixed misused 'repository'.
322
        repo.lock_read()
3015.3.19 by Daniel Watkins
Repositories are now held read-locked for as long as possible.
323
        try:
3015.3.24 by Daniel Watkins
Added indication of what is being checked.
324
            note("Checking repository at '%s'."
325
                 % (repo.bzrdir.root_transport.base,))
3015.3.21 by Daniel Watkins
Fixed misused 'repository'.
326
            result = repo.check()
3015.3.19 by Daniel Watkins
Repositories are now held read-locked for as long as possible.
327
            result.report_results(verbose)
328
            for branch in branches:
3015.3.24 by Daniel Watkins
Added indication of what is being checked.
329
                note("Checking branch at '%s'."
330
                     % (branch.bzrdir.root_transport.base,))
3015.3.19 by Daniel Watkins
Repositories are now held read-locked for as long as possible.
331
                check_branch(branch, verbose)
332
        finally:
3015.3.21 by Daniel Watkins
Fixed misused 'repository'.
333
            repo.unlock()