/brz/remove-bazaar

To get this branch, use:
bzr branch http://gegoxaren.bato24.eu/bzr/brz/remove-bazaar
3246.2.2 by Martin Pool
Fix up import of tsort
1
# Copyright (C) 2005, 2006, 2008 Canonical Ltd
1185.16.113 by mbp at sourcefrog
Add topo_sort utility function
2
#
3
# This program is free software; you can redistribute it and/or modify
4
# it under the terms of the GNU General Public License as published by
5
# the Free Software Foundation; either version 2 of the License, or
6
# (at your option) any later version.
7
#
8
# This program is distributed in the hope that it will be useful,
9
# but WITHOUT ANY WARRANTY; without even the implied warranty of
10
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
11
# GNU General Public License for more details.
12
#
13
# You should have received a copy of the GNU General Public License
14
# along with this program; if not, write to the Free Software
4183.7.1 by Sabin Iacob
update FSF mailing address
15
# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
1185.16.113 by mbp at sourcefrog
Add topo_sort utility function
16
1570.1.7 by Robert Collins
Replace the slow topo_sort routine with a much faster one for non trivial datasets.
17
18
"""Topological sorting routines."""
19
20
2425.4.2 by John Arbash Meinel
Change valid self._foo variables into local variables.
21
from bzrlib import errors
3246.2.2 by Martin Pool
Fix up import of tsort
22
import bzrlib.revision as _mod_revision
1570.1.7 by Robert Collins
Replace the slow topo_sort routine with a much faster one for non trivial datasets.
23
1185.16.114 by mbp at sourcefrog
Improved topological sort
24
1624.1.2 by Robert Collins
Add MergeSort facility to bzrlib.tsort.
25
__all__ = ["topo_sort", "TopoSorter", "merge_sort", "MergeSorter"]
26
27
1185.16.114 by mbp at sourcefrog
Improved topological sort
28
def topo_sort(graph):
1185.16.113 by mbp at sourcefrog
Add topo_sort utility function
29
    """Topological sort a graph.
30
1185.16.114 by mbp at sourcefrog
Improved topological sort
31
    graph -- sequence of pairs of node->parents_list.
32
33
    The result is a list of node names, such that all parents come before
34
    their children.
35
1185.16.113 by mbp at sourcefrog
Add topo_sort utility function
36
    node identifiers can be any hashable object, and are typically strings.
4577.2.3 by Maarten Bosmans
tsort.topo_sort: new algorithm, improving performance
37
38
    This function has the same purpose as the TopoSorter class, but uses a
39
    different algorithm to sort the graph. That means that while both return a list
40
    with parents before their child nodes, the exact ordering can be different.
41
42
    topo_sort is faster when the whole list is needed, while when iterating over a
43
    part of the list, TopoSorter.iter_topo_order should be used.
1185.16.113 by mbp at sourcefrog
Add topo_sort utility function
44
    """
4577.2.3 by Maarten Bosmans
tsort.topo_sort: new algorithm, improving performance
45
    # store a dict of the graph.
46
    graph = dict(graph)
47
    # this is the stack storing on which the sorted nodes are pushed.
48
    node_name_stack = []
49
50
    # count the number of children for every node in the graph
51
    nchildren = dict.fromkeys(graph.iterkeys(), 0)
52
    for parents in graph.itervalues():
53
        for parent in parents:
54
            if parent in nchildren:
55
                nchildren[parent] += 1
56
    # keep track of nodes without children in a separate list
57
    nochildnodes = [node for (node, n) in nchildren.iteritems() if n == 0]
58
59
    while nochildnodes:
60
        # pick a node without a child and add it to the stack.
61
        node_name = nochildnodes.pop()
62
        node_name_stack.append(node_name)
63
64
        # the parents of the node lose it as a child; if it was the last
65
        # child, add the parent to the list of childless nodes.
66
        parents = graph.pop(node_name)
67
        for parent in parents:
68
            if parent in nchildren:
69
                nchildren[parent] -= 1
70
                if nchildren[parent] == 0:
71
                    nochildnodes.append(parent)
72
73
    # if there are still nodes left in the graph,
74
    # that means that there is a cycle
75
    if graph:
76
        raise errors.GraphCycleError(graph)
77
78
    # the nodes where pushed on the stack child first, so this list needs to be
79
    # reversed before returning it.
80
    node_name_stack.reverse()
81
    return node_name_stack
1570.1.7 by Robert Collins
Replace the slow topo_sort routine with a much faster one for non trivial datasets.
82
83
84
class TopoSorter(object):
85
86
    def __init__(self, graph):
87
        """Topological sorting of a graph.
3943.8.1 by Marius Kruger
remove all trailing whitespace from bzr source
88
1570.1.7 by Robert Collins
Replace the slow topo_sort routine with a much faster one for non trivial datasets.
89
        :param graph: sequence of pairs of node_name->parent_names_list.
90
                      i.e. [('C', ['B']), ('B', ['A']), ('A', [])]
91
                      For this input the output from the sort or
92
                      iter_topo_order routines will be:
93
                      'A', 'B', 'C'
3943.8.1 by Marius Kruger
remove all trailing whitespace from bzr source
94
1570.1.7 by Robert Collins
Replace the slow topo_sort routine with a much faster one for non trivial datasets.
95
        node identifiers can be any hashable object, and are typically strings.
96
1587.1.2 by Robert Collins
Review comments for reconcile.
97
        If you have a graph like [('a', ['b']), ('a', ['c'])] this will only use
98
        one of the two values for 'a'.
99
1570.1.7 by Robert Collins
Replace the slow topo_sort routine with a much faster one for non trivial datasets.
100
        The graph is sorted lazily: until you iterate or sort the input is
101
        not processed other than to create an internal representation.
102
3943.8.1 by Marius Kruger
remove all trailing whitespace from bzr source
103
        iteration or sorting may raise GraphCycleError if a cycle is present
1570.1.7 by Robert Collins
Replace the slow topo_sort routine with a much faster one for non trivial datasets.
104
        in the graph.
105
        """
4577.2.2 by Maarten Bosmans
tsort.TopoSorter: performance improvements
106
        # store a dict of the graph.
1570.1.7 by Robert Collins
Replace the slow topo_sort routine with a much faster one for non trivial datasets.
107
        self._graph = dict(graph)
108
109
    def sorted(self):
110
        """Sort the graph and return as a list.
3943.8.1 by Marius Kruger
remove all trailing whitespace from bzr source
111
1570.1.7 by Robert Collins
Replace the slow topo_sort routine with a much faster one for non trivial datasets.
112
        After calling this the sorter is empty and you must create a new one.
113
        """
114
        return list(self.iter_topo_order())
115
116
###        Useful if fiddling with this code.
117
###        # cross check
118
###        sorted_names = list(self.iter_topo_order())
119
###        for index in range(len(sorted_names)):
120
###            rev = sorted_names[index]
121
###            for left_index in range(index):
122
###                if rev in self.original_graph[sorted_names[left_index]]:
123
###                    print "revision in parent list of earlier revision"
124
###                    import pdb;pdb.set_trace()
125
126
    def iter_topo_order(self):
1587.1.3 by Robert Collins
Typos for reconcile - docstring in tsort.py was out of sync with code.
127
        """Yield the nodes of the graph in a topological order.
3943.8.1 by Marius Kruger
remove all trailing whitespace from bzr source
128
1570.1.7 by Robert Collins
Replace the slow topo_sort routine with a much faster one for non trivial datasets.
129
        After finishing iteration the sorter is empty and you cannot continue
130
        iteration.
131
        """
4577.2.2 by Maarten Bosmans
tsort.TopoSorter: performance improvements
132
        graph = self._graph
133
        visitable = set(graph)
134
135
        # this is a stack storing the depth first search into the graph.
136
        pending_node_stack = []
137
        # at each level of 'recursion' we have to check each parent. This
138
        # stack stores the parents we have not yet checked for the node at the
139
        # matching depth in pending_node_stack
140
        pending_parents_stack = []
141
142
        # this is a set of the completed nodes for fast checking whether a
143
        # parent in a node we are processing on the stack has already been
144
        # emitted and thus can be skipped.
145
        completed_node_names = set()
146
147
        while graph:
1570.1.7 by Robert Collins
Replace the slow topo_sort routine with a much faster one for non trivial datasets.
148
            # now pick a random node in the source graph, and transfer it to the
4577.2.2 by Maarten Bosmans
tsort.TopoSorter: performance improvements
149
            # top of the depth first search stack of pending nodes.
150
            node_name, parents = graph.popitem()
151
            pending_node_stack.append(node_name)
152
            pending_parents_stack.append(list(parents))
153
154
            # loop until pending_node_stack is empty
155
            while pending_node_stack:
156
                parents_to_visit = pending_parents_stack[-1]
157
                # if there are no parents left, the revision is done
1570.1.7 by Robert Collins
Replace the slow topo_sort routine with a much faster one for non trivial datasets.
158
                if not parents_to_visit:
159
                    # append the revision to the topo sorted list
4577.2.2 by Maarten Bosmans
tsort.TopoSorter: performance improvements
160
                    # all the nodes parents have been added to the output,
161
                    # now we can add it to the output.
162
                    popped_node = pending_node_stack.pop()
163
                    pending_parents_stack.pop()
164
                    completed_node_names.add(popped_node)
165
                    yield popped_node
1570.1.7 by Robert Collins
Replace the slow topo_sort routine with a much faster one for non trivial datasets.
166
                else:
4577.2.2 by Maarten Bosmans
tsort.TopoSorter: performance improvements
167
                    # recurse depth first into a single parent
168
                    next_node_name = parents_to_visit.pop()
169
170
                    if next_node_name in completed_node_names:
171
                        # parent was already completed by a child, skip it.
172
                        continue
173
                    if next_node_name not in visitable:
174
                        # parent is not a node in the original graph, skip it.
175
                        continue
176
177
                    # transfer it along with its parents from the source graph
178
                    # into the top of the current depth first search stack.
179
                    try:
180
                        parents = graph.pop(next_node_name)
181
                    except KeyError:
182
                        # if the next node is not in the source graph it has
183
                        # already been popped from it and placed into the
184
                        # current search stack (but not completed or we would
185
                        # have hit the continue 6 lines up).  this indicates a
186
                        # cycle.
187
                        raise errors.GraphCycleError(pending_node_stack)
188
                    pending_node_stack.append(next_node_name)
189
                    pending_parents_stack.append(list(parents))
1624.1.2 by Robert Collins
Add MergeSort facility to bzrlib.tsort.
190
191
1988.4.1 by Robert Collins
bzrlib.tsort.merge_sorted now accepts 'generate_revnos'. This parameter
192
def merge_sort(graph, branch_tip, mainline_revisions=None, generate_revno=False):
1624.1.2 by Robert Collins
Add MergeSort facility to bzrlib.tsort.
193
    """Topological sort a graph which groups merges.
194
195
    :param graph: sequence of pairs of node->parents_list.
3943.8.1 by Marius Kruger
remove all trailing whitespace from bzr source
196
    :param branch_tip: the tip of the branch to graph. Revisions not
1624.1.2 by Robert Collins
Add MergeSort facility to bzrlib.tsort.
197
                       reachable from branch_tip are not included in the
198
                       output.
1624.1.3 by Robert Collins
Convert log to use the new tsort.merge_sort routine.
199
    :param mainline_revisions: If not None this forces a mainline to be
200
                               used rather than synthesised from the graph.
201
                               This must be a valid path through some part
202
                               of the graph. If the mainline does not cover all
203
                               the revisions, output stops at the start of the
204
                               old revision listed in the mainline revisions
205
                               list.
206
                               The order for this parameter is oldest-first.
1988.4.4 by Robert Collins
Tidy up the patch.
207
    :param generate_revno: Optional parameter controlling the generation of
208
        revision number sequences in the output. See the output description of
209
        the MergeSorter docstring for details.
210
    :result: See the MergeSorter docstring for details.
1624.1.2 by Robert Collins
Add MergeSort facility to bzrlib.tsort.
211
    node identifiers can be any hashable object, and are typically strings.
212
    """
1988.4.1 by Robert Collins
bzrlib.tsort.merge_sorted now accepts 'generate_revnos'. This parameter
213
    return MergeSorter(graph, branch_tip, mainline_revisions,
214
        generate_revno).sorted()
1624.1.2 by Robert Collins
Add MergeSort facility to bzrlib.tsort.
215
216
217
class MergeSorter(object):
218
2425.4.1 by John Arbash Meinel
Use __slots__ for MergeSorter
219
    __slots__ = ['_node_name_stack',
220
                 '_node_merge_depth_stack',
221
                 '_pending_parents_stack',
3170.3.2 by John Arbash Meinel
Implement the new dotted revision numbering.
222
                 '_first_child_stack',
2425.4.1 by John Arbash Meinel
Use __slots__ for MergeSorter
223
                 '_left_subtree_pushed_stack',
224
                 '_generate_revno',
225
                 '_graph',
226
                 '_mainline_revisions',
227
                 '_stop_revision',
228
                 '_original_graph',
229
                 '_revnos',
3170.3.2 by John Arbash Meinel
Implement the new dotted revision numbering.
230
                 '_revno_to_branch_count',
2425.4.1 by John Arbash Meinel
Use __slots__ for MergeSorter
231
                 '_completed_node_names',
232
                 '_scheduled_nodes',
233
                ]
234
1988.4.1 by Robert Collins
bzrlib.tsort.merge_sorted now accepts 'generate_revnos'. This parameter
235
    def __init__(self, graph, branch_tip, mainline_revisions=None,
236
        generate_revno=False):
1624.1.2 by Robert Collins
Add MergeSort facility to bzrlib.tsort.
237
        """Merge-aware topological sorting of a graph.
3943.8.1 by Marius Kruger
remove all trailing whitespace from bzr source
238
1624.1.2 by Robert Collins
Add MergeSort facility to bzrlib.tsort.
239
        :param graph: sequence of pairs of node_name->parent_names_list.
240
                      i.e. [('C', ['B']), ('B', ['A']), ('A', [])]
241
                      For this input the output from the sort or
242
                      iter_topo_order routines will be:
243
                      'A', 'B', 'C'
3943.8.1 by Marius Kruger
remove all trailing whitespace from bzr source
244
        :param branch_tip: the tip of the branch to graph. Revisions not
1624.1.3 by Robert Collins
Convert log to use the new tsort.merge_sort routine.
245
                       reachable from branch_tip are not included in the
246
                       output.
247
        :param mainline_revisions: If not None this forces a mainline to be
248
                               used rather than synthesised from the graph.
249
                               This must be a valid path through some part
250
                               of the graph. If the mainline does not cover all
251
                               the revisions, output stops at the start of the
252
                               old revision listed in the mainline revisions
253
                               list.
254
                               The order for this parameter is oldest-first.
1988.4.4 by Robert Collins
Tidy up the patch.
255
        :param generate_revno: Optional parameter controlling the generation of
1988.4.1 by Robert Collins
bzrlib.tsort.merge_sorted now accepts 'generate_revnos'. This parameter
256
            revision number sequences in the output. See the output description
257
            for more details.
258
1988.4.4 by Robert Collins
Tidy up the patch.
259
        The result is a list sorted so that all parents come before
1988.4.1 by Robert Collins
bzrlib.tsort.merge_sorted now accepts 'generate_revnos'. This parameter
260
        their children. Each element of the list is a tuple containing:
261
        (sequence_number, node_name, merge_depth, end_of_merge)
3943.8.1 by Marius Kruger
remove all trailing whitespace from bzr source
262
         * sequence_number: The sequence of this row in the output. Useful for
1988.4.1 by Robert Collins
bzrlib.tsort.merge_sorted now accepts 'generate_revnos'. This parameter
263
           GUIs.
1988.4.4 by Robert Collins
Tidy up the patch.
264
         * node_name: The node name: opaque text to the merge routine.
1988.4.1 by Robert Collins
bzrlib.tsort.merge_sorted now accepts 'generate_revnos'. This parameter
265
         * merge_depth: How many levels of merging deep this node has been
266
           found.
267
         * revno_sequence: When requested this field provides a sequence of
268
             revision numbers for all revisions. The format is:
3170.3.7 by John Arbash Meinel
update MergeSorter.__init__() docstring
269
             (REVNO, BRANCHNUM, BRANCHREVNO). BRANCHNUM is the number of the
1988.4.1 by Robert Collins
bzrlib.tsort.merge_sorted now accepts 'generate_revnos'. This parameter
270
             branch that the revno is on. From left to right the REVNO numbers
271
             are the sequence numbers within that branch of the revision.
272
             For instance, the graph {A:[], B:['A'], C:['A', 'B']} will get
273
             the following revno_sequences assigned: A:(1,), B:(1,1,1), C:(2,).
274
             This should be read as 'A is the first commit in the trunk',
275
             'B is the first commit on the first branch made from A', 'C is the
276
             second commit in the trunk'.
277
         * end_of_merge: When True the next node is part of a different merge.
1624.1.3 by Robert Collins
Convert log to use the new tsort.merge_sort routine.
278
3943.8.1 by Marius Kruger
remove all trailing whitespace from bzr source
279
1624.1.2 by Robert Collins
Add MergeSort facility to bzrlib.tsort.
280
        node identifiers can be any hashable object, and are typically strings.
281
282
        If you have a graph like [('a', ['b']), ('a', ['c'])] this will only use
283
        one of the two values for 'a'.
284
285
        The graph is sorted lazily: until you iterate or sort the input is
286
        not processed other than to create an internal representation.
287
3943.8.1 by Marius Kruger
remove all trailing whitespace from bzr source
288
        iteration or sorting may raise GraphCycleError if a cycle is present
1624.1.2 by Robert Collins
Add MergeSort facility to bzrlib.tsort.
289
        in the graph.
290
291
        Background information on the design:
292
        -------------------------------------
293
        definition: the end of any cluster or 'merge' occurs when:
294
            1 - the next revision has a lower merge depth than we do.
295
              i.e.
296
              A 0
297
              B  1
298
              C   2
299
              D  1
300
              E 0
301
              C, D are the ends of clusters, E might be but we need more data.
302
            2 - or the next revision at our merge depth is not our left most
303
              ancestor.
304
              This is required to handle multiple-merges in one commit.
305
              i.e.
306
              A 0    [F, B, E]
307
              B  1   [D, C]
308
              C   2  [D]
309
              D  1   [F]
310
              E  1   [F]
311
              F 0
312
              C is the end of a cluster due to rule 1.
3943.8.1 by Marius Kruger
remove all trailing whitespace from bzr source
313
              D is not the end of a cluster from rule 1, but is from rule 2: E
1624.1.2 by Robert Collins
Add MergeSort facility to bzrlib.tsort.
314
                is not its left most ancestor
315
              E is the end of a cluster due to rule 1
316
              F might be but we need more data.
3943.8.1 by Marius Kruger
remove all trailing whitespace from bzr source
317
1624.1.2 by Robert Collins
Add MergeSort facility to bzrlib.tsort.
318
        we show connecting lines to a parent when:
319
         - The parent is the start of a merge within this cluster.
3943.8.1 by Marius Kruger
remove all trailing whitespace from bzr source
320
           That is, the merge was not done to the mainline before this cluster
1624.1.2 by Robert Collins
Add MergeSort facility to bzrlib.tsort.
321
           was merged to the mainline.
322
           This can be detected thus:
3943.8.1 by Marius Kruger
remove all trailing whitespace from bzr source
323
            * The parent has a higher merge depth and is the next revision in
1624.1.2 by Robert Collins
Add MergeSort facility to bzrlib.tsort.
324
              the list.
3943.8.1 by Marius Kruger
remove all trailing whitespace from bzr source
325
1624.1.2 by Robert Collins
Add MergeSort facility to bzrlib.tsort.
326
          The next revision in the list constraint is needed for this case:
3943.8.1 by Marius Kruger
remove all trailing whitespace from bzr source
327
          A 0   [D, B]
328
          B  1  [C, F]   # we do not want to show a line to F which is depth 2
1624.1.2 by Robert Collins
Add MergeSort facility to bzrlib.tsort.
329
                           but not a merge
3943.8.1 by Marius Kruger
remove all trailing whitespace from bzr source
330
          C  1  [H]      # note that this is a long line to show back to the
1624.1.2 by Robert Collins
Add MergeSort facility to bzrlib.tsort.
331
                           ancestor - see the end of merge rules.
332
          D 0   [G, E]
333
          E  1  [G, F]
334
          F   2 [G]
335
          G  1  [H]
336
          H 0
337
         - Part of this merges 'branch':
338
          The parent has the same merge depth and is our left most parent and we
339
           are not the end of the cluster.
340
          A 0   [C, B] lines: [B, C]
341
          B  1  [E, C] lines: [C]
342
          C 0   [D]    lines: [D]
343
          D 0   [F, E] lines: [E, F]
344
          E  1  [F]    lines: [F]
345
          F 0
346
         - The end of this merge/cluster:
347
          we can ONLY have multiple parents at the end of a cluster if this
348
          branch was previously merged into the 'mainline'.
349
          - if we have one and only one parent, show it
350
            Note that this may be to a greater merge depth - for instance if
351
            this branch continued from a deeply nested branch to add something
352
            to it.
353
          - if we have more than one parent - show the second oldest (older ==
354
            further down the list) parent with
355
            an equal or lower merge depth
356
             XXXX revisit when awake. ddaa asks about the relevance of each one
357
             - maybe more than one parent is relevant
358
        """
1988.4.1 by Robert Collins
bzrlib.tsort.merge_sorted now accepts 'generate_revnos'. This parameter
359
        self._generate_revno = generate_revno
1624.1.2 by Robert Collins
Add MergeSort facility to bzrlib.tsort.
360
        # a dict of the graph.
361
        self._graph = dict(graph)
1624.1.3 by Robert Collins
Convert log to use the new tsort.merge_sort routine.
362
        # if there is an explicit mainline, alter the graph to match. This is
363
        # easier than checking at every merge whether we are on the mainline and
364
        # if so which path to take.
365
        if mainline_revisions is None:
366
            self._mainline_revisions = []
367
            self._stop_revision = None
368
        else:
369
            self._mainline_revisions = list(mainline_revisions)
370
            self._stop_revision = self._mainline_revisions[0]
3943.8.1 by Marius Kruger
remove all trailing whitespace from bzr source
371
        # skip the first revision, its what we reach and its parents are
1624.1.3 by Robert Collins
Convert log to use the new tsort.merge_sort routine.
372
        # therefore irrelevant
373
        for index, revision in enumerate(self._mainline_revisions[1:]):
374
            # NB: index 0 means self._mainline_revisions[1]
375
            # if the mainline matches the graph, nothing to do.
376
            parent = self._mainline_revisions[index]
377
            if parent is None:
378
                # end of mainline_revisions history
379
                continue
3533.2.1 by John Arbash Meinel
(bug #243536) tsort.merge_sorted() should work even with a ghost in mainline.
380
            graph_parent_ids = self._graph[revision]
381
            if not graph_parent_ids:
382
                # We ran into a ghost, skip over it, this is a workaround for
383
                # bug #243536, the _graph has had ghosts stripped, but the
384
                # mainline_revisions have not
385
                continue
386
            if graph_parent_ids[0] == parent:
1624.1.3 by Robert Collins
Convert log to use the new tsort.merge_sort routine.
387
                continue
388
            # remove it from its prior spot
389
            self._graph[revision].remove(parent)
390
            # insert it into the start of the mainline
391
            self._graph[revision].insert(0, parent)
1624.1.2 by Robert Collins
Add MergeSort facility to bzrlib.tsort.
392
        # we need to do a check late in the process to detect end-of-merges
393
        # which requires the parents to be accessible: its easier for now
394
        # to just keep the original graph around.
1624.1.3 by Robert Collins
Convert log to use the new tsort.merge_sort routine.
395
        self._original_graph = dict(self._graph.items())
1988.4.1 by Robert Collins
bzrlib.tsort.merge_sorted now accepts 'generate_revnos'. This parameter
396
        # we need to know the revision numbers of revisions to determine
397
        # the revision numbers of their descendants
3170.3.2 by John Arbash Meinel
Implement the new dotted revision numbering.
398
        # this is a graph from node to [revno_tuple, first_child]
399
        # where first_child is True if no other children have seen this node
1988.4.1 by Robert Collins
bzrlib.tsort.merge_sorted now accepts 'generate_revnos'. This parameter
400
        # and revno_tuple is the tuple that was assigned to the node.
401
        # we dont know revnos to start with, so we start it seeded with
3170.3.2 by John Arbash Meinel
Implement the new dotted revision numbering.
402
        # [None, True]
403
        self._revnos = dict((revision, [None, True])
404
                            for revision in self._graph)
405
        # Each mainline revision counts how many child branches have spawned from it.
406
        self._revno_to_branch_count = {}
3943.8.1 by Marius Kruger
remove all trailing whitespace from bzr source
407
1624.1.2 by Robert Collins
Add MergeSort facility to bzrlib.tsort.
408
        # this is a stack storing the depth first search into the graph.
409
        self._node_name_stack = []
410
        # at each level of recursion we need the merge depth this node is at:
411
        self._node_merge_depth_stack = []
412
        # at each level of 'recursion' we have to check each parent. This
3943.8.1 by Marius Kruger
remove all trailing whitespace from bzr source
413
        # stack stores the parents we have not yet checked for the node at the
1624.1.2 by Robert Collins
Add MergeSort facility to bzrlib.tsort.
414
        # matching depth in _node_name_stack
415
        self._pending_parents_stack = []
1988.4.1 by Robert Collins
bzrlib.tsort.merge_sorted now accepts 'generate_revnos'. This parameter
416
        # When we first look at a node we assign it a seqence number from its
417
        # leftmost parent.
3170.3.2 by John Arbash Meinel
Implement the new dotted revision numbering.
418
        self._first_child_stack = []
1624.1.2 by Robert Collins
Add MergeSort facility to bzrlib.tsort.
419
        # this is a set of the nodes who have been completely analysed for fast
420
        # membership checking
421
        self._completed_node_names = set()
422
        # this is the scheduling of nodes list.
423
        # Nodes are scheduled
424
        # from the bottom left of the tree: in the tree
425
        # A 0  [D, B]
426
        # B  1 [C]
427
        # C  1 [D]
428
        # D 0  [F, E]
429
        # E  1 [F]
430
        # F 0
3943.8.1 by Marius Kruger
remove all trailing whitespace from bzr source
431
        # the scheduling order is: F, E, D, C, B, A
1624.1.2 by Robert Collins
Add MergeSort facility to bzrlib.tsort.
432
        # that is - 'left subtree, right subtree, node'
433
        # which would mean that when we schedule A we can emit the entire tree.
434
        self._scheduled_nodes = []
3943.8.1 by Marius Kruger
remove all trailing whitespace from bzr source
435
        # This records for each node when we have processed its left most
1624.1.2 by Robert Collins
Add MergeSort facility to bzrlib.tsort.
436
        # unmerged subtree. After this subtree is scheduled, all other subtrees
437
        # have their merge depth increased by one from this nodes merge depth.
1988.4.1 by Robert Collins
bzrlib.tsort.merge_sorted now accepts 'generate_revnos'. This parameter
438
        # it contains tuples - name, merge_depth
439
        self._left_subtree_pushed_stack = []
1624.1.2 by Robert Collins
Add MergeSort facility to bzrlib.tsort.
440
441
        # seed the search with the tip of the branch
3236.2.1 by Michael Hudson
test and fix
442
        if (branch_tip is not None and
3246.2.1 by Martin Pool
Merge tsort NULL_REVISION patch from mwhudson, with John's import fix
443
            branch_tip != _mod_revision.NULL_REVISION):
1624.1.2 by Robert Collins
Add MergeSort facility to bzrlib.tsort.
444
            parents = self._graph.pop(branch_tip)
445
            self._push_node(branch_tip, 0, parents)
446
447
    def sorted(self):
448
        """Sort the graph and return as a list.
3943.8.1 by Marius Kruger
remove all trailing whitespace from bzr source
449
1624.1.2 by Robert Collins
Add MergeSort facility to bzrlib.tsort.
450
        After calling this the sorter is empty and you must create a new one.
451
        """
452
        return list(self.iter_topo_order())
453
454
    def iter_topo_order(self):
455
        """Yield the nodes of the graph in a topological order.
3943.8.1 by Marius Kruger
remove all trailing whitespace from bzr source
456
1624.1.2 by Robert Collins
Add MergeSort facility to bzrlib.tsort.
457
        After finishing iteration the sorter is empty and you cannot continue
458
        iteration.
459
        """
2425.4.2 by John Arbash Meinel
Change valid self._foo variables into local variables.
460
        # These are safe to offload to local variables, because they are used
461
        # as a stack and modified in place, never assigned to.
462
        node_name_stack = self._node_name_stack
2425.4.3 by John Arbash Meinel
Inline self._pop_node and self._push_node
463
        node_merge_depth_stack = self._node_merge_depth_stack
2425.4.2 by John Arbash Meinel
Change valid self._foo variables into local variables.
464
        pending_parents_stack = self._pending_parents_stack
465
        left_subtree_pushed_stack = self._left_subtree_pushed_stack
466
        completed_node_names = self._completed_node_names
2425.4.3 by John Arbash Meinel
Inline self._pop_node and self._push_node
467
        scheduled_nodes = self._scheduled_nodes
468
469
        graph_pop = self._graph.pop
470
471
        def push_node(node_name, merge_depth, parents,
472
                      node_name_stack_append=node_name_stack.append,
473
                      node_merge_depth_stack_append=node_merge_depth_stack.append,
474
                      left_subtree_pushed_stack_append=left_subtree_pushed_stack.append,
475
                      pending_parents_stack_append=pending_parents_stack.append,
3170.3.2 by John Arbash Meinel
Implement the new dotted revision numbering.
476
                      first_child_stack_append=self._first_child_stack.append,
2425.4.3 by John Arbash Meinel
Inline self._pop_node and self._push_node
477
                      revnos=self._revnos,
478
                      ):
479
            """Add node_name to the pending node stack.
480
481
            Names in this stack will get emitted into the output as they are popped
482
            off the stack.
483
484
            This inlines a lot of self._variable.append functions as local
485
            variables.
486
            """
487
            node_name_stack_append(node_name)
488
            node_merge_depth_stack_append(merge_depth)
489
            left_subtree_pushed_stack_append(False)
490
            pending_parents_stack_append(list(parents))
3170.3.2 by John Arbash Meinel
Implement the new dotted revision numbering.
491
            # as we push it, check if it is the first child
2425.4.3 by John Arbash Meinel
Inline self._pop_node and self._push_node
492
            if parents:
493
                # node has parents, assign from the left most parent.
3170.3.2 by John Arbash Meinel
Implement the new dotted revision numbering.
494
                parent_info = revnos[parents[0]]
495
                first_child = parent_info[1]
496
                parent_info[1] = False
2425.4.3 by John Arbash Meinel
Inline self._pop_node and self._push_node
497
            else:
3170.3.2 by John Arbash Meinel
Implement the new dotted revision numbering.
498
                # We don't use the same algorithm here, but we need to keep the
499
                # stack in line
500
                first_child = None
501
            first_child_stack_append(first_child)
2425.4.3 by John Arbash Meinel
Inline self._pop_node and self._push_node
502
503
        def pop_node(node_name_stack_pop=node_name_stack.pop,
504
                     node_merge_depth_stack_pop=node_merge_depth_stack.pop,
3170.3.2 by John Arbash Meinel
Implement the new dotted revision numbering.
505
                     first_child_stack_pop=self._first_child_stack.pop,
2425.4.3 by John Arbash Meinel
Inline self._pop_node and self._push_node
506
                     left_subtree_pushed_stack_pop=left_subtree_pushed_stack.pop,
507
                     pending_parents_stack_pop=pending_parents_stack.pop,
508
                     original_graph=self._original_graph,
509
                     revnos=self._revnos,
510
                     completed_node_names_add=self._completed_node_names.add,
511
                     scheduled_nodes_append=scheduled_nodes.append,
3170.3.2 by John Arbash Meinel
Implement the new dotted revision numbering.
512
                     revno_to_branch_count=self._revno_to_branch_count,
2425.4.3 by John Arbash Meinel
Inline self._pop_node and self._push_node
513
                    ):
514
            """Pop the top node off the stack
515
516
            The node is appended to the sorted output.
517
            """
518
            # we are returning from the flattened call frame:
519
            # pop off the local variables
520
            node_name = node_name_stack_pop()
521
            merge_depth = node_merge_depth_stack_pop()
3170.3.2 by John Arbash Meinel
Implement the new dotted revision numbering.
522
            first_child = first_child_stack_pop()
2425.4.3 by John Arbash Meinel
Inline self._pop_node and self._push_node
523
            # remove this node from the pending lists:
524
            left_subtree_pushed_stack_pop()
525
            pending_parents_stack_pop()
526
527
            parents = original_graph[node_name]
528
            if parents:
529
                # node has parents, assign from the left most parent.
3170.3.2 by John Arbash Meinel
Implement the new dotted revision numbering.
530
                parent_revno = revnos[parents[0]][0]
531
                if not first_child:
2425.4.3 by John Arbash Meinel
Inline self._pop_node and self._push_node
532
                    # not the first child, make a new branch
3170.3.2 by John Arbash Meinel
Implement the new dotted revision numbering.
533
                    base_revno = parent_revno[0]
534
                    branch_count = revno_to_branch_count.get(base_revno, 0)
535
                    branch_count += 1
536
                    revno_to_branch_count[base_revno] = branch_count
537
                    revno = (parent_revno[0], branch_count, 1)
538
                    # revno = (parent_revno[0], branch_count, parent_revno[-1]+1)
2425.4.3 by John Arbash Meinel
Inline self._pop_node and self._push_node
539
                else:
3170.3.2 by John Arbash Meinel
Implement the new dotted revision numbering.
540
                    # as the first child, we just increase the final revision
541
                    # number
542
                    revno = parent_revno[:-1] + (parent_revno[-1] + 1,)
2425.4.3 by John Arbash Meinel
Inline self._pop_node and self._push_node
543
            else:
544
                # no parents, use the root sequence
3613.1.1 by John Arbash Meinel
Fix the merge_sort code so that it properly increments.
545
                root_count = revno_to_branch_count.get(0, -1)
546
                root_count += 1
3170.3.2 by John Arbash Meinel
Implement the new dotted revision numbering.
547
                if root_count:
548
                    revno = (0, root_count, 1)
2425.4.3 by John Arbash Meinel
Inline self._pop_node and self._push_node
549
                else:
550
                    revno = (1,)
3170.3.2 by John Arbash Meinel
Implement the new dotted revision numbering.
551
                revno_to_branch_count[0] = root_count
2425.4.3 by John Arbash Meinel
Inline self._pop_node and self._push_node
552
553
            # store the revno for this node for future reference
554
            revnos[node_name][0] = revno
555
            completed_node_names_add(node_name)
556
            scheduled_nodes_append((node_name, merge_depth, revno))
557
            return node_name
558
2425.4.2 by John Arbash Meinel
Change valid self._foo variables into local variables.
559
560
        while node_name_stack:
1624.1.2 by Robert Collins
Add MergeSort facility to bzrlib.tsort.
561
            # loop until this call completes.
2425.4.2 by John Arbash Meinel
Change valid self._foo variables into local variables.
562
            parents_to_visit = pending_parents_stack[-1]
1624.1.2 by Robert Collins
Add MergeSort facility to bzrlib.tsort.
563
            # if all parents are done, the revision is done
564
            if not parents_to_visit:
565
                # append the revision to the topo sorted scheduled list:
566
                # all the nodes parents have been scheduled added, now
567
                # we can add it to the output.
2425.4.2 by John Arbash Meinel
Change valid self._foo variables into local variables.
568
                pop_node()
1624.1.2 by Robert Collins
Add MergeSort facility to bzrlib.tsort.
569
            else:
2425.4.2 by John Arbash Meinel
Change valid self._foo variables into local variables.
570
                while pending_parents_stack[-1]:
571
                    if not left_subtree_pushed_stack[-1]:
1624.1.2 by Robert Collins
Add MergeSort facility to bzrlib.tsort.
572
                        # recurse depth first into the primary parent
2425.4.2 by John Arbash Meinel
Change valid self._foo variables into local variables.
573
                        next_node_name = pending_parents_stack[-1].pop(0)
1624.1.2 by Robert Collins
Add MergeSort facility to bzrlib.tsort.
574
                    else:
575
                        # place any merges in right-to-left order for scheduling
576
                        # which gives us left-to-right order after we reverse
3943.8.1 by Marius Kruger
remove all trailing whitespace from bzr source
577
                        # the scheduled queue. XXX: This has the effect of
1624.1.2 by Robert Collins
Add MergeSort facility to bzrlib.tsort.
578
                        # allocating common-new revisions to the right-most
3943.8.1 by Marius Kruger
remove all trailing whitespace from bzr source
579
                        # subtree rather than the left most, which will
1624.1.2 by Robert Collins
Add MergeSort facility to bzrlib.tsort.
580
                        # display nicely (you get smaller trees at the top
581
                        # of the combined merge).
2425.4.2 by John Arbash Meinel
Change valid self._foo variables into local variables.
582
                        next_node_name = pending_parents_stack[-1].pop()
583
                    if next_node_name in completed_node_names:
1624.1.2 by Robert Collins
Add MergeSort facility to bzrlib.tsort.
584
                        # this parent was completed by a child on the
585
                        # call stack. skip it.
586
                        continue
587
                    # otherwise transfer it from the source graph into the
588
                    # top of the current depth first search stack.
589
                    try:
2425.4.3 by John Arbash Meinel
Inline self._pop_node and self._push_node
590
                        parents = graph_pop(next_node_name)
1624.1.2 by Robert Collins
Add MergeSort facility to bzrlib.tsort.
591
                    except KeyError:
592
                        # if the next node is not in the source graph it has
593
                        # already been popped from it and placed into the
594
                        # current search stack (but not completed or we would
595
                        # have hit the continue 4 lines up.
596
                        # this indicates a cycle.
2425.4.2 by John Arbash Meinel
Change valid self._foo variables into local variables.
597
                        raise errors.GraphCycleError(node_name_stack)
1624.1.2 by Robert Collins
Add MergeSort facility to bzrlib.tsort.
598
                    next_merge_depth = 0
2425.4.2 by John Arbash Meinel
Change valid self._foo variables into local variables.
599
                    if left_subtree_pushed_stack[-1]:
1988.4.1 by Robert Collins
bzrlib.tsort.merge_sorted now accepts 'generate_revnos'. This parameter
600
                        # a new child branch from name_stack[-1]
1624.1.2 by Robert Collins
Add MergeSort facility to bzrlib.tsort.
601
                        next_merge_depth = 1
602
                    else:
603
                        next_merge_depth = 0
2425.4.2 by John Arbash Meinel
Change valid self._foo variables into local variables.
604
                        left_subtree_pushed_stack[-1] = True
1624.1.2 by Robert Collins
Add MergeSort facility to bzrlib.tsort.
605
                    next_merge_depth = (
2425.4.3 by John Arbash Meinel
Inline self._pop_node and self._push_node
606
                        node_merge_depth_stack[-1] + next_merge_depth)
2425.4.2 by John Arbash Meinel
Change valid self._foo variables into local variables.
607
                    push_node(
1624.1.2 by Robert Collins
Add MergeSort facility to bzrlib.tsort.
608
                        next_node_name,
609
                        next_merge_depth,
610
                        parents)
3943.8.1 by Marius Kruger
remove all trailing whitespace from bzr source
611
                    # and do not continue processing parents until this 'call'
1624.1.2 by Robert Collins
Add MergeSort facility to bzrlib.tsort.
612
                    # has recursed.
613
                    break
2425.4.3 by John Arbash Meinel
Inline self._pop_node and self._push_node
614
1624.1.2 by Robert Collins
Add MergeSort facility to bzrlib.tsort.
615
        # We have scheduled the graph. Now deliver the ordered output:
616
        sequence_number = 0
2425.4.2 by John Arbash Meinel
Change valid self._foo variables into local variables.
617
        stop_revision = self._stop_revision
618
        generate_revno = self._generate_revno
619
        original_graph = self._original_graph
620
621
        while scheduled_nodes:
622
            node_name, merge_depth, revno = scheduled_nodes.pop()
623
            if node_name == stop_revision:
1624.1.3 by Robert Collins
Convert log to use the new tsort.merge_sort routine.
624
                return
2425.4.2 by John Arbash Meinel
Change valid self._foo variables into local variables.
625
            if not len(scheduled_nodes):
1988.4.1 by Robert Collins
bzrlib.tsort.merge_sorted now accepts 'generate_revnos'. This parameter
626
                # last revision is the end of a merge
1624.1.2 by Robert Collins
Add MergeSort facility to bzrlib.tsort.
627
                end_of_merge = True
2425.4.2 by John Arbash Meinel
Change valid self._foo variables into local variables.
628
            elif scheduled_nodes[-1][1] < merge_depth:
1624.1.2 by Robert Collins
Add MergeSort facility to bzrlib.tsort.
629
                # the next node is to our left
630
                end_of_merge = True
2425.4.2 by John Arbash Meinel
Change valid self._foo variables into local variables.
631
            elif (scheduled_nodes[-1][1] == merge_depth and
632
                  (scheduled_nodes[-1][0] not in
633
                   original_graph[node_name])):
1624.1.2 by Robert Collins
Add MergeSort facility to bzrlib.tsort.
634
                # the next node was part of a multiple-merge.
635
                end_of_merge = True
636
            else:
637
                end_of_merge = False
2425.4.2 by John Arbash Meinel
Change valid self._foo variables into local variables.
638
            if generate_revno:
1988.4.1 by Robert Collins
bzrlib.tsort.merge_sorted now accepts 'generate_revnos'. This parameter
639
                yield (sequence_number, node_name, merge_depth, revno, end_of_merge)
640
            else:
641
                yield (sequence_number, node_name, merge_depth, end_of_merge)
1624.1.2 by Robert Collins
Add MergeSort facility to bzrlib.tsort.
642
            sequence_number += 1
643
644
    def _push_node(self, node_name, merge_depth, parents):
645
        """Add node_name to the pending node stack.
3943.8.1 by Marius Kruger
remove all trailing whitespace from bzr source
646
1624.1.2 by Robert Collins
Add MergeSort facility to bzrlib.tsort.
647
        Names in this stack will get emitted into the output as they are popped
648
        off the stack.
649
        """
650
        self._node_name_stack.append(node_name)
651
        self._node_merge_depth_stack.append(merge_depth)
1988.4.1 by Robert Collins
bzrlib.tsort.merge_sorted now accepts 'generate_revnos'. This parameter
652
        self._left_subtree_pushed_stack.append(False)
1624.1.2 by Robert Collins
Add MergeSort facility to bzrlib.tsort.
653
        self._pending_parents_stack.append(list(parents))
3170.3.2 by John Arbash Meinel
Implement the new dotted revision numbering.
654
        # as we push it, figure out if this is the first child
1988.4.1 by Robert Collins
bzrlib.tsort.merge_sorted now accepts 'generate_revnos'. This parameter
655
        parents = self._original_graph[node_name]
656
        if parents:
657
            # node has parents, assign from the left most parent.
3170.3.2 by John Arbash Meinel
Implement the new dotted revision numbering.
658
            parent_info = self._revnos[parents[0]]
659
            first_child = parent_info[1]
660
            parent_info[1] = False
1988.4.1 by Robert Collins
bzrlib.tsort.merge_sorted now accepts 'generate_revnos'. This parameter
661
        else:
3170.3.2 by John Arbash Meinel
Implement the new dotted revision numbering.
662
            # We don't use the same algorithm here, but we need to keep the
663
            # stack in line
664
            first_child = None
665
        self._first_child_stack.append(first_child)
1624.1.2 by Robert Collins
Add MergeSort facility to bzrlib.tsort.
666
667
    def _pop_node(self):
3943.8.1 by Marius Kruger
remove all trailing whitespace from bzr source
668
        """Pop the top node off the stack
1624.1.2 by Robert Collins
Add MergeSort facility to bzrlib.tsort.
669
670
        The node is appended to the sorted output.
671
        """
672
        # we are returning from the flattened call frame:
673
        # pop off the local variables
674
        node_name = self._node_name_stack.pop()
675
        merge_depth = self._node_merge_depth_stack.pop()
3170.3.2 by John Arbash Meinel
Implement the new dotted revision numbering.
676
        first_child = self._first_child_stack.pop()
1988.4.1 by Robert Collins
bzrlib.tsort.merge_sorted now accepts 'generate_revnos'. This parameter
677
        # remove this node from the pending lists:
678
        self._left_subtree_pushed_stack.pop()
1624.1.2 by Robert Collins
Add MergeSort facility to bzrlib.tsort.
679
        self._pending_parents_stack.pop()
680
1988.4.1 by Robert Collins
bzrlib.tsort.merge_sorted now accepts 'generate_revnos'. This parameter
681
        parents = self._original_graph[node_name]
682
        if parents:
683
            # node has parents, assign from the left most parent.
3170.3.2 by John Arbash Meinel
Implement the new dotted revision numbering.
684
            parent_revno = self._revnos[parents[0]][0]
685
            if not first_child:
1988.4.1 by Robert Collins
bzrlib.tsort.merge_sorted now accepts 'generate_revnos'. This parameter
686
                # not the first child, make a new branch
3170.3.2 by John Arbash Meinel
Implement the new dotted revision numbering.
687
                base_revno = parent_revno[0]
688
                branch_count = self._revno_to_branch_count.get(base_revno, 0)
689
                branch_count += 1
690
                self._revno_to_branch_count[base_revno] = branch_count
691
                revno = (parent_revno[0], branch_count, 1)
692
                # revno = (parent_revno[0], branch_count, parent_revno[-1]+1)
1988.4.1 by Robert Collins
bzrlib.tsort.merge_sorted now accepts 'generate_revnos'. This parameter
693
            else:
3170.3.2 by John Arbash Meinel
Implement the new dotted revision numbering.
694
                # as the first child, we just increase the final revision
695
                # number
696
                revno = parent_revno[:-1] + (parent_revno[-1] + 1,)
1988.4.1 by Robert Collins
bzrlib.tsort.merge_sorted now accepts 'generate_revnos'. This parameter
697
        else:
698
            # no parents, use the root sequence
3170.3.2 by John Arbash Meinel
Implement the new dotted revision numbering.
699
            root_count = self._revno_to_branch_count.get(0, 0)
700
            if root_count:
701
                revno = (0, root_count, 1)
1988.4.1 by Robert Collins
bzrlib.tsort.merge_sorted now accepts 'generate_revnos'. This parameter
702
            else:
703
                revno = (1,)
3170.3.2 by John Arbash Meinel
Implement the new dotted revision numbering.
704
            root_count += 1
705
            self._revno_to_branch_count[0] = root_count
1988.4.1 by Robert Collins
bzrlib.tsort.merge_sorted now accepts 'generate_revnos'. This parameter
706
707
        # store the revno for this node for future reference
708
        self._revnos[node_name][0] = revno
1624.1.2 by Robert Collins
Add MergeSort facility to bzrlib.tsort.
709
        self._completed_node_names.add(node_name)
1988.4.1 by Robert Collins
bzrlib.tsort.merge_sorted now accepts 'generate_revnos'. This parameter
710
        self._scheduled_nodes.append((node_name, merge_depth, self._revnos[node_name][0]))
1570.1.7 by Robert Collins
Replace the slow topo_sort routine with a much faster one for non trivial datasets.
711
        return node_name