/brz/remove-bazaar

To get this branch, use:
bzr branch http://gegoxaren.bato24.eu/bzr/brz/remove-bazaar
0.1.1 by Martin Pool
Check in old existing knit code.
1
#! /usr/bin/python
2
3
# Copyright (C) 2005 Canonical Ltd
4
0.1.33 by Martin Pool
add gpl text
5
# This program is free software; you can redistribute it and/or modify
6
# it under the terms of the GNU General Public License as published by
7
# the Free Software Foundation; either version 2 of the License, or
8
# (at your option) any later version.
9
10
# This program is distributed in the hope that it will be useful,
11
# but WITHOUT ANY WARRANTY; without even the implied warranty of
12
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
13
# GNU General Public License for more details.
14
15
# You should have received a copy of the GNU General Public License
16
# along with this program; if not, write to the Free Software
17
# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
0.1.1 by Martin Pool
Check in old existing knit code.
18
19
# Author: Martin Pool <mbp@canonical.com>
20
21
0.1.38 by Martin Pool
Rename knit to weave. (I don't think there's an existing module called weave.)
22
"""Weave - storage of related text file versions"""
0.1.2 by Martin Pool
Import testsweet module adapted from bzr.
23
0.1.61 by Martin Pool
doc
24
# TODO: Perhaps have copy method for Weave instances?
0.1.2 by Martin Pool
Import testsweet module adapted from bzr.
25
0.1.58 by Martin Pool
doc
26
# XXX: If we do weaves this way, will a merge still behave the same
27
# way if it's done in a different order?  That's a pretty desirable
28
# property.
29
0.1.62 by Martin Pool
Lame command-line client for reading and writing weaves.
30
# TODO: Nothing here so far assumes the lines are really \n newlines,
31
# rather than being split up in some other way.  We could accomodate
32
# binaries, perhaps by naively splitting on \n or perhaps using
33
# something like a rolling checksum.
34
35
# TODO: Track version names as well as indexes. 
36
0.1.85 by Martin Pool
doc
37
# TODO: End marker for each version so we can stop reading?
0.1.69 by Martin Pool
Simple text-based format for storing weaves, cleaner than
38
39
# TODO: Check that no insertion occurs inside a deletion that was
40
# active in the version of the insertion.
41
912 by Martin Pool
- update todos for weave
42
# TODO: In addition to the SHA-1 check, perhaps have some code that
43
# checks structural constraints of the weave: ie that insertions are
44
# properly nested, that there is no text outside of an insertion, that
45
# insertions or deletions are not repeated, etc.
0.1.85 by Martin Pool
doc
46
918 by Martin Pool
- start doing new weave-merge algorithm
47
# TODO: Make the info command just show info, not extract everything:
48
# it can be much faster.
49
50
# TODO: Perhaps use long integers as sets instead of set objects; may
51
# be faster.
52
53
# TODO: Parallel-extract that passes back each line along with a
54
# description of which revisions include it.  Nice for checking all
55
# shas in parallel.
56
57
0.1.85 by Martin Pool
doc
58
926 by Martin Pool
- update more weave code to use intsets
59
from bzrlib.intset import IntSet
0.1.66 by Martin Pool
Cope without set/frozenset classes
60
61
924 by Martin Pool
- Add IntSet class
62
0.1.47 by Martin Pool
New WeaveError and WeaveFormatError rather than assertions.
63
class WeaveError(Exception):
64
    """Exception in processing weave"""
65
66
67
class WeaveFormatError(WeaveError):
68
    """Weave invariant violated"""
69
    
70
0.1.38 by Martin Pool
Rename knit to weave. (I don't think there's an existing module called weave.)
71
class Weave(object):
72
    """weave - versioned text file storage.
0.1.2 by Martin Pool
Import testsweet module adapted from bzr.
73
    
0.1.72 by Martin Pool
Go back to weave lines normally having newlines at the end.
74
    A Weave manages versions of line-based text files, keeping track
75
    of the originating version for each line.
76
77
    To clients the "lines" of the file are represented as a list of strings.
78
    These strings  will typically have terminal newline characters, but
79
    this is not required.  In particular files commonly do not have a newline
80
    at the end of the file.
0.1.2 by Martin Pool
Import testsweet module adapted from bzr.
81
0.1.4 by Martin Pool
Start indexing knits by both integer and version string.
82
    Texts can be identified in either of two ways:
83
84
    * a nonnegative index number.
85
86
    * a version-id string.
87
0.1.38 by Martin Pool
Rename knit to weave. (I don't think there's an existing module called weave.)
88
    Typically the index number will be valid only inside this weave and
0.1.4 by Martin Pool
Start indexing knits by both integer and version string.
89
    the version-id is used to reference it in the larger world.
0.1.2 by Martin Pool
Import testsweet module adapted from bzr.
90
0.1.39 by Martin Pool
Change to a more realistic weave structure which can represent insertions and
91
    The weave is represented as a list mixing edit instructions and
92
    literal text.  Each entry in _l can be either a string (or
93
    unicode), or a tuple.  If a string, it means that the given line
94
    should be output in the currently active revisions.
95
96
    If a tuple, it gives a processing instruction saying in which
97
    revisions the enclosed lines are active.  The tuple has the form
98
    (instruction, version).
99
100
    The instruction can be '{' or '}' for an insertion block, and '['
101
    and ']' for a deletion block respectively.  The version is the
0.1.45 by Martin Pool
doc
102
    integer version index.  There is no replace operator, only deletes
103
    and inserts.
0.1.39 by Martin Pool
Change to a more realistic weave structure which can represent insertions and
104
0.1.41 by Martin Pool
Doc
105
    Constraints/notes:
0.1.39 by Martin Pool
Change to a more realistic weave structure which can represent insertions and
106
107
    * A later version can delete lines that were introduced by any
108
      number of ancestor versions; this implies that deletion
109
      instructions can span insertion blocks without regard to the
110
      insertion block's nesting.
111
0.1.41 by Martin Pool
Doc
112
    * Similarly, deletions need not be properly nested with regard to
113
      each other, because they might have been generated by
114
      independent revisions.
115
0.1.45 by Martin Pool
doc
116
    * Insertions are always made by inserting a new bracketed block
117
      into a single point in the previous weave.  This implies they
118
      can nest but not overlap, and the nesting must always have later
119
      insertions on the inside.
120
0.1.41 by Martin Pool
Doc
121
    * It doesn't seem very useful to have an active insertion
122
      inside an inactive insertion, but it might happen.
0.1.45 by Martin Pool
doc
123
      
0.1.41 by Martin Pool
Doc
124
    * Therefore, all instructions are always"considered"; that
125
      is passed onto and off the stack.  An outer inactive block
126
      doesn't disable an inner block.
127
128
    * Lines are enabled if the most recent enclosing insertion is
129
      active and none of the enclosing deletions are active.
0.1.39 by Martin Pool
Change to a more realistic weave structure which can represent insertions and
130
0.1.49 by Martin Pool
Add another constraint: revisions should not delete text that they
131
    * There is no point having a deletion directly inside its own
132
      insertion; you might as well just not write it.  And there
133
      should be no way to get an earlier version deleting a later
134
      version.
135
0.1.2 by Martin Pool
Import testsweet module adapted from bzr.
136
    _l
892 by Martin Pool
- weave stores only direct parents, and calculates and memoizes expansion as needed
137
        Text of the weave.
0.1.4 by Martin Pool
Start indexing knits by both integer and version string.
138
139
    _v
892 by Martin Pool
- weave stores only direct parents, and calculates and memoizes expansion as needed
140
        List of parents, indexed by version number.
141
        It is only necessary to store the minimal set of parents for
142
        each version; the parent's parents are implied.
0.1.13 by Martin Pool
Knit structure now allows for versions to include the lines present in other
143
0.1.89 by Martin Pool
Store SHA1 in weave file for later verification
144
    _sha1s
145
        List of hex SHA-1 of each version, or None if not recorded.
0.1.2 by Martin Pool
Import testsweet module adapted from bzr.
146
    """
0.1.4 by Martin Pool
Start indexing knits by both integer and version string.
147
    def __init__(self):
148
        self._l = []
149
        self._v = []
0.1.89 by Martin Pool
Store SHA1 in weave file for later verification
150
        self._sha1s = []
0.1.60 by Martin Pool
Weave eq and ne methods
151
152
153
    def __eq__(self, other):
154
        if not isinstance(other, Weave):
155
            return False
156
        return self._v == other._v \
157
               and self._l == other._l
158
    
159
160
    def __ne__(self, other):
161
        return not self.__eq__(other)
162
0.1.2 by Martin Pool
Import testsweet module adapted from bzr.
163
        
0.1.26 by Martin Pool
Refactor parameters to add command
164
    def add(self, parents, text):
0.1.4 by Martin Pool
Start indexing knits by both integer and version string.
165
        """Add a single text on top of the weave.
0.1.36 by Martin Pool
doc
166
  
0.1.26 by Martin Pool
Refactor parameters to add command
167
        Returns the index number of the newly added version.
168
169
        parents
892 by Martin Pool
- weave stores only direct parents, and calculates and memoizes expansion as needed
170
            List or set of direct parent version numbers.
171
            
0.1.26 by Martin Pool
Refactor parameters to add command
172
        text
173
            Sequence of lines to be added in the new version."""
0.1.82 by Martin Pool
Small weave optimizations
174
        ## self._check_versions(parents)
175
        ## self._check_lines(text)
0.1.4 by Martin Pool
Start indexing knits by both integer and version string.
176
        idx = len(self._v)
0.1.5 by Martin Pool
Add test for storing two text versions.
177
0.1.89 by Martin Pool
Store SHA1 in weave file for later verification
178
        import sha
179
        s = sha.new()
180
        for l in text:
181
            s.update(l)
182
        sha1 = s.hexdigest()
183
        del s
184
918 by Martin Pool
- start doing new weave-merge algorithm
185
        # TODO: It'd probably be faster to append things on to a new
186
        # list rather than modifying the existing one, which is likely
187
        # to cause a lot of copying.
188
0.1.26 by Martin Pool
Refactor parameters to add command
189
        if parents:
892 by Martin Pool
- weave stores only direct parents, and calculates and memoizes expansion as needed
190
            ancestors = self.inclusions(parents)
191
            delta = self._delta(ancestors, text)
0.1.25 by Martin Pool
Handle insertion of new weave layers that insert text on top of the basis
192
0.1.31 by Martin Pool
Fix insertion of multiple regions, calculating the right line offset as we go.
193
            # offset gives the number of lines that have been inserted
194
            # into the weave up to the current point; if the original edit instruction
195
            # says to change line A then we actually change (A+offset)
196
            offset = 0
197
0.1.25 by Martin Pool
Handle insertion of new weave layers that insert text on top of the basis
198
            for i1, i2, newlines in delta:
0.1.29 by Martin Pool
Better internal error
199
                assert 0 <= i1
200
                assert i1 <= i2
201
                assert i2 <= len(self._l)
0.1.56 by Martin Pool
Handle deletion of lines by marking the region with a deletion
202
203
                # the deletion and insertion are handled separately.
204
                # first delete the region.
0.1.25 by Martin Pool
Handle insertion of new weave layers that insert text on top of the basis
205
                if i1 != i2:
0.1.56 by Martin Pool
Handle deletion of lines by marking the region with a deletion
206
                    self._l.insert(i1+offset, ('[', idx))
207
                    self._l.insert(i2+offset+1, (']', idx))
208
                    offset += 2
209
                    # is this OK???
0.1.39 by Martin Pool
Change to a more realistic weave structure which can represent insertions and
210
0.1.56 by Martin Pool
Handle deletion of lines by marking the region with a deletion
211
                if newlines:
0.1.57 by Martin Pool
Fix bug in an update edit that both deletes and inserts -- previously
212
                    # there may have been a deletion spanning up to
213
                    # i2; we want to insert after this region to make sure
214
                    # we don't destroy ourselves
215
                    i = i2 + offset
0.1.56 by Martin Pool
Handle deletion of lines by marking the region with a deletion
216
                    self._l[i:i] = [('{', idx)] \
217
                                   + newlines \
218
                                   + [('}', idx)]
219
                    offset += 2 + len(newlines)
0.1.25 by Martin Pool
Handle insertion of new weave layers that insert text on top of the basis
220
0.1.75 by Martin Pool
Remove VerInfo class; just store sets directly in the list of
221
            self._addversion(parents)
0.1.25 by Martin Pool
Handle insertion of new weave layers that insert text on top of the basis
222
        else:
0.1.26 by Martin Pool
Refactor parameters to add command
223
            # special case; adding with no parents revision; can do this
0.1.25 by Martin Pool
Handle insertion of new weave layers that insert text on top of the basis
224
            # more quickly by just appending unconditionally
0.1.39 by Martin Pool
Change to a more realistic weave structure which can represent insertions and
225
            self._l.append(('{', idx))
226
            self._l += text
227
            self._l.append(('}', idx))
0.1.25 by Martin Pool
Handle insertion of new weave layers that insert text on top of the basis
228
0.1.75 by Martin Pool
Remove VerInfo class; just store sets directly in the list of
229
            self._addversion(None)
0.1.89 by Martin Pool
Store SHA1 in weave file for later verification
230
231
        self._sha1s.append(sha1)
0.1.25 by Martin Pool
Handle insertion of new weave layers that insert text on top of the basis
232
            
0.1.4 by Martin Pool
Start indexing knits by both integer and version string.
233
        return idx
0.1.2 by Martin Pool
Import testsweet module adapted from bzr.
234
0.1.27 by Martin Pool
Check that version numbers passed in are reasonable
235
0.1.78 by Martin Pool
Rename Weave.get_included to inclusions and getiter to get_iter
236
    def inclusions(self, versions):
893 by Martin Pool
- Refactor weave calculation of inclusions
237
        """Return set of all ancestors of given version(s)."""
924 by Martin Pool
- Add IntSet class
238
        i = IntSet(versions)
893 by Martin Pool
- Refactor weave calculation of inclusions
239
        v = max(versions)
892 by Martin Pool
- weave stores only direct parents, and calculates and memoizes expansion as needed
240
        try:
893 by Martin Pool
- Refactor weave calculation of inclusions
241
            while v >= 0:
242
                if v in i:
243
                    # include all its parents
244
                    i.update(self._v[v])
245
                v -= 1
246
            return i
892 by Martin Pool
- weave stores only direct parents, and calculates and memoizes expansion as needed
247
        except IndexError:
248
            raise ValueError("version %d not present in weave" % v)
0.1.77 by Martin Pool
New Weave.get_included() does transitive expansion
249
250
890 by Martin Pool
- weave info should show minimal expression of parents
251
    def minimal_parents(self, version):
252
        """Find the minimal set of parents for the version."""
253
        included = self._v[version]
254
        if not included:
255
            return []
256
        
257
        li = list(included)
893 by Martin Pool
- Refactor weave calculation of inclusions
258
        li.sort(reverse=True)
890 by Martin Pool
- weave info should show minimal expression of parents
259
260
        mininc = []
924 by Martin Pool
- Add IntSet class
261
        gotit = IntSet()
890 by Martin Pool
- weave info should show minimal expression of parents
262
263
        for pv in li:
264
            if pv not in gotit:
265
                mininc.append(pv)
893 by Martin Pool
- Refactor weave calculation of inclusions
266
                gotit.update(self.inclusions(pv))
890 by Martin Pool
- weave info should show minimal expression of parents
267
268
        assert mininc[0] >= 0
269
        assert mininc[-1] < version
270
        return mininc
271
272
0.1.75 by Martin Pool
Remove VerInfo class; just store sets directly in the list of
273
    def _addversion(self, parents):
274
        if parents:
892 by Martin Pool
- weave stores only direct parents, and calculates and memoizes expansion as needed
275
            self._v.append(parents)
0.1.75 by Martin Pool
Remove VerInfo class; just store sets directly in the list of
276
        else:
926 by Martin Pool
- update more weave code to use intsets
277
            self._v.append(IntSet())
0.1.75 by Martin Pool
Remove VerInfo class; just store sets directly in the list of
278
279
0.1.39 by Martin Pool
Change to a more realistic weave structure which can represent insertions and
280
    def _check_lines(self, text):
281
        if not isinstance(text, list):
282
            raise ValueError("text should be a list, not %s" % type(text))
283
284
        for l in text:
285
            if not isinstance(l, basestring):
869 by Martin Pool
- more weave.py command line options
286
                raise ValueError("text line should be a string or unicode, not %s"
287
                                 % type(l))
0.1.39 by Martin Pool
Change to a more realistic weave structure which can represent insertions and
288
        
289
290
0.1.27 by Martin Pool
Check that version numbers passed in are reasonable
291
    def _check_versions(self, indexes):
292
        """Check everything in the sequence of indexes is valid"""
293
        for i in indexes:
294
            try:
295
                self._v[i]
296
            except IndexError:
297
                raise IndexError("invalid version number %r" % i)
298
0.1.2 by Martin Pool
Import testsweet module adapted from bzr.
299
    
0.1.7 by Martin Pool
Add trivial annotate text
300
    def annotate(self, index):
301
        return list(self.annotate_iter(index))
302
303
0.1.78 by Martin Pool
Rename Weave.get_included to inclusions and getiter to get_iter
304
    def annotate_iter(self, version):
0.1.7 by Martin Pool
Add trivial annotate text
305
        """Yield list of (index-id, line) pairs for the specified version.
306
307
        The index indicates when the line originated in the weave."""
893 by Martin Pool
- Refactor weave calculation of inclusions
308
        for origin, lineno, text in self._extract([version]):
0.1.39 by Martin Pool
Change to a more realistic weave structure which can represent insertions and
309
            yield origin, text
0.1.22 by Martin Pool
Calculate delta for new versions relative to a set of parent versions.
310
311
918 by Martin Pool
- start doing new weave-merge algorithm
312
    def _walk(self):
313
        """Walk the weave.
314
315
        Yields sequence of
316
        (lineno, insert, deletes, text)
317
        for each literal line.
318
        """
319
        
320
        istack = []
926 by Martin Pool
- update more weave code to use intsets
321
        dset = IntSet()
918 by Martin Pool
- start doing new weave-merge algorithm
322
323
        lineno = 0         # line of weave, 0-based
324
325
        for l in self._l:
326
            if isinstance(l, tuple):
327
                c, v = l
328
                isactive = None
329
                if c == '{':
330
                    istack.append(v)
331
                elif c == '}':
332
                    oldv = istack.pop()
333
                elif c == '[':
926 by Martin Pool
- update more weave code to use intsets
334
                    assert v not in dset
335
                    dset.add(v)
918 by Martin Pool
- start doing new weave-merge algorithm
336
                elif c == ']':
926 by Martin Pool
- update more weave code to use intsets
337
                    dset.remove(v)
918 by Martin Pool
- start doing new weave-merge algorithm
338
                else:
339
                    raise WeaveFormatError('unexpected instruction %r'
340
                                           % v)
341
            else:
342
                assert isinstance(l, basestring)
343
                assert istack
344
                yield lineno, istack[-1], dset, l
345
            lineno += 1
346
347
348
893 by Martin Pool
- Refactor weave calculation of inclusions
349
    def _extract(self, versions):
0.1.20 by Martin Pool
Factor out Knit.extract() method
350
        """Yield annotation of lines in included set.
351
0.1.39 by Martin Pool
Change to a more realistic weave structure which can represent insertions and
352
        Yields a sequence of tuples (origin, lineno, text), where
353
        origin is the origin version, lineno the index in the weave,
354
        and text the text of the line.
355
0.1.20 by Martin Pool
Factor out Knit.extract() method
356
        The set typically but not necessarily corresponds to a version.
357
        """
893 by Martin Pool
- Refactor weave calculation of inclusions
358
        included = self.inclusions(versions)
881 by Martin Pool
- faster weave extraction
359
360
        istack = []
926 by Martin Pool
- update more weave code to use intsets
361
        dset = IntSet()
0.1.48 by Martin Pool
Basic parsing of delete instructions.
362
363
        lineno = 0         # line of weave, 0-based
891 by Martin Pool
- fix up refactoring of weave
364
894 by Martin Pool
- small optimization for weave extract
365
        isactive = None
0.1.85 by Martin Pool
doc
366
0.1.63 by Martin Pool
Abbreviate WeaveFormatError in some code
367
        WFE = WeaveFormatError
0.1.95 by Martin Pool
- preliminary merge conflict detection
368
0.1.39 by Martin Pool
Change to a more realistic weave structure which can represent insertions and
369
        for l in self._l:
370
            if isinstance(l, tuple):
371
                c, v = l
894 by Martin Pool
- small optimization for weave extract
372
                isactive = None
891 by Martin Pool
- fix up refactoring of weave
373
                if c == '{':
374
                    assert v not in istack
375
                    istack.append(v)
376
                elif c == '}':
377
                    oldv = istack.pop()
378
                    assert oldv == v
379
                elif c == '[':
380
                    if v in included:
881 by Martin Pool
- faster weave extraction
381
                        assert v not in dset
0.1.48 by Martin Pool
Basic parsing of delete instructions.
382
                        dset.add(v)
891 by Martin Pool
- fix up refactoring of weave
383
                else:
384
                    assert c == ']'
385
                    if v in included:
881 by Martin Pool
- faster weave extraction
386
                        assert v in dset
0.1.48 by Martin Pool
Basic parsing of delete instructions.
387
                        dset.remove(v)
0.1.39 by Martin Pool
Change to a more realistic weave structure which can represent insertions and
388
            else:
389
                assert isinstance(l, basestring)
894 by Martin Pool
- small optimization for weave extract
390
                if isactive is None:
391
                    isactive = (not dset) and istack and (istack[-1] in included)
0.1.39 by Martin Pool
Change to a more realistic weave structure which can represent insertions and
392
                if isactive:
888 by Martin Pool
- fix refactoring breakage
393
                    yield istack[-1], lineno, l
0.1.39 by Martin Pool
Change to a more realistic weave structure which can represent insertions and
394
            lineno += 1
0.1.7 by Martin Pool
Add trivial annotate text
395
0.1.46 by Martin Pool
More constraints on structure of weave, and checks that they work
396
        if istack:
0.1.63 by Martin Pool
Abbreviate WeaveFormatError in some code
397
            raise WFE("unclosed insertion blocks at end of weave",
0.1.47 by Martin Pool
New WeaveError and WeaveFormatError rather than assertions.
398
                                   istack)
0.1.48 by Martin Pool
Basic parsing of delete instructions.
399
        if dset:
0.1.63 by Martin Pool
Abbreviate WeaveFormatError in some code
400
            raise WFE("unclosed deletion blocks at end of weave",
0.1.48 by Martin Pool
Basic parsing of delete instructions.
401
                                   dset)
0.1.40 by Martin Pool
Add test for extracting from weave with nested insertions
402
0.1.7 by Martin Pool
Add trivial annotate text
403
0.1.78 by Martin Pool
Rename Weave.get_included to inclusions and getiter to get_iter
404
    def get_iter(self, version):
0.1.5 by Martin Pool
Add test for storing two text versions.
405
        """Yield lines for the specified version."""
893 by Martin Pool
- Refactor weave calculation of inclusions
406
        for origin, lineno, line in self._extract([version]):
0.1.8 by Martin Pool
Unify get/annotate code
407
            yield line
0.1.5 by Martin Pool
Add test for storing two text versions.
408
409
0.1.4 by Martin Pool
Start indexing knits by both integer and version string.
410
    def get(self, index):
0.1.78 by Martin Pool
Rename Weave.get_included to inclusions and getiter to get_iter
411
        return list(self.get_iter(index))
0.1.1 by Martin Pool
Check in old existing knit code.
412
413
0.1.95 by Martin Pool
- preliminary merge conflict detection
414
    def mash_iter(self, included):
0.1.65 by Martin Pool
Add Weave.merge_iter to get automerged lines
415
        """Return composed version of multiple included versions."""
893 by Martin Pool
- Refactor weave calculation of inclusions
416
        for origin, lineno, text in self._extract(included):
0.1.65 by Martin Pool
Add Weave.merge_iter to get automerged lines
417
            yield text
418
419
0.1.11 by Martin Pool
Add Knit.dump method
420
    def dump(self, to_file):
421
        from pprint import pprint
0.1.38 by Martin Pool
Rename knit to weave. (I don't think there's an existing module called weave.)
422
        print >>to_file, "Weave._l = ",
0.1.11 by Martin Pool
Add Knit.dump method
423
        pprint(self._l, to_file)
0.1.38 by Martin Pool
Rename knit to weave. (I don't think there's an existing module called weave.)
424
        print >>to_file, "Weave._v = ",
0.1.18 by Martin Pool
Better Knit.dump method
425
        pprint(self._v, to_file)
0.1.11 by Martin Pool
Add Knit.dump method
426
427
0.1.91 by Martin Pool
Update Weave.check
428
429
    def numversions(self):
430
        l = len(self._v)
431
        assert l == len(self._sha1s)
432
        return l
433
434
894 by Martin Pool
- small optimization for weave extract
435
    def check(self, progress_bar=None):
0.1.91 by Martin Pool
Update Weave.check
436
        # check no circular inclusions
437
        for version in range(self.numversions()):
438
            inclusions = list(self._v[version])
439
            if inclusions:
440
                inclusions.sort()
441
                if inclusions[-1] >= version:
0.1.47 by Martin Pool
New WeaveError and WeaveFormatError rather than assertions.
442
                    raise WeaveFormatError("invalid included version %d for index %d"
0.1.91 by Martin Pool
Update Weave.check
443
                                           % (inclusions[-1], version))
444
445
        # try extracting all versions; this is a bit slow and parallel
446
        # extraction could be used
447
        import sha
894 by Martin Pool
- small optimization for weave extract
448
        nv = self.numversions()
449
        for version in range(nv):
450
            if progress_bar:
451
                progress_bar.update('checking text', version, nv)
0.1.91 by Martin Pool
Update Weave.check
452
            s = sha.new()
453
            for l in self.get_iter(version):
454
                s.update(l)
455
            hd = s.hexdigest()
456
            expected = self._sha1s[version]
457
            if hd != expected:
458
                raise WeaveError("mismatched sha1 for version %d; "
459
                                 "got %s, expected %s"
460
                                 % (version, hd, expected))
0.1.18 by Martin Pool
Better Knit.dump method
461
881 by Martin Pool
- faster weave extraction
462
        # TODO: check insertions are properly nested, that there are
463
        # no lines outside of insertion blocks, that deletions are
464
        # properly paired, etc.
465
0.1.13 by Martin Pool
Knit structure now allows for versions to include the lines present in other
466
467
0.1.95 by Martin Pool
- preliminary merge conflict detection
468
    def merge(self, merge_versions):
469
        """Automerge and mark conflicts between versions.
470
471
        This returns a sequence, each entry describing alternatives
472
        for a chunk of the file.  Each of the alternatives is given as
473
        a list of lines.
474
475
        If there is a chunk of the file where there's no diagreement,
476
        only one alternative is given.
477
        """
478
479
        # approach: find the included versions common to all the
480
        # merged versions
481
        raise NotImplementedError()
482
483
484
0.1.21 by Martin Pool
Start computing a delta to insert a new revision
485
    def _delta(self, included, lines):
486
        """Return changes from basis to new revision.
487
488
        The old text for comparison is the union of included revisions.
489
490
        This is used in inserting a new text.
0.1.22 by Martin Pool
Calculate delta for new versions relative to a set of parent versions.
491
0.1.55 by Martin Pool
doc
492
        Delta is returned as a sequence of
493
        (weave1, weave2, newlines).
494
495
        This indicates that weave1:weave2 of the old weave should be
0.1.22 by Martin Pool
Calculate delta for new versions relative to a set of parent versions.
496
        replaced by the sequence of lines in newlines.  Note that
497
        these line numbers are positions in the total weave and don't
498
        correspond to the lines in any extracted version, or even the
499
        extracted union of included versions.
500
501
        If line1=line2, this is a pure insert; if newlines=[] this is a
502
        pure delete.  (Similar to difflib.)
0.1.21 by Martin Pool
Start computing a delta to insert a new revision
503
        """
0.1.54 by Martin Pool
Fix weave line calculation when making deltas
504
        # basis a list of (origin, lineno, line)
0.1.84 by Martin Pool
Refactor Weave._delta to calculate less unused information
505
        basis_lineno = []
0.1.83 by Martin Pool
Better delta basis calculation
506
        basis_lines = []
893 by Martin Pool
- Refactor weave calculation of inclusions
507
        for origin, lineno, line in self._extract(included):
0.1.84 by Martin Pool
Refactor Weave._delta to calculate less unused information
508
            basis_lineno.append(lineno)
509
            basis_lines.append(line)
0.1.22 by Martin Pool
Calculate delta for new versions relative to a set of parent versions.
510
511
        # add a sentinal, because we can also match against the final line
0.1.84 by Martin Pool
Refactor Weave._delta to calculate less unused information
512
        basis_lineno.append(len(self._l))
0.1.22 by Martin Pool
Calculate delta for new versions relative to a set of parent versions.
513
0.1.63 by Martin Pool
Abbreviate WeaveFormatError in some code
514
        # XXX: which line of the weave should we really consider
515
        # matches the end of the file?  the current code says it's the
516
        # last line of the weave?
0.1.22 by Martin Pool
Calculate delta for new versions relative to a set of parent versions.
517
518
        from difflib import SequenceMatcher
519
        s = SequenceMatcher(None, basis_lines, lines)
520
0.1.55 by Martin Pool
doc
521
        # TODO: Perhaps return line numbers from composed weave as well?
522
0.1.22 by Martin Pool
Calculate delta for new versions relative to a set of parent versions.
523
        for tag, i1, i2, j1, j2 in s.get_opcodes():
0.1.23 by Martin Pool
tidy up
524
            ##print tag, i1, i2, j1, j2
0.1.22 by Martin Pool
Calculate delta for new versions relative to a set of parent versions.
525
526
            if tag == 'equal':
527
                continue
528
529
            # i1,i2 are given in offsets within basis_lines; we need to map them
530
            # back to offsets within the entire weave
0.1.84 by Martin Pool
Refactor Weave._delta to calculate less unused information
531
            real_i1 = basis_lineno[i1]
532
            real_i2 = basis_lineno[i2]
0.1.22 by Martin Pool
Calculate delta for new versions relative to a set of parent versions.
533
0.1.35 by Martin Pool
Clean up Knit._delta method
534
            assert 0 <= j1
535
            assert j1 <= j2
536
            assert j2 <= len(lines)
0.1.22 by Martin Pool
Calculate delta for new versions relative to a set of parent versions.
537
0.1.35 by Martin Pool
Clean up Knit._delta method
538
            yield real_i1, real_i2, lines[j1:j2]
0.1.21 by Martin Pool
Start computing a delta to insert a new revision
539
0.1.1 by Martin Pool
Check in old existing knit code.
540
918 by Martin Pool
- start doing new weave-merge algorithm
541
            
542
    def plan_merge(self, ver_a, ver_b):
543
        """Return pseudo-annotation indicating how the two versions merge.
544
545
        This is computed between versions a and b and their common
546
        base.
547
548
        Weave lines present in none of them are skipped entirely.
549
        """
926 by Martin Pool
- update more weave code to use intsets
550
        inc_a = self.inclusions([ver_a])
551
        inc_b = self.inclusions([ver_b])
918 by Martin Pool
- start doing new weave-merge algorithm
552
        inc_c = inc_a & inc_b
553
554
        for lineno, insert, deleteset, line in self._walk():
555
            if deleteset & inc_c:
556
                # killed in parent; can't be in either a or b
557
                # not relevant to our work
558
                yield 'killed-base', line
926 by Martin Pool
- update more weave code to use intsets
559
            elif insert in inc_c:
918 by Martin Pool
- start doing new weave-merge algorithm
560
                # was inserted in base
561
                killed_a = bool(deleteset & inc_a)
562
                killed_b = bool(deleteset & inc_b)
563
                if killed_a and killed_b:
564
                    yield 'killed-both', line
565
                elif killed_a:
566
                    yield 'killed-a', line
567
                elif killed_b:
568
                    yield 'killed-b', line
569
                else:
570
                    yield 'unchanged', line
926 by Martin Pool
- update more weave code to use intsets
571
            elif insert in inc_a:
918 by Martin Pool
- start doing new weave-merge algorithm
572
                if deleteset & inc_a:
573
                    yield 'ghost-a', line
574
                else:
575
                    # new in A; not in B
576
                    yield 'new-a', line
926 by Martin Pool
- update more weave code to use intsets
577
            elif insert in inc_b:
918 by Martin Pool
- start doing new weave-merge algorithm
578
                if deleteset & inc_b:
579
                    yield 'ghost-b', line
580
                else:
581
                    yield 'new-b', line
582
            else:
583
                # not in either revision
584
                yield 'irrelevant', line
585
919 by Martin Pool
- more development of weave-merge
586
        yield 'unchanged', ''           # terminator
587
588
589
590
    def weave_merge(self, plan):
591
        lines_a = []
592
        lines_b = []
593
        ch_a = ch_b = False
594
595
        for state, line in plan:
596
            if state == 'unchanged' or state == 'killed-both':
597
                # resync and flush queued conflicts changes if any
598
                if not lines_a and not lines_b:
599
                    pass
600
                elif ch_a and not ch_b:
601
                    # one-sided change:                    
602
                    for l in lines_a: yield l
603
                elif ch_b and not ch_a:
604
                    for l in lines_b: yield l
605
                elif lines_a == lines_b:
606
                    for l in lines_a: yield l
607
                else:
608
                    yield '<<<<\n'
609
                    for l in lines_a: yield l
610
                    yield '====\n'
611
                    for l in lines_b: yield l
612
                    yield '>>>>\n'
613
614
                del lines_a[:]
615
                del lines_b[:]
616
                ch_a = ch_b = False
617
                
618
            if state == 'unchanged':
619
                if line:
620
                    yield line
621
            elif state == 'killed-a':
622
                ch_a = True
623
                lines_b.append(line)
624
            elif state == 'killed-b':
625
                ch_b = True
626
                lines_a.append(line)
627
            elif state == 'new-a':
628
                ch_a = True
629
                lines_a.append(line)
630
            elif state == 'new-b':
631
                ch_b = True
632
                lines_b.append(line)
633
            else:
920 by Martin Pool
- add more test cases for weave_merge
634
                assert state in ('irrelevant', 'ghost-a', 'ghost-b', 'killed-base',
635
                                 'killed-both'), \
919 by Martin Pool
- more development of weave-merge
636
                       state
637
638
                
639
640
918 by Martin Pool
- start doing new weave-merge algorithm
641
642
0.1.62 by Martin Pool
Lame command-line client for reading and writing weaves.
643
0.1.88 by Martin Pool
Add weave info command.
644
def weave_info(filename, out):
645
    """Show some text information about the weave."""
646
    from weavefile import read_weave
647
    wf = file(filename, 'rb')
648
    w = read_weave(wf)
649
    # FIXME: doesn't work on pipes
650
    weave_size = wf.tell()
651
    print >>out, "weave file size %d bytes" % weave_size
652
    print >>out, "weave contains %d versions" % len(w._v)
653
654
    total = 0
870 by Martin Pool
- better weave info display
655
    print '%6s %6s %8s %40s %20s' % ('ver', 'lines', 'bytes', 'sha1', 'parents')
656
    for i in (6, 6, 8, 40, 20):
657
        print '-' * i,
658
    print
0.1.88 by Martin Pool
Add weave info command.
659
    for i in range(len(w._v)):
660
        text = w.get(i)
661
        lines = len(text)
662
        bytes = sum((len(a) for a in text))
0.1.91 by Martin Pool
Update Weave.check
663
        sha1 = w._sha1s[i]
870 by Martin Pool
- better weave info display
664
        print '%6d %6d %8d %40s' % (i, lines, bytes, sha1),
892 by Martin Pool
- weave stores only direct parents, and calculates and memoizes expansion as needed
665
        for pv in w._v[i]:
666
            print pv,
667
        print
0.1.88 by Martin Pool
Add weave info command.
668
        total += bytes
669
670
    print >>out, "versions total %d bytes" % total
671
    print >>out, "compression ratio %.3f" % (float(total)/float(weave_size))
869 by Martin Pool
- more weave.py command line options
672
673
674
def usage():
871 by Martin Pool
- add command for merge-based weave
675
    print """bzr weave tool
676
677
Experimental tool for weave algorithm.
678
869 by Martin Pool
- more weave.py command line options
679
usage:
680
    weave init WEAVEFILE
681
        Create an empty weave file
682
    weave get WEAVEFILE VERSION
683
        Write out specified version.
684
    weave check WEAVEFILE
685
        Check consistency of all versions.
686
    weave info WEAVEFILE
687
        Display table of contents.
688
    weave add WEAVEFILE [BASE...] < NEWTEXT
689
        Add NEWTEXT, with specified parent versions.
690
    weave annotate WEAVEFILE VERSION
691
        Display origin of each line.
692
    weave mash WEAVEFILE VERSION...
693
        Display composite of all selected versions.
694
    weave merge WEAVEFILE VERSION1 VERSION2 > OUT
695
        Auto-merge two versions and display conflicts.
871 by Martin Pool
- add command for merge-based weave
696
697
example:
698
699
    % weave init foo.weave
700
    % vi foo.txt
701
    % weave add foo.weave < foo.txt
702
    added version 0
703
704
    (create updated version)
705
    % vi foo.txt
706
    % weave get foo.weave 0 | diff -u - foo.txt
707
    % weave add foo.weave 0 < foo.txt
708
    added version 1
709
710
    % weave get foo.weave 0 > foo.txt       (create forked version)
711
    % vi foo.txt
712
    % weave add foo.weave 0 < foo.txt
713
    added version 2
714
715
    % weave merge foo.weave 1 2 > foo.txt   (merge them)
716
    % vi foo.txt                            (resolve conflicts)
717
    % weave add foo.weave 1 2 < foo.txt     (commit merged version)     
718
    
869 by Martin Pool
- more weave.py command line options
719
"""
0.1.88 by Martin Pool
Add weave info command.
720
    
721
0.1.62 by Martin Pool
Lame command-line client for reading and writing weaves.
722
723
def main(argv):
724
    import sys
725
    import os
869 by Martin Pool
- more weave.py command line options
726
    from weavefile import write_weave, read_weave
894 by Martin Pool
- small optimization for weave extract
727
    from bzrlib.progress import ProgressBar
728
729
    #import psyco
730
    #psyco.full()
731
0.1.62 by Martin Pool
Lame command-line client for reading and writing weaves.
732
    cmd = argv[1]
869 by Martin Pool
- more weave.py command line options
733
734
    def readit():
735
        return read_weave(file(argv[2], 'rb'))
736
    
737
    if cmd == 'help':
738
        usage()
739
    elif cmd == 'add':
740
        w = readit()
0.1.62 by Martin Pool
Lame command-line client for reading and writing weaves.
741
        # at the moment, based on everything in the file
869 by Martin Pool
- more weave.py command line options
742
        parents = map(int, argv[3:])
0.1.72 by Martin Pool
Go back to weave lines normally having newlines at the end.
743
        lines = sys.stdin.readlines()
0.1.69 by Martin Pool
Simple text-based format for storing weaves, cleaner than
744
        ver = w.add(parents, lines)
869 by Martin Pool
- more weave.py command line options
745
        write_weave(w, file(argv[2], 'wb'))
746
        print 'added version %d' % ver
0.1.62 by Martin Pool
Lame command-line client for reading and writing weaves.
747
    elif cmd == 'init':
748
        fn = argv[2]
749
        if os.path.exists(fn):
750
            raise IOError("file exists")
751
        w = Weave()
869 by Martin Pool
- more weave.py command line options
752
        write_weave(w, file(fn, 'wb'))
753
    elif cmd == 'get': # get one version
754
        w = readit()
0.1.94 by Martin Pool
Fix get_iter call
755
        sys.stdout.writelines(w.get_iter(int(argv[3])))
869 by Martin Pool
- more weave.py command line options
756
        
757
    elif cmd == 'mash': # get composite
758
        w = readit()
759
        sys.stdout.writelines(w.mash_iter(map(int, argv[3:])))
760
0.1.62 by Martin Pool
Lame command-line client for reading and writing weaves.
761
    elif cmd == 'annotate':
869 by Martin Pool
- more weave.py command line options
762
        w = readit()
0.1.72 by Martin Pool
Go back to weave lines normally having newlines at the end.
763
        # newline is added to all lines regardless; too hard to get
764
        # reasonable formatting otherwise
0.1.62 by Martin Pool
Lame command-line client for reading and writing weaves.
765
        lasto = None
766
        for origin, text in w.annotate(int(argv[3])):
0.1.72 by Martin Pool
Go back to weave lines normally having newlines at the end.
767
            text = text.rstrip('\r\n')
0.1.62 by Martin Pool
Lame command-line client for reading and writing weaves.
768
            if origin == lasto:
769
                print '      | %s' % (text)
770
            else:
771
                print '%5d | %s' % (origin, text)
772
                lasto = origin
871 by Martin Pool
- add command for merge-based weave
773
                
0.1.88 by Martin Pool
Add weave info command.
774
    elif cmd == 'info':
775
        weave_info(argv[2], sys.stdout)
871 by Martin Pool
- add command for merge-based weave
776
        
0.1.91 by Martin Pool
Update Weave.check
777
    elif cmd == 'check':
869 by Martin Pool
- more weave.py command line options
778
        w = readit()
894 by Martin Pool
- small optimization for weave extract
779
        pb = ProgressBar()
780
        w.check(pb)
781
        pb.clear()
871 by Martin Pool
- add command for merge-based weave
782
892 by Martin Pool
- weave stores only direct parents, and calculates and memoizes expansion as needed
783
    elif cmd == 'inclusions':
784
        w = readit()
785
        print ' '.join(map(str, w.inclusions([int(argv[3])])))
786
787
    elif cmd == 'parents':
788
        w = readit()
789
        print ' '.join(map(str, w._v[int(argv[3])]))
790
918 by Martin Pool
- start doing new weave-merge algorithm
791
    elif cmd == 'plan-merge':
792
        w = readit()
793
        for state, line in w.plan_merge(int(argv[3]), int(argv[4])):
919 by Martin Pool
- more development of weave-merge
794
            if line:
795
                print '%14s | %s' % (state, line),
918 by Martin Pool
- start doing new weave-merge algorithm
796
871 by Martin Pool
- add command for merge-based weave
797
    elif cmd == 'merge':
919 by Martin Pool
- more development of weave-merge
798
        w = readit()
799
        p = w.plan_merge(int(argv[3]), int(argv[4]))
800
        sys.stdout.writelines(w.weave_merge(p))
801
            
802
    elif cmd == 'mash-merge':
871 by Martin Pool
- add command for merge-based weave
803
        if len(argv) != 5:
804
            usage()
805
            return 1
806
807
        w = readit()
808
        v1, v2 = map(int, argv[3:5])
809
810
        basis = w.inclusions([v1]).intersection(w.inclusions([v2]))
811
812
        base_lines = list(w.mash_iter(basis))
813
        a_lines = list(w.get(v1))
814
        b_lines = list(w.get(v2))
815
816
        from bzrlib.merge3 import Merge3
817
        m3 = Merge3(base_lines, a_lines, b_lines)
818
819
        name_a = 'version %d' % v1
820
        name_b = 'version %d' % v2
821
        sys.stdout.writelines(m3.merge_lines(name_a=name_a, name_b=name_b))
0.1.62 by Martin Pool
Lame command-line client for reading and writing weaves.
822
    else:
823
        raise ValueError('unknown command %r' % cmd)
824
    
825
826
if __name__ == '__main__':
827
    import sys
828
    sys.exit(main(sys.argv))