/brz/remove-bazaar

To get this branch, use:
bzr branch http://gegoxaren.bato24.eu/bzr/brz/remove-bazaar
0.1.1 by Martin Pool
Check in old existing knit code.
1
#! /usr/bin/python
2
3
# Copyright (C) 2005 Canonical Ltd
4
0.1.33 by Martin Pool
add gpl text
5
# This program is free software; you can redistribute it and/or modify
6
# it under the terms of the GNU General Public License as published by
7
# the Free Software Foundation; either version 2 of the License, or
8
# (at your option) any later version.
9
10
# This program is distributed in the hope that it will be useful,
11
# but WITHOUT ANY WARRANTY; without even the implied warranty of
12
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
13
# GNU General Public License for more details.
14
15
# You should have received a copy of the GNU General Public License
16
# along with this program; if not, write to the Free Software
17
# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
0.1.1 by Martin Pool
Check in old existing knit code.
18
19
# Author: Martin Pool <mbp@canonical.com>
20
21
0.1.38 by Martin Pool
Rename knit to weave. (I don't think there's an existing module called weave.)
22
"""Weave - storage of related text file versions"""
0.1.2 by Martin Pool
Import testsweet module adapted from bzr.
23
0.1.61 by Martin Pool
doc
24
# TODO: Perhaps have copy method for Weave instances?
0.1.2 by Martin Pool
Import testsweet module adapted from bzr.
25
0.1.58 by Martin Pool
doc
26
# XXX: If we do weaves this way, will a merge still behave the same
27
# way if it's done in a different order?  That's a pretty desirable
28
# property.
29
0.1.62 by Martin Pool
Lame command-line client for reading and writing weaves.
30
# TODO: Nothing here so far assumes the lines are really \n newlines,
31
# rather than being split up in some other way.  We could accomodate
32
# binaries, perhaps by naively splitting on \n or perhaps using
33
# something like a rolling checksum.
34
35
# TODO: Track version names as well as indexes. 
36
0.1.85 by Martin Pool
doc
37
# TODO: End marker for each version so we can stop reading?
0.1.69 by Martin Pool
Simple text-based format for storing weaves, cleaner than
38
39
# TODO: Check that no insertion occurs inside a deletion that was
40
# active in the version of the insertion.
41
912 by Martin Pool
- update todos for weave
42
# TODO: In addition to the SHA-1 check, perhaps have some code that
43
# checks structural constraints of the weave: ie that insertions are
44
# properly nested, that there is no text outside of an insertion, that
45
# insertions or deletions are not repeated, etc.
0.1.85 by Martin Pool
doc
46
918 by Martin Pool
- start doing new weave-merge algorithm
47
# TODO: Make the info command just show info, not extract everything:
48
# it can be much faster.
49
50
# TODO: Perhaps use long integers as sets instead of set objects; may
51
# be faster.
52
53
# TODO: Parallel-extract that passes back each line along with a
54
# description of which revisions include it.  Nice for checking all
55
# shas in parallel.
56
57
0.1.85 by Martin Pool
doc
58
0.1.34 by Martin Pool
remove dead code
59
0.1.66 by Martin Pool
Cope without set/frozenset classes
60
try:
61
    set
62
    frozenset
63
except NameError:
64
    from sets import Set, ImmutableSet
65
    set = Set
66
    frozenset = ImmutableSet
0.1.67 by Martin Pool
More fixes to try to run on python2.3
67
    del Set, ImmutableSet
0.1.66 by Martin Pool
Cope without set/frozenset classes
68
69
924 by Martin Pool
- Add IntSet class
70
0.1.47 by Martin Pool
New WeaveError and WeaveFormatError rather than assertions.
71
class WeaveError(Exception):
72
    """Exception in processing weave"""
73
74
75
class WeaveFormatError(WeaveError):
76
    """Weave invariant violated"""
77
    
78
0.1.38 by Martin Pool
Rename knit to weave. (I don't think there's an existing module called weave.)
79
class Weave(object):
80
    """weave - versioned text file storage.
0.1.2 by Martin Pool
Import testsweet module adapted from bzr.
81
    
0.1.72 by Martin Pool
Go back to weave lines normally having newlines at the end.
82
    A Weave manages versions of line-based text files, keeping track
83
    of the originating version for each line.
84
85
    To clients the "lines" of the file are represented as a list of strings.
86
    These strings  will typically have terminal newline characters, but
87
    this is not required.  In particular files commonly do not have a newline
88
    at the end of the file.
0.1.2 by Martin Pool
Import testsweet module adapted from bzr.
89
0.1.4 by Martin Pool
Start indexing knits by both integer and version string.
90
    Texts can be identified in either of two ways:
91
92
    * a nonnegative index number.
93
94
    * a version-id string.
95
0.1.38 by Martin Pool
Rename knit to weave. (I don't think there's an existing module called weave.)
96
    Typically the index number will be valid only inside this weave and
0.1.4 by Martin Pool
Start indexing knits by both integer and version string.
97
    the version-id is used to reference it in the larger world.
0.1.2 by Martin Pool
Import testsweet module adapted from bzr.
98
0.1.39 by Martin Pool
Change to a more realistic weave structure which can represent insertions and
99
    The weave is represented as a list mixing edit instructions and
100
    literal text.  Each entry in _l can be either a string (or
101
    unicode), or a tuple.  If a string, it means that the given line
102
    should be output in the currently active revisions.
103
104
    If a tuple, it gives a processing instruction saying in which
105
    revisions the enclosed lines are active.  The tuple has the form
106
    (instruction, version).
107
108
    The instruction can be '{' or '}' for an insertion block, and '['
109
    and ']' for a deletion block respectively.  The version is the
0.1.45 by Martin Pool
doc
110
    integer version index.  There is no replace operator, only deletes
111
    and inserts.
0.1.39 by Martin Pool
Change to a more realistic weave structure which can represent insertions and
112
0.1.41 by Martin Pool
Doc
113
    Constraints/notes:
0.1.39 by Martin Pool
Change to a more realistic weave structure which can represent insertions and
114
115
    * A later version can delete lines that were introduced by any
116
      number of ancestor versions; this implies that deletion
117
      instructions can span insertion blocks without regard to the
118
      insertion block's nesting.
119
0.1.41 by Martin Pool
Doc
120
    * Similarly, deletions need not be properly nested with regard to
121
      each other, because they might have been generated by
122
      independent revisions.
123
0.1.45 by Martin Pool
doc
124
    * Insertions are always made by inserting a new bracketed block
125
      into a single point in the previous weave.  This implies they
126
      can nest but not overlap, and the nesting must always have later
127
      insertions on the inside.
128
0.1.41 by Martin Pool
Doc
129
    * It doesn't seem very useful to have an active insertion
130
      inside an inactive insertion, but it might happen.
0.1.45 by Martin Pool
doc
131
      
0.1.41 by Martin Pool
Doc
132
    * Therefore, all instructions are always"considered"; that
133
      is passed onto and off the stack.  An outer inactive block
134
      doesn't disable an inner block.
135
136
    * Lines are enabled if the most recent enclosing insertion is
137
      active and none of the enclosing deletions are active.
0.1.39 by Martin Pool
Change to a more realistic weave structure which can represent insertions and
138
0.1.49 by Martin Pool
Add another constraint: revisions should not delete text that they
139
    * There is no point having a deletion directly inside its own
140
      insertion; you might as well just not write it.  And there
141
      should be no way to get an earlier version deleting a later
142
      version.
143
0.1.2 by Martin Pool
Import testsweet module adapted from bzr.
144
    _l
892 by Martin Pool
- weave stores only direct parents, and calculates and memoizes expansion as needed
145
        Text of the weave.
0.1.4 by Martin Pool
Start indexing knits by both integer and version string.
146
147
    _v
892 by Martin Pool
- weave stores only direct parents, and calculates and memoizes expansion as needed
148
        List of parents, indexed by version number.
149
        It is only necessary to store the minimal set of parents for
150
        each version; the parent's parents are implied.
0.1.13 by Martin Pool
Knit structure now allows for versions to include the lines present in other
151
0.1.89 by Martin Pool
Store SHA1 in weave file for later verification
152
    _sha1s
153
        List of hex SHA-1 of each version, or None if not recorded.
0.1.2 by Martin Pool
Import testsweet module adapted from bzr.
154
    """
0.1.4 by Martin Pool
Start indexing knits by both integer and version string.
155
    def __init__(self):
156
        self._l = []
157
        self._v = []
0.1.89 by Martin Pool
Store SHA1 in weave file for later verification
158
        self._sha1s = []
0.1.60 by Martin Pool
Weave eq and ne methods
159
160
161
    def __eq__(self, other):
162
        if not isinstance(other, Weave):
163
            return False
164
        return self._v == other._v \
165
               and self._l == other._l
166
    
167
168
    def __ne__(self, other):
169
        return not self.__eq__(other)
170
0.1.2 by Martin Pool
Import testsweet module adapted from bzr.
171
        
0.1.26 by Martin Pool
Refactor parameters to add command
172
    def add(self, parents, text):
0.1.4 by Martin Pool
Start indexing knits by both integer and version string.
173
        """Add a single text on top of the weave.
0.1.36 by Martin Pool
doc
174
  
0.1.26 by Martin Pool
Refactor parameters to add command
175
        Returns the index number of the newly added version.
176
177
        parents
892 by Martin Pool
- weave stores only direct parents, and calculates and memoizes expansion as needed
178
            List or set of direct parent version numbers.
179
            
0.1.26 by Martin Pool
Refactor parameters to add command
180
        text
181
            Sequence of lines to be added in the new version."""
0.1.82 by Martin Pool
Small weave optimizations
182
        ## self._check_versions(parents)
183
        ## self._check_lines(text)
0.1.4 by Martin Pool
Start indexing knits by both integer and version string.
184
        idx = len(self._v)
0.1.5 by Martin Pool
Add test for storing two text versions.
185
0.1.89 by Martin Pool
Store SHA1 in weave file for later verification
186
        import sha
187
        s = sha.new()
188
        for l in text:
189
            s.update(l)
190
        sha1 = s.hexdigest()
191
        del s
192
918 by Martin Pool
- start doing new weave-merge algorithm
193
        # TODO: It'd probably be faster to append things on to a new
194
        # list rather than modifying the existing one, which is likely
195
        # to cause a lot of copying.
196
0.1.26 by Martin Pool
Refactor parameters to add command
197
        if parents:
892 by Martin Pool
- weave stores only direct parents, and calculates and memoizes expansion as needed
198
            ancestors = self.inclusions(parents)
199
            delta = self._delta(ancestors, text)
0.1.25 by Martin Pool
Handle insertion of new weave layers that insert text on top of the basis
200
0.1.31 by Martin Pool
Fix insertion of multiple regions, calculating the right line offset as we go.
201
            # offset gives the number of lines that have been inserted
202
            # into the weave up to the current point; if the original edit instruction
203
            # says to change line A then we actually change (A+offset)
204
            offset = 0
205
0.1.25 by Martin Pool
Handle insertion of new weave layers that insert text on top of the basis
206
            for i1, i2, newlines in delta:
0.1.29 by Martin Pool
Better internal error
207
                assert 0 <= i1
208
                assert i1 <= i2
209
                assert i2 <= len(self._l)
0.1.56 by Martin Pool
Handle deletion of lines by marking the region with a deletion
210
211
                # the deletion and insertion are handled separately.
212
                # first delete the region.
0.1.25 by Martin Pool
Handle insertion of new weave layers that insert text on top of the basis
213
                if i1 != i2:
0.1.56 by Martin Pool
Handle deletion of lines by marking the region with a deletion
214
                    self._l.insert(i1+offset, ('[', idx))
215
                    self._l.insert(i2+offset+1, (']', idx))
216
                    offset += 2
217
                    # is this OK???
0.1.39 by Martin Pool
Change to a more realistic weave structure which can represent insertions and
218
0.1.56 by Martin Pool
Handle deletion of lines by marking the region with a deletion
219
                if newlines:
0.1.57 by Martin Pool
Fix bug in an update edit that both deletes and inserts -- previously
220
                    # there may have been a deletion spanning up to
221
                    # i2; we want to insert after this region to make sure
222
                    # we don't destroy ourselves
223
                    i = i2 + offset
0.1.56 by Martin Pool
Handle deletion of lines by marking the region with a deletion
224
                    self._l[i:i] = [('{', idx)] \
225
                                   + newlines \
226
                                   + [('}', idx)]
227
                    offset += 2 + len(newlines)
0.1.25 by Martin Pool
Handle insertion of new weave layers that insert text on top of the basis
228
0.1.75 by Martin Pool
Remove VerInfo class; just store sets directly in the list of
229
            self._addversion(parents)
0.1.25 by Martin Pool
Handle insertion of new weave layers that insert text on top of the basis
230
        else:
0.1.26 by Martin Pool
Refactor parameters to add command
231
            # special case; adding with no parents revision; can do this
0.1.25 by Martin Pool
Handle insertion of new weave layers that insert text on top of the basis
232
            # more quickly by just appending unconditionally
0.1.39 by Martin Pool
Change to a more realistic weave structure which can represent insertions and
233
            self._l.append(('{', idx))
234
            self._l += text
235
            self._l.append(('}', idx))
0.1.25 by Martin Pool
Handle insertion of new weave layers that insert text on top of the basis
236
0.1.75 by Martin Pool
Remove VerInfo class; just store sets directly in the list of
237
            self._addversion(None)
0.1.89 by Martin Pool
Store SHA1 in weave file for later verification
238
239
        self._sha1s.append(sha1)
0.1.25 by Martin Pool
Handle insertion of new weave layers that insert text on top of the basis
240
            
0.1.4 by Martin Pool
Start indexing knits by both integer and version string.
241
        return idx
0.1.2 by Martin Pool
Import testsweet module adapted from bzr.
242
0.1.27 by Martin Pool
Check that version numbers passed in are reasonable
243
918 by Martin Pool
- start doing new weave-merge algorithm
244
    def inclusions_bitset(self, versions):
245
        i = 0
246
        for v in versions:
247
            i |= (1L << v)
248
        v = max(versions)
249
        while v >= 0:
250
            if i & (1L << v):
251
                # if v is included, include all its parents
252
                for pv in self._v[v]:
253
                    i |= (1L << pv)
254
            v -= 1
255
        return i
256
257
0.1.78 by Martin Pool
Rename Weave.get_included to inclusions and getiter to get_iter
258
    def inclusions(self, versions):
893 by Martin Pool
- Refactor weave calculation of inclusions
259
        """Return set of all ancestors of given version(s)."""
924 by Martin Pool
- Add IntSet class
260
        from bzrlib.intset import IntSet
261
        
262
        i = IntSet(versions)
893 by Martin Pool
- Refactor weave calculation of inclusions
263
        v = max(versions)
892 by Martin Pool
- weave stores only direct parents, and calculates and memoizes expansion as needed
264
        try:
893 by Martin Pool
- Refactor weave calculation of inclusions
265
            while v >= 0:
266
                if v in i:
267
                    # include all its parents
268
                    i.update(self._v[v])
269
                v -= 1
270
            return i
892 by Martin Pool
- weave stores only direct parents, and calculates and memoizes expansion as needed
271
        except IndexError:
272
            raise ValueError("version %d not present in weave" % v)
0.1.77 by Martin Pool
New Weave.get_included() does transitive expansion
273
274
890 by Martin Pool
- weave info should show minimal expression of parents
275
    def minimal_parents(self, version):
276
        """Find the minimal set of parents for the version."""
924 by Martin Pool
- Add IntSet class
277
        from bzrlib.intset import IntSet
278
        
890 by Martin Pool
- weave info should show minimal expression of parents
279
        included = self._v[version]
280
        if not included:
281
            return []
282
        
283
        li = list(included)
893 by Martin Pool
- Refactor weave calculation of inclusions
284
        li.sort(reverse=True)
890 by Martin Pool
- weave info should show minimal expression of parents
285
286
        mininc = []
924 by Martin Pool
- Add IntSet class
287
        gotit = IntSet()
890 by Martin Pool
- weave info should show minimal expression of parents
288
289
        for pv in li:
290
            if pv not in gotit:
291
                mininc.append(pv)
893 by Martin Pool
- Refactor weave calculation of inclusions
292
                gotit.update(self.inclusions(pv))
890 by Martin Pool
- weave info should show minimal expression of parents
293
294
        assert mininc[0] >= 0
295
        assert mininc[-1] < version
296
        return mininc
297
298
0.1.75 by Martin Pool
Remove VerInfo class; just store sets directly in the list of
299
    def _addversion(self, parents):
300
        if parents:
892 by Martin Pool
- weave stores only direct parents, and calculates and memoizes expansion as needed
301
            self._v.append(parents)
0.1.75 by Martin Pool
Remove VerInfo class; just store sets directly in the list of
302
        else:
303
            self._v.append(frozenset())
304
305
0.1.39 by Martin Pool
Change to a more realistic weave structure which can represent insertions and
306
    def _check_lines(self, text):
307
        if not isinstance(text, list):
308
            raise ValueError("text should be a list, not %s" % type(text))
309
310
        for l in text:
311
            if not isinstance(l, basestring):
869 by Martin Pool
- more weave.py command line options
312
                raise ValueError("text line should be a string or unicode, not %s"
313
                                 % type(l))
0.1.39 by Martin Pool
Change to a more realistic weave structure which can represent insertions and
314
        
315
316
0.1.27 by Martin Pool
Check that version numbers passed in are reasonable
317
    def _check_versions(self, indexes):
318
        """Check everything in the sequence of indexes is valid"""
319
        for i in indexes:
320
            try:
321
                self._v[i]
322
            except IndexError:
323
                raise IndexError("invalid version number %r" % i)
324
0.1.2 by Martin Pool
Import testsweet module adapted from bzr.
325
    
0.1.7 by Martin Pool
Add trivial annotate text
326
    def annotate(self, index):
327
        return list(self.annotate_iter(index))
328
329
0.1.78 by Martin Pool
Rename Weave.get_included to inclusions and getiter to get_iter
330
    def annotate_iter(self, version):
0.1.7 by Martin Pool
Add trivial annotate text
331
        """Yield list of (index-id, line) pairs for the specified version.
332
333
        The index indicates when the line originated in the weave."""
893 by Martin Pool
- Refactor weave calculation of inclusions
334
        for origin, lineno, text in self._extract([version]):
0.1.39 by Martin Pool
Change to a more realistic weave structure which can represent insertions and
335
            yield origin, text
0.1.22 by Martin Pool
Calculate delta for new versions relative to a set of parent versions.
336
337
918 by Martin Pool
- start doing new weave-merge algorithm
338
    def _walk(self):
339
        """Walk the weave.
340
341
        Yields sequence of
342
        (lineno, insert, deletes, text)
343
        for each literal line.
344
        """
345
        
346
        istack = []
347
        dset = 0L
348
349
        lineno = 0         # line of weave, 0-based
350
351
        for l in self._l:
352
            if isinstance(l, tuple):
353
                c, v = l
354
                isactive = None
355
                if c == '{':
356
                    istack.append(v)
357
                elif c == '}':
358
                    oldv = istack.pop()
359
                elif c == '[':
360
                    vs = (1L << v)
361
                    assert not (dset & vs)
362
                    dset |= vs
363
                elif c == ']':
364
                    vs = (1L << v)
365
                    assert dset & vs
366
                    dset ^= vs
367
                else:
368
                    raise WeaveFormatError('unexpected instruction %r'
369
                                           % v)
370
            else:
371
                assert isinstance(l, basestring)
372
                assert istack
373
                yield lineno, istack[-1], dset, l
374
            lineno += 1
375
376
377
893 by Martin Pool
- Refactor weave calculation of inclusions
378
    def _extract(self, versions):
0.1.20 by Martin Pool
Factor out Knit.extract() method
379
        """Yield annotation of lines in included set.
380
0.1.39 by Martin Pool
Change to a more realistic weave structure which can represent insertions and
381
        Yields a sequence of tuples (origin, lineno, text), where
382
        origin is the origin version, lineno the index in the weave,
383
        and text the text of the line.
384
0.1.20 by Martin Pool
Factor out Knit.extract() method
385
        The set typically but not necessarily corresponds to a version.
386
        """
893 by Martin Pool
- Refactor weave calculation of inclusions
387
        included = self.inclusions(versions)
881 by Martin Pool
- faster weave extraction
388
389
        istack = []
390
        dset = set()
0.1.48 by Martin Pool
Basic parsing of delete instructions.
391
392
        lineno = 0         # line of weave, 0-based
891 by Martin Pool
- fix up refactoring of weave
393
894 by Martin Pool
- small optimization for weave extract
394
        isactive = None
0.1.85 by Martin Pool
doc
395
0.1.63 by Martin Pool
Abbreviate WeaveFormatError in some code
396
        WFE = WeaveFormatError
0.1.95 by Martin Pool
- preliminary merge conflict detection
397
0.1.39 by Martin Pool
Change to a more realistic weave structure which can represent insertions and
398
        for l in self._l:
399
            if isinstance(l, tuple):
400
                c, v = l
894 by Martin Pool
- small optimization for weave extract
401
                isactive = None
891 by Martin Pool
- fix up refactoring of weave
402
                if c == '{':
403
                    assert v not in istack
404
                    istack.append(v)
405
                elif c == '}':
406
                    oldv = istack.pop()
407
                    assert oldv == v
408
                elif c == '[':
409
                    if v in included:
881 by Martin Pool
- faster weave extraction
410
                        assert v not in dset
0.1.48 by Martin Pool
Basic parsing of delete instructions.
411
                        dset.add(v)
891 by Martin Pool
- fix up refactoring of weave
412
                else:
413
                    assert c == ']'
414
                    if v in included:
881 by Martin Pool
- faster weave extraction
415
                        assert v in dset
0.1.48 by Martin Pool
Basic parsing of delete instructions.
416
                        dset.remove(v)
0.1.39 by Martin Pool
Change to a more realistic weave structure which can represent insertions and
417
            else:
418
                assert isinstance(l, basestring)
894 by Martin Pool
- small optimization for weave extract
419
                if isactive is None:
420
                    isactive = (not dset) and istack and (istack[-1] in included)
0.1.39 by Martin Pool
Change to a more realistic weave structure which can represent insertions and
421
                if isactive:
888 by Martin Pool
- fix refactoring breakage
422
                    yield istack[-1], lineno, l
0.1.39 by Martin Pool
Change to a more realistic weave structure which can represent insertions and
423
            lineno += 1
0.1.7 by Martin Pool
Add trivial annotate text
424
0.1.46 by Martin Pool
More constraints on structure of weave, and checks that they work
425
        if istack:
0.1.63 by Martin Pool
Abbreviate WeaveFormatError in some code
426
            raise WFE("unclosed insertion blocks at end of weave",
0.1.47 by Martin Pool
New WeaveError and WeaveFormatError rather than assertions.
427
                                   istack)
0.1.48 by Martin Pool
Basic parsing of delete instructions.
428
        if dset:
0.1.63 by Martin Pool
Abbreviate WeaveFormatError in some code
429
            raise WFE("unclosed deletion blocks at end of weave",
0.1.48 by Martin Pool
Basic parsing of delete instructions.
430
                                   dset)
0.1.40 by Martin Pool
Add test for extracting from weave with nested insertions
431
0.1.7 by Martin Pool
Add trivial annotate text
432
0.1.78 by Martin Pool
Rename Weave.get_included to inclusions and getiter to get_iter
433
    def get_iter(self, version):
0.1.5 by Martin Pool
Add test for storing two text versions.
434
        """Yield lines for the specified version."""
893 by Martin Pool
- Refactor weave calculation of inclusions
435
        for origin, lineno, line in self._extract([version]):
0.1.8 by Martin Pool
Unify get/annotate code
436
            yield line
0.1.5 by Martin Pool
Add test for storing two text versions.
437
438
0.1.4 by Martin Pool
Start indexing knits by both integer and version string.
439
    def get(self, index):
0.1.78 by Martin Pool
Rename Weave.get_included to inclusions and getiter to get_iter
440
        return list(self.get_iter(index))
0.1.1 by Martin Pool
Check in old existing knit code.
441
442
0.1.95 by Martin Pool
- preliminary merge conflict detection
443
    def mash_iter(self, included):
0.1.65 by Martin Pool
Add Weave.merge_iter to get automerged lines
444
        """Return composed version of multiple included versions."""
445
        included = frozenset(included)
893 by Martin Pool
- Refactor weave calculation of inclusions
446
        for origin, lineno, text in self._extract(included):
0.1.65 by Martin Pool
Add Weave.merge_iter to get automerged lines
447
            yield text
448
449
0.1.11 by Martin Pool
Add Knit.dump method
450
    def dump(self, to_file):
451
        from pprint import pprint
0.1.38 by Martin Pool
Rename knit to weave. (I don't think there's an existing module called weave.)
452
        print >>to_file, "Weave._l = ",
0.1.11 by Martin Pool
Add Knit.dump method
453
        pprint(self._l, to_file)
0.1.38 by Martin Pool
Rename knit to weave. (I don't think there's an existing module called weave.)
454
        print >>to_file, "Weave._v = ",
0.1.18 by Martin Pool
Better Knit.dump method
455
        pprint(self._v, to_file)
0.1.11 by Martin Pool
Add Knit.dump method
456
457
0.1.91 by Martin Pool
Update Weave.check
458
459
    def numversions(self):
460
        l = len(self._v)
461
        assert l == len(self._sha1s)
462
        return l
463
464
894 by Martin Pool
- small optimization for weave extract
465
    def check(self, progress_bar=None):
0.1.91 by Martin Pool
Update Weave.check
466
        # check no circular inclusions
467
        for version in range(self.numversions()):
468
            inclusions = list(self._v[version])
469
            if inclusions:
470
                inclusions.sort()
471
                if inclusions[-1] >= version:
0.1.47 by Martin Pool
New WeaveError and WeaveFormatError rather than assertions.
472
                    raise WeaveFormatError("invalid included version %d for index %d"
0.1.91 by Martin Pool
Update Weave.check
473
                                           % (inclusions[-1], version))
474
475
        # try extracting all versions; this is a bit slow and parallel
476
        # extraction could be used
477
        import sha
894 by Martin Pool
- small optimization for weave extract
478
        nv = self.numversions()
479
        for version in range(nv):
480
            if progress_bar:
481
                progress_bar.update('checking text', version, nv)
0.1.91 by Martin Pool
Update Weave.check
482
            s = sha.new()
483
            for l in self.get_iter(version):
484
                s.update(l)
485
            hd = s.hexdigest()
486
            expected = self._sha1s[version]
487
            if hd != expected:
488
                raise WeaveError("mismatched sha1 for version %d; "
489
                                 "got %s, expected %s"
490
                                 % (version, hd, expected))
0.1.18 by Martin Pool
Better Knit.dump method
491
881 by Martin Pool
- faster weave extraction
492
        # TODO: check insertions are properly nested, that there are
493
        # no lines outside of insertion blocks, that deletions are
494
        # properly paired, etc.
495
0.1.13 by Martin Pool
Knit structure now allows for versions to include the lines present in other
496
497
0.1.95 by Martin Pool
- preliminary merge conflict detection
498
    def merge(self, merge_versions):
499
        """Automerge and mark conflicts between versions.
500
501
        This returns a sequence, each entry describing alternatives
502
        for a chunk of the file.  Each of the alternatives is given as
503
        a list of lines.
504
505
        If there is a chunk of the file where there's no diagreement,
506
        only one alternative is given.
507
        """
508
509
        # approach: find the included versions common to all the
510
        # merged versions
511
        raise NotImplementedError()
512
513
514
0.1.21 by Martin Pool
Start computing a delta to insert a new revision
515
    def _delta(self, included, lines):
516
        """Return changes from basis to new revision.
517
518
        The old text for comparison is the union of included revisions.
519
520
        This is used in inserting a new text.
0.1.22 by Martin Pool
Calculate delta for new versions relative to a set of parent versions.
521
0.1.55 by Martin Pool
doc
522
        Delta is returned as a sequence of
523
        (weave1, weave2, newlines).
524
525
        This indicates that weave1:weave2 of the old weave should be
0.1.22 by Martin Pool
Calculate delta for new versions relative to a set of parent versions.
526
        replaced by the sequence of lines in newlines.  Note that
527
        these line numbers are positions in the total weave and don't
528
        correspond to the lines in any extracted version, or even the
529
        extracted union of included versions.
530
531
        If line1=line2, this is a pure insert; if newlines=[] this is a
532
        pure delete.  (Similar to difflib.)
0.1.21 by Martin Pool
Start computing a delta to insert a new revision
533
        """
0.1.54 by Martin Pool
Fix weave line calculation when making deltas
534
        # basis a list of (origin, lineno, line)
0.1.84 by Martin Pool
Refactor Weave._delta to calculate less unused information
535
        basis_lineno = []
0.1.83 by Martin Pool
Better delta basis calculation
536
        basis_lines = []
893 by Martin Pool
- Refactor weave calculation of inclusions
537
        for origin, lineno, line in self._extract(included):
0.1.84 by Martin Pool
Refactor Weave._delta to calculate less unused information
538
            basis_lineno.append(lineno)
539
            basis_lines.append(line)
0.1.22 by Martin Pool
Calculate delta for new versions relative to a set of parent versions.
540
541
        # add a sentinal, because we can also match against the final line
0.1.84 by Martin Pool
Refactor Weave._delta to calculate less unused information
542
        basis_lineno.append(len(self._l))
0.1.22 by Martin Pool
Calculate delta for new versions relative to a set of parent versions.
543
0.1.63 by Martin Pool
Abbreviate WeaveFormatError in some code
544
        # XXX: which line of the weave should we really consider
545
        # matches the end of the file?  the current code says it's the
546
        # last line of the weave?
0.1.22 by Martin Pool
Calculate delta for new versions relative to a set of parent versions.
547
548
        from difflib import SequenceMatcher
549
        s = SequenceMatcher(None, basis_lines, lines)
550
0.1.55 by Martin Pool
doc
551
        # TODO: Perhaps return line numbers from composed weave as well?
552
0.1.22 by Martin Pool
Calculate delta for new versions relative to a set of parent versions.
553
        for tag, i1, i2, j1, j2 in s.get_opcodes():
0.1.23 by Martin Pool
tidy up
554
            ##print tag, i1, i2, j1, j2
0.1.22 by Martin Pool
Calculate delta for new versions relative to a set of parent versions.
555
556
            if tag == 'equal':
557
                continue
558
559
            # i1,i2 are given in offsets within basis_lines; we need to map them
560
            # back to offsets within the entire weave
0.1.84 by Martin Pool
Refactor Weave._delta to calculate less unused information
561
            real_i1 = basis_lineno[i1]
562
            real_i2 = basis_lineno[i2]
0.1.22 by Martin Pool
Calculate delta for new versions relative to a set of parent versions.
563
0.1.35 by Martin Pool
Clean up Knit._delta method
564
            assert 0 <= j1
565
            assert j1 <= j2
566
            assert j2 <= len(lines)
0.1.22 by Martin Pool
Calculate delta for new versions relative to a set of parent versions.
567
0.1.35 by Martin Pool
Clean up Knit._delta method
568
            yield real_i1, real_i2, lines[j1:j2]
0.1.21 by Martin Pool
Start computing a delta to insert a new revision
569
0.1.1 by Martin Pool
Check in old existing knit code.
570
918 by Martin Pool
- start doing new weave-merge algorithm
571
            
572
    def plan_merge(self, ver_a, ver_b):
573
        """Return pseudo-annotation indicating how the two versions merge.
574
575
        This is computed between versions a and b and their common
576
        base.
577
578
        Weave lines present in none of them are skipped entirely.
579
        """
580
        inc_a = self.inclusions_bitset([ver_a])
581
        inc_b = self.inclusions_bitset([ver_b])
582
        inc_c = inc_a & inc_b
583
584
        for lineno, insert, deleteset, line in self._walk():
585
            insertset = (1L << insert)
586
            if deleteset & inc_c:
587
                # killed in parent; can't be in either a or b
588
                # not relevant to our work
589
                yield 'killed-base', line
590
            elif insertset & inc_c:
591
                # was inserted in base
592
                killed_a = bool(deleteset & inc_a)
593
                killed_b = bool(deleteset & inc_b)
594
                if killed_a and killed_b:
595
                    yield 'killed-both', line
596
                elif killed_a:
597
                    yield 'killed-a', line
598
                elif killed_b:
599
                    yield 'killed-b', line
600
                else:
601
                    yield 'unchanged', line
602
            elif insertset & inc_a:
603
                if deleteset & inc_a:
604
                    yield 'ghost-a', line
605
                else:
606
                    # new in A; not in B
607
                    yield 'new-a', line
608
            elif insertset & inc_b:
609
                if deleteset & inc_b:
610
                    yield 'ghost-b', line
611
                else:
612
                    yield 'new-b', line
613
            else:
614
                # not in either revision
615
                yield 'irrelevant', line
616
919 by Martin Pool
- more development of weave-merge
617
        yield 'unchanged', ''           # terminator
618
619
620
621
    def weave_merge(self, plan):
622
        lines_a = []
623
        lines_b = []
624
        ch_a = ch_b = False
625
626
        for state, line in plan:
627
            if state == 'unchanged' or state == 'killed-both':
628
                # resync and flush queued conflicts changes if any
629
                if not lines_a and not lines_b:
630
                    pass
631
                elif ch_a and not ch_b:
632
                    # one-sided change:                    
633
                    for l in lines_a: yield l
634
                elif ch_b and not ch_a:
635
                    for l in lines_b: yield l
636
                elif lines_a == lines_b:
637
                    for l in lines_a: yield l
638
                else:
639
                    yield '<<<<\n'
640
                    for l in lines_a: yield l
641
                    yield '====\n'
642
                    for l in lines_b: yield l
643
                    yield '>>>>\n'
644
645
                del lines_a[:]
646
                del lines_b[:]
647
                ch_a = ch_b = False
648
                
649
            if state == 'unchanged':
650
                if line:
651
                    yield line
652
            elif state == 'killed-a':
653
                ch_a = True
654
                lines_b.append(line)
655
            elif state == 'killed-b':
656
                ch_b = True
657
                lines_a.append(line)
658
            elif state == 'new-a':
659
                ch_a = True
660
                lines_a.append(line)
661
            elif state == 'new-b':
662
                ch_b = True
663
                lines_b.append(line)
664
            else:
920 by Martin Pool
- add more test cases for weave_merge
665
                assert state in ('irrelevant', 'ghost-a', 'ghost-b', 'killed-base',
666
                                 'killed-both'), \
919 by Martin Pool
- more development of weave-merge
667
                       state
668
669
                
670
671
918 by Martin Pool
- start doing new weave-merge algorithm
672
673
0.1.62 by Martin Pool
Lame command-line client for reading and writing weaves.
674
0.1.88 by Martin Pool
Add weave info command.
675
def weave_info(filename, out):
676
    """Show some text information about the weave."""
677
    from weavefile import read_weave
678
    wf = file(filename, 'rb')
679
    w = read_weave(wf)
680
    # FIXME: doesn't work on pipes
681
    weave_size = wf.tell()
682
    print >>out, "weave file size %d bytes" % weave_size
683
    print >>out, "weave contains %d versions" % len(w._v)
684
685
    total = 0
870 by Martin Pool
- better weave info display
686
    print '%6s %6s %8s %40s %20s' % ('ver', 'lines', 'bytes', 'sha1', 'parents')
687
    for i in (6, 6, 8, 40, 20):
688
        print '-' * i,
689
    print
0.1.88 by Martin Pool
Add weave info command.
690
    for i in range(len(w._v)):
691
        text = w.get(i)
692
        lines = len(text)
693
        bytes = sum((len(a) for a in text))
0.1.91 by Martin Pool
Update Weave.check
694
        sha1 = w._sha1s[i]
870 by Martin Pool
- better weave info display
695
        print '%6d %6d %8d %40s' % (i, lines, bytes, sha1),
892 by Martin Pool
- weave stores only direct parents, and calculates and memoizes expansion as needed
696
        for pv in w._v[i]:
697
            print pv,
698
        print
0.1.88 by Martin Pool
Add weave info command.
699
        total += bytes
700
701
    print >>out, "versions total %d bytes" % total
702
    print >>out, "compression ratio %.3f" % (float(total)/float(weave_size))
869 by Martin Pool
- more weave.py command line options
703
704
705
def usage():
871 by Martin Pool
- add command for merge-based weave
706
    print """bzr weave tool
707
708
Experimental tool for weave algorithm.
709
869 by Martin Pool
- more weave.py command line options
710
usage:
711
    weave init WEAVEFILE
712
        Create an empty weave file
713
    weave get WEAVEFILE VERSION
714
        Write out specified version.
715
    weave check WEAVEFILE
716
        Check consistency of all versions.
717
    weave info WEAVEFILE
718
        Display table of contents.
719
    weave add WEAVEFILE [BASE...] < NEWTEXT
720
        Add NEWTEXT, with specified parent versions.
721
    weave annotate WEAVEFILE VERSION
722
        Display origin of each line.
723
    weave mash WEAVEFILE VERSION...
724
        Display composite of all selected versions.
725
    weave merge WEAVEFILE VERSION1 VERSION2 > OUT
726
        Auto-merge two versions and display conflicts.
871 by Martin Pool
- add command for merge-based weave
727
728
example:
729
730
    % weave init foo.weave
731
    % vi foo.txt
732
    % weave add foo.weave < foo.txt
733
    added version 0
734
735
    (create updated version)
736
    % vi foo.txt
737
    % weave get foo.weave 0 | diff -u - foo.txt
738
    % weave add foo.weave 0 < foo.txt
739
    added version 1
740
741
    % weave get foo.weave 0 > foo.txt       (create forked version)
742
    % vi foo.txt
743
    % weave add foo.weave 0 < foo.txt
744
    added version 2
745
746
    % weave merge foo.weave 1 2 > foo.txt   (merge them)
747
    % vi foo.txt                            (resolve conflicts)
748
    % weave add foo.weave 1 2 < foo.txt     (commit merged version)     
749
    
869 by Martin Pool
- more weave.py command line options
750
"""
0.1.88 by Martin Pool
Add weave info command.
751
    
752
0.1.62 by Martin Pool
Lame command-line client for reading and writing weaves.
753
754
def main(argv):
755
    import sys
756
    import os
869 by Martin Pool
- more weave.py command line options
757
    from weavefile import write_weave, read_weave
894 by Martin Pool
- small optimization for weave extract
758
    from bzrlib.progress import ProgressBar
759
760
    #import psyco
761
    #psyco.full()
762
0.1.62 by Martin Pool
Lame command-line client for reading and writing weaves.
763
    cmd = argv[1]
869 by Martin Pool
- more weave.py command line options
764
765
    def readit():
766
        return read_weave(file(argv[2], 'rb'))
767
    
768
    if cmd == 'help':
769
        usage()
770
    elif cmd == 'add':
771
        w = readit()
0.1.62 by Martin Pool
Lame command-line client for reading and writing weaves.
772
        # at the moment, based on everything in the file
869 by Martin Pool
- more weave.py command line options
773
        parents = map(int, argv[3:])
0.1.72 by Martin Pool
Go back to weave lines normally having newlines at the end.
774
        lines = sys.stdin.readlines()
0.1.69 by Martin Pool
Simple text-based format for storing weaves, cleaner than
775
        ver = w.add(parents, lines)
869 by Martin Pool
- more weave.py command line options
776
        write_weave(w, file(argv[2], 'wb'))
777
        print 'added version %d' % ver
0.1.62 by Martin Pool
Lame command-line client for reading and writing weaves.
778
    elif cmd == 'init':
779
        fn = argv[2]
780
        if os.path.exists(fn):
781
            raise IOError("file exists")
782
        w = Weave()
869 by Martin Pool
- more weave.py command line options
783
        write_weave(w, file(fn, 'wb'))
784
    elif cmd == 'get': # get one version
785
        w = readit()
0.1.94 by Martin Pool
Fix get_iter call
786
        sys.stdout.writelines(w.get_iter(int(argv[3])))
869 by Martin Pool
- more weave.py command line options
787
        
788
    elif cmd == 'mash': # get composite
789
        w = readit()
790
        sys.stdout.writelines(w.mash_iter(map(int, argv[3:])))
791
0.1.62 by Martin Pool
Lame command-line client for reading and writing weaves.
792
    elif cmd == 'annotate':
869 by Martin Pool
- more weave.py command line options
793
        w = readit()
0.1.72 by Martin Pool
Go back to weave lines normally having newlines at the end.
794
        # newline is added to all lines regardless; too hard to get
795
        # reasonable formatting otherwise
0.1.62 by Martin Pool
Lame command-line client for reading and writing weaves.
796
        lasto = None
797
        for origin, text in w.annotate(int(argv[3])):
0.1.72 by Martin Pool
Go back to weave lines normally having newlines at the end.
798
            text = text.rstrip('\r\n')
0.1.62 by Martin Pool
Lame command-line client for reading and writing weaves.
799
            if origin == lasto:
800
                print '      | %s' % (text)
801
            else:
802
                print '%5d | %s' % (origin, text)
803
                lasto = origin
871 by Martin Pool
- add command for merge-based weave
804
                
0.1.88 by Martin Pool
Add weave info command.
805
    elif cmd == 'info':
806
        weave_info(argv[2], sys.stdout)
871 by Martin Pool
- add command for merge-based weave
807
        
0.1.91 by Martin Pool
Update Weave.check
808
    elif cmd == 'check':
869 by Martin Pool
- more weave.py command line options
809
        w = readit()
894 by Martin Pool
- small optimization for weave extract
810
        pb = ProgressBar()
811
        w.check(pb)
812
        pb.clear()
871 by Martin Pool
- add command for merge-based weave
813
892 by Martin Pool
- weave stores only direct parents, and calculates and memoizes expansion as needed
814
    elif cmd == 'inclusions':
815
        w = readit()
816
        print ' '.join(map(str, w.inclusions([int(argv[3])])))
817
818
    elif cmd == 'parents':
819
        w = readit()
820
        print ' '.join(map(str, w._v[int(argv[3])]))
821
918 by Martin Pool
- start doing new weave-merge algorithm
822
    elif cmd == 'plan-merge':
823
        w = readit()
824
        for state, line in w.plan_merge(int(argv[3]), int(argv[4])):
919 by Martin Pool
- more development of weave-merge
825
            if line:
826
                print '%14s | %s' % (state, line),
918 by Martin Pool
- start doing new weave-merge algorithm
827
871 by Martin Pool
- add command for merge-based weave
828
    elif cmd == 'merge':
919 by Martin Pool
- more development of weave-merge
829
        w = readit()
830
        p = w.plan_merge(int(argv[3]), int(argv[4]))
831
        sys.stdout.writelines(w.weave_merge(p))
832
            
833
    elif cmd == 'mash-merge':
871 by Martin Pool
- add command for merge-based weave
834
        if len(argv) != 5:
835
            usage()
836
            return 1
837
838
        w = readit()
839
        v1, v2 = map(int, argv[3:5])
840
841
        basis = w.inclusions([v1]).intersection(w.inclusions([v2]))
842
843
        base_lines = list(w.mash_iter(basis))
844
        a_lines = list(w.get(v1))
845
        b_lines = list(w.get(v2))
846
847
        from bzrlib.merge3 import Merge3
848
        m3 = Merge3(base_lines, a_lines, b_lines)
849
850
        name_a = 'version %d' % v1
851
        name_b = 'version %d' % v2
852
        sys.stdout.writelines(m3.merge_lines(name_a=name_a, name_b=name_b))
0.1.62 by Martin Pool
Lame command-line client for reading and writing weaves.
853
    else:
854
        raise ValueError('unknown command %r' % cmd)
855
    
856
857
if __name__ == '__main__':
858
    import sys
859
    sys.exit(main(sys.argv))