/brz/remove-bazaar

To get this branch, use:
bzr branch http://gegoxaren.bato24.eu/bzr/brz/remove-bazaar
0.1.1 by Martin Pool
Check in old existing knit code.
1
#! /usr/bin/python
2
3
# Copyright (C) 2005 Canonical Ltd
4
0.1.33 by Martin Pool
add gpl text
5
# This program is free software; you can redistribute it and/or modify
6
# it under the terms of the GNU General Public License as published by
7
# the Free Software Foundation; either version 2 of the License, or
8
# (at your option) any later version.
9
10
# This program is distributed in the hope that it will be useful,
11
# but WITHOUT ANY WARRANTY; without even the implied warranty of
12
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
13
# GNU General Public License for more details.
14
15
# You should have received a copy of the GNU General Public License
16
# along with this program; if not, write to the Free Software
17
# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
0.1.1 by Martin Pool
Check in old existing knit code.
18
19
# Author: Martin Pool <mbp@canonical.com>
20
21
0.1.38 by Martin Pool
Rename knit to weave. (I don't think there's an existing module called weave.)
22
"""Weave - storage of related text file versions"""
0.1.2 by Martin Pool
Import testsweet module adapted from bzr.
23
928 by Martin Pool
- go back to using plain builtin set()
24
# before intset (r923) 2000 versions in 41.5s
25
# with intset (r926) 2000 versions in 93s !!!
26
# better to just use plain sets.
27
931 by Martin Pool
- experiment with making Weave._extract() return a list, not a generator - slightly faster
28
# making _extract build and return a list, rather than being a generator
29
# takes 37.94s
30
938 by Martin Pool
- various optimizations to weave add code
31
# with python -O, r923 does 2000 versions in 36.87s
32
33
# with optimizations to avoid mutating lists - 35.75!  I guess copying
34
# all the elements every time costs more than the small manipulations.
35
# a surprisingly small change.
36
37
# r931, which avoids using a generator for extract, does 36.98s
38
39
# with memoized inclusions, takes 41.49s; not very good
40
41
# with slots, takes 37.35s; without takes 39.16, a bit surprising
42
43
# with the delta calculation mixed in with the add method, rather than
44
# separated, takes 36.78s
45
46
# with delta folded in and mutation of the list, 36.13s
47
48
# with all this and simplification of add code, 33s 
49
50
0.1.61 by Martin Pool
doc
51
# TODO: Perhaps have copy method for Weave instances?
0.1.2 by Martin Pool
Import testsweet module adapted from bzr.
52
0.1.58 by Martin Pool
doc
53
# XXX: If we do weaves this way, will a merge still behave the same
54
# way if it's done in a different order?  That's a pretty desirable
55
# property.
56
0.1.62 by Martin Pool
Lame command-line client for reading and writing weaves.
57
# TODO: Nothing here so far assumes the lines are really \n newlines,
58
# rather than being split up in some other way.  We could accomodate
59
# binaries, perhaps by naively splitting on \n or perhaps using
60
# something like a rolling checksum.
61
62
# TODO: Track version names as well as indexes. 
63
0.1.85 by Martin Pool
doc
64
# TODO: End marker for each version so we can stop reading?
0.1.69 by Martin Pool
Simple text-based format for storing weaves, cleaner than
65
66
# TODO: Check that no insertion occurs inside a deletion that was
67
# active in the version of the insertion.
68
912 by Martin Pool
- update todos for weave
69
# TODO: In addition to the SHA-1 check, perhaps have some code that
70
# checks structural constraints of the weave: ie that insertions are
71
# properly nested, that there is no text outside of an insertion, that
72
# insertions or deletions are not repeated, etc.
0.1.85 by Martin Pool
doc
73
918 by Martin Pool
- start doing new weave-merge algorithm
74
# TODO: Make the info command just show info, not extract everything:
75
# it can be much faster.
76
77
# TODO: Perhaps use long integers as sets instead of set objects; may
78
# be faster.
79
80
# TODO: Parallel-extract that passes back each line along with a
81
# description of which revisions include it.  Nice for checking all
82
# shas in parallel.
83
84
0.1.85 by Martin Pool
doc
85
924 by Martin Pool
- Add IntSet class
86
0.1.47 by Martin Pool
New WeaveError and WeaveFormatError rather than assertions.
87
class WeaveError(Exception):
88
    """Exception in processing weave"""
89
90
91
class WeaveFormatError(WeaveError):
92
    """Weave invariant violated"""
93
    
94
0.1.38 by Martin Pool
Rename knit to weave. (I don't think there's an existing module called weave.)
95
class Weave(object):
96
    """weave - versioned text file storage.
0.1.2 by Martin Pool
Import testsweet module adapted from bzr.
97
    
0.1.72 by Martin Pool
Go back to weave lines normally having newlines at the end.
98
    A Weave manages versions of line-based text files, keeping track
99
    of the originating version for each line.
100
101
    To clients the "lines" of the file are represented as a list of strings.
102
    These strings  will typically have terminal newline characters, but
103
    this is not required.  In particular files commonly do not have a newline
104
    at the end of the file.
0.1.2 by Martin Pool
Import testsweet module adapted from bzr.
105
0.1.4 by Martin Pool
Start indexing knits by both integer and version string.
106
    Texts can be identified in either of two ways:
107
108
    * a nonnegative index number.
109
110
    * a version-id string.
111
0.1.38 by Martin Pool
Rename knit to weave. (I don't think there's an existing module called weave.)
112
    Typically the index number will be valid only inside this weave and
0.1.4 by Martin Pool
Start indexing knits by both integer and version string.
113
    the version-id is used to reference it in the larger world.
0.1.2 by Martin Pool
Import testsweet module adapted from bzr.
114
0.1.39 by Martin Pool
Change to a more realistic weave structure which can represent insertions and
115
    The weave is represented as a list mixing edit instructions and
116
    literal text.  Each entry in _l can be either a string (or
117
    unicode), or a tuple.  If a string, it means that the given line
118
    should be output in the currently active revisions.
119
120
    If a tuple, it gives a processing instruction saying in which
121
    revisions the enclosed lines are active.  The tuple has the form
122
    (instruction, version).
123
124
    The instruction can be '{' or '}' for an insertion block, and '['
125
    and ']' for a deletion block respectively.  The version is the
0.1.45 by Martin Pool
doc
126
    integer version index.  There is no replace operator, only deletes
127
    and inserts.
0.1.39 by Martin Pool
Change to a more realistic weave structure which can represent insertions and
128
0.1.41 by Martin Pool
Doc
129
    Constraints/notes:
0.1.39 by Martin Pool
Change to a more realistic weave structure which can represent insertions and
130
131
    * A later version can delete lines that were introduced by any
132
      number of ancestor versions; this implies that deletion
133
      instructions can span insertion blocks without regard to the
134
      insertion block's nesting.
135
0.1.41 by Martin Pool
Doc
136
    * Similarly, deletions need not be properly nested with regard to
137
      each other, because they might have been generated by
138
      independent revisions.
139
0.1.45 by Martin Pool
doc
140
    * Insertions are always made by inserting a new bracketed block
141
      into a single point in the previous weave.  This implies they
142
      can nest but not overlap, and the nesting must always have later
143
      insertions on the inside.
144
0.1.41 by Martin Pool
Doc
145
    * It doesn't seem very useful to have an active insertion
146
      inside an inactive insertion, but it might happen.
0.1.45 by Martin Pool
doc
147
      
0.1.41 by Martin Pool
Doc
148
    * Therefore, all instructions are always"considered"; that
149
      is passed onto and off the stack.  An outer inactive block
150
      doesn't disable an inner block.
151
152
    * Lines are enabled if the most recent enclosing insertion is
153
      active and none of the enclosing deletions are active.
0.1.39 by Martin Pool
Change to a more realistic weave structure which can represent insertions and
154
0.1.49 by Martin Pool
Add another constraint: revisions should not delete text that they
155
    * There is no point having a deletion directly inside its own
156
      insertion; you might as well just not write it.  And there
157
      should be no way to get an earlier version deleting a later
158
      version.
159
0.1.2 by Martin Pool
Import testsweet module adapted from bzr.
160
    _l
892 by Martin Pool
- weave stores only direct parents, and calculates and memoizes expansion as needed
161
        Text of the weave.
0.1.4 by Martin Pool
Start indexing knits by both integer and version string.
162
163
    _v
892 by Martin Pool
- weave stores only direct parents, and calculates and memoizes expansion as needed
164
        List of parents, indexed by version number.
165
        It is only necessary to store the minimal set of parents for
166
        each version; the parent's parents are implied.
0.1.13 by Martin Pool
Knit structure now allows for versions to include the lines present in other
167
0.1.89 by Martin Pool
Store SHA1 in weave file for later verification
168
    _sha1s
169
        List of hex SHA-1 of each version, or None if not recorded.
0.1.2 by Martin Pool
Import testsweet module adapted from bzr.
170
    """
938 by Martin Pool
- various optimizations to weave add code
171
172
    ## __slots__ = ['_l', '_v', '_sha1s']
173
    
0.1.4 by Martin Pool
Start indexing knits by both integer and version string.
174
    def __init__(self):
175
        self._l = []
176
        self._v = []
0.1.89 by Martin Pool
Store SHA1 in weave file for later verification
177
        self._sha1s = []
0.1.60 by Martin Pool
Weave eq and ne methods
178
179
180
    def __eq__(self, other):
181
        if not isinstance(other, Weave):
182
            return False
183
        return self._v == other._v \
184
               and self._l == other._l
185
    
186
187
    def __ne__(self, other):
188
        return not self.__eq__(other)
189
0.1.2 by Martin Pool
Import testsweet module adapted from bzr.
190
        
0.1.26 by Martin Pool
Refactor parameters to add command
191
    def add(self, parents, text):
0.1.4 by Martin Pool
Start indexing knits by both integer and version string.
192
        """Add a single text on top of the weave.
0.1.36 by Martin Pool
doc
193
  
0.1.26 by Martin Pool
Refactor parameters to add command
194
        Returns the index number of the newly added version.
195
196
        parents
892 by Martin Pool
- weave stores only direct parents, and calculates and memoizes expansion as needed
197
            List or set of direct parent version numbers.
198
            
0.1.26 by Martin Pool
Refactor parameters to add command
199
        text
200
            Sequence of lines to be added in the new version."""
938 by Martin Pool
- various optimizations to weave add code
201
202
        self._check_versions(parents)
0.1.82 by Martin Pool
Small weave optimizations
203
        ## self._check_lines(text)
938 by Martin Pool
- various optimizations to weave add code
204
        new_version = len(self._v)
0.1.5 by Martin Pool
Add test for storing two text versions.
205
0.1.89 by Martin Pool
Store SHA1 in weave file for later verification
206
        import sha
207
        s = sha.new()
938 by Martin Pool
- various optimizations to weave add code
208
        map(s.update, text)
0.1.89 by Martin Pool
Store SHA1 in weave file for later verification
209
        sha1 = s.hexdigest()
210
        del s
211
938 by Martin Pool
- various optimizations to weave add code
212
        # if we abort after here the weave will be corrupt
213
        self._addversion(parents)
0.1.89 by Martin Pool
Store SHA1 in weave file for later verification
214
        self._sha1s.append(sha1)
938 by Martin Pool
- various optimizations to weave add code
215
216
            
217
        if not parents:
218
            # special case; adding with no parents revision; can do
219
            # this more quickly by just appending unconditionally.
220
            # even more specially, if we're adding an empty text we
221
            # need do nothing at all.
222
            if text:
223
                self._l.append(('{', new_version))
224
                self._l.extend(text)
225
                self._l.append(('}', new_version))
226
        
227
            return new_version
228
229
        if len(parents) == 1 and sha1 == self._sha1s[parents[0]]:
230
            # special case: same as the single parent
231
            return new_version
232
            
233
234
        ancestors = self.inclusions(parents)
235
236
        l = self._l
237
238
        # basis a list of (origin, lineno, line)
239
        basis_lineno = []
240
        basis_lines = []
241
        for origin, lineno, line in self._extract(ancestors):
242
            basis_lineno.append(lineno)
243
            basis_lines.append(line)
244
245
        # another small special case: a merge, producing the same text as auto-merge
246
        if text == basis_lines:
247
            return new_version            
248
249
        # add a sentinal, because we can also match against the final line
250
        basis_lineno.append(len(self._l))
251
252
        # XXX: which line of the weave should we really consider
253
        # matches the end of the file?  the current code says it's the
254
        # last line of the weave?
255
256
        #print 'basis_lines:', basis_lines
257
        #print 'new_lines:  ', lines
258
259
        from difflib import SequenceMatcher
260
        s = SequenceMatcher(None, basis_lines, text)
261
262
        # offset gives the number of lines that have been inserted
263
        # into the weave up to the current point; if the original edit instruction
264
        # says to change line A then we actually change (A+offset)
265
        offset = 0
266
267
        for tag, i1, i2, j1, j2 in s.get_opcodes():
268
            # i1,i2 are given in offsets within basis_lines; we need to map them
269
            # back to offsets within the entire weave
270
            #print 'raw match', tag, i1, i2, j1, j2
271
            if tag == 'equal':
272
                continue
273
274
            i1 = basis_lineno[i1]
275
            i2 = basis_lineno[i2]
276
277
            assert 0 <= i1 <= i2 <= len(old_l)
278
            assert 0 <= j1 <= j2 <= len(text)
279
280
            #print tag, i1, i2, j1, j2
281
282
            # the deletion and insertion are handled separately.
283
            # first delete the region.
284
            if i1 != i2:
285
                self._l.insert(i1+offset, ('[', new_version))
286
                self._l.insert(i2+offset+1, (']', new_version))
287
                offset += 2
288
289
            if j1 != j2:
290
                # there may have been a deletion spanning up to
291
                # i2; we want to insert after this region to make sure
292
                # we don't destroy ourselves
293
                i = i2 + offset
294
                self._l[i:i] = ([('{', new_version)] 
295
                                + text[j1:j2] 
296
                                + [('}', new_version)])
297
                offset += 2 + (j2 - j1)
298
299
        return new_version
0.1.2 by Martin Pool
Import testsweet module adapted from bzr.
300
0.1.27 by Martin Pool
Check that version numbers passed in are reasonable
301
0.1.78 by Martin Pool
Rename Weave.get_included to inclusions and getiter to get_iter
302
    def inclusions(self, versions):
893 by Martin Pool
- Refactor weave calculation of inclusions
303
        """Return set of all ancestors of given version(s)."""
928 by Martin Pool
- go back to using plain builtin set()
304
        i = set(versions)
893 by Martin Pool
- Refactor weave calculation of inclusions
305
        v = max(versions)
892 by Martin Pool
- weave stores only direct parents, and calculates and memoizes expansion as needed
306
        try:
893 by Martin Pool
- Refactor weave calculation of inclusions
307
            while v >= 0:
308
                if v in i:
309
                    # include all its parents
310
                    i.update(self._v[v])
311
                v -= 1
312
            return i
892 by Martin Pool
- weave stores only direct parents, and calculates and memoizes expansion as needed
313
        except IndexError:
314
            raise ValueError("version %d not present in weave" % v)
0.1.77 by Martin Pool
New Weave.get_included() does transitive expansion
315
316
890 by Martin Pool
- weave info should show minimal expression of parents
317
    def minimal_parents(self, version):
318
        """Find the minimal set of parents for the version."""
319
        included = self._v[version]
320
        if not included:
321
            return []
322
        
323
        li = list(included)
893 by Martin Pool
- Refactor weave calculation of inclusions
324
        li.sort(reverse=True)
890 by Martin Pool
- weave info should show minimal expression of parents
325
326
        mininc = []
928 by Martin Pool
- go back to using plain builtin set()
327
        gotit = set()
890 by Martin Pool
- weave info should show minimal expression of parents
328
329
        for pv in li:
330
            if pv not in gotit:
331
                mininc.append(pv)
893 by Martin Pool
- Refactor weave calculation of inclusions
332
                gotit.update(self.inclusions(pv))
890 by Martin Pool
- weave info should show minimal expression of parents
333
334
        assert mininc[0] >= 0
335
        assert mininc[-1] < version
336
        return mininc
337
338
0.1.75 by Martin Pool
Remove VerInfo class; just store sets directly in the list of
339
    def _addversion(self, parents):
340
        if parents:
892 by Martin Pool
- weave stores only direct parents, and calculates and memoizes expansion as needed
341
            self._v.append(parents)
0.1.75 by Martin Pool
Remove VerInfo class; just store sets directly in the list of
342
        else:
928 by Martin Pool
- go back to using plain builtin set()
343
            self._v.append(set())
0.1.75 by Martin Pool
Remove VerInfo class; just store sets directly in the list of
344
345
0.1.39 by Martin Pool
Change to a more realistic weave structure which can represent insertions and
346
    def _check_lines(self, text):
347
        if not isinstance(text, list):
348
            raise ValueError("text should be a list, not %s" % type(text))
349
350
        for l in text:
351
            if not isinstance(l, basestring):
869 by Martin Pool
- more weave.py command line options
352
                raise ValueError("text line should be a string or unicode, not %s"
353
                                 % type(l))
0.1.39 by Martin Pool
Change to a more realistic weave structure which can represent insertions and
354
        
355
356
0.1.27 by Martin Pool
Check that version numbers passed in are reasonable
357
    def _check_versions(self, indexes):
358
        """Check everything in the sequence of indexes is valid"""
359
        for i in indexes:
360
            try:
361
                self._v[i]
362
            except IndexError:
363
                raise IndexError("invalid version number %r" % i)
364
0.1.2 by Martin Pool
Import testsweet module adapted from bzr.
365
    
0.1.7 by Martin Pool
Add trivial annotate text
366
    def annotate(self, index):
367
        return list(self.annotate_iter(index))
368
369
0.1.78 by Martin Pool
Rename Weave.get_included to inclusions and getiter to get_iter
370
    def annotate_iter(self, version):
0.1.7 by Martin Pool
Add trivial annotate text
371
        """Yield list of (index-id, line) pairs for the specified version.
372
373
        The index indicates when the line originated in the weave."""
893 by Martin Pool
- Refactor weave calculation of inclusions
374
        for origin, lineno, text in self._extract([version]):
0.1.39 by Martin Pool
Change to a more realistic weave structure which can represent insertions and
375
            yield origin, text
0.1.22 by Martin Pool
Calculate delta for new versions relative to a set of parent versions.
376
377
918 by Martin Pool
- start doing new weave-merge algorithm
378
    def _walk(self):
379
        """Walk the weave.
380
381
        Yields sequence of
382
        (lineno, insert, deletes, text)
383
        for each literal line.
384
        """
385
        
386
        istack = []
928 by Martin Pool
- go back to using plain builtin set()
387
        dset = set()
918 by Martin Pool
- start doing new weave-merge algorithm
388
389
        lineno = 0         # line of weave, 0-based
390
391
        for l in self._l:
392
            if isinstance(l, tuple):
393
                c, v = l
394
                isactive = None
395
                if c == '{':
396
                    istack.append(v)
397
                elif c == '}':
398
                    oldv = istack.pop()
399
                elif c == '[':
926 by Martin Pool
- update more weave code to use intsets
400
                    assert v not in dset
401
                    dset.add(v)
918 by Martin Pool
- start doing new weave-merge algorithm
402
                elif c == ']':
926 by Martin Pool
- update more weave code to use intsets
403
                    dset.remove(v)
918 by Martin Pool
- start doing new weave-merge algorithm
404
                else:
405
                    raise WeaveFormatError('unexpected instruction %r'
406
                                           % v)
407
            else:
408
                assert isinstance(l, basestring)
409
                assert istack
410
                yield lineno, istack[-1], dset, l
411
            lineno += 1
412
413
414
893 by Martin Pool
- Refactor weave calculation of inclusions
415
    def _extract(self, versions):
0.1.20 by Martin Pool
Factor out Knit.extract() method
416
        """Yield annotation of lines in included set.
417
0.1.39 by Martin Pool
Change to a more realistic weave structure which can represent insertions and
418
        Yields a sequence of tuples (origin, lineno, text), where
419
        origin is the origin version, lineno the index in the weave,
420
        and text the text of the line.
421
0.1.20 by Martin Pool
Factor out Knit.extract() method
422
        The set typically but not necessarily corresponds to a version.
423
        """
893 by Martin Pool
- Refactor weave calculation of inclusions
424
        included = self.inclusions(versions)
881 by Martin Pool
- faster weave extraction
425
426
        istack = []
928 by Martin Pool
- go back to using plain builtin set()
427
        dset = set()
0.1.48 by Martin Pool
Basic parsing of delete instructions.
428
429
        lineno = 0         # line of weave, 0-based
891 by Martin Pool
- fix up refactoring of weave
430
894 by Martin Pool
- small optimization for weave extract
431
        isactive = None
0.1.85 by Martin Pool
doc
432
931 by Martin Pool
- experiment with making Weave._extract() return a list, not a generator - slightly faster
433
        result = []
434
0.1.63 by Martin Pool
Abbreviate WeaveFormatError in some code
435
        WFE = WeaveFormatError
0.1.95 by Martin Pool
- preliminary merge conflict detection
436
0.1.39 by Martin Pool
Change to a more realistic weave structure which can represent insertions and
437
        for l in self._l:
438
            if isinstance(l, tuple):
439
                c, v = l
894 by Martin Pool
- small optimization for weave extract
440
                isactive = None
891 by Martin Pool
- fix up refactoring of weave
441
                if c == '{':
442
                    assert v not in istack
443
                    istack.append(v)
444
                elif c == '}':
445
                    oldv = istack.pop()
446
                    assert oldv == v
447
                elif c == '[':
448
                    if v in included:
881 by Martin Pool
- faster weave extraction
449
                        assert v not in dset
0.1.48 by Martin Pool
Basic parsing of delete instructions.
450
                        dset.add(v)
891 by Martin Pool
- fix up refactoring of weave
451
                else:
452
                    assert c == ']'
453
                    if v in included:
881 by Martin Pool
- faster weave extraction
454
                        assert v in dset
0.1.48 by Martin Pool
Basic parsing of delete instructions.
455
                        dset.remove(v)
0.1.39 by Martin Pool
Change to a more realistic weave structure which can represent insertions and
456
            else:
457
                assert isinstance(l, basestring)
894 by Martin Pool
- small optimization for weave extract
458
                if isactive is None:
459
                    isactive = (not dset) and istack and (istack[-1] in included)
0.1.39 by Martin Pool
Change to a more realistic weave structure which can represent insertions and
460
                if isactive:
931 by Martin Pool
- experiment with making Weave._extract() return a list, not a generator - slightly faster
461
                    result.append((istack[-1], lineno, l))
0.1.39 by Martin Pool
Change to a more realistic weave structure which can represent insertions and
462
            lineno += 1
0.1.7 by Martin Pool
Add trivial annotate text
463
0.1.46 by Martin Pool
More constraints on structure of weave, and checks that they work
464
        if istack:
0.1.63 by Martin Pool
Abbreviate WeaveFormatError in some code
465
            raise WFE("unclosed insertion blocks at end of weave",
0.1.47 by Martin Pool
New WeaveError and WeaveFormatError rather than assertions.
466
                                   istack)
0.1.48 by Martin Pool
Basic parsing of delete instructions.
467
        if dset:
0.1.63 by Martin Pool
Abbreviate WeaveFormatError in some code
468
            raise WFE("unclosed deletion blocks at end of weave",
0.1.48 by Martin Pool
Basic parsing of delete instructions.
469
                                   dset)
0.1.40 by Martin Pool
Add test for extracting from weave with nested insertions
470
931 by Martin Pool
- experiment with making Weave._extract() return a list, not a generator - slightly faster
471
        return result
472
    
473
0.1.7 by Martin Pool
Add trivial annotate text
474
0.1.78 by Martin Pool
Rename Weave.get_included to inclusions and getiter to get_iter
475
    def get_iter(self, version):
0.1.5 by Martin Pool
Add test for storing two text versions.
476
        """Yield lines for the specified version."""
893 by Martin Pool
- Refactor weave calculation of inclusions
477
        for origin, lineno, line in self._extract([version]):
0.1.8 by Martin Pool
Unify get/annotate code
478
            yield line
0.1.5 by Martin Pool
Add test for storing two text versions.
479
480
0.1.4 by Martin Pool
Start indexing knits by both integer and version string.
481
    def get(self, index):
0.1.78 by Martin Pool
Rename Weave.get_included to inclusions and getiter to get_iter
482
        return list(self.get_iter(index))
0.1.1 by Martin Pool
Check in old existing knit code.
483
484
0.1.95 by Martin Pool
- preliminary merge conflict detection
485
    def mash_iter(self, included):
0.1.65 by Martin Pool
Add Weave.merge_iter to get automerged lines
486
        """Return composed version of multiple included versions."""
893 by Martin Pool
- Refactor weave calculation of inclusions
487
        for origin, lineno, text in self._extract(included):
0.1.65 by Martin Pool
Add Weave.merge_iter to get automerged lines
488
            yield text
489
490
0.1.11 by Martin Pool
Add Knit.dump method
491
    def dump(self, to_file):
492
        from pprint import pprint
0.1.38 by Martin Pool
Rename knit to weave. (I don't think there's an existing module called weave.)
493
        print >>to_file, "Weave._l = ",
0.1.11 by Martin Pool
Add Knit.dump method
494
        pprint(self._l, to_file)
0.1.38 by Martin Pool
Rename knit to weave. (I don't think there's an existing module called weave.)
495
        print >>to_file, "Weave._v = ",
0.1.18 by Martin Pool
Better Knit.dump method
496
        pprint(self._v, to_file)
0.1.11 by Martin Pool
Add Knit.dump method
497
498
0.1.91 by Martin Pool
Update Weave.check
499
500
    def numversions(self):
501
        l = len(self._v)
502
        assert l == len(self._sha1s)
503
        return l
504
505
894 by Martin Pool
- small optimization for weave extract
506
    def check(self, progress_bar=None):
0.1.91 by Martin Pool
Update Weave.check
507
        # check no circular inclusions
508
        for version in range(self.numversions()):
509
            inclusions = list(self._v[version])
510
            if inclusions:
511
                inclusions.sort()
512
                if inclusions[-1] >= version:
0.1.47 by Martin Pool
New WeaveError and WeaveFormatError rather than assertions.
513
                    raise WeaveFormatError("invalid included version %d for index %d"
0.1.91 by Martin Pool
Update Weave.check
514
                                           % (inclusions[-1], version))
515
516
        # try extracting all versions; this is a bit slow and parallel
517
        # extraction could be used
518
        import sha
894 by Martin Pool
- small optimization for weave extract
519
        nv = self.numversions()
520
        for version in range(nv):
521
            if progress_bar:
522
                progress_bar.update('checking text', version, nv)
0.1.91 by Martin Pool
Update Weave.check
523
            s = sha.new()
524
            for l in self.get_iter(version):
525
                s.update(l)
526
            hd = s.hexdigest()
527
            expected = self._sha1s[version]
528
            if hd != expected:
529
                raise WeaveError("mismatched sha1 for version %d; "
530
                                 "got %s, expected %s"
531
                                 % (version, hd, expected))
0.1.18 by Martin Pool
Better Knit.dump method
532
881 by Martin Pool
- faster weave extraction
533
        # TODO: check insertions are properly nested, that there are
534
        # no lines outside of insertion blocks, that deletions are
535
        # properly paired, etc.
536
0.1.13 by Martin Pool
Knit structure now allows for versions to include the lines present in other
537
538
0.1.95 by Martin Pool
- preliminary merge conflict detection
539
    def merge(self, merge_versions):
540
        """Automerge and mark conflicts between versions.
541
542
        This returns a sequence, each entry describing alternatives
543
        for a chunk of the file.  Each of the alternatives is given as
544
        a list of lines.
545
546
        If there is a chunk of the file where there's no diagreement,
547
        only one alternative is given.
548
        """
549
550
        # approach: find the included versions common to all the
551
        # merged versions
552
        raise NotImplementedError()
553
554
555
0.1.21 by Martin Pool
Start computing a delta to insert a new revision
556
    def _delta(self, included, lines):
557
        """Return changes from basis to new revision.
558
559
        The old text for comparison is the union of included revisions.
560
561
        This is used in inserting a new text.
0.1.22 by Martin Pool
Calculate delta for new versions relative to a set of parent versions.
562
0.1.55 by Martin Pool
doc
563
        Delta is returned as a sequence of
564
        (weave1, weave2, newlines).
565
566
        This indicates that weave1:weave2 of the old weave should be
0.1.22 by Martin Pool
Calculate delta for new versions relative to a set of parent versions.
567
        replaced by the sequence of lines in newlines.  Note that
568
        these line numbers are positions in the total weave and don't
569
        correspond to the lines in any extracted version, or even the
570
        extracted union of included versions.
571
572
        If line1=line2, this is a pure insert; if newlines=[] this is a
573
        pure delete.  (Similar to difflib.)
0.1.21 by Martin Pool
Start computing a delta to insert a new revision
574
        """
575
0.1.1 by Martin Pool
Check in old existing knit code.
576
918 by Martin Pool
- start doing new weave-merge algorithm
577
            
578
    def plan_merge(self, ver_a, ver_b):
579
        """Return pseudo-annotation indicating how the two versions merge.
580
581
        This is computed between versions a and b and their common
582
        base.
583
584
        Weave lines present in none of them are skipped entirely.
585
        """
926 by Martin Pool
- update more weave code to use intsets
586
        inc_a = self.inclusions([ver_a])
587
        inc_b = self.inclusions([ver_b])
918 by Martin Pool
- start doing new weave-merge algorithm
588
        inc_c = inc_a & inc_b
589
590
        for lineno, insert, deleteset, line in self._walk():
591
            if deleteset & inc_c:
592
                # killed in parent; can't be in either a or b
593
                # not relevant to our work
594
                yield 'killed-base', line
926 by Martin Pool
- update more weave code to use intsets
595
            elif insert in inc_c:
918 by Martin Pool
- start doing new weave-merge algorithm
596
                # was inserted in base
597
                killed_a = bool(deleteset & inc_a)
598
                killed_b = bool(deleteset & inc_b)
599
                if killed_a and killed_b:
600
                    yield 'killed-both', line
601
                elif killed_a:
602
                    yield 'killed-a', line
603
                elif killed_b:
604
                    yield 'killed-b', line
605
                else:
606
                    yield 'unchanged', line
926 by Martin Pool
- update more weave code to use intsets
607
            elif insert in inc_a:
918 by Martin Pool
- start doing new weave-merge algorithm
608
                if deleteset & inc_a:
609
                    yield 'ghost-a', line
610
                else:
611
                    # new in A; not in B
612
                    yield 'new-a', line
926 by Martin Pool
- update more weave code to use intsets
613
            elif insert in inc_b:
918 by Martin Pool
- start doing new weave-merge algorithm
614
                if deleteset & inc_b:
615
                    yield 'ghost-b', line
616
                else:
617
                    yield 'new-b', line
618
            else:
619
                # not in either revision
620
                yield 'irrelevant', line
621
919 by Martin Pool
- more development of weave-merge
622
        yield 'unchanged', ''           # terminator
623
624
625
626
    def weave_merge(self, plan):
627
        lines_a = []
628
        lines_b = []
629
        ch_a = ch_b = False
630
631
        for state, line in plan:
632
            if state == 'unchanged' or state == 'killed-both':
633
                # resync and flush queued conflicts changes if any
634
                if not lines_a and not lines_b:
635
                    pass
636
                elif ch_a and not ch_b:
637
                    # one-sided change:                    
638
                    for l in lines_a: yield l
639
                elif ch_b and not ch_a:
640
                    for l in lines_b: yield l
641
                elif lines_a == lines_b:
642
                    for l in lines_a: yield l
643
                else:
644
                    yield '<<<<\n'
645
                    for l in lines_a: yield l
646
                    yield '====\n'
647
                    for l in lines_b: yield l
648
                    yield '>>>>\n'
649
650
                del lines_a[:]
651
                del lines_b[:]
652
                ch_a = ch_b = False
653
                
654
            if state == 'unchanged':
655
                if line:
656
                    yield line
657
            elif state == 'killed-a':
658
                ch_a = True
659
                lines_b.append(line)
660
            elif state == 'killed-b':
661
                ch_b = True
662
                lines_a.append(line)
663
            elif state == 'new-a':
664
                ch_a = True
665
                lines_a.append(line)
666
            elif state == 'new-b':
667
                ch_b = True
668
                lines_b.append(line)
669
            else:
920 by Martin Pool
- add more test cases for weave_merge
670
                assert state in ('irrelevant', 'ghost-a', 'ghost-b', 'killed-base',
671
                                 'killed-both'), \
919 by Martin Pool
- more development of weave-merge
672
                       state
673
674
                
675
676
918 by Martin Pool
- start doing new weave-merge algorithm
677
678
0.1.62 by Martin Pool
Lame command-line client for reading and writing weaves.
679
0.1.88 by Martin Pool
Add weave info command.
680
def weave_info(filename, out):
681
    """Show some text information about the weave."""
682
    from weavefile import read_weave
683
    wf = file(filename, 'rb')
684
    w = read_weave(wf)
685
    # FIXME: doesn't work on pipes
686
    weave_size = wf.tell()
687
    print >>out, "weave file size %d bytes" % weave_size
688
    print >>out, "weave contains %d versions" % len(w._v)
689
690
    total = 0
870 by Martin Pool
- better weave info display
691
    print '%6s %6s %8s %40s %20s' % ('ver', 'lines', 'bytes', 'sha1', 'parents')
692
    for i in (6, 6, 8, 40, 20):
693
        print '-' * i,
694
    print
0.1.88 by Martin Pool
Add weave info command.
695
    for i in range(len(w._v)):
696
        text = w.get(i)
697
        lines = len(text)
698
        bytes = sum((len(a) for a in text))
0.1.91 by Martin Pool
Update Weave.check
699
        sha1 = w._sha1s[i]
870 by Martin Pool
- better weave info display
700
        print '%6d %6d %8d %40s' % (i, lines, bytes, sha1),
892 by Martin Pool
- weave stores only direct parents, and calculates and memoizes expansion as needed
701
        for pv in w._v[i]:
702
            print pv,
703
        print
0.1.88 by Martin Pool
Add weave info command.
704
        total += bytes
705
706
    print >>out, "versions total %d bytes" % total
707
    print >>out, "compression ratio %.3f" % (float(total)/float(weave_size))
869 by Martin Pool
- more weave.py command line options
708
709
710
def usage():
871 by Martin Pool
- add command for merge-based weave
711
    print """bzr weave tool
712
713
Experimental tool for weave algorithm.
714
869 by Martin Pool
- more weave.py command line options
715
usage:
716
    weave init WEAVEFILE
717
        Create an empty weave file
718
    weave get WEAVEFILE VERSION
719
        Write out specified version.
720
    weave check WEAVEFILE
721
        Check consistency of all versions.
722
    weave info WEAVEFILE
723
        Display table of contents.
724
    weave add WEAVEFILE [BASE...] < NEWTEXT
725
        Add NEWTEXT, with specified parent versions.
726
    weave annotate WEAVEFILE VERSION
727
        Display origin of each line.
728
    weave mash WEAVEFILE VERSION...
729
        Display composite of all selected versions.
730
    weave merge WEAVEFILE VERSION1 VERSION2 > OUT
731
        Auto-merge two versions and display conflicts.
871 by Martin Pool
- add command for merge-based weave
732
733
example:
734
735
    % weave init foo.weave
736
    % vi foo.txt
737
    % weave add foo.weave < foo.txt
738
    added version 0
739
740
    (create updated version)
741
    % vi foo.txt
742
    % weave get foo.weave 0 | diff -u - foo.txt
743
    % weave add foo.weave 0 < foo.txt
744
    added version 1
745
746
    % weave get foo.weave 0 > foo.txt       (create forked version)
747
    % vi foo.txt
748
    % weave add foo.weave 0 < foo.txt
749
    added version 2
750
751
    % weave merge foo.weave 1 2 > foo.txt   (merge them)
752
    % vi foo.txt                            (resolve conflicts)
753
    % weave add foo.weave 1 2 < foo.txt     (commit merged version)     
754
    
869 by Martin Pool
- more weave.py command line options
755
"""
0.1.88 by Martin Pool
Add weave info command.
756
    
757
0.1.62 by Martin Pool
Lame command-line client for reading and writing weaves.
758
759
def main(argv):
760
    import sys
761
    import os
869 by Martin Pool
- more weave.py command line options
762
    from weavefile import write_weave, read_weave
894 by Martin Pool
- small optimization for weave extract
763
    from bzrlib.progress import ProgressBar
764
765
    #import psyco
766
    #psyco.full()
767
0.1.62 by Martin Pool
Lame command-line client for reading and writing weaves.
768
    cmd = argv[1]
869 by Martin Pool
- more weave.py command line options
769
770
    def readit():
771
        return read_weave(file(argv[2], 'rb'))
772
    
773
    if cmd == 'help':
774
        usage()
775
    elif cmd == 'add':
776
        w = readit()
0.1.62 by Martin Pool
Lame command-line client for reading and writing weaves.
777
        # at the moment, based on everything in the file
869 by Martin Pool
- more weave.py command line options
778
        parents = map(int, argv[3:])
0.1.72 by Martin Pool
Go back to weave lines normally having newlines at the end.
779
        lines = sys.stdin.readlines()
0.1.69 by Martin Pool
Simple text-based format for storing weaves, cleaner than
780
        ver = w.add(parents, lines)
869 by Martin Pool
- more weave.py command line options
781
        write_weave(w, file(argv[2], 'wb'))
782
        print 'added version %d' % ver
0.1.62 by Martin Pool
Lame command-line client for reading and writing weaves.
783
    elif cmd == 'init':
784
        fn = argv[2]
785
        if os.path.exists(fn):
786
            raise IOError("file exists")
787
        w = Weave()
869 by Martin Pool
- more weave.py command line options
788
        write_weave(w, file(fn, 'wb'))
789
    elif cmd == 'get': # get one version
790
        w = readit()
0.1.94 by Martin Pool
Fix get_iter call
791
        sys.stdout.writelines(w.get_iter(int(argv[3])))
869 by Martin Pool
- more weave.py command line options
792
        
793
    elif cmd == 'mash': # get composite
794
        w = readit()
795
        sys.stdout.writelines(w.mash_iter(map(int, argv[3:])))
796
0.1.62 by Martin Pool
Lame command-line client for reading and writing weaves.
797
    elif cmd == 'annotate':
869 by Martin Pool
- more weave.py command line options
798
        w = readit()
0.1.72 by Martin Pool
Go back to weave lines normally having newlines at the end.
799
        # newline is added to all lines regardless; too hard to get
800
        # reasonable formatting otherwise
0.1.62 by Martin Pool
Lame command-line client for reading and writing weaves.
801
        lasto = None
802
        for origin, text in w.annotate(int(argv[3])):
0.1.72 by Martin Pool
Go back to weave lines normally having newlines at the end.
803
            text = text.rstrip('\r\n')
0.1.62 by Martin Pool
Lame command-line client for reading and writing weaves.
804
            if origin == lasto:
805
                print '      | %s' % (text)
806
            else:
807
                print '%5d | %s' % (origin, text)
808
                lasto = origin
871 by Martin Pool
- add command for merge-based weave
809
                
0.1.88 by Martin Pool
Add weave info command.
810
    elif cmd == 'info':
811
        weave_info(argv[2], sys.stdout)
871 by Martin Pool
- add command for merge-based weave
812
        
0.1.91 by Martin Pool
Update Weave.check
813
    elif cmd == 'check':
869 by Martin Pool
- more weave.py command line options
814
        w = readit()
894 by Martin Pool
- small optimization for weave extract
815
        pb = ProgressBar()
816
        w.check(pb)
817
        pb.clear()
938 by Martin Pool
- various optimizations to weave add code
818
        print '%d versions ok' % w.numversions()
871 by Martin Pool
- add command for merge-based weave
819
892 by Martin Pool
- weave stores only direct parents, and calculates and memoizes expansion as needed
820
    elif cmd == 'inclusions':
821
        w = readit()
822
        print ' '.join(map(str, w.inclusions([int(argv[3])])))
823
824
    elif cmd == 'parents':
825
        w = readit()
826
        print ' '.join(map(str, w._v[int(argv[3])]))
827
918 by Martin Pool
- start doing new weave-merge algorithm
828
    elif cmd == 'plan-merge':
829
        w = readit()
830
        for state, line in w.plan_merge(int(argv[3]), int(argv[4])):
919 by Martin Pool
- more development of weave-merge
831
            if line:
832
                print '%14s | %s' % (state, line),
918 by Martin Pool
- start doing new weave-merge algorithm
833
871 by Martin Pool
- add command for merge-based weave
834
    elif cmd == 'merge':
919 by Martin Pool
- more development of weave-merge
835
        w = readit()
836
        p = w.plan_merge(int(argv[3]), int(argv[4]))
837
        sys.stdout.writelines(w.weave_merge(p))
838
            
839
    elif cmd == 'mash-merge':
871 by Martin Pool
- add command for merge-based weave
840
        if len(argv) != 5:
841
            usage()
842
            return 1
843
844
        w = readit()
845
        v1, v2 = map(int, argv[3:5])
846
847
        basis = w.inclusions([v1]).intersection(w.inclusions([v2]))
848
849
        base_lines = list(w.mash_iter(basis))
850
        a_lines = list(w.get(v1))
851
        b_lines = list(w.get(v2))
852
853
        from bzrlib.merge3 import Merge3
854
        m3 = Merge3(base_lines, a_lines, b_lines)
855
856
        name_a = 'version %d' % v1
857
        name_b = 'version %d' % v2
858
        sys.stdout.writelines(m3.merge_lines(name_a=name_a, name_b=name_b))
0.1.62 by Martin Pool
Lame command-line client for reading and writing weaves.
859
    else:
860
        raise ValueError('unknown command %r' % cmd)
861
    
862
863
if __name__ == '__main__':
864
    import sys
865
    sys.exit(main(sys.argv))