/brz/remove-bazaar

To get this branch, use:
bzr branch http://gegoxaren.bato24.eu/bzr/brz/remove-bazaar
0.1.1 by Martin Pool
Check in old existing knit code.
1
#! /usr/bin/python
2
3
# Copyright (C) 2005 Canonical Ltd
4
0.1.33 by Martin Pool
add gpl text
5
# This program is free software; you can redistribute it and/or modify
6
# it under the terms of the GNU General Public License as published by
7
# the Free Software Foundation; either version 2 of the License, or
8
# (at your option) any later version.
9
10
# This program is distributed in the hope that it will be useful,
11
# but WITHOUT ANY WARRANTY; without even the implied warranty of
12
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
13
# GNU General Public License for more details.
14
15
# You should have received a copy of the GNU General Public License
16
# along with this program; if not, write to the Free Software
17
# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
0.1.1 by Martin Pool
Check in old existing knit code.
18
19
# Author: Martin Pool <mbp@canonical.com>
20
21
0.1.38 by Martin Pool
Rename knit to weave. (I don't think there's an existing module called weave.)
22
"""Weave - storage of related text file versions"""
0.1.2 by Martin Pool
Import testsweet module adapted from bzr.
23
928 by Martin Pool
- go back to using plain builtin set()
24
# before intset (r923) 2000 versions in 41.5s
25
# with intset (r926) 2000 versions in 93s !!!
26
# better to just use plain sets.
27
931 by Martin Pool
- experiment with making Weave._extract() return a list, not a generator - slightly faster
28
# making _extract build and return a list, rather than being a generator
29
# takes 37.94s
30
938 by Martin Pool
- various optimizations to weave add code
31
# with python -O, r923 does 2000 versions in 36.87s
32
33
# with optimizations to avoid mutating lists - 35.75!  I guess copying
34
# all the elements every time costs more than the small manipulations.
35
# a surprisingly small change.
36
37
# r931, which avoids using a generator for extract, does 36.98s
38
39
# with memoized inclusions, takes 41.49s; not very good
40
41
# with slots, takes 37.35s; without takes 39.16, a bit surprising
42
43
# with the delta calculation mixed in with the add method, rather than
44
# separated, takes 36.78s
45
46
# with delta folded in and mutation of the list, 36.13s
47
1079 by Martin Pool
- weavefile can just use lists for read-in ancestry, not frozensets
48
# with all this and simplification of add code, 33s
49
50
51
52
938 by Martin Pool
- various optimizations to weave add code
53
54
0.1.61 by Martin Pool
doc
55
# TODO: Perhaps have copy method for Weave instances?
0.1.2 by Martin Pool
Import testsweet module adapted from bzr.
56
0.1.58 by Martin Pool
doc
57
# XXX: If we do weaves this way, will a merge still behave the same
58
# way if it's done in a different order?  That's a pretty desirable
59
# property.
60
0.1.62 by Martin Pool
Lame command-line client for reading and writing weaves.
61
# TODO: Nothing here so far assumes the lines are really \n newlines,
62
# rather than being split up in some other way.  We could accomodate
63
# binaries, perhaps by naively splitting on \n or perhaps using
64
# something like a rolling checksum.
65
66
# TODO: Track version names as well as indexes. 
67
0.1.85 by Martin Pool
doc
68
# TODO: End marker for each version so we can stop reading?
0.1.69 by Martin Pool
Simple text-based format for storing weaves, cleaner than
69
70
# TODO: Check that no insertion occurs inside a deletion that was
71
# active in the version of the insertion.
72
912 by Martin Pool
- update todos for weave
73
# TODO: In addition to the SHA-1 check, perhaps have some code that
74
# checks structural constraints of the weave: ie that insertions are
75
# properly nested, that there is no text outside of an insertion, that
76
# insertions or deletions are not repeated, etc.
0.1.85 by Martin Pool
doc
77
918 by Martin Pool
- start doing new weave-merge algorithm
78
# TODO: Parallel-extract that passes back each line along with a
79
# description of which revisions include it.  Nice for checking all
80
# shas in parallel.
81
82
0.1.85 by Martin Pool
doc
83
924 by Martin Pool
- Add IntSet class
84
0.1.47 by Martin Pool
New WeaveError and WeaveFormatError rather than assertions.
85
class WeaveError(Exception):
86
    """Exception in processing weave"""
87
88
89
class WeaveFormatError(WeaveError):
90
    """Weave invariant violated"""
91
    
92
0.1.38 by Martin Pool
Rename knit to weave. (I don't think there's an existing module called weave.)
93
class Weave(object):
94
    """weave - versioned text file storage.
0.1.2 by Martin Pool
Import testsweet module adapted from bzr.
95
    
0.1.72 by Martin Pool
Go back to weave lines normally having newlines at the end.
96
    A Weave manages versions of line-based text files, keeping track
97
    of the originating version for each line.
98
99
    To clients the "lines" of the file are represented as a list of strings.
100
    These strings  will typically have terminal newline characters, but
101
    this is not required.  In particular files commonly do not have a newline
102
    at the end of the file.
0.1.2 by Martin Pool
Import testsweet module adapted from bzr.
103
0.1.4 by Martin Pool
Start indexing knits by both integer and version string.
104
    Texts can be identified in either of two ways:
105
106
    * a nonnegative index number.
107
1075 by Martin Pool
- don't store redundant version number at end of insert blocks
108
    * a version-id string. (not implemented yet)
0.1.4 by Martin Pool
Start indexing knits by both integer and version string.
109
0.1.38 by Martin Pool
Rename knit to weave. (I don't think there's an existing module called weave.)
110
    Typically the index number will be valid only inside this weave and
0.1.4 by Martin Pool
Start indexing knits by both integer and version string.
111
    the version-id is used to reference it in the larger world.
0.1.2 by Martin Pool
Import testsweet module adapted from bzr.
112
0.1.39 by Martin Pool
Change to a more realistic weave structure which can represent insertions and
113
    The weave is represented as a list mixing edit instructions and
944 by Martin Pool
- refactor member names in Weave code
114
    literal text.  Each entry in _weave can be either a string (or
0.1.39 by Martin Pool
Change to a more realistic weave structure which can represent insertions and
115
    unicode), or a tuple.  If a string, it means that the given line
116
    should be output in the currently active revisions.
117
118
    If a tuple, it gives a processing instruction saying in which
119
    revisions the enclosed lines are active.  The tuple has the form
120
    (instruction, version).
121
122
    The instruction can be '{' or '}' for an insertion block, and '['
123
    and ']' for a deletion block respectively.  The version is the
0.1.45 by Martin Pool
doc
124
    integer version index.  There is no replace operator, only deletes
1075 by Martin Pool
- don't store redundant version number at end of insert blocks
125
    and inserts.  For '}', the end of an insertion, there is no
126
    version parameter because it always closes the most recently
127
    opened insertion.
0.1.39 by Martin Pool
Change to a more realistic weave structure which can represent insertions and
128
0.1.41 by Martin Pool
Doc
129
    Constraints/notes:
0.1.39 by Martin Pool
Change to a more realistic weave structure which can represent insertions and
130
131
    * A later version can delete lines that were introduced by any
132
      number of ancestor versions; this implies that deletion
133
      instructions can span insertion blocks without regard to the
134
      insertion block's nesting.
135
0.1.41 by Martin Pool
Doc
136
    * Similarly, deletions need not be properly nested with regard to
137
      each other, because they might have been generated by
138
      independent revisions.
139
0.1.45 by Martin Pool
doc
140
    * Insertions are always made by inserting a new bracketed block
141
      into a single point in the previous weave.  This implies they
142
      can nest but not overlap, and the nesting must always have later
143
      insertions on the inside.
144
0.1.41 by Martin Pool
Doc
145
    * It doesn't seem very useful to have an active insertion
146
      inside an inactive insertion, but it might happen.
0.1.45 by Martin Pool
doc
147
      
0.1.41 by Martin Pool
Doc
148
    * Therefore, all instructions are always"considered"; that
149
      is passed onto and off the stack.  An outer inactive block
150
      doesn't disable an inner block.
151
152
    * Lines are enabled if the most recent enclosing insertion is
153
      active and none of the enclosing deletions are active.
0.1.39 by Martin Pool
Change to a more realistic weave structure which can represent insertions and
154
0.1.49 by Martin Pool
Add another constraint: revisions should not delete text that they
155
    * There is no point having a deletion directly inside its own
156
      insertion; you might as well just not write it.  And there
157
      should be no way to get an earlier version deleting a later
158
      version.
159
944 by Martin Pool
- refactor member names in Weave code
160
    _weave
161
        Text of the weave; list of control instruction tuples and strings.
0.1.4 by Martin Pool
Start indexing knits by both integer and version string.
162
944 by Martin Pool
- refactor member names in Weave code
163
    _parents
892 by Martin Pool
- weave stores only direct parents, and calculates and memoizes expansion as needed
164
        List of parents, indexed by version number.
165
        It is only necessary to store the minimal set of parents for
166
        each version; the parent's parents are implied.
0.1.13 by Martin Pool
Knit structure now allows for versions to include the lines present in other
167
0.1.89 by Martin Pool
Store SHA1 in weave file for later verification
168
    _sha1s
169
        List of hex SHA-1 of each version, or None if not recorded.
0.1.2 by Martin Pool
Import testsweet module adapted from bzr.
170
    """
938 by Martin Pool
- various optimizations to weave add code
171
944 by Martin Pool
- refactor member names in Weave code
172
    __slots__ = ['_weave', '_parents', '_sha1s']
938 by Martin Pool
- various optimizations to weave add code
173
    
0.1.4 by Martin Pool
Start indexing knits by both integer and version string.
174
    def __init__(self):
944 by Martin Pool
- refactor member names in Weave code
175
        self._weave = []
176
        self._parents = []
0.1.89 by Martin Pool
Store SHA1 in weave file for later verification
177
        self._sha1s = []
0.1.60 by Martin Pool
Weave eq and ne methods
178
179
180
    def __eq__(self, other):
181
        if not isinstance(other, Weave):
182
            return False
944 by Martin Pool
- refactor member names in Weave code
183
        return self._parents == other._parents \
184
               and self._weave == other._weave
0.1.60 by Martin Pool
Weave eq and ne methods
185
    
186
187
    def __ne__(self, other):
188
        return not self.__eq__(other)
189
0.1.2 by Martin Pool
Import testsweet module adapted from bzr.
190
        
0.1.26 by Martin Pool
Refactor parameters to add command
191
    def add(self, parents, text):
0.1.4 by Martin Pool
Start indexing knits by both integer and version string.
192
        """Add a single text on top of the weave.
0.1.36 by Martin Pool
doc
193
  
0.1.26 by Martin Pool
Refactor parameters to add command
194
        Returns the index number of the newly added version.
195
196
        parents
892 by Martin Pool
- weave stores only direct parents, and calculates and memoizes expansion as needed
197
            List or set of direct parent version numbers.
198
            
0.1.26 by Martin Pool
Refactor parameters to add command
199
        text
200
            Sequence of lines to be added in the new version."""
938 by Martin Pool
- various optimizations to weave add code
201
202
        self._check_versions(parents)
0.1.82 by Martin Pool
Small weave optimizations
203
        ## self._check_lines(text)
944 by Martin Pool
- refactor member names in Weave code
204
        new_version = len(self._parents)
0.1.5 by Martin Pool
Add test for storing two text versions.
205
0.1.89 by Martin Pool
Store SHA1 in weave file for later verification
206
        import sha
207
        s = sha.new()
938 by Martin Pool
- various optimizations to weave add code
208
        map(s.update, text)
0.1.89 by Martin Pool
Store SHA1 in weave file for later verification
209
        sha1 = s.hexdigest()
210
        del s
211
938 by Martin Pool
- various optimizations to weave add code
212
        # if we abort after here the weave will be corrupt
944 by Martin Pool
- refactor member names in Weave code
213
        self._parents.append(frozenset(parents))
0.1.89 by Martin Pool
Store SHA1 in weave file for later verification
214
        self._sha1s.append(sha1)
938 by Martin Pool
- various optimizations to weave add code
215
216
            
217
        if not parents:
218
            # special case; adding with no parents revision; can do
219
            # this more quickly by just appending unconditionally.
220
            # even more specially, if we're adding an empty text we
221
            # need do nothing at all.
222
            if text:
944 by Martin Pool
- refactor member names in Weave code
223
                self._weave.append(('{', new_version))
224
                self._weave.extend(text)
1075 by Martin Pool
- don't store redundant version number at end of insert blocks
225
                self._weave.append(('}', None))
938 by Martin Pool
- various optimizations to weave add code
226
        
227
            return new_version
228
941 by Martin Pool
- allow for parents specified to Weave.add to be a set
229
        if len(parents) == 1:
230
            pv = list(parents)[0]
231
            if sha1 == self._sha1s[pv]:
232
                # special case: same as the single parent
233
                return new_version
938 by Martin Pool
- various optimizations to weave add code
234
            
235
236
        ancestors = self.inclusions(parents)
237
944 by Martin Pool
- refactor member names in Weave code
238
        l = self._weave
938 by Martin Pool
- various optimizations to weave add code
239
240
        # basis a list of (origin, lineno, line)
241
        basis_lineno = []
242
        basis_lines = []
243
        for origin, lineno, line in self._extract(ancestors):
244
            basis_lineno.append(lineno)
245
            basis_lines.append(line)
246
1042 by Martin Pool
- more statistics output from 'weave stats' command
247
        # another small special case: a merge, producing the same text
248
        # as auto-merge
938 by Martin Pool
- various optimizations to weave add code
249
        if text == basis_lines:
250
            return new_version            
251
252
        # add a sentinal, because we can also match against the final line
944 by Martin Pool
- refactor member names in Weave code
253
        basis_lineno.append(len(self._weave))
938 by Martin Pool
- various optimizations to weave add code
254
255
        # XXX: which line of the weave should we really consider
256
        # matches the end of the file?  the current code says it's the
257
        # last line of the weave?
258
259
        #print 'basis_lines:', basis_lines
260
        #print 'new_lines:  ', lines
261
262
        from difflib import SequenceMatcher
263
        s = SequenceMatcher(None, basis_lines, text)
264
265
        # offset gives the number of lines that have been inserted
266
        # into the weave up to the current point; if the original edit instruction
267
        # says to change line A then we actually change (A+offset)
268
        offset = 0
269
270
        for tag, i1, i2, j1, j2 in s.get_opcodes():
271
            # i1,i2 are given in offsets within basis_lines; we need to map them
272
            # back to offsets within the entire weave
273
            #print 'raw match', tag, i1, i2, j1, j2
274
            if tag == 'equal':
275
                continue
276
277
            i1 = basis_lineno[i1]
278
            i2 = basis_lineno[i2]
279
280
            assert 0 <= j1 <= j2 <= len(text)
281
282
            #print tag, i1, i2, j1, j2
283
284
            # the deletion and insertion are handled separately.
285
            # first delete the region.
286
            if i1 != i2:
944 by Martin Pool
- refactor member names in Weave code
287
                self._weave.insert(i1+offset, ('[', new_version))
288
                self._weave.insert(i2+offset+1, (']', new_version))
938 by Martin Pool
- various optimizations to weave add code
289
                offset += 2
290
291
            if j1 != j2:
292
                # there may have been a deletion spanning up to
293
                # i2; we want to insert after this region to make sure
294
                # we don't destroy ourselves
295
                i = i2 + offset
944 by Martin Pool
- refactor member names in Weave code
296
                self._weave[i:i] = ([('{', new_version)] 
1075 by Martin Pool
- don't store redundant version number at end of insert blocks
297
                                    + text[j1:j2] 
298
                                    + [('}', None)])
938 by Martin Pool
- various optimizations to weave add code
299
                offset += 2 + (j2 - j1)
300
301
        return new_version
0.1.2 by Martin Pool
Import testsweet module adapted from bzr.
302
0.1.27 by Martin Pool
Check that version numbers passed in are reasonable
303
0.1.78 by Martin Pool
Rename Weave.get_included to inclusions and getiter to get_iter
304
    def inclusions(self, versions):
893 by Martin Pool
- Refactor weave calculation of inclusions
305
        """Return set of all ancestors of given version(s)."""
928 by Martin Pool
- go back to using plain builtin set()
306
        i = set(versions)
893 by Martin Pool
- Refactor weave calculation of inclusions
307
        v = max(versions)
892 by Martin Pool
- weave stores only direct parents, and calculates and memoizes expansion as needed
308
        try:
893 by Martin Pool
- Refactor weave calculation of inclusions
309
            while v >= 0:
310
                if v in i:
311
                    # include all its parents
944 by Martin Pool
- refactor member names in Weave code
312
                    i.update(self._parents[v])
893 by Martin Pool
- Refactor weave calculation of inclusions
313
                v -= 1
314
            return i
892 by Martin Pool
- weave stores only direct parents, and calculates and memoizes expansion as needed
315
        except IndexError:
316
            raise ValueError("version %d not present in weave" % v)
0.1.77 by Martin Pool
New Weave.get_included() does transitive expansion
317
318
890 by Martin Pool
- weave info should show minimal expression of parents
319
    def minimal_parents(self, version):
320
        """Find the minimal set of parents for the version."""
944 by Martin Pool
- refactor member names in Weave code
321
        included = self._parents[version]
890 by Martin Pool
- weave info should show minimal expression of parents
322
        if not included:
323
            return []
324
        
325
        li = list(included)
893 by Martin Pool
- Refactor weave calculation of inclusions
326
        li.sort(reverse=True)
890 by Martin Pool
- weave info should show minimal expression of parents
327
328
        mininc = []
928 by Martin Pool
- go back to using plain builtin set()
329
        gotit = set()
890 by Martin Pool
- weave info should show minimal expression of parents
330
331
        for pv in li:
332
            if pv not in gotit:
333
                mininc.append(pv)
893 by Martin Pool
- Refactor weave calculation of inclusions
334
                gotit.update(self.inclusions(pv))
890 by Martin Pool
- weave info should show minimal expression of parents
335
336
        assert mininc[0] >= 0
337
        assert mininc[-1] < version
338
        return mininc
339
340
0.1.75 by Martin Pool
Remove VerInfo class; just store sets directly in the list of
341
0.1.39 by Martin Pool
Change to a more realistic weave structure which can represent insertions and
342
    def _check_lines(self, text):
343
        if not isinstance(text, list):
344
            raise ValueError("text should be a list, not %s" % type(text))
345
346
        for l in text:
347
            if not isinstance(l, basestring):
869 by Martin Pool
- more weave.py command line options
348
                raise ValueError("text line should be a string or unicode, not %s"
349
                                 % type(l))
0.1.39 by Martin Pool
Change to a more realistic weave structure which can represent insertions and
350
        
351
352
0.1.27 by Martin Pool
Check that version numbers passed in are reasonable
353
    def _check_versions(self, indexes):
354
        """Check everything in the sequence of indexes is valid"""
355
        for i in indexes:
356
            try:
944 by Martin Pool
- refactor member names in Weave code
357
                self._parents[i]
0.1.27 by Martin Pool
Check that version numbers passed in are reasonable
358
            except IndexError:
359
                raise IndexError("invalid version number %r" % i)
360
0.1.2 by Martin Pool
Import testsweet module adapted from bzr.
361
    
0.1.7 by Martin Pool
Add trivial annotate text
362
    def annotate(self, index):
363
        return list(self.annotate_iter(index))
364
365
0.1.78 by Martin Pool
Rename Weave.get_included to inclusions and getiter to get_iter
366
    def annotate_iter(self, version):
0.1.7 by Martin Pool
Add trivial annotate text
367
        """Yield list of (index-id, line) pairs for the specified version.
368
369
        The index indicates when the line originated in the weave."""
893 by Martin Pool
- Refactor weave calculation of inclusions
370
        for origin, lineno, text in self._extract([version]):
0.1.39 by Martin Pool
Change to a more realistic weave structure which can represent insertions and
371
            yield origin, text
0.1.22 by Martin Pool
Calculate delta for new versions relative to a set of parent versions.
372
373
918 by Martin Pool
- start doing new weave-merge algorithm
374
    def _walk(self):
375
        """Walk the weave.
376
377
        Yields sequence of
378
        (lineno, insert, deletes, text)
379
        for each literal line.
380
        """
381
        
382
        istack = []
928 by Martin Pool
- go back to using plain builtin set()
383
        dset = set()
918 by Martin Pool
- start doing new weave-merge algorithm
384
385
        lineno = 0         # line of weave, 0-based
386
944 by Martin Pool
- refactor member names in Weave code
387
        for l in self._weave:
918 by Martin Pool
- start doing new weave-merge algorithm
388
            if isinstance(l, tuple):
389
                c, v = l
390
                isactive = None
391
                if c == '{':
392
                    istack.append(v)
393
                elif c == '}':
1075 by Martin Pool
- don't store redundant version number at end of insert blocks
394
                    istack.pop()
918 by Martin Pool
- start doing new weave-merge algorithm
395
                elif c == '[':
926 by Martin Pool
- update more weave code to use intsets
396
                    assert v not in dset
397
                    dset.add(v)
918 by Martin Pool
- start doing new weave-merge algorithm
398
                elif c == ']':
926 by Martin Pool
- update more weave code to use intsets
399
                    dset.remove(v)
918 by Martin Pool
- start doing new weave-merge algorithm
400
                else:
401
                    raise WeaveFormatError('unexpected instruction %r'
402
                                           % v)
403
            else:
404
                assert isinstance(l, basestring)
405
                assert istack
406
                yield lineno, istack[-1], dset, l
407
            lineno += 1
408
409
410
893 by Martin Pool
- Refactor weave calculation of inclusions
411
    def _extract(self, versions):
0.1.20 by Martin Pool
Factor out Knit.extract() method
412
        """Yield annotation of lines in included set.
413
0.1.39 by Martin Pool
Change to a more realistic weave structure which can represent insertions and
414
        Yields a sequence of tuples (origin, lineno, text), where
415
        origin is the origin version, lineno the index in the weave,
416
        and text the text of the line.
417
0.1.20 by Martin Pool
Factor out Knit.extract() method
418
        The set typically but not necessarily corresponds to a version.
419
        """
893 by Martin Pool
- Refactor weave calculation of inclusions
420
        included = self.inclusions(versions)
881 by Martin Pool
- faster weave extraction
421
422
        istack = []
928 by Martin Pool
- go back to using plain builtin set()
423
        dset = set()
0.1.48 by Martin Pool
Basic parsing of delete instructions.
424
425
        lineno = 0         # line of weave, 0-based
891 by Martin Pool
- fix up refactoring of weave
426
894 by Martin Pool
- small optimization for weave extract
427
        isactive = None
0.1.85 by Martin Pool
doc
428
931 by Martin Pool
- experiment with making Weave._extract() return a list, not a generator - slightly faster
429
        result = []
430
0.1.63 by Martin Pool
Abbreviate WeaveFormatError in some code
431
        WFE = WeaveFormatError
0.1.95 by Martin Pool
- preliminary merge conflict detection
432
944 by Martin Pool
- refactor member names in Weave code
433
        for l in self._weave:
0.1.39 by Martin Pool
Change to a more realistic weave structure which can represent insertions and
434
            if isinstance(l, tuple):
435
                c, v = l
894 by Martin Pool
- small optimization for weave extract
436
                isactive = None
891 by Martin Pool
- fix up refactoring of weave
437
                if c == '{':
438
                    assert v not in istack
439
                    istack.append(v)
440
                elif c == '}':
1075 by Martin Pool
- don't store redundant version number at end of insert blocks
441
                    istack.pop()
891 by Martin Pool
- fix up refactoring of weave
442
                elif c == '[':
443
                    if v in included:
881 by Martin Pool
- faster weave extraction
444
                        assert v not in dset
0.1.48 by Martin Pool
Basic parsing of delete instructions.
445
                        dset.add(v)
891 by Martin Pool
- fix up refactoring of weave
446
                else:
447
                    assert c == ']'
448
                    if v in included:
881 by Martin Pool
- faster weave extraction
449
                        assert v in dset
0.1.48 by Martin Pool
Basic parsing of delete instructions.
450
                        dset.remove(v)
0.1.39 by Martin Pool
Change to a more realistic weave structure which can represent insertions and
451
            else:
452
                assert isinstance(l, basestring)
894 by Martin Pool
- small optimization for weave extract
453
                if isactive is None:
454
                    isactive = (not dset) and istack and (istack[-1] in included)
0.1.39 by Martin Pool
Change to a more realistic weave structure which can represent insertions and
455
                if isactive:
931 by Martin Pool
- experiment with making Weave._extract() return a list, not a generator - slightly faster
456
                    result.append((istack[-1], lineno, l))
0.1.39 by Martin Pool
Change to a more realistic weave structure which can represent insertions and
457
            lineno += 1
0.1.7 by Martin Pool
Add trivial annotate text
458
0.1.46 by Martin Pool
More constraints on structure of weave, and checks that they work
459
        if istack:
0.1.63 by Martin Pool
Abbreviate WeaveFormatError in some code
460
            raise WFE("unclosed insertion blocks at end of weave",
0.1.47 by Martin Pool
New WeaveError and WeaveFormatError rather than assertions.
461
                                   istack)
0.1.48 by Martin Pool
Basic parsing of delete instructions.
462
        if dset:
0.1.63 by Martin Pool
Abbreviate WeaveFormatError in some code
463
            raise WFE("unclosed deletion blocks at end of weave",
0.1.48 by Martin Pool
Basic parsing of delete instructions.
464
                                   dset)
0.1.40 by Martin Pool
Add test for extracting from weave with nested insertions
465
931 by Martin Pool
- experiment with making Weave._extract() return a list, not a generator - slightly faster
466
        return result
467
    
468
0.1.7 by Martin Pool
Add trivial annotate text
469
0.1.78 by Martin Pool
Rename Weave.get_included to inclusions and getiter to get_iter
470
    def get_iter(self, version):
0.1.5 by Martin Pool
Add test for storing two text versions.
471
        """Yield lines for the specified version."""
893 by Martin Pool
- Refactor weave calculation of inclusions
472
        for origin, lineno, line in self._extract([version]):
0.1.8 by Martin Pool
Unify get/annotate code
473
            yield line
0.1.5 by Martin Pool
Add test for storing two text versions.
474
475
0.1.4 by Martin Pool
Start indexing knits by both integer and version string.
476
    def get(self, index):
0.1.78 by Martin Pool
Rename Weave.get_included to inclusions and getiter to get_iter
477
        return list(self.get_iter(index))
0.1.1 by Martin Pool
Check in old existing knit code.
478
479
0.1.95 by Martin Pool
- preliminary merge conflict detection
480
    def mash_iter(self, included):
0.1.65 by Martin Pool
Add Weave.merge_iter to get automerged lines
481
        """Return composed version of multiple included versions."""
893 by Martin Pool
- Refactor weave calculation of inclusions
482
        for origin, lineno, text in self._extract(included):
0.1.65 by Martin Pool
Add Weave.merge_iter to get automerged lines
483
            yield text
484
485
0.1.11 by Martin Pool
Add Knit.dump method
486
    def dump(self, to_file):
487
        from pprint import pprint
944 by Martin Pool
- refactor member names in Weave code
488
        print >>to_file, "Weave._weave = ",
489
        pprint(self._weave, to_file)
490
        print >>to_file, "Weave._parents = ",
491
        pprint(self._parents, to_file)
0.1.11 by Martin Pool
Add Knit.dump method
492
493
0.1.91 by Martin Pool
Update Weave.check
494
495
    def numversions(self):
944 by Martin Pool
- refactor member names in Weave code
496
        l = len(self._parents)
0.1.91 by Martin Pool
Update Weave.check
497
        assert l == len(self._sha1s)
498
        return l
499
500
946 by Martin Pool
- weave info only shows the weave headers, doesn't extract every version:
501
    def __len__(self):
502
        return self.numversions()
503
504
894 by Martin Pool
- small optimization for weave extract
505
    def check(self, progress_bar=None):
0.1.91 by Martin Pool
Update Weave.check
506
        # check no circular inclusions
507
        for version in range(self.numversions()):
944 by Martin Pool
- refactor member names in Weave code
508
            inclusions = list(self._parents[version])
0.1.91 by Martin Pool
Update Weave.check
509
            if inclusions:
510
                inclusions.sort()
511
                if inclusions[-1] >= version:
0.1.47 by Martin Pool
New WeaveError and WeaveFormatError rather than assertions.
512
                    raise WeaveFormatError("invalid included version %d for index %d"
0.1.91 by Martin Pool
Update Weave.check
513
                                           % (inclusions[-1], version))
514
515
        # try extracting all versions; this is a bit slow and parallel
516
        # extraction could be used
517
        import sha
894 by Martin Pool
- small optimization for weave extract
518
        nv = self.numversions()
519
        for version in range(nv):
520
            if progress_bar:
521
                progress_bar.update('checking text', version, nv)
0.1.91 by Martin Pool
Update Weave.check
522
            s = sha.new()
523
            for l in self.get_iter(version):
524
                s.update(l)
525
            hd = s.hexdigest()
526
            expected = self._sha1s[version]
527
            if hd != expected:
528
                raise WeaveError("mismatched sha1 for version %d; "
529
                                 "got %s, expected %s"
530
                                 % (version, hd, expected))
0.1.18 by Martin Pool
Better Knit.dump method
531
881 by Martin Pool
- faster weave extraction
532
        # TODO: check insertions are properly nested, that there are
533
        # no lines outside of insertion blocks, that deletions are
534
        # properly paired, etc.
535
0.1.13 by Martin Pool
Knit structure now allows for versions to include the lines present in other
536
537
0.1.95 by Martin Pool
- preliminary merge conflict detection
538
    def merge(self, merge_versions):
539
        """Automerge and mark conflicts between versions.
540
541
        This returns a sequence, each entry describing alternatives
542
        for a chunk of the file.  Each of the alternatives is given as
543
        a list of lines.
544
545
        If there is a chunk of the file where there's no diagreement,
546
        only one alternative is given.
547
        """
548
549
        # approach: find the included versions common to all the
550
        # merged versions
551
        raise NotImplementedError()
552
553
554
0.1.21 by Martin Pool
Start computing a delta to insert a new revision
555
    def _delta(self, included, lines):
556
        """Return changes from basis to new revision.
557
558
        The old text for comparison is the union of included revisions.
559
560
        This is used in inserting a new text.
0.1.22 by Martin Pool
Calculate delta for new versions relative to a set of parent versions.
561
0.1.55 by Martin Pool
doc
562
        Delta is returned as a sequence of
563
        (weave1, weave2, newlines).
564
565
        This indicates that weave1:weave2 of the old weave should be
0.1.22 by Martin Pool
Calculate delta for new versions relative to a set of parent versions.
566
        replaced by the sequence of lines in newlines.  Note that
567
        these line numbers are positions in the total weave and don't
568
        correspond to the lines in any extracted version, or even the
569
        extracted union of included versions.
570
571
        If line1=line2, this is a pure insert; if newlines=[] this is a
572
        pure delete.  (Similar to difflib.)
0.1.21 by Martin Pool
Start computing a delta to insert a new revision
573
        """
574
0.1.1 by Martin Pool
Check in old existing knit code.
575
918 by Martin Pool
- start doing new weave-merge algorithm
576
            
577
    def plan_merge(self, ver_a, ver_b):
578
        """Return pseudo-annotation indicating how the two versions merge.
579
580
        This is computed between versions a and b and their common
581
        base.
582
583
        Weave lines present in none of them are skipped entirely.
584
        """
926 by Martin Pool
- update more weave code to use intsets
585
        inc_a = self.inclusions([ver_a])
586
        inc_b = self.inclusions([ver_b])
918 by Martin Pool
- start doing new weave-merge algorithm
587
        inc_c = inc_a & inc_b
588
589
        for lineno, insert, deleteset, line in self._walk():
590
            if deleteset & inc_c:
591
                # killed in parent; can't be in either a or b
592
                # not relevant to our work
593
                yield 'killed-base', line
926 by Martin Pool
- update more weave code to use intsets
594
            elif insert in inc_c:
918 by Martin Pool
- start doing new weave-merge algorithm
595
                # was inserted in base
596
                killed_a = bool(deleteset & inc_a)
597
                killed_b = bool(deleteset & inc_b)
598
                if killed_a and killed_b:
599
                    yield 'killed-both', line
600
                elif killed_a:
601
                    yield 'killed-a', line
602
                elif killed_b:
603
                    yield 'killed-b', line
604
                else:
605
                    yield 'unchanged', line
926 by Martin Pool
- update more weave code to use intsets
606
            elif insert in inc_a:
918 by Martin Pool
- start doing new weave-merge algorithm
607
                if deleteset & inc_a:
608
                    yield 'ghost-a', line
609
                else:
610
                    # new in A; not in B
611
                    yield 'new-a', line
926 by Martin Pool
- update more weave code to use intsets
612
            elif insert in inc_b:
918 by Martin Pool
- start doing new weave-merge algorithm
613
                if deleteset & inc_b:
614
                    yield 'ghost-b', line
615
                else:
616
                    yield 'new-b', line
617
            else:
618
                # not in either revision
619
                yield 'irrelevant', line
620
919 by Martin Pool
- more development of weave-merge
621
        yield 'unchanged', ''           # terminator
622
623
624
625
    def weave_merge(self, plan):
626
        lines_a = []
627
        lines_b = []
628
        ch_a = ch_b = False
629
630
        for state, line in plan:
631
            if state == 'unchanged' or state == 'killed-both':
632
                # resync and flush queued conflicts changes if any
633
                if not lines_a and not lines_b:
634
                    pass
635
                elif ch_a and not ch_b:
636
                    # one-sided change:                    
637
                    for l in lines_a: yield l
638
                elif ch_b and not ch_a:
639
                    for l in lines_b: yield l
640
                elif lines_a == lines_b:
641
                    for l in lines_a: yield l
642
                else:
643
                    yield '<<<<\n'
644
                    for l in lines_a: yield l
645
                    yield '====\n'
646
                    for l in lines_b: yield l
647
                    yield '>>>>\n'
648
649
                del lines_a[:]
650
                del lines_b[:]
651
                ch_a = ch_b = False
652
                
653
            if state == 'unchanged':
654
                if line:
655
                    yield line
656
            elif state == 'killed-a':
657
                ch_a = True
658
                lines_b.append(line)
659
            elif state == 'killed-b':
660
                ch_b = True
661
                lines_a.append(line)
662
            elif state == 'new-a':
663
                ch_a = True
664
                lines_a.append(line)
665
            elif state == 'new-b':
666
                ch_b = True
667
                lines_b.append(line)
668
            else:
920 by Martin Pool
- add more test cases for weave_merge
669
                assert state in ('irrelevant', 'ghost-a', 'ghost-b', 'killed-base',
670
                                 'killed-both'), \
919 by Martin Pool
- more development of weave-merge
671
                       state
672
673
                
674
675
918 by Martin Pool
- start doing new weave-merge algorithm
676
677
0.1.62 by Martin Pool
Lame command-line client for reading and writing weaves.
678
946 by Martin Pool
- weave info only shows the weave headers, doesn't extract every version:
679
def weave_info(w):
0.1.88 by Martin Pool
Add weave info command.
680
    """Show some text information about the weave."""
946 by Martin Pool
- weave info only shows the weave headers, doesn't extract every version:
681
    print '%6s %40s %20s' % ('ver', 'sha1', 'parents')
682
    for i in (6, 40, 20):
870 by Martin Pool
- better weave info display
683
        print '-' * i,
684
    print
946 by Martin Pool
- weave info only shows the weave headers, doesn't extract every version:
685
    for i in range(w.numversions()):
0.1.91 by Martin Pool
Update Weave.check
686
        sha1 = w._sha1s[i]
946 by Martin Pool
- weave info only shows the weave headers, doesn't extract every version:
687
        print '%6d %40s %s' % (i, sha1, ' '.join(map(str, w._parents[i])))
0.1.88 by Martin Pool
Add weave info command.
688
869 by Martin Pool
- more weave.py command line options
689
690
947 by Martin Pool
- new 'weave stats' command
691
def weave_stats(weave_file):
692
    from bzrlib.progress import ProgressBar
693
    from bzrlib.weavefile import read_weave
694
695
    pb = ProgressBar()
696
697
    wf = file(weave_file, 'rb')
698
    w = read_weave(wf)
699
    # FIXME: doesn't work on pipes
700
    weave_size = wf.tell()
701
702
    total = 0
703
    vers = len(w)
704
    for i in range(vers):
705
        pb.update('checking sizes', i, vers)
706
        for line in w.get_iter(i):
707
            total += len(line)
708
709
    pb.clear()
710
711
    print 'versions          %9d' % vers
712
    print 'weave file        %9d bytes' % weave_size
713
    print 'total contents    %9d bytes' % total
714
    print 'compression ratio %9.2fx' % (float(total) / float(weave_size))
1042 by Martin Pool
- more statistics output from 'weave stats' command
715
    if vers:
716
        avg = total/vers
717
        print 'average size      %9d bytes' % avg
718
        print 'relative size     %9.2fx' % (float(weave_size) / float(avg))
947 by Martin Pool
- new 'weave stats' command
719
720
869 by Martin Pool
- more weave.py command line options
721
def usage():
871 by Martin Pool
- add command for merge-based weave
722
    print """bzr weave tool
723
724
Experimental tool for weave algorithm.
725
869 by Martin Pool
- more weave.py command line options
726
usage:
727
    weave init WEAVEFILE
728
        Create an empty weave file
729
    weave get WEAVEFILE VERSION
730
        Write out specified version.
731
    weave check WEAVEFILE
732
        Check consistency of all versions.
733
    weave info WEAVEFILE
734
        Display table of contents.
735
    weave add WEAVEFILE [BASE...] < NEWTEXT
736
        Add NEWTEXT, with specified parent versions.
737
    weave annotate WEAVEFILE VERSION
738
        Display origin of each line.
739
    weave mash WEAVEFILE VERSION...
740
        Display composite of all selected versions.
741
    weave merge WEAVEFILE VERSION1 VERSION2 > OUT
742
        Auto-merge two versions and display conflicts.
871 by Martin Pool
- add command for merge-based weave
743
744
example:
745
746
    % weave init foo.weave
747
    % vi foo.txt
748
    % weave add foo.weave < foo.txt
749
    added version 0
750
751
    (create updated version)
752
    % vi foo.txt
753
    % weave get foo.weave 0 | diff -u - foo.txt
754
    % weave add foo.weave 0 < foo.txt
755
    added version 1
756
757
    % weave get foo.weave 0 > foo.txt       (create forked version)
758
    % vi foo.txt
759
    % weave add foo.weave 0 < foo.txt
760
    added version 2
761
762
    % weave merge foo.weave 1 2 > foo.txt   (merge them)
763
    % vi foo.txt                            (resolve conflicts)
764
    % weave add foo.weave 1 2 < foo.txt     (commit merged version)     
765
    
869 by Martin Pool
- more weave.py command line options
766
"""
0.1.88 by Martin Pool
Add weave info command.
767
    
768
0.1.62 by Martin Pool
Lame command-line client for reading and writing weaves.
769
770
def main(argv):
771
    import sys
772
    import os
869 by Martin Pool
- more weave.py command line options
773
    from weavefile import write_weave, read_weave
894 by Martin Pool
- small optimization for weave extract
774
    from bzrlib.progress import ProgressBar
775
1078 by Martin Pool
- use psyco for weave if possible
776
    try:
777
        import psyco
778
        psyco.full()
779
    except ImportError:
780
        pass
894 by Martin Pool
- small optimization for weave extract
781
0.1.62 by Martin Pool
Lame command-line client for reading and writing weaves.
782
    cmd = argv[1]
869 by Martin Pool
- more weave.py command line options
783
784
    def readit():
785
        return read_weave(file(argv[2], 'rb'))
786
    
787
    if cmd == 'help':
788
        usage()
789
    elif cmd == 'add':
790
        w = readit()
0.1.62 by Martin Pool
Lame command-line client for reading and writing weaves.
791
        # at the moment, based on everything in the file
869 by Martin Pool
- more weave.py command line options
792
        parents = map(int, argv[3:])
0.1.72 by Martin Pool
Go back to weave lines normally having newlines at the end.
793
        lines = sys.stdin.readlines()
0.1.69 by Martin Pool
Simple text-based format for storing weaves, cleaner than
794
        ver = w.add(parents, lines)
869 by Martin Pool
- more weave.py command line options
795
        write_weave(w, file(argv[2], 'wb'))
796
        print 'added version %d' % ver
0.1.62 by Martin Pool
Lame command-line client for reading and writing weaves.
797
    elif cmd == 'init':
798
        fn = argv[2]
799
        if os.path.exists(fn):
800
            raise IOError("file exists")
801
        w = Weave()
869 by Martin Pool
- more weave.py command line options
802
        write_weave(w, file(fn, 'wb'))
803
    elif cmd == 'get': # get one version
804
        w = readit()
0.1.94 by Martin Pool
Fix get_iter call
805
        sys.stdout.writelines(w.get_iter(int(argv[3])))
869 by Martin Pool
- more weave.py command line options
806
        
807
    elif cmd == 'mash': # get composite
808
        w = readit()
809
        sys.stdout.writelines(w.mash_iter(map(int, argv[3:])))
810
0.1.62 by Martin Pool
Lame command-line client for reading and writing weaves.
811
    elif cmd == 'annotate':
869 by Martin Pool
- more weave.py command line options
812
        w = readit()
0.1.72 by Martin Pool
Go back to weave lines normally having newlines at the end.
813
        # newline is added to all lines regardless; too hard to get
814
        # reasonable formatting otherwise
0.1.62 by Martin Pool
Lame command-line client for reading and writing weaves.
815
        lasto = None
816
        for origin, text in w.annotate(int(argv[3])):
0.1.72 by Martin Pool
Go back to weave lines normally having newlines at the end.
817
            text = text.rstrip('\r\n')
0.1.62 by Martin Pool
Lame command-line client for reading and writing weaves.
818
            if origin == lasto:
819
                print '      | %s' % (text)
820
            else:
821
                print '%5d | %s' % (origin, text)
822
                lasto = origin
871 by Martin Pool
- add command for merge-based weave
823
                
0.1.88 by Martin Pool
Add weave info command.
824
    elif cmd == 'info':
946 by Martin Pool
- weave info only shows the weave headers, doesn't extract every version:
825
        weave_info(readit())
947 by Martin Pool
- new 'weave stats' command
826
827
    elif cmd == 'stats':
828
        weave_stats(argv[2])
871 by Martin Pool
- add command for merge-based weave
829
        
0.1.91 by Martin Pool
Update Weave.check
830
    elif cmd == 'check':
869 by Martin Pool
- more weave.py command line options
831
        w = readit()
894 by Martin Pool
- small optimization for weave extract
832
        pb = ProgressBar()
833
        w.check(pb)
834
        pb.clear()
938 by Martin Pool
- various optimizations to weave add code
835
        print '%d versions ok' % w.numversions()
871 by Martin Pool
- add command for merge-based weave
836
892 by Martin Pool
- weave stores only direct parents, and calculates and memoizes expansion as needed
837
    elif cmd == 'inclusions':
838
        w = readit()
839
        print ' '.join(map(str, w.inclusions([int(argv[3])])))
840
841
    elif cmd == 'parents':
842
        w = readit()
944 by Martin Pool
- refactor member names in Weave code
843
        print ' '.join(map(str, w._parents[int(argv[3])]))
892 by Martin Pool
- weave stores only direct parents, and calculates and memoizes expansion as needed
844
918 by Martin Pool
- start doing new weave-merge algorithm
845
    elif cmd == 'plan-merge':
846
        w = readit()
847
        for state, line in w.plan_merge(int(argv[3]), int(argv[4])):
919 by Martin Pool
- more development of weave-merge
848
            if line:
849
                print '%14s | %s' % (state, line),
918 by Martin Pool
- start doing new weave-merge algorithm
850
871 by Martin Pool
- add command for merge-based weave
851
    elif cmd == 'merge':
919 by Martin Pool
- more development of weave-merge
852
        w = readit()
853
        p = w.plan_merge(int(argv[3]), int(argv[4]))
854
        sys.stdout.writelines(w.weave_merge(p))
855
            
856
    elif cmd == 'mash-merge':
871 by Martin Pool
- add command for merge-based weave
857
        if len(argv) != 5:
858
            usage()
859
            return 1
860
861
        w = readit()
862
        v1, v2 = map(int, argv[3:5])
863
864
        basis = w.inclusions([v1]).intersection(w.inclusions([v2]))
865
866
        base_lines = list(w.mash_iter(basis))
867
        a_lines = list(w.get(v1))
868
        b_lines = list(w.get(v2))
869
870
        from bzrlib.merge3 import Merge3
871
        m3 = Merge3(base_lines, a_lines, b_lines)
872
873
        name_a = 'version %d' % v1
874
        name_b = 'version %d' % v2
875
        sys.stdout.writelines(m3.merge_lines(name_a=name_a, name_b=name_b))
0.1.62 by Martin Pool
Lame command-line client for reading and writing weaves.
876
    else:
877
        raise ValueError('unknown command %r' % cmd)
878
    
879
1076 by Martin Pool
- add code to run weave utility under profiler
880
881
def profile_main(argv): 
882
    import tempfile, hotshot, hotshot.stats
883
884
    prof_f = tempfile.NamedTemporaryFile()
885
886
    prof = hotshot.Profile(prof_f.name)
887
888
    ret = prof.runcall(main, argv)
889
    prof.close()
890
891
    stats = hotshot.stats.load(prof_f.name)
892
    #stats.strip_dirs()
1079 by Martin Pool
- weavefile can just use lists for read-in ancestry, not frozensets
893
    stats.sort_stats('cumulative')
1076 by Martin Pool
- add code to run weave utility under profiler
894
    ## XXX: Might like to write to stderr or the trace file instead but
895
    ## print_stats seems hardcoded to stdout
896
    stats.print_stats(20)
897
            
898
    return ret
899
900
0.1.62 by Martin Pool
Lame command-line client for reading and writing weaves.
901
if __name__ == '__main__':
902
    import sys
1078 by Martin Pool
- use psyco for weave if possible
903
    if sys.argv[1] == '--profile':
904
        args = sys.argv[:]
905
        del args[1]
906
        sys.exit(profile_main(args))
907
    else:
908
        sys.exit(main(sys.argv))
1076 by Martin Pool
- add code to run weave utility under profiler
909