/brz/remove-bazaar

To get this branch, use:
bzr branch http://gegoxaren.bato24.eu/bzr/brz/remove-bazaar
0.1.1 by Martin Pool
Check in old existing knit code.
1
#! /usr/bin/python
2
3
# Copyright (C) 2005 Canonical Ltd
4
0.1.33 by Martin Pool
add gpl text
5
# This program is free software; you can redistribute it and/or modify
6
# it under the terms of the GNU General Public License as published by
7
# the Free Software Foundation; either version 2 of the License, or
8
# (at your option) any later version.
9
10
# This program is distributed in the hope that it will be useful,
11
# but WITHOUT ANY WARRANTY; without even the implied warranty of
12
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
13
# GNU General Public License for more details.
14
15
# You should have received a copy of the GNU General Public License
16
# along with this program; if not, write to the Free Software
17
# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
0.1.1 by Martin Pool
Check in old existing knit code.
18
19
# Author: Martin Pool <mbp@canonical.com>
20
21
0.1.38 by Martin Pool
Rename knit to weave. (I don't think there's an existing module called weave.)
22
"""Weave - storage of related text file versions"""
0.1.2 by Martin Pool
Import testsweet module adapted from bzr.
23
928 by Martin Pool
- go back to using plain builtin set()
24
# before intset (r923) 2000 versions in 41.5s
25
# with intset (r926) 2000 versions in 93s !!!
26
# better to just use plain sets.
27
931 by Martin Pool
- experiment with making Weave._extract() return a list, not a generator - slightly faster
28
# making _extract build and return a list, rather than being a generator
29
# takes 37.94s
30
938 by Martin Pool
- various optimizations to weave add code
31
# with python -O, r923 does 2000 versions in 36.87s
32
33
# with optimizations to avoid mutating lists - 35.75!  I guess copying
34
# all the elements every time costs more than the small manipulations.
35
# a surprisingly small change.
36
37
# r931, which avoids using a generator for extract, does 36.98s
38
39
# with memoized inclusions, takes 41.49s; not very good
40
41
# with slots, takes 37.35s; without takes 39.16, a bit surprising
42
43
# with the delta calculation mixed in with the add method, rather than
44
# separated, takes 36.78s
45
46
# with delta folded in and mutation of the list, 36.13s
47
48
# with all this and simplification of add code, 33s 
49
50
0.1.61 by Martin Pool
doc
51
# TODO: Perhaps have copy method for Weave instances?
0.1.2 by Martin Pool
Import testsweet module adapted from bzr.
52
0.1.58 by Martin Pool
doc
53
# XXX: If we do weaves this way, will a merge still behave the same
54
# way if it's done in a different order?  That's a pretty desirable
55
# property.
56
0.1.62 by Martin Pool
Lame command-line client for reading and writing weaves.
57
# TODO: Nothing here so far assumes the lines are really \n newlines,
58
# rather than being split up in some other way.  We could accomodate
59
# binaries, perhaps by naively splitting on \n or perhaps using
60
# something like a rolling checksum.
61
62
# TODO: Track version names as well as indexes. 
63
0.1.85 by Martin Pool
doc
64
# TODO: End marker for each version so we can stop reading?
0.1.69 by Martin Pool
Simple text-based format for storing weaves, cleaner than
65
66
# TODO: Check that no insertion occurs inside a deletion that was
67
# active in the version of the insertion.
68
912 by Martin Pool
- update todos for weave
69
# TODO: In addition to the SHA-1 check, perhaps have some code that
70
# checks structural constraints of the weave: ie that insertions are
71
# properly nested, that there is no text outside of an insertion, that
72
# insertions or deletions are not repeated, etc.
0.1.85 by Martin Pool
doc
73
918 by Martin Pool
- start doing new weave-merge algorithm
74
# TODO: Parallel-extract that passes back each line along with a
75
# description of which revisions include it.  Nice for checking all
76
# shas in parallel.
77
78
0.1.85 by Martin Pool
doc
79
924 by Martin Pool
- Add IntSet class
80
0.1.47 by Martin Pool
New WeaveError and WeaveFormatError rather than assertions.
81
class WeaveError(Exception):
82
    """Exception in processing weave"""
83
84
85
class WeaveFormatError(WeaveError):
86
    """Weave invariant violated"""
87
    
88
0.1.38 by Martin Pool
Rename knit to weave. (I don't think there's an existing module called weave.)
89
class Weave(object):
90
    """weave - versioned text file storage.
0.1.2 by Martin Pool
Import testsweet module adapted from bzr.
91
    
0.1.72 by Martin Pool
Go back to weave lines normally having newlines at the end.
92
    A Weave manages versions of line-based text files, keeping track
93
    of the originating version for each line.
94
95
    To clients the "lines" of the file are represented as a list of strings.
96
    These strings  will typically have terminal newline characters, but
97
    this is not required.  In particular files commonly do not have a newline
98
    at the end of the file.
0.1.2 by Martin Pool
Import testsweet module adapted from bzr.
99
0.1.4 by Martin Pool
Start indexing knits by both integer and version string.
100
    Texts can be identified in either of two ways:
101
102
    * a nonnegative index number.
103
1075 by Martin Pool
- don't store redundant version number at end of insert blocks
104
    * a version-id string. (not implemented yet)
0.1.4 by Martin Pool
Start indexing knits by both integer and version string.
105
0.1.38 by Martin Pool
Rename knit to weave. (I don't think there's an existing module called weave.)
106
    Typically the index number will be valid only inside this weave and
0.1.4 by Martin Pool
Start indexing knits by both integer and version string.
107
    the version-id is used to reference it in the larger world.
0.1.2 by Martin Pool
Import testsweet module adapted from bzr.
108
0.1.39 by Martin Pool
Change to a more realistic weave structure which can represent insertions and
109
    The weave is represented as a list mixing edit instructions and
944 by Martin Pool
- refactor member names in Weave code
110
    literal text.  Each entry in _weave can be either a string (or
0.1.39 by Martin Pool
Change to a more realistic weave structure which can represent insertions and
111
    unicode), or a tuple.  If a string, it means that the given line
112
    should be output in the currently active revisions.
113
114
    If a tuple, it gives a processing instruction saying in which
115
    revisions the enclosed lines are active.  The tuple has the form
116
    (instruction, version).
117
118
    The instruction can be '{' or '}' for an insertion block, and '['
119
    and ']' for a deletion block respectively.  The version is the
0.1.45 by Martin Pool
doc
120
    integer version index.  There is no replace operator, only deletes
1075 by Martin Pool
- don't store redundant version number at end of insert blocks
121
    and inserts.  For '}', the end of an insertion, there is no
122
    version parameter because it always closes the most recently
123
    opened insertion.
0.1.39 by Martin Pool
Change to a more realistic weave structure which can represent insertions and
124
0.1.41 by Martin Pool
Doc
125
    Constraints/notes:
0.1.39 by Martin Pool
Change to a more realistic weave structure which can represent insertions and
126
127
    * A later version can delete lines that were introduced by any
128
      number of ancestor versions; this implies that deletion
129
      instructions can span insertion blocks without regard to the
130
      insertion block's nesting.
131
0.1.41 by Martin Pool
Doc
132
    * Similarly, deletions need not be properly nested with regard to
133
      each other, because they might have been generated by
134
      independent revisions.
135
0.1.45 by Martin Pool
doc
136
    * Insertions are always made by inserting a new bracketed block
137
      into a single point in the previous weave.  This implies they
138
      can nest but not overlap, and the nesting must always have later
139
      insertions on the inside.
140
0.1.41 by Martin Pool
Doc
141
    * It doesn't seem very useful to have an active insertion
142
      inside an inactive insertion, but it might happen.
0.1.45 by Martin Pool
doc
143
      
0.1.41 by Martin Pool
Doc
144
    * Therefore, all instructions are always"considered"; that
145
      is passed onto and off the stack.  An outer inactive block
146
      doesn't disable an inner block.
147
148
    * Lines are enabled if the most recent enclosing insertion is
149
      active and none of the enclosing deletions are active.
0.1.39 by Martin Pool
Change to a more realistic weave structure which can represent insertions and
150
0.1.49 by Martin Pool
Add another constraint: revisions should not delete text that they
151
    * There is no point having a deletion directly inside its own
152
      insertion; you might as well just not write it.  And there
153
      should be no way to get an earlier version deleting a later
154
      version.
155
944 by Martin Pool
- refactor member names in Weave code
156
    _weave
157
        Text of the weave; list of control instruction tuples and strings.
0.1.4 by Martin Pool
Start indexing knits by both integer and version string.
158
944 by Martin Pool
- refactor member names in Weave code
159
    _parents
892 by Martin Pool
- weave stores only direct parents, and calculates and memoizes expansion as needed
160
        List of parents, indexed by version number.
161
        It is only necessary to store the minimal set of parents for
162
        each version; the parent's parents are implied.
0.1.13 by Martin Pool
Knit structure now allows for versions to include the lines present in other
163
0.1.89 by Martin Pool
Store SHA1 in weave file for later verification
164
    _sha1s
165
        List of hex SHA-1 of each version, or None if not recorded.
0.1.2 by Martin Pool
Import testsweet module adapted from bzr.
166
    """
938 by Martin Pool
- various optimizations to weave add code
167
944 by Martin Pool
- refactor member names in Weave code
168
    __slots__ = ['_weave', '_parents', '_sha1s']
938 by Martin Pool
- various optimizations to weave add code
169
    
0.1.4 by Martin Pool
Start indexing knits by both integer and version string.
170
    def __init__(self):
944 by Martin Pool
- refactor member names in Weave code
171
        self._weave = []
172
        self._parents = []
0.1.89 by Martin Pool
Store SHA1 in weave file for later verification
173
        self._sha1s = []
0.1.60 by Martin Pool
Weave eq and ne methods
174
175
176
    def __eq__(self, other):
177
        if not isinstance(other, Weave):
178
            return False
944 by Martin Pool
- refactor member names in Weave code
179
        return self._parents == other._parents \
180
               and self._weave == other._weave
0.1.60 by Martin Pool
Weave eq and ne methods
181
    
182
183
    def __ne__(self, other):
184
        return not self.__eq__(other)
185
0.1.2 by Martin Pool
Import testsweet module adapted from bzr.
186
        
0.1.26 by Martin Pool
Refactor parameters to add command
187
    def add(self, parents, text):
0.1.4 by Martin Pool
Start indexing knits by both integer and version string.
188
        """Add a single text on top of the weave.
0.1.36 by Martin Pool
doc
189
  
0.1.26 by Martin Pool
Refactor parameters to add command
190
        Returns the index number of the newly added version.
191
192
        parents
892 by Martin Pool
- weave stores only direct parents, and calculates and memoizes expansion as needed
193
            List or set of direct parent version numbers.
194
            
0.1.26 by Martin Pool
Refactor parameters to add command
195
        text
196
            Sequence of lines to be added in the new version."""
938 by Martin Pool
- various optimizations to weave add code
197
198
        self._check_versions(parents)
0.1.82 by Martin Pool
Small weave optimizations
199
        ## self._check_lines(text)
944 by Martin Pool
- refactor member names in Weave code
200
        new_version = len(self._parents)
0.1.5 by Martin Pool
Add test for storing two text versions.
201
0.1.89 by Martin Pool
Store SHA1 in weave file for later verification
202
        import sha
203
        s = sha.new()
938 by Martin Pool
- various optimizations to weave add code
204
        map(s.update, text)
0.1.89 by Martin Pool
Store SHA1 in weave file for later verification
205
        sha1 = s.hexdigest()
206
        del s
207
938 by Martin Pool
- various optimizations to weave add code
208
        # if we abort after here the weave will be corrupt
944 by Martin Pool
- refactor member names in Weave code
209
        self._parents.append(frozenset(parents))
0.1.89 by Martin Pool
Store SHA1 in weave file for later verification
210
        self._sha1s.append(sha1)
938 by Martin Pool
- various optimizations to weave add code
211
212
            
213
        if not parents:
214
            # special case; adding with no parents revision; can do
215
            # this more quickly by just appending unconditionally.
216
            # even more specially, if we're adding an empty text we
217
            # need do nothing at all.
218
            if text:
944 by Martin Pool
- refactor member names in Weave code
219
                self._weave.append(('{', new_version))
220
                self._weave.extend(text)
1075 by Martin Pool
- don't store redundant version number at end of insert blocks
221
                self._weave.append(('}', None))
938 by Martin Pool
- various optimizations to weave add code
222
        
223
            return new_version
224
941 by Martin Pool
- allow for parents specified to Weave.add to be a set
225
        if len(parents) == 1:
226
            pv = list(parents)[0]
227
            if sha1 == self._sha1s[pv]:
228
                # special case: same as the single parent
229
                return new_version
938 by Martin Pool
- various optimizations to weave add code
230
            
231
232
        ancestors = self.inclusions(parents)
233
944 by Martin Pool
- refactor member names in Weave code
234
        l = self._weave
938 by Martin Pool
- various optimizations to weave add code
235
236
        # basis a list of (origin, lineno, line)
237
        basis_lineno = []
238
        basis_lines = []
239
        for origin, lineno, line in self._extract(ancestors):
240
            basis_lineno.append(lineno)
241
            basis_lines.append(line)
242
1042 by Martin Pool
- more statistics output from 'weave stats' command
243
        # another small special case: a merge, producing the same text
244
        # as auto-merge
938 by Martin Pool
- various optimizations to weave add code
245
        if text == basis_lines:
246
            return new_version            
247
248
        # add a sentinal, because we can also match against the final line
944 by Martin Pool
- refactor member names in Weave code
249
        basis_lineno.append(len(self._weave))
938 by Martin Pool
- various optimizations to weave add code
250
251
        # XXX: which line of the weave should we really consider
252
        # matches the end of the file?  the current code says it's the
253
        # last line of the weave?
254
255
        #print 'basis_lines:', basis_lines
256
        #print 'new_lines:  ', lines
257
258
        from difflib import SequenceMatcher
259
        s = SequenceMatcher(None, basis_lines, text)
260
261
        # offset gives the number of lines that have been inserted
262
        # into the weave up to the current point; if the original edit instruction
263
        # says to change line A then we actually change (A+offset)
264
        offset = 0
265
266
        for tag, i1, i2, j1, j2 in s.get_opcodes():
267
            # i1,i2 are given in offsets within basis_lines; we need to map them
268
            # back to offsets within the entire weave
269
            #print 'raw match', tag, i1, i2, j1, j2
270
            if tag == 'equal':
271
                continue
272
273
            i1 = basis_lineno[i1]
274
            i2 = basis_lineno[i2]
275
276
            assert 0 <= j1 <= j2 <= len(text)
277
278
            #print tag, i1, i2, j1, j2
279
280
            # the deletion and insertion are handled separately.
281
            # first delete the region.
282
            if i1 != i2:
944 by Martin Pool
- refactor member names in Weave code
283
                self._weave.insert(i1+offset, ('[', new_version))
284
                self._weave.insert(i2+offset+1, (']', new_version))
938 by Martin Pool
- various optimizations to weave add code
285
                offset += 2
286
287
            if j1 != j2:
288
                # there may have been a deletion spanning up to
289
                # i2; we want to insert after this region to make sure
290
                # we don't destroy ourselves
291
                i = i2 + offset
944 by Martin Pool
- refactor member names in Weave code
292
                self._weave[i:i] = ([('{', new_version)] 
1075 by Martin Pool
- don't store redundant version number at end of insert blocks
293
                                    + text[j1:j2] 
294
                                    + [('}', None)])
938 by Martin Pool
- various optimizations to weave add code
295
                offset += 2 + (j2 - j1)
296
297
        return new_version
0.1.2 by Martin Pool
Import testsweet module adapted from bzr.
298
0.1.27 by Martin Pool
Check that version numbers passed in are reasonable
299
0.1.78 by Martin Pool
Rename Weave.get_included to inclusions and getiter to get_iter
300
    def inclusions(self, versions):
893 by Martin Pool
- Refactor weave calculation of inclusions
301
        """Return set of all ancestors of given version(s)."""
928 by Martin Pool
- go back to using plain builtin set()
302
        i = set(versions)
893 by Martin Pool
- Refactor weave calculation of inclusions
303
        v = max(versions)
892 by Martin Pool
- weave stores only direct parents, and calculates and memoizes expansion as needed
304
        try:
893 by Martin Pool
- Refactor weave calculation of inclusions
305
            while v >= 0:
306
                if v in i:
307
                    # include all its parents
944 by Martin Pool
- refactor member names in Weave code
308
                    i.update(self._parents[v])
893 by Martin Pool
- Refactor weave calculation of inclusions
309
                v -= 1
310
            return i
892 by Martin Pool
- weave stores only direct parents, and calculates and memoizes expansion as needed
311
        except IndexError:
312
            raise ValueError("version %d not present in weave" % v)
0.1.77 by Martin Pool
New Weave.get_included() does transitive expansion
313
314
890 by Martin Pool
- weave info should show minimal expression of parents
315
    def minimal_parents(self, version):
316
        """Find the minimal set of parents for the version."""
944 by Martin Pool
- refactor member names in Weave code
317
        included = self._parents[version]
890 by Martin Pool
- weave info should show minimal expression of parents
318
        if not included:
319
            return []
320
        
321
        li = list(included)
893 by Martin Pool
- Refactor weave calculation of inclusions
322
        li.sort(reverse=True)
890 by Martin Pool
- weave info should show minimal expression of parents
323
324
        mininc = []
928 by Martin Pool
- go back to using plain builtin set()
325
        gotit = set()
890 by Martin Pool
- weave info should show minimal expression of parents
326
327
        for pv in li:
328
            if pv not in gotit:
329
                mininc.append(pv)
893 by Martin Pool
- Refactor weave calculation of inclusions
330
                gotit.update(self.inclusions(pv))
890 by Martin Pool
- weave info should show minimal expression of parents
331
332
        assert mininc[0] >= 0
333
        assert mininc[-1] < version
334
        return mininc
335
336
0.1.75 by Martin Pool
Remove VerInfo class; just store sets directly in the list of
337
0.1.39 by Martin Pool
Change to a more realistic weave structure which can represent insertions and
338
    def _check_lines(self, text):
339
        if not isinstance(text, list):
340
            raise ValueError("text should be a list, not %s" % type(text))
341
342
        for l in text:
343
            if not isinstance(l, basestring):
869 by Martin Pool
- more weave.py command line options
344
                raise ValueError("text line should be a string or unicode, not %s"
345
                                 % type(l))
0.1.39 by Martin Pool
Change to a more realistic weave structure which can represent insertions and
346
        
347
348
0.1.27 by Martin Pool
Check that version numbers passed in are reasonable
349
    def _check_versions(self, indexes):
350
        """Check everything in the sequence of indexes is valid"""
351
        for i in indexes:
352
            try:
944 by Martin Pool
- refactor member names in Weave code
353
                self._parents[i]
0.1.27 by Martin Pool
Check that version numbers passed in are reasonable
354
            except IndexError:
355
                raise IndexError("invalid version number %r" % i)
356
0.1.2 by Martin Pool
Import testsweet module adapted from bzr.
357
    
0.1.7 by Martin Pool
Add trivial annotate text
358
    def annotate(self, index):
359
        return list(self.annotate_iter(index))
360
361
0.1.78 by Martin Pool
Rename Weave.get_included to inclusions and getiter to get_iter
362
    def annotate_iter(self, version):
0.1.7 by Martin Pool
Add trivial annotate text
363
        """Yield list of (index-id, line) pairs for the specified version.
364
365
        The index indicates when the line originated in the weave."""
893 by Martin Pool
- Refactor weave calculation of inclusions
366
        for origin, lineno, text in self._extract([version]):
0.1.39 by Martin Pool
Change to a more realistic weave structure which can represent insertions and
367
            yield origin, text
0.1.22 by Martin Pool
Calculate delta for new versions relative to a set of parent versions.
368
369
918 by Martin Pool
- start doing new weave-merge algorithm
370
    def _walk(self):
371
        """Walk the weave.
372
373
        Yields sequence of
374
        (lineno, insert, deletes, text)
375
        for each literal line.
376
        """
377
        
378
        istack = []
928 by Martin Pool
- go back to using plain builtin set()
379
        dset = set()
918 by Martin Pool
- start doing new weave-merge algorithm
380
381
        lineno = 0         # line of weave, 0-based
382
944 by Martin Pool
- refactor member names in Weave code
383
        for l in self._weave:
918 by Martin Pool
- start doing new weave-merge algorithm
384
            if isinstance(l, tuple):
385
                c, v = l
386
                isactive = None
387
                if c == '{':
388
                    istack.append(v)
389
                elif c == '}':
1075 by Martin Pool
- don't store redundant version number at end of insert blocks
390
                    istack.pop()
918 by Martin Pool
- start doing new weave-merge algorithm
391
                elif c == '[':
926 by Martin Pool
- update more weave code to use intsets
392
                    assert v not in dset
393
                    dset.add(v)
918 by Martin Pool
- start doing new weave-merge algorithm
394
                elif c == ']':
926 by Martin Pool
- update more weave code to use intsets
395
                    dset.remove(v)
918 by Martin Pool
- start doing new weave-merge algorithm
396
                else:
397
                    raise WeaveFormatError('unexpected instruction %r'
398
                                           % v)
399
            else:
400
                assert isinstance(l, basestring)
401
                assert istack
402
                yield lineno, istack[-1], dset, l
403
            lineno += 1
404
405
406
893 by Martin Pool
- Refactor weave calculation of inclusions
407
    def _extract(self, versions):
0.1.20 by Martin Pool
Factor out Knit.extract() method
408
        """Yield annotation of lines in included set.
409
0.1.39 by Martin Pool
Change to a more realistic weave structure which can represent insertions and
410
        Yields a sequence of tuples (origin, lineno, text), where
411
        origin is the origin version, lineno the index in the weave,
412
        and text the text of the line.
413
0.1.20 by Martin Pool
Factor out Knit.extract() method
414
        The set typically but not necessarily corresponds to a version.
415
        """
893 by Martin Pool
- Refactor weave calculation of inclusions
416
        included = self.inclusions(versions)
881 by Martin Pool
- faster weave extraction
417
418
        istack = []
928 by Martin Pool
- go back to using plain builtin set()
419
        dset = set()
0.1.48 by Martin Pool
Basic parsing of delete instructions.
420
421
        lineno = 0         # line of weave, 0-based
891 by Martin Pool
- fix up refactoring of weave
422
894 by Martin Pool
- small optimization for weave extract
423
        isactive = None
0.1.85 by Martin Pool
doc
424
931 by Martin Pool
- experiment with making Weave._extract() return a list, not a generator - slightly faster
425
        result = []
426
0.1.63 by Martin Pool
Abbreviate WeaveFormatError in some code
427
        WFE = WeaveFormatError
0.1.95 by Martin Pool
- preliminary merge conflict detection
428
944 by Martin Pool
- refactor member names in Weave code
429
        for l in self._weave:
0.1.39 by Martin Pool
Change to a more realistic weave structure which can represent insertions and
430
            if isinstance(l, tuple):
431
                c, v = l
894 by Martin Pool
- small optimization for weave extract
432
                isactive = None
891 by Martin Pool
- fix up refactoring of weave
433
                if c == '{':
434
                    assert v not in istack
435
                    istack.append(v)
436
                elif c == '}':
1075 by Martin Pool
- don't store redundant version number at end of insert blocks
437
                    istack.pop()
891 by Martin Pool
- fix up refactoring of weave
438
                elif c == '[':
439
                    if v in included:
881 by Martin Pool
- faster weave extraction
440
                        assert v not in dset
0.1.48 by Martin Pool
Basic parsing of delete instructions.
441
                        dset.add(v)
891 by Martin Pool
- fix up refactoring of weave
442
                else:
443
                    assert c == ']'
444
                    if v in included:
881 by Martin Pool
- faster weave extraction
445
                        assert v in dset
0.1.48 by Martin Pool
Basic parsing of delete instructions.
446
                        dset.remove(v)
0.1.39 by Martin Pool
Change to a more realistic weave structure which can represent insertions and
447
            else:
448
                assert isinstance(l, basestring)
894 by Martin Pool
- small optimization for weave extract
449
                if isactive is None:
450
                    isactive = (not dset) and istack and (istack[-1] in included)
0.1.39 by Martin Pool
Change to a more realistic weave structure which can represent insertions and
451
                if isactive:
931 by Martin Pool
- experiment with making Weave._extract() return a list, not a generator - slightly faster
452
                    result.append((istack[-1], lineno, l))
0.1.39 by Martin Pool
Change to a more realistic weave structure which can represent insertions and
453
            lineno += 1
0.1.7 by Martin Pool
Add trivial annotate text
454
0.1.46 by Martin Pool
More constraints on structure of weave, and checks that they work
455
        if istack:
0.1.63 by Martin Pool
Abbreviate WeaveFormatError in some code
456
            raise WFE("unclosed insertion blocks at end of weave",
0.1.47 by Martin Pool
New WeaveError and WeaveFormatError rather than assertions.
457
                                   istack)
0.1.48 by Martin Pool
Basic parsing of delete instructions.
458
        if dset:
0.1.63 by Martin Pool
Abbreviate WeaveFormatError in some code
459
            raise WFE("unclosed deletion blocks at end of weave",
0.1.48 by Martin Pool
Basic parsing of delete instructions.
460
                                   dset)
0.1.40 by Martin Pool
Add test for extracting from weave with nested insertions
461
931 by Martin Pool
- experiment with making Weave._extract() return a list, not a generator - slightly faster
462
        return result
463
    
464
0.1.7 by Martin Pool
Add trivial annotate text
465
0.1.78 by Martin Pool
Rename Weave.get_included to inclusions and getiter to get_iter
466
    def get_iter(self, version):
0.1.5 by Martin Pool
Add test for storing two text versions.
467
        """Yield lines for the specified version."""
893 by Martin Pool
- Refactor weave calculation of inclusions
468
        for origin, lineno, line in self._extract([version]):
0.1.8 by Martin Pool
Unify get/annotate code
469
            yield line
0.1.5 by Martin Pool
Add test for storing two text versions.
470
471
0.1.4 by Martin Pool
Start indexing knits by both integer and version string.
472
    def get(self, index):
0.1.78 by Martin Pool
Rename Weave.get_included to inclusions and getiter to get_iter
473
        return list(self.get_iter(index))
0.1.1 by Martin Pool
Check in old existing knit code.
474
475
0.1.95 by Martin Pool
- preliminary merge conflict detection
476
    def mash_iter(self, included):
0.1.65 by Martin Pool
Add Weave.merge_iter to get automerged lines
477
        """Return composed version of multiple included versions."""
893 by Martin Pool
- Refactor weave calculation of inclusions
478
        for origin, lineno, text in self._extract(included):
0.1.65 by Martin Pool
Add Weave.merge_iter to get automerged lines
479
            yield text
480
481
0.1.11 by Martin Pool
Add Knit.dump method
482
    def dump(self, to_file):
483
        from pprint import pprint
944 by Martin Pool
- refactor member names in Weave code
484
        print >>to_file, "Weave._weave = ",
485
        pprint(self._weave, to_file)
486
        print >>to_file, "Weave._parents = ",
487
        pprint(self._parents, to_file)
0.1.11 by Martin Pool
Add Knit.dump method
488
489
0.1.91 by Martin Pool
Update Weave.check
490
491
    def numversions(self):
944 by Martin Pool
- refactor member names in Weave code
492
        l = len(self._parents)
0.1.91 by Martin Pool
Update Weave.check
493
        assert l == len(self._sha1s)
494
        return l
495
496
946 by Martin Pool
- weave info only shows the weave headers, doesn't extract every version:
497
    def __len__(self):
498
        return self.numversions()
499
500
894 by Martin Pool
- small optimization for weave extract
501
    def check(self, progress_bar=None):
0.1.91 by Martin Pool
Update Weave.check
502
        # check no circular inclusions
503
        for version in range(self.numversions()):
944 by Martin Pool
- refactor member names in Weave code
504
            inclusions = list(self._parents[version])
0.1.91 by Martin Pool
Update Weave.check
505
            if inclusions:
506
                inclusions.sort()
507
                if inclusions[-1] >= version:
0.1.47 by Martin Pool
New WeaveError and WeaveFormatError rather than assertions.
508
                    raise WeaveFormatError("invalid included version %d for index %d"
0.1.91 by Martin Pool
Update Weave.check
509
                                           % (inclusions[-1], version))
510
511
        # try extracting all versions; this is a bit slow and parallel
512
        # extraction could be used
513
        import sha
894 by Martin Pool
- small optimization for weave extract
514
        nv = self.numversions()
515
        for version in range(nv):
516
            if progress_bar:
517
                progress_bar.update('checking text', version, nv)
0.1.91 by Martin Pool
Update Weave.check
518
            s = sha.new()
519
            for l in self.get_iter(version):
520
                s.update(l)
521
            hd = s.hexdigest()
522
            expected = self._sha1s[version]
523
            if hd != expected:
524
                raise WeaveError("mismatched sha1 for version %d; "
525
                                 "got %s, expected %s"
526
                                 % (version, hd, expected))
0.1.18 by Martin Pool
Better Knit.dump method
527
881 by Martin Pool
- faster weave extraction
528
        # TODO: check insertions are properly nested, that there are
529
        # no lines outside of insertion blocks, that deletions are
530
        # properly paired, etc.
531
0.1.13 by Martin Pool
Knit structure now allows for versions to include the lines present in other
532
533
0.1.95 by Martin Pool
- preliminary merge conflict detection
534
    def merge(self, merge_versions):
535
        """Automerge and mark conflicts between versions.
536
537
        This returns a sequence, each entry describing alternatives
538
        for a chunk of the file.  Each of the alternatives is given as
539
        a list of lines.
540
541
        If there is a chunk of the file where there's no diagreement,
542
        only one alternative is given.
543
        """
544
545
        # approach: find the included versions common to all the
546
        # merged versions
547
        raise NotImplementedError()
548
549
550
0.1.21 by Martin Pool
Start computing a delta to insert a new revision
551
    def _delta(self, included, lines):
552
        """Return changes from basis to new revision.
553
554
        The old text for comparison is the union of included revisions.
555
556
        This is used in inserting a new text.
0.1.22 by Martin Pool
Calculate delta for new versions relative to a set of parent versions.
557
0.1.55 by Martin Pool
doc
558
        Delta is returned as a sequence of
559
        (weave1, weave2, newlines).
560
561
        This indicates that weave1:weave2 of the old weave should be
0.1.22 by Martin Pool
Calculate delta for new versions relative to a set of parent versions.
562
        replaced by the sequence of lines in newlines.  Note that
563
        these line numbers are positions in the total weave and don't
564
        correspond to the lines in any extracted version, or even the
565
        extracted union of included versions.
566
567
        If line1=line2, this is a pure insert; if newlines=[] this is a
568
        pure delete.  (Similar to difflib.)
0.1.21 by Martin Pool
Start computing a delta to insert a new revision
569
        """
570
0.1.1 by Martin Pool
Check in old existing knit code.
571
918 by Martin Pool
- start doing new weave-merge algorithm
572
            
573
    def plan_merge(self, ver_a, ver_b):
574
        """Return pseudo-annotation indicating how the two versions merge.
575
576
        This is computed between versions a and b and their common
577
        base.
578
579
        Weave lines present in none of them are skipped entirely.
580
        """
926 by Martin Pool
- update more weave code to use intsets
581
        inc_a = self.inclusions([ver_a])
582
        inc_b = self.inclusions([ver_b])
918 by Martin Pool
- start doing new weave-merge algorithm
583
        inc_c = inc_a & inc_b
584
585
        for lineno, insert, deleteset, line in self._walk():
586
            if deleteset & inc_c:
587
                # killed in parent; can't be in either a or b
588
                # not relevant to our work
589
                yield 'killed-base', line
926 by Martin Pool
- update more weave code to use intsets
590
            elif insert in inc_c:
918 by Martin Pool
- start doing new weave-merge algorithm
591
                # was inserted in base
592
                killed_a = bool(deleteset & inc_a)
593
                killed_b = bool(deleteset & inc_b)
594
                if killed_a and killed_b:
595
                    yield 'killed-both', line
596
                elif killed_a:
597
                    yield 'killed-a', line
598
                elif killed_b:
599
                    yield 'killed-b', line
600
                else:
601
                    yield 'unchanged', line
926 by Martin Pool
- update more weave code to use intsets
602
            elif insert in inc_a:
918 by Martin Pool
- start doing new weave-merge algorithm
603
                if deleteset & inc_a:
604
                    yield 'ghost-a', line
605
                else:
606
                    # new in A; not in B
607
                    yield 'new-a', line
926 by Martin Pool
- update more weave code to use intsets
608
            elif insert in inc_b:
918 by Martin Pool
- start doing new weave-merge algorithm
609
                if deleteset & inc_b:
610
                    yield 'ghost-b', line
611
                else:
612
                    yield 'new-b', line
613
            else:
614
                # not in either revision
615
                yield 'irrelevant', line
616
919 by Martin Pool
- more development of weave-merge
617
        yield 'unchanged', ''           # terminator
618
619
620
621
    def weave_merge(self, plan):
622
        lines_a = []
623
        lines_b = []
624
        ch_a = ch_b = False
625
626
        for state, line in plan:
627
            if state == 'unchanged' or state == 'killed-both':
628
                # resync and flush queued conflicts changes if any
629
                if not lines_a and not lines_b:
630
                    pass
631
                elif ch_a and not ch_b:
632
                    # one-sided change:                    
633
                    for l in lines_a: yield l
634
                elif ch_b and not ch_a:
635
                    for l in lines_b: yield l
636
                elif lines_a == lines_b:
637
                    for l in lines_a: yield l
638
                else:
639
                    yield '<<<<\n'
640
                    for l in lines_a: yield l
641
                    yield '====\n'
642
                    for l in lines_b: yield l
643
                    yield '>>>>\n'
644
645
                del lines_a[:]
646
                del lines_b[:]
647
                ch_a = ch_b = False
648
                
649
            if state == 'unchanged':
650
                if line:
651
                    yield line
652
            elif state == 'killed-a':
653
                ch_a = True
654
                lines_b.append(line)
655
            elif state == 'killed-b':
656
                ch_b = True
657
                lines_a.append(line)
658
            elif state == 'new-a':
659
                ch_a = True
660
                lines_a.append(line)
661
            elif state == 'new-b':
662
                ch_b = True
663
                lines_b.append(line)
664
            else:
920 by Martin Pool
- add more test cases for weave_merge
665
                assert state in ('irrelevant', 'ghost-a', 'ghost-b', 'killed-base',
666
                                 'killed-both'), \
919 by Martin Pool
- more development of weave-merge
667
                       state
668
669
                
670
671
918 by Martin Pool
- start doing new weave-merge algorithm
672
673
0.1.62 by Martin Pool
Lame command-line client for reading and writing weaves.
674
946 by Martin Pool
- weave info only shows the weave headers, doesn't extract every version:
675
def weave_info(w):
0.1.88 by Martin Pool
Add weave info command.
676
    """Show some text information about the weave."""
946 by Martin Pool
- weave info only shows the weave headers, doesn't extract every version:
677
    print '%6s %40s %20s' % ('ver', 'sha1', 'parents')
678
    for i in (6, 40, 20):
870 by Martin Pool
- better weave info display
679
        print '-' * i,
680
    print
946 by Martin Pool
- weave info only shows the weave headers, doesn't extract every version:
681
    for i in range(w.numversions()):
0.1.91 by Martin Pool
Update Weave.check
682
        sha1 = w._sha1s[i]
946 by Martin Pool
- weave info only shows the weave headers, doesn't extract every version:
683
        print '%6d %40s %s' % (i, sha1, ' '.join(map(str, w._parents[i])))
0.1.88 by Martin Pool
Add weave info command.
684
869 by Martin Pool
- more weave.py command line options
685
686
947 by Martin Pool
- new 'weave stats' command
687
def weave_stats(weave_file):
688
    from bzrlib.progress import ProgressBar
689
    from bzrlib.weavefile import read_weave
690
691
    pb = ProgressBar()
692
693
    wf = file(weave_file, 'rb')
694
    w = read_weave(wf)
695
    # FIXME: doesn't work on pipes
696
    weave_size = wf.tell()
697
698
    total = 0
699
    vers = len(w)
700
    for i in range(vers):
701
        pb.update('checking sizes', i, vers)
702
        for line in w.get_iter(i):
703
            total += len(line)
704
705
    pb.clear()
706
707
    print 'versions          %9d' % vers
708
    print 'weave file        %9d bytes' % weave_size
709
    print 'total contents    %9d bytes' % total
710
    print 'compression ratio %9.2fx' % (float(total) / float(weave_size))
1042 by Martin Pool
- more statistics output from 'weave stats' command
711
    if vers:
712
        avg = total/vers
713
        print 'average size      %9d bytes' % avg
714
        print 'relative size     %9.2fx' % (float(weave_size) / float(avg))
947 by Martin Pool
- new 'weave stats' command
715
716
869 by Martin Pool
- more weave.py command line options
717
def usage():
871 by Martin Pool
- add command for merge-based weave
718
    print """bzr weave tool
719
720
Experimental tool for weave algorithm.
721
869 by Martin Pool
- more weave.py command line options
722
usage:
723
    weave init WEAVEFILE
724
        Create an empty weave file
725
    weave get WEAVEFILE VERSION
726
        Write out specified version.
727
    weave check WEAVEFILE
728
        Check consistency of all versions.
729
    weave info WEAVEFILE
730
        Display table of contents.
731
    weave add WEAVEFILE [BASE...] < NEWTEXT
732
        Add NEWTEXT, with specified parent versions.
733
    weave annotate WEAVEFILE VERSION
734
        Display origin of each line.
735
    weave mash WEAVEFILE VERSION...
736
        Display composite of all selected versions.
737
    weave merge WEAVEFILE VERSION1 VERSION2 > OUT
738
        Auto-merge two versions and display conflicts.
871 by Martin Pool
- add command for merge-based weave
739
740
example:
741
742
    % weave init foo.weave
743
    % vi foo.txt
744
    % weave add foo.weave < foo.txt
745
    added version 0
746
747
    (create updated version)
748
    % vi foo.txt
749
    % weave get foo.weave 0 | diff -u - foo.txt
750
    % weave add foo.weave 0 < foo.txt
751
    added version 1
752
753
    % weave get foo.weave 0 > foo.txt       (create forked version)
754
    % vi foo.txt
755
    % weave add foo.weave 0 < foo.txt
756
    added version 2
757
758
    % weave merge foo.weave 1 2 > foo.txt   (merge them)
759
    % vi foo.txt                            (resolve conflicts)
760
    % weave add foo.weave 1 2 < foo.txt     (commit merged version)     
761
    
869 by Martin Pool
- more weave.py command line options
762
"""
0.1.88 by Martin Pool
Add weave info command.
763
    
764
0.1.62 by Martin Pool
Lame command-line client for reading and writing weaves.
765
766
def main(argv):
767
    import sys
768
    import os
869 by Martin Pool
- more weave.py command line options
769
    from weavefile import write_weave, read_weave
894 by Martin Pool
- small optimization for weave extract
770
    from bzrlib.progress import ProgressBar
771
772
    #import psyco
773
    #psyco.full()
774
0.1.62 by Martin Pool
Lame command-line client for reading and writing weaves.
775
    cmd = argv[1]
869 by Martin Pool
- more weave.py command line options
776
777
    def readit():
778
        return read_weave(file(argv[2], 'rb'))
779
    
780
    if cmd == 'help':
781
        usage()
782
    elif cmd == 'add':
783
        w = readit()
0.1.62 by Martin Pool
Lame command-line client for reading and writing weaves.
784
        # at the moment, based on everything in the file
869 by Martin Pool
- more weave.py command line options
785
        parents = map(int, argv[3:])
0.1.72 by Martin Pool
Go back to weave lines normally having newlines at the end.
786
        lines = sys.stdin.readlines()
0.1.69 by Martin Pool
Simple text-based format for storing weaves, cleaner than
787
        ver = w.add(parents, lines)
869 by Martin Pool
- more weave.py command line options
788
        write_weave(w, file(argv[2], 'wb'))
789
        print 'added version %d' % ver
0.1.62 by Martin Pool
Lame command-line client for reading and writing weaves.
790
    elif cmd == 'init':
791
        fn = argv[2]
792
        if os.path.exists(fn):
793
            raise IOError("file exists")
794
        w = Weave()
869 by Martin Pool
- more weave.py command line options
795
        write_weave(w, file(fn, 'wb'))
796
    elif cmd == 'get': # get one version
797
        w = readit()
0.1.94 by Martin Pool
Fix get_iter call
798
        sys.stdout.writelines(w.get_iter(int(argv[3])))
869 by Martin Pool
- more weave.py command line options
799
        
800
    elif cmd == 'mash': # get composite
801
        w = readit()
802
        sys.stdout.writelines(w.mash_iter(map(int, argv[3:])))
803
0.1.62 by Martin Pool
Lame command-line client for reading and writing weaves.
804
    elif cmd == 'annotate':
869 by Martin Pool
- more weave.py command line options
805
        w = readit()
0.1.72 by Martin Pool
Go back to weave lines normally having newlines at the end.
806
        # newline is added to all lines regardless; too hard to get
807
        # reasonable formatting otherwise
0.1.62 by Martin Pool
Lame command-line client for reading and writing weaves.
808
        lasto = None
809
        for origin, text in w.annotate(int(argv[3])):
0.1.72 by Martin Pool
Go back to weave lines normally having newlines at the end.
810
            text = text.rstrip('\r\n')
0.1.62 by Martin Pool
Lame command-line client for reading and writing weaves.
811
            if origin == lasto:
812
                print '      | %s' % (text)
813
            else:
814
                print '%5d | %s' % (origin, text)
815
                lasto = origin
871 by Martin Pool
- add command for merge-based weave
816
                
0.1.88 by Martin Pool
Add weave info command.
817
    elif cmd == 'info':
946 by Martin Pool
- weave info only shows the weave headers, doesn't extract every version:
818
        weave_info(readit())
947 by Martin Pool
- new 'weave stats' command
819
820
    elif cmd == 'stats':
821
        weave_stats(argv[2])
871 by Martin Pool
- add command for merge-based weave
822
        
0.1.91 by Martin Pool
Update Weave.check
823
    elif cmd == 'check':
869 by Martin Pool
- more weave.py command line options
824
        w = readit()
894 by Martin Pool
- small optimization for weave extract
825
        pb = ProgressBar()
826
        w.check(pb)
827
        pb.clear()
938 by Martin Pool
- various optimizations to weave add code
828
        print '%d versions ok' % w.numversions()
871 by Martin Pool
- add command for merge-based weave
829
892 by Martin Pool
- weave stores only direct parents, and calculates and memoizes expansion as needed
830
    elif cmd == 'inclusions':
831
        w = readit()
832
        print ' '.join(map(str, w.inclusions([int(argv[3])])))
833
834
    elif cmd == 'parents':
835
        w = readit()
944 by Martin Pool
- refactor member names in Weave code
836
        print ' '.join(map(str, w._parents[int(argv[3])]))
892 by Martin Pool
- weave stores only direct parents, and calculates and memoizes expansion as needed
837
918 by Martin Pool
- start doing new weave-merge algorithm
838
    elif cmd == 'plan-merge':
839
        w = readit()
840
        for state, line in w.plan_merge(int(argv[3]), int(argv[4])):
919 by Martin Pool
- more development of weave-merge
841
            if line:
842
                print '%14s | %s' % (state, line),
918 by Martin Pool
- start doing new weave-merge algorithm
843
871 by Martin Pool
- add command for merge-based weave
844
    elif cmd == 'merge':
919 by Martin Pool
- more development of weave-merge
845
        w = readit()
846
        p = w.plan_merge(int(argv[3]), int(argv[4]))
847
        sys.stdout.writelines(w.weave_merge(p))
848
            
849
    elif cmd == 'mash-merge':
871 by Martin Pool
- add command for merge-based weave
850
        if len(argv) != 5:
851
            usage()
852
            return 1
853
854
        w = readit()
855
        v1, v2 = map(int, argv[3:5])
856
857
        basis = w.inclusions([v1]).intersection(w.inclusions([v2]))
858
859
        base_lines = list(w.mash_iter(basis))
860
        a_lines = list(w.get(v1))
861
        b_lines = list(w.get(v2))
862
863
        from bzrlib.merge3 import Merge3
864
        m3 = Merge3(base_lines, a_lines, b_lines)
865
866
        name_a = 'version %d' % v1
867
        name_b = 'version %d' % v2
868
        sys.stdout.writelines(m3.merge_lines(name_a=name_a, name_b=name_b))
0.1.62 by Martin Pool
Lame command-line client for reading and writing weaves.
869
    else:
870
        raise ValueError('unknown command %r' % cmd)
871
    
872
1076 by Martin Pool
- add code to run weave utility under profiler
873
874
def profile_main(argv): 
875
    import tempfile, hotshot, hotshot.stats
876
877
    prof_f = tempfile.NamedTemporaryFile()
878
879
    prof = hotshot.Profile(prof_f.name)
880
881
    ret = prof.runcall(main, argv)
882
    prof.close()
883
884
    stats = hotshot.stats.load(prof_f.name)
885
    #stats.strip_dirs()
886
    stats.sort_stats('time')
887
    ## XXX: Might like to write to stderr or the trace file instead but
888
    ## print_stats seems hardcoded to stdout
889
    stats.print_stats(20)
890
            
891
    return ret
892
893
0.1.62 by Martin Pool
Lame command-line client for reading and writing weaves.
894
if __name__ == '__main__':
895
    import sys
896
    sys.exit(main(sys.argv))
1076 by Martin Pool
- add code to run weave utility under profiler
897