20
20
lines by NL. The field delimiters are ommitted in the grammar, line delimiters
21
21
are not - this is done for clarity of reading. All string data is in utf8.
25
MINIKIND = "f" | "d" | "l" | "a" | "r" | "t";
28
WHOLE_NUMBER = {digit}, digit;
30
REVISION_ID = a non-empty utf8 string;
32
dirstate format = header line, full checksum, row count, parent details,
33
ghost_details, entries;
34
header line = "#bazaar dirstate flat format 3", NL;
35
full checksum = "crc32: ", ["-"], WHOLE_NUMBER, NL;
36
row count = "num_entries: ", WHOLE_NUMBER, NL;
37
parent_details = WHOLE NUMBER, {REVISION_ID}* NL;
38
ghost_details = WHOLE NUMBER, {REVISION_ID}*, NL;
40
entry = entry_key, current_entry_details, {parent_entry_details};
41
entry_key = dirname, basename, fileid;
42
current_entry_details = common_entry_details, working_entry_details;
43
parent_entry_details = common_entry_details, history_entry_details;
44
common_entry_details = MINIKIND, fingerprint, size, executable
45
working_entry_details = packed_stat
46
history_entry_details = REVISION_ID;
49
fingerprint = a nonempty utf8 sequence with meaning defined by minikind.
51
Given this definition, the following is useful to know::
53
entry (aka row) - all the data for a given key.
54
entry[0]: The key (dirname, basename, fileid)
58
entry[1]: The tree(s) data for this path and id combination.
59
entry[1][0]: The current tree
60
entry[1][1]: The second tree
62
For an entry for a tree, we have (using tree 0 - current tree) to demonstrate::
64
entry[1][0][0]: minikind
65
entry[1][0][1]: fingerprint
67
entry[1][0][3]: executable
68
entry[1][0][4]: packed_stat
72
entry[1][1][4]: revision_id
23
MINIKIND = "f" | "d" | "l" | "a" | "r" | "t";
26
WHOLE_NUMBER = {digit}, digit;
28
REVISION_ID = a non-empty utf8 string;
30
dirstate format = header line, full checksum, row count, parent details,
31
ghost_details, entries;
32
header line = "#bazaar dirstate flat format 3", NL;
33
full checksum = "crc32: ", ["-"], WHOLE_NUMBER, NL;
34
row count = "num_entries: ", WHOLE_NUMBER, NL;
35
parent_details = WHOLE NUMBER, {REVISION_ID}* NL;
36
ghost_details = WHOLE NUMBER, {REVISION_ID}*, NL;
38
entry = entry_key, current_entry_details, {parent_entry_details};
39
entry_key = dirname, basename, fileid;
40
current_entry_details = common_entry_details, working_entry_details;
41
parent_entry_details = common_entry_details, history_entry_details;
42
common_entry_details = MINIKIND, fingerprint, size, executable
43
working_entry_details = packed_stat
44
history_entry_details = REVISION_ID;
47
fingerprint = a nonempty utf8 sequence with meaning defined by minikind.
49
Given this definition, the following is useful to know:
50
entry (aka row) - all the data for a given key.
51
entry[0]: The key (dirname, basename, fileid)
55
entry[1]: The tree(s) data for this path and id combination.
56
entry[1][0]: The current tree
57
entry[1][1]: The second tree
59
For an entry for a tree, we have (using tree 0 - current tree) to demonstrate:
60
entry[1][0][0]: minikind
61
entry[1][0][1]: fingerprint
63
entry[1][0][3]: executable
64
entry[1][0][4]: packed_stat
66
entry[1][1][4]: revision_id
74
68
There may be multiple rows at the root, one per id present in the root, so the
75
in memory root row is now::
77
self._dirblocks[0] -> ('', [entry ...]),
79
and the entries in there are::
83
entries[0][2]: file_id
84
entries[1][0]: The tree data for the current tree for this fileid at /
89
b'r' is a relocated entry: This path is not present in this tree with this
90
id, but the id can be found at another location. The fingerprint is
91
used to point to the target location.
92
b'a' is an absent entry: In that tree the id is not present at this path.
93
b'd' is a directory entry: This path in this tree is a directory with the
94
current file id. There is no fingerprint for directories.
95
b'f' is a file entry: As for directory, but it's a file. The fingerprint is
96
the sha1 value of the file's canonical form, i.e. after any read
97
filters have been applied to the convenience form stored in the working
99
b'l' is a symlink entry: As for directory, but a symlink. The fingerprint is
101
b't' is a reference to a nested subtree; the fingerprint is the referenced
69
in memory root row is now:
70
self._dirblocks[0] -> ('', [entry ...]),
71
and the entries in there are
74
entries[0][2]: file_id
75
entries[1][0]: The tree data for the current tree for this fileid at /
79
'r' is a relocated entry: This path is not present in this tree with this id,
80
but the id can be found at another location. The fingerprint is used to
81
point to the target location.
82
'a' is an absent entry: In that tree the id is not present at this path.
83
'd' is a directory entry: This path in this tree is a directory with the
84
current file id. There is no fingerprint for directories.
85
'f' is a file entry: As for directory, but it's a file. The fingerprint is the
86
sha1 value of the file's canonical form, i.e. after any read filters have
87
been applied to the convenience form stored in the working tree.
88
'l' is a symlink entry: As for directory, but a symlink. The fingerprint is the
90
't' is a reference to a nested subtree; the fingerprint is the referenced
106
The entries on disk and in memory are ordered according to the following keys::
95
The entries on disk and in memory are ordered according to the following keys:
108
97
directory, as a list of components
112
101
--- Format 1 had the following different definition: ---
116
rows = dirname, NULL, basename, NULL, MINIKIND, NULL, fileid_utf8, NULL,
117
WHOLE NUMBER (* size *), NULL, packed stat, NULL, sha1|symlink target,
119
PARENT ROW = NULL, revision_utf8, NULL, MINIKIND, NULL, dirname, NULL,
120
basename, NULL, WHOLE NUMBER (* size *), NULL, "y" | "n", NULL,
102
rows = dirname, NULL, basename, NULL, MINIKIND, NULL, fileid_utf8, NULL,
103
WHOLE NUMBER (* size *), NULL, packed stat, NULL, sha1|symlink target,
105
PARENT ROW = NULL, revision_utf8, NULL, MINIKIND, NULL, dirname, NULL,
106
basename, NULL, WHOLE NUMBER (* size *), NULL, "y" | "n", NULL,
123
109
PARENT ROW's are emitted for every parent that is not in the ghosts details
124
110
line. That is, if the parents are foo, bar, baz, and the ghosts are bar, then
261
231
ERROR_DIRECTORY = 267
264
class DirstateCorrupt(errors.BzrError):
266
_fmt = "The dirstate file (%(state)s) appears to be corrupt: %(msg)s"
268
def __init__(self, state, msg):
269
errors.BzrError.__init__(self)
234
if not getattr(struct, '_compile', None):
235
# Cannot pre-compile the dirstate pack_stat
236
def pack_stat(st, _encode=binascii.b2a_base64, _pack=struct.pack):
237
"""Convert stat values into a packed representation."""
238
return _encode(_pack('>LLLLLL', st.st_size, int(st.st_mtime),
239
int(st.st_ctime), st.st_dev, st.st_ino & 0xFFFFFFFF,
242
# compile the struct compiler we need, so as to only do it once
243
from _struct import Struct
244
_compiled_pack = Struct('>LLLLLL').pack
245
def pack_stat(st, _encode=binascii.b2a_base64, _pack=_compiled_pack):
246
"""Convert stat values into a packed representation."""
247
# jam 20060614 it isn't really worth removing more entries if we
248
# are going to leave it in packed form.
249
# With only st_mtime and st_mode filesize is 5.5M and read time is 275ms
250
# With all entries, filesize is 5.9M and read time is maybe 280ms
251
# well within the noise margin
253
# base64 encoding always adds a final newline, so strip it off
254
# The current version
255
return _encode(_pack(st.st_size, int(st.st_mtime), int(st.st_ctime),
256
st.st_dev, st.st_ino & 0xFFFFFFFF, st.st_mode))[:-1]
257
# This is 0.060s / 1.520s faster by not encoding as much information
258
# return _encode(_pack('>LL', int(st.st_mtime), st.st_mode))[:-1]
259
# This is not strictly faster than _encode(_pack())[:-1]
260
# return '%X.%X.%X.%X.%X.%X' % (
261
# st.st_size, int(st.st_mtime), int(st.st_ctime),
262
# st.st_dev, st.st_ino, st.st_mode)
263
# Similar to the _encode(_pack('>LL'))
264
# return '%X.%X' % (int(st.st_mtime), st.st_mode)
274
267
class SHA1Provider(object):
358
354
NOT_IN_MEMORY = 0
359
355
IN_MEMORY_UNMODIFIED = 1
360
356
IN_MEMORY_MODIFIED = 2
361
IN_MEMORY_HASH_MODIFIED = 3 # Only hash-cache updates
363
358
# A pack_stat (the x's) that is just noise and will never match the output
364
359
# of base64 encode.
366
NULL_PARENT_DETAILS = static_tuple.StaticTuple(b'a', b'', 0, False, b'')
368
HEADER_FORMAT_2 = b'#bazaar dirstate flat format 2\n'
369
HEADER_FORMAT_3 = b'#bazaar dirstate flat format 3\n'
371
def __init__(self, path, sha1_provider, worth_saving_limit=0,
372
use_filesystem_for_exec=True):
361
NULL_PARENT_DETAILS = ('a', '', 0, False, '')
363
HEADER_FORMAT_2 = '#bazaar dirstate flat format 2\n'
364
HEADER_FORMAT_3 = '#bazaar dirstate flat format 3\n'
366
def __init__(self, path, sha1_provider):
373
367
"""Create a DirState object.
375
369
:param path: The path at which the dirstate file on disk should live.
376
370
:param sha1_provider: an object meeting the SHA1Provider interface.
377
:param worth_saving_limit: when the exact number of hash changed
378
entries is known, only bother saving the dirstate if more than
379
this count of entries have changed.
380
-1 means never save hash changes, 0 means always save hash changes.
381
:param use_filesystem_for_exec: Whether to trust the filesystem
382
for executable bit information
384
372
# _header_state and _dirblock_state represent the current state
385
373
# of the dirstate metadata and the per-row data respectiely.
423
411
self._last_block_index = None
424
412
self._last_entry_index = None
425
# The set of known hash changes
426
self._known_hash_changes = set()
427
# How many hash changed entries can we have without saving
428
self._worth_saving_limit = worth_saving_limit
429
self._config_stack = config.LocationStack(urlutils.local_path_to_url(
431
self._use_filesystem_for_exec = use_filesystem_for_exec
433
414
def __repr__(self):
434
415
return "%s(%r)" % \
435
416
(self.__class__.__name__, self._filename)
437
def _mark_modified(self, hash_changed_entries=None, header_modified=False):
438
"""Mark this dirstate as modified.
440
:param hash_changed_entries: if non-None, mark just these entries as
441
having their hash modified.
442
:param header_modified: mark the header modified as well, not just the
445
#trace.mutter_callsite(3, "modified hash entries: %s", hash_changed_entries)
446
if hash_changed_entries:
447
self._known_hash_changes.update(
448
[e[0] for e in hash_changed_entries])
449
if self._dirblock_state in (DirState.NOT_IN_MEMORY,
450
DirState.IN_MEMORY_UNMODIFIED):
451
# If the dirstate is already marked a IN_MEMORY_MODIFIED, then
452
# that takes precedence.
453
self._dirblock_state = DirState.IN_MEMORY_HASH_MODIFIED
455
# TODO: Since we now have a IN_MEMORY_HASH_MODIFIED state, we
456
# should fail noisily if someone tries to set
457
# IN_MEMORY_MODIFIED but we don't have a write-lock!
458
# We don't know exactly what changed so disable smart saving
459
self._dirblock_state = DirState.IN_MEMORY_MODIFIED
461
self._header_state = DirState.IN_MEMORY_MODIFIED
463
def _mark_unmodified(self):
464
"""Mark this dirstate as unmodified."""
465
self._header_state = DirState.IN_MEMORY_UNMODIFIED
466
self._dirblock_state = DirState.IN_MEMORY_UNMODIFIED
467
self._known_hash_changes = set()
469
418
def add(self, path, file_id, kind, stat, fingerprint):
470
419
"""Add a path to be tracked.
472
:param path: The path within the dirstate - b'' is the root, 'foo' is the
421
:param path: The path within the dirstate - '' is the root, 'foo' is the
473
422
path foo within the root, 'foo/bar' is the path bar within foo
475
424
:param file_id: The file id of the path being added.
508
457
utf8path = (dirname + '/' + basename).strip('/').encode('utf8')
509
458
dirname, basename = osutils.split(utf8path)
510
459
# uses __class__ for speed; the check is needed for safety
511
if file_id.__class__ is not bytes:
460
if file_id.__class__ is not str:
512
461
raise AssertionError(
513
462
"must be a utf8 file_id not %s" % (type(file_id), ))
514
463
# Make sure the file_id does not exist in this tree
515
464
rename_from = None
516
file_id_entry = self._get_entry(
517
0, fileid_utf8=file_id, include_deleted=True)
465
file_id_entry = self._get_entry(0, fileid_utf8=file_id, include_deleted=True)
518
466
if file_id_entry != (None, None):
519
if file_id_entry[1][0][0] == b'a':
467
if file_id_entry[1][0][0] == 'a':
520
468
if file_id_entry[0] != (dirname, basename, file_id):
521
469
# set the old name's current operation to rename
522
470
self.update_minimal(file_id_entry[0],
528
476
rename_from = file_id_entry[0][0:2]
530
path = osutils.pathjoin(
531
file_id_entry[0][0], file_id_entry[0][1])
478
path = osutils.pathjoin(file_id_entry[0][0], file_id_entry[0][1])
532
479
kind = DirState._minikind_to_kind[file_id_entry[1][0][0]]
533
480
info = '%s:%s' % (kind, path)
534
481
raise errors.DuplicateFileId(file_id, info)
535
first_key = (dirname, basename, b'')
482
first_key = (dirname, basename, '')
536
483
block_index, present = self._find_block_index_from_key(first_key)
538
485
# check the path is not in the tree
539
486
block = self._dirblocks[block_index][1]
540
487
entry_index, _ = self._find_entry_index(first_key, block)
541
488
while (entry_index < len(block) and
542
block[entry_index][0][0:2] == first_key[0:2]):
543
if block[entry_index][1][0][0] not in (b'a', b'r'):
489
block[entry_index][0][0:2] == first_key[0:2]):
490
if block[entry_index][1][0][0] not in 'ar':
544
491
# this path is in the dirstate in the current tree.
545
raise Exception("adding already added path!")
492
raise Exception, "adding already added path!"
548
495
# The block where we want to put the file is not present. But it
1423
1351
fingerprint, new_child_path)
1424
1352
self._check_delta_ids_absent(new_ids, delta, 0)
1426
self._apply_removals(viewitems(removals))
1427
self._apply_insertions(viewvalues(insertions))
1354
self._apply_removals(removals.iteritems())
1355
self._apply_insertions(insertions.values())
1428
1356
# Validate parents
1429
1357
self._after_delta_check_parents(parents, 0)
1430
except errors.BzrError as e:
1358
except errors.BzrError, e:
1431
1359
self._changes_aborted = True
1432
1360
if 'integrity error' not in str(e):
1434
1362
# _get_entry raises BzrError when a request is inconsistent; we
1435
# want such errors to be shown as InconsistentDelta - and that
1363
# want such errors to be shown as InconsistentDelta - and that
1436
1364
# fits the behaviour we trigger.
1437
raise errors.InconsistentDeltaDelta(delta,
1438
"error from _get_entry. %s" % (e,))
1365
raise errors.InconsistentDeltaDelta(delta, "error from _get_entry.")
1440
1367
def _apply_removals(self, removals):
1441
1368
for file_id, path in sorted(removals, reverse=True,
1442
key=operator.itemgetter(1)):
1369
key=operator.itemgetter(1)):
1443
1370
dirname, basename = osutils.split(path)
1444
1371
block_i, entry_i, d_present, f_present = \
1445
1372
self._get_block_entry_index(dirname, basename, 0)
1447
1374
entry = self._dirblocks[block_i][1][entry_i]
1448
1375
except IndexError:
1449
self._raise_invalid(path, file_id,
1450
"Wrong path for old path.")
1451
if not f_present or entry[1][0][0] in (b'a', b'r'):
1452
self._raise_invalid(path, file_id,
1453
"Wrong path for old path.")
1376
self._changes_aborted = True
1377
raise errors.InconsistentDelta(path, file_id,
1378
"Wrong path for old path.")
1379
if not f_present or entry[1][0][0] in 'ar':
1380
self._changes_aborted = True
1381
raise errors.InconsistentDelta(path, file_id,
1382
"Wrong path for old path.")
1454
1383
if file_id != entry[0][2]:
1455
self._raise_invalid(path, file_id,
1456
"Attempt to remove path has wrong id - found %r."
1384
self._changes_aborted = True
1385
raise errors.InconsistentDelta(path, file_id,
1386
"Attempt to remove path has wrong id - found %r."
1458
1388
self._make_absent(entry)
1459
1389
# See if we have a malformed delta: deleting a directory must not
1460
1390
# leave crud behind. This increases the number of bisects needed
1536
1471
new_ids = set()
1537
1472
for old_path, new_path, file_id, inv_entry in delta:
1538
if file_id.__class__ is not bytes:
1539
raise AssertionError(
1540
"must be a utf8 file_id not %s" % (type(file_id), ))
1541
1473
if inv_entry is not None and file_id != inv_entry.file_id:
1542
self._raise_invalid(new_path, file_id,
1543
"mismatched entry file_id %r" % inv_entry)
1544
if new_path is None:
1545
new_path_utf8 = None
1474
raise errors.InconsistentDelta(new_path, file_id,
1475
"mismatched entry file_id %r" % inv_entry)
1476
if new_path is not None:
1547
1477
if inv_entry is None:
1548
self._raise_invalid(new_path, file_id,
1549
"new_path with no entry")
1478
raise errors.InconsistentDelta(new_path, file_id,
1479
"new_path with no entry")
1550
1480
new_path_utf8 = encode(new_path)
1551
1481
# note the parent for validation
1552
1482
dirname_utf8, basename_utf8 = osutils.split(new_path_utf8)
1553
1483
if basename_utf8:
1554
1484
parents.add((dirname_utf8, inv_entry.parent_id))
1555
1485
if old_path is None:
1556
old_path_utf8 = None
1558
old_path_utf8 = encode(old_path)
1559
if old_path is None:
1560
adds.append((None, new_path_utf8, file_id,
1561
inv_to_entry(inv_entry), True))
1486
adds.append((None, encode(new_path), file_id,
1487
inv_to_entry(inv_entry), True))
1562
1488
new_ids.add(file_id)
1563
1489
elif new_path is None:
1564
deletes.append((old_path_utf8, None, file_id, None, True))
1565
elif (old_path, new_path) == root_only:
1566
# change things in-place
1567
# Note: the case of a parent directory changing its file_id
1568
# tends to break optimizations here, because officially
1569
# the file has actually been moved, it just happens to
1570
# end up at the same path. If we can figure out how to
1571
# handle that case, we can avoid a lot of add+delete
1572
# pairs for objects that stay put.
1573
# elif old_path == new_path:
1574
changes.append((old_path_utf8, new_path_utf8, file_id,
1575
inv_to_entry(inv_entry)))
1490
deletes.append((encode(old_path), None, file_id, None, True))
1491
elif (old_path, new_path) != root_only:
1578
1493
# Because renames must preserve their children we must have
1579
1494
# processed all relocations and removes before hand. The sort
1582
1497
# pair will result in the deleted item being reinserted, or
1583
1498
# renamed items being reinserted twice - and possibly at the
1584
1499
# wrong place. Splitting into a delete/add pair also simplifies
1585
# the handling of entries with (b'f', ...), (b'r' ...) because
1586
# the target of the b'r' is old_path here, and we add that to
1500
# the handling of entries with ('f', ...), ('r' ...) because
1501
# the target of the 'r' is old_path here, and we add that to
1587
1502
# deletes, meaning that the add handler does not need to check
1588
# for b'r' items on every pass.
1503
# for 'r' items on every pass.
1589
1504
self._update_basis_apply_deletes(deletes)
1591
1506
# Split into an add/delete pair recursively.
1592
adds.append((old_path_utf8, new_path_utf8, file_id,
1593
inv_to_entry(inv_entry), False))
1507
adds.append((None, new_path_utf8, file_id,
1508
inv_to_entry(inv_entry), False))
1594
1509
# Expunge deletes that we've seen so that deleted/renamed
1595
1510
# children of a rename directory are handled correctly.
1596
new_deletes = reversed(list(
1597
self._iter_child_entries(1, old_path_utf8)))
1511
new_deletes = reversed(list(self._iter_child_entries(1,
1598
1513
# Remove the current contents of the tree at orig_path, and
1599
1514
# reinsert at the correct new path.
1600
1515
for entry in new_deletes:
1601
child_dirname, child_basename, child_file_id = entry[0]
1603
source_path = child_dirname + b'/' + child_basename
1517
source_path = entry[0][0] + '/' + entry[0][1]
1605
source_path = child_basename
1519
source_path = entry[0][1]
1606
1520
if new_path_utf8:
1608
new_path_utf8 + source_path[len(old_path_utf8):]
1521
target_path = new_path_utf8 + source_path[len(old_path):]
1610
if old_path_utf8 == b'':
1611
1524
raise AssertionError("cannot rename directory to"
1613
target_path = source_path[len(old_path_utf8) + 1:]
1615
(None, target_path, entry[0][2], entry[1][1], False))
1526
target_path = source_path[len(old_path) + 1:]
1527
adds.append((None, target_path, entry[0][2], entry[1][1], False))
1616
1528
deletes.append(
1617
1529
(source_path, target_path, entry[0][2], None, False))
1618
1530
deletes.append(
1619
(old_path_utf8, new_path_utf8, file_id, None, False))
1531
(encode(old_path), new_path, file_id, None, False))
1533
# changes to just the root should not require remove/insertion
1535
changes.append((encode(old_path), encode(new_path), file_id,
1536
inv_to_entry(inv_entry)))
1621
1537
self._check_delta_ids_absent(new_ids, delta, 1)
1623
1539
# Finish expunging deletes/first half of renames.
1680
1597
# Adds are accumulated partly from renames, so can be in any input
1681
1598
# order - sort it.
1682
# TODO: we may want to sort in dirblocks order. That way each entry
1683
# will end up in the same directory, allowing the _get_entry
1684
# fast-path for looking up 2 items in the same dir work.
1685
adds.sort(key=lambda x: x[1])
1686
1600
# adds is now in lexographic order, which places all parents before
1687
1601
# their children, so we can process it linearly.
1688
st = static_tuple.StaticTuple
1689
1603
for old_path, new_path, file_id, new_details, real_add in adds:
1690
dirname, basename = osutils.split(new_path)
1691
entry_key = st(dirname, basename, file_id)
1692
block_index, present = self._find_block_index_from_key(entry_key)
1694
# The block where we want to put the file is not present.
1695
# However, it might have just been an empty directory. Look for
1696
# the parent in the basis-so-far before throwing an error.
1697
parent_dir, parent_base = osutils.split(dirname)
1698
parent_block_idx, parent_entry_idx, _, parent_present = \
1699
self._get_block_entry_index(parent_dir, parent_base, 1)
1700
if not parent_present:
1701
self._raise_invalid(new_path, file_id,
1702
"Unable to find block for this record."
1703
" Was the parent added?")
1704
self._ensure_block(parent_block_idx, parent_entry_idx, dirname)
1706
block = self._dirblocks[block_index][1]
1707
entry_index, present = self._find_entry_index(entry_key, block)
1709
if old_path is not None:
1710
self._raise_invalid(new_path, file_id,
1711
'considered a real add but still had old_path at %s'
1714
entry = block[entry_index]
1715
basis_kind = entry[1][1][0]
1716
if basis_kind == b'a':
1717
entry[1][1] = new_details
1718
elif basis_kind == b'r':
1719
raise NotImplementedError()
1721
self._raise_invalid(new_path, file_id,
1722
"An entry was marked as a new add"
1723
" but the basis target already existed")
1725
# The exact key was not found in the block. However, we need to
1726
# check if there is a key next to us that would have matched.
1727
# We only need to check 2 locations, because there are only 2
1729
for maybe_index in range(entry_index - 1, entry_index + 1):
1730
if maybe_index < 0 or maybe_index >= len(block):
1732
maybe_entry = block[maybe_index]
1733
if maybe_entry[0][:2] != (dirname, basename):
1734
# Just a random neighbor
1736
if maybe_entry[0][2] == file_id:
1737
raise AssertionError(
1738
'_find_entry_index didnt find a key match'
1739
' but walking the data did, for %s'
1741
basis_kind = maybe_entry[1][1][0]
1742
if basis_kind not in (b'a', b'r'):
1743
self._raise_invalid(new_path, file_id,
1744
"we have an add record for path, but the path"
1745
" is already present with another file_id %s"
1746
% (maybe_entry[0][2],))
1748
entry = (entry_key, [DirState.NULL_PARENT_DETAILS,
1750
block.insert(entry_index, entry)
1752
active_kind = entry[1][0][0]
1753
if active_kind == b'a':
1754
# The active record shows up as absent, this could be genuine,
1755
# or it could be present at some other location. We need to
1757
id_index = self._get_id_index()
1758
# The id_index may not be perfectly accurate for tree1, because
1759
# we haven't been keeping it updated. However, it should be
1760
# fine for tree0, and that gives us enough info for what we
1762
keys = id_index.get(file_id, ())
1764
block_i, entry_i, d_present, f_present = \
1765
self._get_block_entry_index(key[0], key[1], 0)
1768
active_entry = self._dirblocks[block_i][1][entry_i]
1769
if (active_entry[0][2] != file_id):
1770
# Some other file is at this path, we don't need to
1773
real_active_kind = active_entry[1][0][0]
1774
if real_active_kind in (b'a', b'r'):
1775
# We found a record, which was not *this* record,
1776
# which matches the file_id, but is not actually
1777
# present. Something seems *really* wrong.
1778
self._raise_invalid(new_path, file_id,
1779
"We found a tree0 entry that doesnt make sense")
1780
# Now, we've found a tree0 entry which matches the file_id
1781
# but is at a different location. So update them to be
1783
active_dir, active_name = active_entry[0][:2]
1785
active_path = active_dir + b'/' + active_name
1787
active_path = active_name
1788
active_entry[1][1] = st(b'r', new_path, 0, False, b'')
1789
entry[1][0] = st(b'r', active_path, 0, False, b'')
1790
elif active_kind == b'r':
1791
raise NotImplementedError()
1793
new_kind = new_details[0]
1794
if new_kind == b'd':
1795
self._ensure_block(block_index, entry_index, new_path)
1604
# the entry for this file_id must be in tree 0.
1605
entry = self._get_entry(0, file_id, new_path)
1606
if entry[0] is None or entry[0][2] != file_id:
1607
self._changes_aborted = True
1608
raise errors.InconsistentDelta(new_path, file_id,
1609
'working tree does not contain new entry')
1610
if real_add and entry[1][1][0] not in absent:
1611
self._changes_aborted = True
1612
raise errors.InconsistentDelta(new_path, file_id,
1613
'The entry was considered to be a genuinely new record,'
1614
' but there was already an old record for it.')
1615
# We don't need to update the target of an 'r' because the handling
1616
# of renames turns all 'r' situations into a delete at the original
1618
entry[1][1] = new_details
1797
1620
def _update_basis_apply_changes(self, changes):
1798
1621
"""Apply a sequence of changes to tree 1 during update_basis_by_delta.
1823
1653
null = DirState.NULL_PARENT_DETAILS
1824
1654
for old_path, new_path, file_id, _, real_delete in deletes:
1825
1655
if real_delete != (new_path is None):
1826
self._raise_invalid(old_path, file_id, "bad delete delta")
1656
self._changes_aborted = True
1657
raise AssertionError("bad delete delta")
1827
1658
# the entry for this file_id must be in tree 1.
1828
1659
dirname, basename = osutils.split(old_path)
1829
1660
block_index, entry_index, dir_present, file_present = \
1830
1661
self._get_block_entry_index(dirname, basename, 1)
1831
1662
if not file_present:
1832
self._raise_invalid(old_path, file_id,
1833
'basis tree does not contain removed entry')
1663
self._changes_aborted = True
1664
raise errors.InconsistentDelta(old_path, file_id,
1665
'basis tree does not contain removed entry')
1834
1666
entry = self._dirblocks[block_index][1][entry_index]
1835
# The state of the entry in the 'active' WT
1836
active_kind = entry[1][0][0]
1837
1667
if entry[0][2] != file_id:
1838
self._raise_invalid(old_path, file_id,
1839
'mismatched file_id in tree 1')
1841
old_kind = entry[1][1][0]
1842
if active_kind in b'ar':
1843
# The active tree doesn't have this file_id.
1844
# The basis tree is changing this record. If this is a
1845
# rename, then we don't want the record here at all
1846
# anymore. If it is just an in-place change, we want the
1847
# record here, but we'll add it if we need to. So we just
1849
if active_kind == b'r':
1850
active_path = entry[1][0][1]
1851
active_entry = self._get_entry(0, file_id, active_path)
1852
if active_entry[1][1][0] != b'r':
1853
self._raise_invalid(old_path, file_id,
1854
"Dirstate did not have matching rename entries")
1855
elif active_entry[1][0][0] in b'ar':
1856
self._raise_invalid(old_path, file_id,
1857
"Dirstate had a rename pointing at an inactive"
1859
active_entry[1][1] = null
1668
self._changes_aborted = True
1669
raise errors.InconsistentDelta(old_path, file_id,
1670
'mismatched file_id in tree 1')
1672
if entry[1][0][0] != 'a':
1673
self._changes_aborted = True
1674
raise errors.InconsistentDelta(old_path, file_id,
1675
'This was marked as a real delete, but the WT state'
1676
' claims that it still exists and is versioned.')
1860
1677
del self._dirblocks[block_index][1][entry_index]
1861
if old_kind == b'd':
1862
# This was a directory, and the active tree says it
1863
# doesn't exist, and now the basis tree says it doesn't
1864
# exist. Remove its dirblock if present
1866
present) = self._find_block_index_from_key(
1867
(old_path, b'', b''))
1869
dir_block = self._dirblocks[dir_block_index][1]
1871
# This entry is empty, go ahead and just remove it
1872
del self._dirblocks[dir_block_index]
1874
# There is still an active record, so just mark this
1877
block_i, entry_i, d_present, f_present = \
1878
self._get_block_entry_index(old_path, b'', 1)
1880
dir_block = self._dirblocks[block_i][1]
1881
for child_entry in dir_block:
1882
child_basis_kind = child_entry[1][1][0]
1883
if child_basis_kind not in b'ar':
1884
self._raise_invalid(old_path, file_id,
1885
"The file id was deleted but its children were "
1679
if entry[1][0][0] == 'a':
1680
self._changes_aborted = True
1681
raise errors.InconsistentDelta(old_path, file_id,
1682
'The entry was considered a rename, but the source path'
1683
' is marked as absent.')
1684
# For whatever reason, we were asked to rename an entry
1685
# that was originally marked as deleted. This could be
1686
# because we are renaming the parent directory, and the WT
1687
# current state has the file marked as deleted.
1688
elif entry[1][0][0] == 'r':
1689
# implement the rename
1690
del self._dirblocks[block_index][1][entry_index]
1692
# it is being resurrected here, so blank it out temporarily.
1693
self._dirblocks[block_index][1][entry_index][1][1] = null
1888
1695
def _after_delta_check_parents(self, parents, index):
1889
1696
"""Check that parents required by the delta are all intact.
1891
1698
:param parents: An iterable of (path_utf8, file_id) tuples which are
1892
1699
required to be present in tree 'index' at path_utf8 with id file_id
1893
1700
and be a directory.
2113
1924
tree present there.
2115
1926
self._read_dirblocks_if_needed()
2116
key = dirname, basename, b''
1927
key = dirname, basename, ''
2117
1928
block_index, present = self._find_block_index_from_key(key)
2118
1929
if not present:
2119
1930
# no such directory - return the dir index and 0 for the row.
2120
1931
return block_index, 0, False, False
2121
block = self._dirblocks[block_index][1] # access the entries only
1932
block = self._dirblocks[block_index][1] # access the entries only
2122
1933
entry_index, present = self._find_entry_index(key, block)
2123
1934
# linear search through entries at this path to find the one
2125
1936
while entry_index < len(block) and block[entry_index][0][1] == basename:
2126
if block[entry_index][1][tree_index][0] not in (b'a', b'r'):
1937
if block[entry_index][1][tree_index][0] not in 'ar':
2127
1938
# neither absent or relocated
2128
1939
return block_index, entry_index, True, True
2129
1940
entry_index += 1
2130
1941
return block_index, entry_index, True, False
2132
def _get_entry(self, tree_index, fileid_utf8=None, path_utf8=None,
2133
include_deleted=False):
1943
def _get_entry(self, tree_index, fileid_utf8=None, path_utf8=None, include_deleted=False):
2134
1944
"""Get the dirstate entry for path in tree tree_index.
2136
1946
If either file_id or path is supplied, it is used as the key to lookup.
2337
2145
def _get_id_index(self):
2338
"""Get an id index of self._dirblocks.
2340
This maps from file_id => [(directory, name, file_id)] entries where
2341
that file_id appears in one of the trees.
2146
"""Get an id index of self._dirblocks."""
2343
2147
if self._id_index is None:
2345
2149
for key, tree_details in self._iter_entries():
2346
self._add_to_id_index(id_index, key)
2150
id_index.setdefault(key[2], set()).add(key)
2347
2151
self._id_index = id_index
2348
2152
return self._id_index
2350
def _add_to_id_index(self, id_index, entry_key):
2351
"""Add this entry to the _id_index mapping."""
2352
# This code used to use a set for every entry in the id_index. However,
2353
# it is *rare* to have more than one entry. So a set is a large
2354
# overkill. And even when we do, we won't ever have more than the
2355
# number of parent trees. Which is still a small number (rarely >2). As
2356
# such, we use a simple tuple, and do our own uniqueness checks. While
2357
# the 'in' check is O(N) since N is nicely bounded it shouldn't ever
2358
# cause quadratic failure.
2359
file_id = entry_key[2]
2360
entry_key = static_tuple.StaticTuple.from_sequence(entry_key)
2361
if file_id not in id_index:
2362
id_index[file_id] = static_tuple.StaticTuple(entry_key,)
2364
entry_keys = id_index[file_id]
2365
if entry_key not in entry_keys:
2366
id_index[file_id] = entry_keys + (entry_key,)
2368
def _remove_from_id_index(self, id_index, entry_key):
2369
"""Remove this entry from the _id_index mapping.
2371
It is an programming error to call this when the entry_key is not
2374
file_id = entry_key[2]
2375
entry_keys = list(id_index[file_id])
2376
entry_keys.remove(entry_key)
2377
id_index[file_id] = static_tuple.StaticTuple.from_sequence(entry_keys)
2379
2154
def _get_output_lines(self, lines):
2380
2155
"""Format lines for final output.
2385
2160
output_lines = [DirState.HEADER_FORMAT_3]
2386
lines.append(b'') # a final newline
2387
inventory_text = b'\0\n\0'.join(lines)
2388
output_lines.append(b'crc32: %d\n' % (zlib.crc32(inventory_text),))
2161
lines.append('') # a final newline
2162
inventory_text = '\0\n\0'.join(lines)
2163
output_lines.append('crc32: %s\n' % (zlib.crc32(inventory_text),))
2389
2164
# -3, 1 for num parents, 1 for ghosts, 1 for final newline
2390
num_entries = len(lines) - 3
2391
output_lines.append(b'num_entries: %d\n' % (num_entries,))
2165
num_entries = len(lines)-3
2166
output_lines.append('num_entries: %s\n' % (num_entries,))
2392
2167
output_lines.append(inventory_text)
2393
2168
return output_lines
2395
2170
def _make_deleted_row(self, fileid_utf8, parents):
2396
2171
"""Return a deleted row for fileid_utf8."""
2397
return (b'/', b'RECYCLED.BIN', b'file', fileid_utf8, 0, DirState.NULLSTAT,
2172
return ('/', 'RECYCLED.BIN', 'file', fileid_utf8, 0, DirState.NULLSTAT,
2400
2175
def _num_present_parents(self):
2401
2176
"""The number of parent entries in each record row."""
2402
2177
return len(self._parents) - len(self._ghosts)
2405
def on_file(cls, path, sha1_provider=None, worth_saving_limit=0,
2406
use_filesystem_for_exec=True):
2180
def on_file(path, sha1_provider=None):
2407
2181
"""Construct a DirState on the file at path "path".
2409
2183
:param path: The path at which the dirstate file on disk should live.
2410
2184
:param sha1_provider: an object meeting the SHA1Provider interface.
2411
2185
If None, a DefaultSHA1Provider is used.
2412
:param worth_saving_limit: when the exact number of hash changed
2413
entries is known, only bother saving the dirstate if more than
2414
this count of entries have changed. -1 means never save.
2415
:param use_filesystem_for_exec: Whether to trust the filesystem
2416
for executable bit information
2417
2186
:return: An unlocked DirState object, associated with the given path.
2419
2188
if sha1_provider is None:
2420
2189
sha1_provider = DefaultSHA1Provider()
2421
result = cls(path, sha1_provider,
2422
worth_saving_limit=worth_saving_limit,
2423
use_filesystem_for_exec=use_filesystem_for_exec)
2190
result = DirState(path, sha1_provider)
2426
2193
def _read_dirblocks_if_needed(self):
2476
2243
raise errors.BzrError(
2477
2244
'invalid header line: %r' % (header,))
2478
2245
crc_line = self._state_file.readline()
2479
if not crc_line.startswith(b'crc32: '):
2246
if not crc_line.startswith('crc32: '):
2480
2247
raise errors.BzrError('missing crc32 checksum: %r' % crc_line)
2481
self.crc_expected = int(crc_line[len(b'crc32: '):-1])
2248
self.crc_expected = int(crc_line[len('crc32: '):-1])
2482
2249
num_entries_line = self._state_file.readline()
2483
if not num_entries_line.startswith(b'num_entries: '):
2250
if not num_entries_line.startswith('num_entries: '):
2484
2251
raise errors.BzrError('missing num_entries line')
2485
self._num_entries = int(num_entries_line[len(b'num_entries: '):-1])
2252
self._num_entries = int(num_entries_line[len('num_entries: '):-1])
2487
def sha1_from_stat(self, path, stat_result):
2254
def sha1_from_stat(self, path, stat_result, _pack_stat=pack_stat):
2488
2255
"""Find a sha1 given a stat lookup."""
2489
return self._get_packed_stat_index().get(pack_stat(stat_result), None)
2256
return self._get_packed_stat_index().get(_pack_stat(stat_result), None)
2491
2258
def _get_packed_stat_index(self):
2492
2259
"""Get a packed_stat index of self._dirblocks."""
2493
2260
if self._packed_stat_index is None:
2495
2262
for key, tree_details in self._iter_entries():
2496
if tree_details[0][0] == b'f':
2263
if tree_details[0][0] == 'f':
2497
2264
index[tree_details[0][4]] = tree_details[0][1]
2498
2265
self._packed_stat_index = index
2499
2266
return self._packed_stat_index
2516
2283
# Should this be a warning? For now, I'm expecting that places that
2517
2284
# mark it inconsistent will warn, making a warning here redundant.
2518
2285
trace.mutter('Not saving DirState because '
2519
'_changes_aborted is set.')
2521
# TODO: Since we now distinguish IN_MEMORY_MODIFIED from
2522
# IN_MEMORY_HASH_MODIFIED, we should only fail quietly if we fail
2523
# to save an IN_MEMORY_HASH_MODIFIED, and fail *noisily* if we
2524
# fail to save IN_MEMORY_MODIFIED
2525
if not self._worth_saving():
2286
'_changes_aborted is set.')
2288
if (self._header_state == DirState.IN_MEMORY_MODIFIED or
2289
self._dirblock_state == DirState.IN_MEMORY_MODIFIED):
2528
grabbed_write_lock = False
2529
if self._lock_state != 'w':
2530
grabbed_write_lock, new_lock = self._lock_token.temporary_write_lock()
2531
# Switch over to the new lock, as the old one may be closed.
2532
# TODO: jam 20070315 We should validate the disk file has
2533
# not changed contents, since temporary_write_lock may
2534
# not be an atomic operation.
2535
self._lock_token = new_lock
2536
self._state_file = new_lock.f
2537
if not grabbed_write_lock:
2538
# We couldn't grab a write lock, so we switch back to a read one
2541
lines = self.get_lines()
2542
self._state_file.seek(0)
2543
self._state_file.writelines(lines)
2544
self._state_file.truncate()
2545
self._state_file.flush()
2546
self._maybe_fdatasync()
2547
self._mark_unmodified()
2549
if grabbed_write_lock:
2550
self._lock_token = self._lock_token.restore_read_lock()
2551
self._state_file = self._lock_token.f
2291
grabbed_write_lock = False
2292
if self._lock_state != 'w':
2293
grabbed_write_lock, new_lock = self._lock_token.temporary_write_lock()
2294
# Switch over to the new lock, as the old one may be closed.
2552
2295
# TODO: jam 20070315 We should validate the disk file has
2553
# not changed contents. Since restore_read_lock may
2296
# not changed contents. Since temporary_write_lock may
2554
2297
# not be an atomic operation.
2556
def _maybe_fdatasync(self):
2557
"""Flush to disk if possible and if not configured off."""
2558
if self._config_stack.get('dirstate.fdatasync'):
2559
osutils.fdatasync(self._state_file.fileno())
2561
def _worth_saving(self):
2562
"""Is it worth saving the dirstate or not?"""
2563
if (self._header_state == DirState.IN_MEMORY_MODIFIED
2564
or self._dirblock_state == DirState.IN_MEMORY_MODIFIED):
2566
if self._dirblock_state == DirState.IN_MEMORY_HASH_MODIFIED:
2567
if self._worth_saving_limit == -1:
2568
# We never save hash changes when the limit is -1
2570
# If we're using smart saving and only a small number of
2571
# entries have changed their hash, don't bother saving. John has
2572
# suggested using a heuristic here based on the size of the
2573
# changed files and/or tree. For now, we go with a configurable
2574
# number of changes, keeping the calculation time
2575
# as low overhead as possible. (This also keeps all existing
2576
# tests passing as the default is 0, i.e. always save.)
2577
if len(self._known_hash_changes) >= self._worth_saving_limit:
2298
self._lock_token = new_lock
2299
self._state_file = new_lock.f
2300
if not grabbed_write_lock:
2301
# We couldn't grab a write lock, so we switch back to a read one
2304
self._state_file.seek(0)
2305
self._state_file.writelines(self.get_lines())
2306
self._state_file.truncate()
2307
self._state_file.flush()
2308
self._header_state = DirState.IN_MEMORY_UNMODIFIED
2309
self._dirblock_state = DirState.IN_MEMORY_UNMODIFIED
2311
if grabbed_write_lock:
2312
self._lock_token = self._lock_token.restore_read_lock()
2313
self._state_file = self._lock_token.f
2314
# TODO: jam 20070315 We should validate the disk file has
2315
# not changed contents. Since restore_read_lock may
2316
# not be an atomic operation.
2581
2318
def _set_data(self, parent_ids, dirblocks):
2582
2319
"""Set the full dirstate data in memory.
2742
2463
# mapping from path,id. We need to look up the correct path
2743
2464
# for the indexes from 0 to tree_index -1
2744
2465
new_details = []
2745
for lookup_index in range(tree_index):
2466
for lookup_index in xrange(tree_index):
2746
2467
# boundary case: this is the first occurence of file_id
2747
# so there are no id_indexes, possibly take this out of
2468
# so there are no id_indexs, possibly take this out of
2749
if not len(entry_keys):
2470
if not len(id_index[file_id]):
2750
2471
new_details.append(DirState.NULL_PARENT_DETAILS)
2752
2473
# grab any one entry, use it to find the right path.
2753
a_key = next(iter(entry_keys))
2754
if by_path[a_key][lookup_index][0] in (b'r', b'a'):
2755
# its a pointer or missing statement, use it as
2758
by_path[a_key][lookup_index])
2474
# TODO: optimise this to reduce memory use in highly
2475
# fragmented situations by reusing the relocation
2477
a_key = iter(id_index[file_id]).next()
2478
if by_path[a_key][lookup_index][0] in ('r', 'a'):
2479
# its a pointer or missing statement, use it as is.
2480
new_details.append(by_path[a_key][lookup_index])
2760
2482
# we have the right key, make a pointer to it.
2761
real_path = (b'/'.join(a_key[0:2])).strip(b'/')
2762
new_details.append(st(b'r', real_path, 0, False,
2483
real_path = ('/'.join(a_key[0:2])).strip('/')
2484
new_details.append(('r', real_path, 0, False, ''))
2764
2485
new_details.append(self._inv_entry_to_details(entry))
2765
2486
new_details.extend(new_location_suffix)
2766
2487
by_path[new_entry_key] = new_details
2767
self._add_to_id_index(id_index, new_entry_key)
2488
id_index[file_id].add(new_entry_key)
2768
2489
# --- end generation of full tree mappings
2770
2491
# sort and output all the entries
2771
new_entries = self._sort_entries(viewitems(by_path))
2492
new_entries = self._sort_entries(by_path.items())
2772
2493
self._entries_to_current_state(new_entries)
2773
2494
self._parents = [rev_id for rev_id, tree in trees]
2774
2495
self._ghosts = list(ghosts)
2775
self._mark_modified(header_modified=True)
2496
self._header_state = DirState.IN_MEMORY_MODIFIED
2497
self._dirblock_state = DirState.IN_MEMORY_MODIFIED
2776
2498
self._id_index = id_index
2778
2500
def _sort_entries(self, entry_list):
2895
2604
# the minimal required trigger is if the execute bit or cached
2896
2605
# kind has changed.
2897
2606
if (current_old[1][0][3] != current_new[1].executable or
2898
current_old[1][0][0] != current_new_minikind):
2607
current_old[1][0][0] != current_new_minikind):
2900
2609
trace.mutter("Updating in-place change '%s'.",
2901
new_path_utf8.decode('utf8'))
2610
new_path_utf8.decode('utf8'))
2902
2611
self.update_minimal(current_old[0], current_new_minikind,
2903
executable=current_new[1].executable,
2904
path_utf8=new_path_utf8, fingerprint=fingerprint,
2612
executable=current_new[1].executable,
2613
path_utf8=new_path_utf8, fingerprint=fingerprint,
2906
2615
# both sides are dealt with, move on
2907
2616
current_old = advance(old_iterator)
2908
2617
current_new = advance(new_iterator)
2909
elif (lt_by_dirs(new_dirname, current_old[0][0])
2910
or (new_dirname == current_old[0][0] and
2911
new_entry_key[1:] < current_old[0][1:])):
2618
elif (cmp_by_dirs(new_dirname, current_old[0][0]) < 0
2619
or (new_dirname == current_old[0][0]
2620
and new_entry_key[1:] < current_old[0][1:])):
2912
2621
# new comes before:
2913
2622
# add a entry for this and advance new
2915
2624
trace.mutter("Inserting from new '%s'.",
2916
new_path_utf8.decode('utf8'))
2625
new_path_utf8.decode('utf8'))
2917
2626
self.update_minimal(new_entry_key, current_new_minikind,
2918
executable=current_new[1].executable,
2919
path_utf8=new_path_utf8, fingerprint=fingerprint,
2627
executable=current_new[1].executable,
2628
path_utf8=new_path_utf8, fingerprint=fingerprint,
2921
2630
current_new = advance(new_iterator)
2923
2632
# we've advanced past the place where the old key would be,
2924
2633
# without seeing it in the new list. so it must be gone.
2926
2635
trace.mutter("Deleting from old '%s/%s'.",
2927
current_old[0][0].decode('utf8'),
2928
current_old[0][1].decode('utf8'))
2636
current_old[0][0].decode('utf8'),
2637
current_old[0][1].decode('utf8'))
2929
2638
self._make_absent(current_old)
2930
2639
current_old = advance(old_iterator)
2931
self._mark_modified()
2640
self._dirblock_state = DirState.IN_MEMORY_MODIFIED
2932
2641
self._id_index = None
2933
2642
self._packed_stat_index = None
2935
2644
trace.mutter("set_state_from_inventory complete.")
2937
def set_state_from_scratch(self, working_inv, parent_trees, parent_ghosts):
2938
"""Wipe the currently stored state and set it to something new.
2940
This is a hard-reset for the data we are working with.
2942
# Technically, we really want a write lock, but until we write, we
2943
# don't really need it.
2944
self._requires_lock()
2945
# root dir and root dir contents with no children. We have to have a
2946
# root for set_state_from_inventory to work correctly.
2947
empty_root = ((b'', b'', inventory.ROOT_ID),
2948
[(b'd', b'', 0, False, DirState.NULLSTAT)])
2949
empty_tree_dirblocks = [(b'', [empty_root]), (b'', [])]
2950
self._set_data([], empty_tree_dirblocks)
2951
self.set_state_from_inventory(working_inv)
2952
self.set_parent_trees(parent_trees, parent_ghosts)
2954
2646
def _make_absent(self, current_old):
2955
2647
"""Mark current_old - an entry - as absent for tree 0.
2993
2682
update_block_index, present = \
2994
2683
self._find_block_index_from_key(update_key)
2995
2684
if not present:
2996
raise AssertionError(
2997
'could not find block for %s' % (update_key,))
2685
raise AssertionError('could not find block for %s' % (update_key,))
2998
2686
update_entry_index, present = \
2999
self._find_entry_index(
3000
update_key, self._dirblocks[update_block_index][1])
2687
self._find_entry_index(update_key, self._dirblocks[update_block_index][1])
3001
2688
if not present:
3002
raise AssertionError(
3003
'could not find entry for %s' % (update_key,))
2689
raise AssertionError('could not find entry for %s' % (update_key,))
3004
2690
update_tree_details = self._dirblocks[update_block_index][1][update_entry_index][1]
3005
2691
# it must not be absent at the moment
3006
if update_tree_details[0][0] == b'a': # absent
2692
if update_tree_details[0][0] == 'a': # absent
3007
2693
raise AssertionError('bad row %r' % (update_tree_details,))
3008
2694
update_tree_details[0] = DirState.NULL_PARENT_DETAILS
3009
self._mark_modified()
2695
self._dirblock_state = DirState.IN_MEMORY_MODIFIED
3010
2696
return last_reference
3012
def update_minimal(self, key, minikind, executable=False, fingerprint=b'',
3013
packed_stat=None, size=0, path_utf8=None, fullscan=False):
2698
def update_minimal(self, key, minikind, executable=False, fingerprint='',
2699
packed_stat=None, size=0, path_utf8=None, fullscan=False):
3014
2700
"""Update an entry to the state in tree 0.
3016
2702
This will either create a new entry at 'key' or update an existing one.
3122
2803
update_block_index, present = \
3123
2804
self._find_block_index_from_key(other_key)
3124
2805
if not present:
3125
raise AssertionError(
3126
'could not find block for %s' % (other_key,))
2806
raise AssertionError('could not find block for %s' % (other_key,))
3127
2807
update_entry_index, present = \
3128
self._find_entry_index(
3129
other_key, self._dirblocks[update_block_index][1])
2808
self._find_entry_index(other_key, self._dirblocks[update_block_index][1])
3130
2809
if not present:
3131
raise AssertionError(
3132
'update_minimal: could not find entry for %s' % (other_key,))
2810
raise AssertionError('update_minimal: could not find entry for %s' % (other_key,))
3133
2811
update_details = self._dirblocks[update_block_index][1][update_entry_index][1][lookup_index]
3134
if update_details[0] in (b'a', b'r'): # relocated, absent
2812
if update_details[0] in 'ar': # relocated, absent
3135
2813
# its a pointer or absent in lookup_index's tree, use
3137
2815
new_entry[1].append(update_details)
3139
2817
# we have the right key, make a pointer to it.
3140
2818
pointer_path = osutils.pathjoin(*other_key[0:2])
3141
new_entry[1].append(
3142
(b'r', pointer_path, 0, False, b''))
2819
new_entry[1].append(('r', pointer_path, 0, False, ''))
3143
2820
block.insert(entry_index, new_entry)
3144
self._add_to_id_index(id_index, key)
2821
existing_keys.add(key)
3146
2823
# Does the new state matter?
3147
2824
block[entry_index][1][0] = new_details
3170
2842
# other trees, so put absent pointers there
3171
2843
# This is the vertical axis in the matrix, all pointing
3172
2844
# to the real path.
3173
block_index, present = self._find_block_index_from_key(
2845
block_index, present = self._find_block_index_from_key(entry_key)
3175
2846
if not present:
3176
2847
raise AssertionError('not present: %r', entry_key)
3177
entry_index, present = self._find_entry_index(
3178
entry_key, self._dirblocks[block_index][1])
2848
entry_index, present = self._find_entry_index(entry_key, self._dirblocks[block_index][1])
3179
2849
if not present:
3180
2850
raise AssertionError('not present: %r', entry_key)
3181
2851
self._dirblocks[block_index][1][entry_index][1][0] = \
3182
(b'r', path_utf8, 0, False, b'')
2852
('r', path_utf8, 0, False, '')
3183
2853
# add a containing dirblock if needed.
3184
if new_details[0] == b'd':
3185
# GZ 2017-06-09: Using pathjoin why?
3186
subdir_key = (osutils.pathjoin(*key[0:2]), b'', b'')
2854
if new_details[0] == 'd':
2855
subdir_key = (osutils.pathjoin(*key[0:2]), '', '')
3187
2856
block_index, present = self._find_block_index_from_key(subdir_key)
3188
2857
if not present:
3189
2858
self._dirblocks.insert(block_index, (subdir_key[0], []))
3191
self._mark_modified()
2860
self._dirblock_state = DirState.IN_MEMORY_MODIFIED
3193
2862
def _maybe_remove_row(self, block, index, id_index):
3194
2863
"""Remove index if it is absent or relocated across the row.
3196
2865
id_index is updated accordingly.
3197
:return: True if we removed the row, False otherwise
3199
2867
present_in_row = False
3200
2868
entry = block[index]
3201
2869
for column in entry[1]:
3202
if column[0] not in (b'a', b'r'):
2870
if column[0] not in 'ar':
3203
2871
present_in_row = True
3205
2873
if not present_in_row:
3206
2874
block.pop(index)
3207
self._remove_from_id_index(id_index, entry[0])
2875
id_index[entry[0][2]].remove(entry[0])
3211
2877
def _validate(self):
3212
2878
"""Check that invariants on the dirblock are correct.
3292
2958
# We check this with a dict per tree pointing either to the present
3293
2959
# name, or None if absent.
3294
2960
tree_count = self._num_present_parents() + 1
3295
id_path_maps = [{} for _ in range(tree_count)]
2961
id_path_maps = [dict() for i in range(tree_count)]
3296
2962
# Make sure that all renamed entries point to the correct location.
3297
2963
for entry in self._iter_entries():
3298
2964
file_id = entry[0][2]
3299
2965
this_path = osutils.pathjoin(entry[0][0], entry[0][1])
3300
2966
if len(entry[1]) != tree_count:
3301
2967
raise AssertionError(
3302
"wrong number of entry details for row\n%s"
3304
(pformat(entry), tree_count))
2968
"wrong number of entry details for row\n%s" \
2969
",\nexpected %d" % \
2970
(pformat(entry), tree_count))
3305
2971
absent_positions = 0
3306
2972
for tree_index, tree_state in enumerate(entry[1]):
3307
2973
this_tree_map = id_path_maps[tree_index]
3308
2974
minikind = tree_state[0]
3309
if minikind in (b'a', b'r'):
2975
if minikind in 'ar':
3310
2976
absent_positions += 1
3311
2977
# have we seen this id before in this column?
3312
2978
if file_id in this_tree_map:
3313
2979
previous_path, previous_loc = this_tree_map[file_id]
3314
2980
# any later mention of this file must be consistent with
3315
2981
# what was said before
3316
if minikind == b'a':
3317
2983
if previous_path is not None:
3318
2984
raise AssertionError(
3319
"file %s is absent in row %r but also present "
3321
(file_id.decode('utf-8'), entry, previous_path))
3322
elif minikind == b'r':
2985
"file %s is absent in row %r but also present " \
2987
(file_id, entry, previous_path))
2988
elif minikind == 'r':
3323
2989
target_location = tree_state[1]
3324
2990
if previous_path != target_location:
3325
2991
raise AssertionError(
3326
"file %s relocation in row %r but also at %r"
3327
% (file_id, entry, previous_path))
2992
"file %s relocation in row %r but also at %r" \
2993
% (file_id, entry, previous_path))
3329
2995
# a file, directory, etc - may have been previously
3330
2996
# pointed to by a relocation, which must point here
3494
3138
# are calculated at the same time, so checking just the size
3495
3139
# gains nothing w.r.t. performance.
3496
3140
link_or_sha1 = state._sha1_file(abspath)
3497
entry[1][0] = (b'f', link_or_sha1, stat_value.st_size,
3141
entry[1][0] = ('f', link_or_sha1, stat_value.st_size,
3498
3142
executable, packed_stat)
3500
entry[1][0] = (b'f', b'', stat_value.st_size,
3144
entry[1][0] = ('f', '', stat_value.st_size,
3501
3145
executable, DirState.NULLSTAT)
3502
worth_saving = False
3503
elif minikind == b'd':
3146
elif minikind == 'd':
3504
3147
link_or_sha1 = None
3505
entry[1][0] = (b'd', b'', 0, False, packed_stat)
3506
if saved_minikind != b'd':
3148
entry[1][0] = ('d', '', 0, False, packed_stat)
3149
if saved_minikind != 'd':
3507
3150
# This changed from something into a directory. Make sure we
3508
3151
# have a directory block for it. This doesn't happen very
3509
3152
# often, so this doesn't have to be super fast.
3510
3153
block_index, entry_index, dir_present, file_present = \
3511
3154
state._get_block_entry_index(entry[0][0], entry[0][1], 0)
3512
3155
state._ensure_block(block_index, entry_index,
3513
osutils.pathjoin(entry[0][0], entry[0][1]))
3515
worth_saving = False
3516
elif minikind == b'l':
3517
if saved_minikind == b'l':
3518
worth_saving = False
3156
osutils.pathjoin(entry[0][0], entry[0][1]))
3157
elif minikind == 'l':
3519
3158
link_or_sha1 = state._read_link(abspath, saved_link_or_sha1)
3520
3159
if state._cutoff_time is None:
3521
3160
state._sha_cutoff_time()
3522
3161
if (stat_value.st_mtime < state._cutoff_time
3523
and stat_value.st_ctime < state._cutoff_time):
3524
entry[1][0] = (b'l', link_or_sha1, stat_value.st_size,
3162
and stat_value.st_ctime < state._cutoff_time):
3163
entry[1][0] = ('l', link_or_sha1, stat_value.st_size,
3525
3164
False, packed_stat)
3527
entry[1][0] = (b'l', b'', stat_value.st_size,
3166
entry[1][0] = ('l', '', stat_value.st_size,
3528
3167
False, DirState.NULLSTAT)
3530
state._mark_modified([entry])
3168
state._dirblock_state = DirState.IN_MEMORY_MODIFIED
3531
3169
return link_or_sha1
3534
3172
class ProcessEntryPython(object):
3536
3174
__slots__ = ["old_dirname_to_file_id", "new_dirname_to_file_id",
3537
"last_source_parent", "last_target_parent", "include_unchanged",
3538
"partial", "use_filesystem_for_exec", "utf8_decode",
3539
"searched_specific_files", "search_specific_files",
3540
"searched_exact_paths", "search_specific_file_parents", "seen_ids",
3541
"state", "source_index", "target_index", "want_unversioned", "tree"]
3175
"last_source_parent", "last_target_parent", "include_unchanged",
3176
"partial", "use_filesystem_for_exec", "utf8_decode",
3177
"searched_specific_files", "search_specific_files",
3178
"searched_exact_paths", "search_specific_file_parents", "seen_ids",
3179
"state", "source_index", "target_index", "want_unversioned", "tree"]
3543
3181
def __init__(self, include_unchanged, use_filesystem_for_exec,
3544
search_specific_files, state, source_index, target_index,
3545
want_unversioned, tree):
3182
search_specific_files, state, source_index, target_index,
3183
want_unversioned, tree):
3546
3184
self.old_dirname_to_file_id = {}
3547
3185
self.new_dirname_to_file_id = {}
3548
3186
# Are we doing a partial iter_changes?
3549
self.partial = search_specific_files != {''}
3187
self.partial = search_specific_files != set([''])
3550
3188
# Using a list so that we can access the values and change them in
3551
3189
# nested scope. Each one is [path, file_id, entry]
3552
3190
self.last_source_parent = [None, None]
3796
3428
and stat.S_IEXEC & path_info[3].st_mode)
3798
3430
target_exec = target_details[3]
3801
(None, self.utf8_decode(path)[0]),
3805
(None, self.utf8_decode(entry[0][1])[0]),
3806
(None, path_info[2]),
3807
(None, target_exec)), True
3431
return (entry[0][2],
3432
(None, self.utf8_decode(path)[0]),
3436
(None, self.utf8_decode(entry[0][1])[0]),
3437
(None, path_info[2]),
3438
(None, target_exec)), True
3809
3440
# Its a missing file, report it as such.
3812
(None, self.utf8_decode(path)[0]),
3816
(None, self.utf8_decode(entry[0][1])[0]),
3818
(None, False)), True
3819
elif source_minikind in _fdlt and target_minikind in b'a':
3441
return (entry[0][2],
3442
(None, self.utf8_decode(path)[0]),
3446
(None, self.utf8_decode(entry[0][1])[0]),
3448
(None, False)), True
3449
elif source_minikind in 'fdlt' and target_minikind in 'a':
3820
3450
# unversioned, possibly, or possibly not deleted: we dont care.
3821
3451
# if its still on disk, *and* theres no other entry at this
3822
3452
# path [we dont know this in this routine at the moment -
3823
3453
# perhaps we should change this - then it would be an unknown.
3824
3454
old_path = pathjoin(entry[0][0], entry[0][1])
3825
3455
# parent id is the entry for the path in the target tree
3826
parent_id = self.state._get_entry(
3827
self.source_index, path_utf8=entry[0][0])[0][2]
3456
parent_id = self.state._get_entry(self.source_index, path_utf8=entry[0][0])[0][2]
3828
3457
if parent_id == entry[0][2]:
3829
3458
parent_id = None
3832
(self.utf8_decode(old_path)[0], None),
3836
(self.utf8_decode(entry[0][1])[0], None),
3837
(DirState._minikind_to_kind[source_minikind], None),
3838
(source_details[3], None)), True
3839
elif source_minikind in _fdlt and target_minikind in b'r':
3459
return (entry[0][2],
3460
(self.utf8_decode(old_path)[0], None),
3464
(self.utf8_decode(entry[0][1])[0], None),
3465
(DirState._minikind_to_kind[source_minikind], None),
3466
(source_details[3], None)), True
3467
elif source_minikind in 'fdlt' and target_minikind in 'r':
3840
3468
# a rename; could be a true rename, or a rename inherited from
3841
3469
# a renamed parent. TODO: handle this efficiently. Its not
3842
3470
# common case to rename dirs though, so a correct but slow
3843
3471
# implementation will do.
3844
if not osutils.is_inside_any(self.searched_specific_files,
3472
if not osutils.is_inside_any(self.searched_specific_files, target_details[1]):
3846
3473
self.search_specific_files.add(target_details[1])
3847
elif source_minikind in _ra and target_minikind in _ra:
3474
elif source_minikind in 'ra' and target_minikind in 'ra':
3848
3475
# neither of the selected trees contain this file,
3849
3476
# so skip over it. This is not currently directly tested, but
3850
3477
# is indirectly via test_too_much.TestCommands.test_conflicts.
3853
3480
raise AssertionError("don't know how to compare "
3854
"source_minikind=%r, target_minikind=%r"
3855
% (source_minikind, target_minikind))
3481
"source_minikind=%r, target_minikind=%r"
3482
% (source_minikind, target_minikind))
3483
## import pdb;pdb.set_trace()
3856
3484
return None, None
3858
3486
def __iter__(self):
3969
3596
new_executable = bool(
3970
3597
stat.S_ISREG(root_dir_info[3].st_mode)
3971
3598
and stat.S_IEXEC & root_dir_info[3].st_mode)
3974
(None, current_root_unicode),
3978
(None, splitpath(current_root_unicode)[-1]),
3979
(None, root_dir_info[2]),
3980
(None, new_executable)
3982
initial_key = (current_root, b'', b'')
3600
(None, current_root_unicode),
3604
(None, splitpath(current_root_unicode)[-1]),
3605
(None, root_dir_info[2]),
3606
(None, new_executable)
3608
initial_key = (current_root, '', '')
3983
3609
block_index, _ = self.state._find_block_index_from_key(initial_key)
3984
3610
if block_index == 0:
3985
3611
# we have processed the total root already, but because the
3986
3612
# initial key matched it we should skip it here.
3988
3614
if root_dir_info and root_dir_info[2] == 'tree-reference':
3989
3615
current_dir_info = None
3991
dir_iterator = osutils._walkdirs_utf8(
3992
root_abspath, prefix=current_root)
3617
dir_iterator = osutils._walkdirs_utf8(root_abspath, prefix=current_root)
3994
current_dir_info = next(dir_iterator)
3995
except OSError as e:
3619
current_dir_info = dir_iterator.next()
3996
3621
# on win32, python2.4 has e.errno == ERROR_DIRECTORY, but
3997
3622
# python 2.5 has e.errno == EINVAL,
3998
3623
# and e.winerror == ERROR_DIRECTORY
4004
3629
if e.errno in (errno.ENOENT, errno.ENOTDIR, errno.EINVAL):
4005
3630
current_dir_info = None
4006
3631
elif (sys.platform == 'win32'
4007
and (e.errno in win_errors or
4008
e_winerror in win_errors)):
3632
and (e.errno in win_errors
3633
or e_winerror in win_errors)):
4009
3634
current_dir_info = None
4013
if current_dir_info[0][0] == b'':
3638
if current_dir_info[0][0] == '':
4014
3639
# remove .bzr from iteration
4015
bzr_index = bisect.bisect_left(
4016
current_dir_info[1], (b'.bzr',))
4017
if current_dir_info[1][bzr_index][0] != b'.bzr':
3640
bzr_index = bisect.bisect_left(current_dir_info[1], ('.bzr',))
3641
if current_dir_info[1][bzr_index][0] != '.bzr':
4018
3642
raise AssertionError()
4019
3643
del current_dir_info[1][bzr_index]
4020
3644
# walk until both the directory listing and the versioned metadata
4021
3645
# are exhausted.
4022
3646
if (block_index < len(self.state._dirblocks) and
4023
osutils.is_inside(current_root,
4024
self.state._dirblocks[block_index][0])):
3647
osutils.is_inside(current_root, self.state._dirblocks[block_index][0])):
4025
3648
current_block = self.state._dirblocks[block_index]
4027
3650
current_block = None
4028
3651
while (current_dir_info is not None or
4029
3652
current_block is not None):
4030
3653
if (current_dir_info and current_block
4031
and current_dir_info[0][0] != current_block[0]):
4032
if _lt_by_dirs(current_dir_info[0][0], current_block[0]):
3654
and current_dir_info[0][0] != current_block[0]):
3655
if _cmp_by_dirs(current_dir_info[0][0], current_block[0]) < 0:
4033
3656
# filesystem data refers to paths not covered by the dirblock.
4034
3657
# this has two possibilities:
4035
3658
# A) it is versioned but empty, so there is no block for it
4041
3664
# recurse into unknown directories.
4043
3666
while path_index < len(current_dir_info[1]):
4044
current_path_info = current_dir_info[1][path_index]
4045
if self.want_unversioned:
4046
if current_path_info[2] == 'directory':
4047
if self.tree._directory_is_tree_reference(
3667
current_path_info = current_dir_info[1][path_index]
3668
if self.want_unversioned:
3669
if current_path_info[2] == 'directory':
3670
if self.tree._directory_is_tree_reference(
4048
3671
current_path_info[0].decode('utf8')):
4049
current_path_info = current_path_info[:2] + \
4050
('tree-reference',) + \
4051
current_path_info[3:]
4052
new_executable = bool(
4053
stat.S_ISREG(current_path_info[3].st_mode)
4054
and stat.S_IEXEC & current_path_info[3].st_mode)
4057
(None, utf8_decode(current_path_info[0])[0]),
4061
(None, utf8_decode(current_path_info[1])[0]),
4062
(None, current_path_info[2]),
4063
(None, new_executable))
4064
# dont descend into this unversioned path if it is
4066
if current_path_info[2] in ('directory',
4068
del current_dir_info[1][path_index]
3672
current_path_info = current_path_info[:2] + \
3673
('tree-reference',) + current_path_info[3:]
3674
new_executable = bool(
3675
stat.S_ISREG(current_path_info[3].st_mode)
3676
and stat.S_IEXEC & current_path_info[3].st_mode)
3678
(None, utf8_decode(current_path_info[0])[0]),
3682
(None, utf8_decode(current_path_info[1])[0]),
3683
(None, current_path_info[2]),
3684
(None, new_executable))
3685
# dont descend into this unversioned path if it is
3687
if current_path_info[2] in ('directory',
3689
del current_dir_info[1][path_index]
4072
3693
# This dir info has been handled, go to the next
4074
current_dir_info = next(dir_iterator)
3695
current_dir_info = dir_iterator.next()
4075
3696
except StopIteration:
4076
3697
current_dir_info = None
4287
3900
raise AssertionError(
4288
3901
"Got entry<->path mismatch for specific path "
4289
3902
"%r entry %r path_info %r " % (
4290
path_utf8, entry, path_info))
3903
path_utf8, entry, path_info))
4291
3904
# Only include changes - we're outside the users requested
4294
3907
self._gather_result_for_consistency(result)
4295
if (result.kind[0] == 'directory' and
4296
result.kind[1] != 'directory'):
3908
if (result[6][0] == 'directory' and
3909
result[6][1] != 'directory'):
4297
3910
# This stopped being a directory, the old children have
4298
3911
# to be included.
4299
if entry[1][self.source_index][0] == b'r':
3912
if entry[1][self.source_index][0] == 'r':
4300
3913
# renamed, take the source path
4301
3914
entry_path_utf8 = entry[1][self.source_index][1]
4303
3916
entry_path_utf8 = path_utf8
4304
initial_key = (entry_path_utf8, b'', b'')
3917
initial_key = (entry_path_utf8, '', '')
4305
3918
block_index, _ = self.state._find_block_index_from_key(
4307
3920
if block_index == 0:
4308
3921
# The children of the root are in block index 1.
4310
3923
current_block = None
4311
3924
if block_index < len(self.state._dirblocks):
4312
3925
current_block = self.state._dirblocks[block_index]
4313
3926
if not osutils.is_inside(
4314
entry_path_utf8, current_block[0]):
3927
entry_path_utf8, current_block[0]):
4315
3928
# No entries for this directory at all.
4316
3929
current_block = None
4317
3930
if current_block is not None:
4318
3931
for entry in current_block[1]:
4319
if entry[1][self.source_index][0] in (b'a', b'r'):
3932
if entry[1][self.source_index][0] in 'ar':
4320
3933
# Not in the source tree, so doesn't have to be