20
20
lines by NL. The field delimiters are ommitted in the grammar, line delimiters
21
21
are not - this is done for clarity of reading. All string data is in utf8.
23
MINIKIND = "f" | "d" | "l" | "a" | "r" | "t";
26
WHOLE_NUMBER = {digit}, digit;
28
REVISION_ID = a non-empty utf8 string;
30
dirstate format = header line, full checksum, row count, parent details,
31
ghost_details, entries;
32
header line = "#bazaar dirstate flat format 3", NL;
33
full checksum = "crc32: ", ["-"], WHOLE_NUMBER, NL;
34
row count = "num_entries: ", WHOLE_NUMBER, NL;
35
parent_details = WHOLE NUMBER, {REVISION_ID}* NL;
36
ghost_details = WHOLE NUMBER, {REVISION_ID}*, NL;
38
entry = entry_key, current_entry_details, {parent_entry_details};
39
entry_key = dirname, basename, fileid;
40
current_entry_details = common_entry_details, working_entry_details;
41
parent_entry_details = common_entry_details, history_entry_details;
42
common_entry_details = MINIKIND, fingerprint, size, executable
43
working_entry_details = packed_stat
44
history_entry_details = REVISION_ID;
47
fingerprint = a nonempty utf8 sequence with meaning defined by minikind.
49
Given this definition, the following is useful to know:
50
entry (aka row) - all the data for a given key.
51
entry[0]: The key (dirname, basename, fileid)
55
entry[1]: The tree(s) data for this path and id combination.
56
entry[1][0]: The current tree
57
entry[1][1]: The second tree
59
For an entry for a tree, we have (using tree 0 - current tree) to demonstrate:
60
entry[1][0][0]: minikind
61
entry[1][0][1]: fingerprint
63
entry[1][0][3]: executable
64
entry[1][0][4]: packed_stat
66
entry[1][1][4]: revision_id
25
MINIKIND = "f" | "d" | "l" | "a" | "r" | "t";
28
WHOLE_NUMBER = {digit}, digit;
30
REVISION_ID = a non-empty utf8 string;
32
dirstate format = header line, full checksum, row count, parent details,
33
ghost_details, entries;
34
header line = "#bazaar dirstate flat format 3", NL;
35
full checksum = "crc32: ", ["-"], WHOLE_NUMBER, NL;
36
row count = "num_entries: ", WHOLE_NUMBER, NL;
37
parent_details = WHOLE NUMBER, {REVISION_ID}* NL;
38
ghost_details = WHOLE NUMBER, {REVISION_ID}*, NL;
40
entry = entry_key, current_entry_details, {parent_entry_details};
41
entry_key = dirname, basename, fileid;
42
current_entry_details = common_entry_details, working_entry_details;
43
parent_entry_details = common_entry_details, history_entry_details;
44
common_entry_details = MINIKIND, fingerprint, size, executable
45
working_entry_details = packed_stat
46
history_entry_details = REVISION_ID;
49
fingerprint = a nonempty utf8 sequence with meaning defined by minikind.
51
Given this definition, the following is useful to know::
53
entry (aka row) - all the data for a given key.
54
entry[0]: The key (dirname, basename, fileid)
58
entry[1]: The tree(s) data for this path and id combination.
59
entry[1][0]: The current tree
60
entry[1][1]: The second tree
62
For an entry for a tree, we have (using tree 0 - current tree) to demonstrate::
64
entry[1][0][0]: minikind
65
entry[1][0][1]: fingerprint
67
entry[1][0][3]: executable
68
entry[1][0][4]: packed_stat
72
entry[1][1][4]: revision_id
68
74
There may be multiple rows at the root, one per id present in the root, so the
69
in memory root row is now:
70
self._dirblocks[0] -> ('', [entry ...]),
71
and the entries in there are
74
entries[0][2]: file_id
75
entries[1][0]: The tree data for the current tree for this fileid at /
79
'r' is a relocated entry: This path is not present in this tree with this id,
80
but the id can be found at another location. The fingerprint is used to
81
point to the target location.
82
'a' is an absent entry: In that tree the id is not present at this path.
83
'd' is a directory entry: This path in this tree is a directory with the
84
current file id. There is no fingerprint for directories.
85
'f' is a file entry: As for directory, but it's a file. The fingerprint is the
86
sha1 value of the file's canonical form, i.e. after any read filters have
87
been applied to the convenience form stored in the working tree.
88
'l' is a symlink entry: As for directory, but a symlink. The fingerprint is the
90
't' is a reference to a nested subtree; the fingerprint is the referenced
75
in memory root row is now::
77
self._dirblocks[0] -> ('', [entry ...]),
79
and the entries in there are::
83
entries[0][2]: file_id
84
entries[1][0]: The tree data for the current tree for this fileid at /
89
b'r' is a relocated entry: This path is not present in this tree with this
90
id, but the id can be found at another location. The fingerprint is
91
used to point to the target location.
92
b'a' is an absent entry: In that tree the id is not present at this path.
93
b'd' is a directory entry: This path in this tree is a directory with the
94
current file id. There is no fingerprint for directories.
95
b'f' is a file entry: As for directory, but it's a file. The fingerprint is
96
the sha1 value of the file's canonical form, i.e. after any read
97
filters have been applied to the convenience form stored in the working
99
b'l' is a symlink entry: As for directory, but a symlink. The fingerprint is
101
b't' is a reference to a nested subtree; the fingerprint is the referenced
95
The entries on disk and in memory are ordered according to the following keys:
106
The entries on disk and in memory are ordered according to the following keys::
97
108
directory, as a list of components
101
112
--- Format 1 had the following different definition: ---
102
rows = dirname, NULL, basename, NULL, MINIKIND, NULL, fileid_utf8, NULL,
103
WHOLE NUMBER (* size *), NULL, packed stat, NULL, sha1|symlink target,
105
PARENT ROW = NULL, revision_utf8, NULL, MINIKIND, NULL, dirname, NULL,
106
basename, NULL, WHOLE NUMBER (* size *), NULL, "y" | "n", NULL,
116
rows = dirname, NULL, basename, NULL, MINIKIND, NULL, fileid_utf8, NULL,
117
WHOLE NUMBER (* size *), NULL, packed stat, NULL, sha1|symlink target,
119
PARENT ROW = NULL, revision_utf8, NULL, MINIKIND, NULL, dirname, NULL,
120
basename, NULL, WHOLE NUMBER (* size *), NULL, "y" | "n", NULL,
109
123
PARENT ROW's are emitted for every parent that is not in the ghosts details
110
124
line. That is, if the parents are foo, bar, baz, and the ghosts are bar, then
231
259
ERROR_DIRECTORY = 267
234
if not getattr(struct, '_compile', None):
235
# Cannot pre-compile the dirstate pack_stat
236
def pack_stat(st, _encode=binascii.b2a_base64, _pack=struct.pack):
237
"""Convert stat values into a packed representation."""
238
return _encode(_pack('>LLLLLL', st.st_size, int(st.st_mtime),
239
int(st.st_ctime), st.st_dev, st.st_ino & 0xFFFFFFFF,
242
# compile the struct compiler we need, so as to only do it once
243
from _struct import Struct
244
_compiled_pack = Struct('>LLLLLL').pack
245
def pack_stat(st, _encode=binascii.b2a_base64, _pack=_compiled_pack):
246
"""Convert stat values into a packed representation."""
247
# jam 20060614 it isn't really worth removing more entries if we
248
# are going to leave it in packed form.
249
# With only st_mtime and st_mode filesize is 5.5M and read time is 275ms
250
# With all entries, filesize is 5.9M and read time is maybe 280ms
251
# well within the noise margin
253
# base64 encoding always adds a final newline, so strip it off
254
# The current version
255
return _encode(_pack(st.st_size, int(st.st_mtime), int(st.st_ctime),
256
st.st_dev, st.st_ino & 0xFFFFFFFF, st.st_mode))[:-1]
257
# This is 0.060s / 1.520s faster by not encoding as much information
258
# return _encode(_pack('>LL', int(st.st_mtime), st.st_mode))[:-1]
259
# This is not strictly faster than _encode(_pack())[:-1]
260
# return '%X.%X.%X.%X.%X.%X' % (
261
# st.st_size, int(st.st_mtime), int(st.st_ctime),
262
# st.st_dev, st.st_ino, st.st_mode)
263
# Similar to the _encode(_pack('>LL'))
264
# return '%X.%X' % (int(st.st_mtime), st.st_mode)
262
class DirstateCorrupt(errors.BzrError):
264
_fmt = "The dirstate file (%(state)s) appears to be corrupt: %(msg)s"
266
def __init__(self, state, msg):
267
errors.BzrError.__init__(self)
267
272
class SHA1Provider(object):
411
418
self._last_block_index = None
412
419
self._last_entry_index = None
420
# The set of known hash changes
421
self._known_hash_changes = set()
422
# How many hash changed entries can we have without saving
423
self._worth_saving_limit = worth_saving_limit
424
self._config_stack = config.LocationStack(urlutils.local_path_to_url(
414
427
def __repr__(self):
415
428
return "%s(%r)" % \
416
429
(self.__class__.__name__, self._filename)
431
def _mark_modified(self, hash_changed_entries=None, header_modified=False):
432
"""Mark this dirstate as modified.
434
:param hash_changed_entries: if non-None, mark just these entries as
435
having their hash modified.
436
:param header_modified: mark the header modified as well, not just the
439
#trace.mutter_callsite(3, "modified hash entries: %s", hash_changed_entries)
440
if hash_changed_entries:
441
self._known_hash_changes.update(
442
[e[0] for e in hash_changed_entries])
443
if self._dirblock_state in (DirState.NOT_IN_MEMORY,
444
DirState.IN_MEMORY_UNMODIFIED):
445
# If the dirstate is already marked a IN_MEMORY_MODIFIED, then
446
# that takes precedence.
447
self._dirblock_state = DirState.IN_MEMORY_HASH_MODIFIED
449
# TODO: Since we now have a IN_MEMORY_HASH_MODIFIED state, we
450
# should fail noisily if someone tries to set
451
# IN_MEMORY_MODIFIED but we don't have a write-lock!
452
# We don't know exactly what changed so disable smart saving
453
self._dirblock_state = DirState.IN_MEMORY_MODIFIED
455
self._header_state = DirState.IN_MEMORY_MODIFIED
457
def _mark_unmodified(self):
458
"""Mark this dirstate as unmodified."""
459
self._header_state = DirState.IN_MEMORY_UNMODIFIED
460
self._dirblock_state = DirState.IN_MEMORY_UNMODIFIED
461
self._known_hash_changes = set()
418
463
def add(self, path, file_id, kind, stat, fingerprint):
419
464
"""Add a path to be tracked.
421
:param path: The path within the dirstate - '' is the root, 'foo' is the
466
:param path: The path within the dirstate - b'' is the root, 'foo' is the
422
467
path foo within the root, 'foo/bar' is the path bar within foo
424
469
:param file_id: The file id of the path being added.
457
502
utf8path = (dirname + '/' + basename).strip('/').encode('utf8')
458
503
dirname, basename = osutils.split(utf8path)
459
504
# uses __class__ for speed; the check is needed for safety
460
if file_id.__class__ is not str:
505
if file_id.__class__ is not bytes:
461
506
raise AssertionError(
462
507
"must be a utf8 file_id not %s" % (type(file_id), ))
463
508
# Make sure the file_id does not exist in this tree
464
509
rename_from = None
465
file_id_entry = self._get_entry(0, fileid_utf8=file_id, include_deleted=True)
510
file_id_entry = self._get_entry(
511
0, fileid_utf8=file_id, include_deleted=True)
466
512
if file_id_entry != (None, None):
467
if file_id_entry[1][0][0] == 'a':
513
if file_id_entry[1][0][0] == b'a':
468
514
if file_id_entry[0] != (dirname, basename, file_id):
469
515
# set the old name's current operation to rename
470
516
self.update_minimal(file_id_entry[0],
476
522
rename_from = file_id_entry[0][0:2]
478
path = osutils.pathjoin(file_id_entry[0][0], file_id_entry[0][1])
524
path = osutils.pathjoin(
525
file_id_entry[0][0], file_id_entry[0][1])
479
526
kind = DirState._minikind_to_kind[file_id_entry[1][0][0]]
480
527
info = '%s:%s' % (kind, path)
481
528
raise errors.DuplicateFileId(file_id, info)
482
first_key = (dirname, basename, '')
529
first_key = (dirname, basename, b'')
483
530
block_index, present = self._find_block_index_from_key(first_key)
485
532
# check the path is not in the tree
486
533
block = self._dirblocks[block_index][1]
487
534
entry_index, _ = self._find_entry_index(first_key, block)
488
535
while (entry_index < len(block) and
489
block[entry_index][0][0:2] == first_key[0:2]):
490
if block[entry_index][1][0][0] not in 'ar':
536
block[entry_index][0][0:2] == first_key[0:2]):
537
if block[entry_index][1][0][0] not in (b'a', b'r'):
491
538
# this path is in the dirstate in the current tree.
492
raise Exception, "adding already added path!"
539
raise Exception("adding already added path!")
495
542
# The block where we want to put the file is not present. But it
1351
1420
fingerprint, new_child_path)
1352
1421
self._check_delta_ids_absent(new_ids, delta, 0)
1354
self._apply_removals(removals.iteritems())
1355
self._apply_insertions(insertions.values())
1423
self._apply_removals(viewitems(removals))
1424
self._apply_insertions(viewvalues(insertions))
1356
1425
# Validate parents
1357
1426
self._after_delta_check_parents(parents, 0)
1358
except errors.BzrError, e:
1427
except errors.BzrError as e:
1359
1428
self._changes_aborted = True
1360
1429
if 'integrity error' not in str(e):
1362
1431
# _get_entry raises BzrError when a request is inconsistent; we
1363
# want such errors to be shown as InconsistentDelta - and that
1432
# want such errors to be shown as InconsistentDelta - and that
1364
1433
# fits the behaviour we trigger.
1365
raise errors.InconsistentDeltaDelta(delta, "error from _get_entry.")
1434
raise errors.InconsistentDeltaDelta(delta,
1435
"error from _get_entry. %s" % (e,))
1367
1437
def _apply_removals(self, removals):
1368
1438
for file_id, path in sorted(removals, reverse=True,
1369
key=operator.itemgetter(1)):
1439
key=operator.itemgetter(1)):
1370
1440
dirname, basename = osutils.split(path)
1371
1441
block_i, entry_i, d_present, f_present = \
1372
1442
self._get_block_entry_index(dirname, basename, 0)
1374
1444
entry = self._dirblocks[block_i][1][entry_i]
1375
1445
except IndexError:
1376
self._changes_aborted = True
1377
raise errors.InconsistentDelta(path, file_id,
1378
"Wrong path for old path.")
1379
if not f_present or entry[1][0][0] in 'ar':
1380
self._changes_aborted = True
1381
raise errors.InconsistentDelta(path, file_id,
1382
"Wrong path for old path.")
1446
self._raise_invalid(path, file_id,
1447
"Wrong path for old path.")
1448
if not f_present or entry[1][0][0] in (b'a', b'r'):
1449
self._raise_invalid(path, file_id,
1450
"Wrong path for old path.")
1383
1451
if file_id != entry[0][2]:
1384
self._changes_aborted = True
1385
raise errors.InconsistentDelta(path, file_id,
1386
"Attempt to remove path has wrong id - found %r."
1452
self._raise_invalid(path, file_id,
1453
"Attempt to remove path has wrong id - found %r."
1388
1455
self._make_absent(entry)
1389
1456
# See if we have a malformed delta: deleting a directory must not
1390
1457
# leave crud behind. This increases the number of bisects needed
1471
1533
new_ids = set()
1472
1534
for old_path, new_path, file_id, inv_entry in delta:
1535
if file_id.__class__ is not bytes:
1536
raise AssertionError(
1537
"must be a utf8 file_id not %s" % (type(file_id), ))
1473
1538
if inv_entry is not None and file_id != inv_entry.file_id:
1474
raise errors.InconsistentDelta(new_path, file_id,
1475
"mismatched entry file_id %r" % inv_entry)
1476
if new_path is not None:
1539
self._raise_invalid(new_path, file_id,
1540
"mismatched entry file_id %r" % inv_entry)
1541
if new_path is None:
1542
new_path_utf8 = None
1477
1544
if inv_entry is None:
1478
raise errors.InconsistentDelta(new_path, file_id,
1479
"new_path with no entry")
1545
self._raise_invalid(new_path, file_id,
1546
"new_path with no entry")
1480
1547
new_path_utf8 = encode(new_path)
1481
1548
# note the parent for validation
1482
1549
dirname_utf8, basename_utf8 = osutils.split(new_path_utf8)
1483
1550
if basename_utf8:
1484
1551
parents.add((dirname_utf8, inv_entry.parent_id))
1485
1552
if old_path is None:
1486
adds.append((None, encode(new_path), file_id,
1487
inv_to_entry(inv_entry), True))
1553
old_path_utf8 = None
1555
old_path_utf8 = encode(old_path)
1556
if old_path is None:
1557
adds.append((None, new_path_utf8, file_id,
1558
inv_to_entry(inv_entry), True))
1488
1559
new_ids.add(file_id)
1489
1560
elif new_path is None:
1490
deletes.append((encode(old_path), None, file_id, None, True))
1491
elif (old_path, new_path) != root_only:
1561
deletes.append((old_path_utf8, None, file_id, None, True))
1562
elif (old_path, new_path) == root_only:
1563
# change things in-place
1564
# Note: the case of a parent directory changing its file_id
1565
# tends to break optimizations here, because officially
1566
# the file has actually been moved, it just happens to
1567
# end up at the same path. If we can figure out how to
1568
# handle that case, we can avoid a lot of add+delete
1569
# pairs for objects that stay put.
1570
# elif old_path == new_path:
1571
changes.append((old_path_utf8, new_path_utf8, file_id,
1572
inv_to_entry(inv_entry)))
1493
1575
# Because renames must preserve their children we must have
1494
1576
# processed all relocations and removes before hand. The sort
1497
1579
# pair will result in the deleted item being reinserted, or
1498
1580
# renamed items being reinserted twice - and possibly at the
1499
1581
# wrong place. Splitting into a delete/add pair also simplifies
1500
# the handling of entries with ('f', ...), ('r' ...) because
1501
# the target of the 'r' is old_path here, and we add that to
1582
# the handling of entries with (b'f', ...), (b'r' ...) because
1583
# the target of the b'r' is old_path here, and we add that to
1502
1584
# deletes, meaning that the add handler does not need to check
1503
# for 'r' items on every pass.
1585
# for b'r' items on every pass.
1504
1586
self._update_basis_apply_deletes(deletes)
1506
1588
# Split into an add/delete pair recursively.
1507
adds.append((None, new_path_utf8, file_id,
1508
inv_to_entry(inv_entry), False))
1589
adds.append((old_path_utf8, new_path_utf8, file_id,
1590
inv_to_entry(inv_entry), False))
1509
1591
# Expunge deletes that we've seen so that deleted/renamed
1510
1592
# children of a rename directory are handled correctly.
1511
new_deletes = reversed(list(self._iter_child_entries(1,
1593
new_deletes = reversed(list(
1594
self._iter_child_entries(1, old_path_utf8)))
1513
1595
# Remove the current contents of the tree at orig_path, and
1514
1596
# reinsert at the correct new path.
1515
1597
for entry in new_deletes:
1517
source_path = entry[0][0] + '/' + entry[0][1]
1598
child_dirname, child_basename, child_file_id = entry[0]
1600
source_path = child_dirname + b'/' + child_basename
1519
source_path = entry[0][1]
1602
source_path = child_basename
1520
1603
if new_path_utf8:
1521
target_path = new_path_utf8 + source_path[len(old_path):]
1605
new_path_utf8 + source_path[len(old_path_utf8):]
1607
if old_path_utf8 == b'':
1524
1608
raise AssertionError("cannot rename directory to"
1526
target_path = source_path[len(old_path) + 1:]
1527
adds.append((None, target_path, entry[0][2], entry[1][1], False))
1610
target_path = source_path[len(old_path_utf8) + 1:]
1612
(None, target_path, entry[0][2], entry[1][1], False))
1528
1613
deletes.append(
1529
1614
(source_path, target_path, entry[0][2], None, False))
1530
1615
deletes.append(
1531
(encode(old_path), new_path, file_id, None, False))
1533
# changes to just the root should not require remove/insertion
1535
changes.append((encode(old_path), encode(new_path), file_id,
1536
inv_to_entry(inv_entry)))
1616
(old_path_utf8, new_path_utf8, file_id, None, False))
1537
1618
self._check_delta_ids_absent(new_ids, delta, 1)
1539
1620
# Finish expunging deletes/first half of renames.
1597
1677
# Adds are accumulated partly from renames, so can be in any input
1598
1678
# order - sort it.
1679
# TODO: we may want to sort in dirblocks order. That way each entry
1680
# will end up in the same directory, allowing the _get_entry
1681
# fast-path for looking up 2 items in the same dir work.
1682
adds.sort(key=lambda x: x[1])
1600
1683
# adds is now in lexographic order, which places all parents before
1601
1684
# their children, so we can process it linearly.
1685
st = static_tuple.StaticTuple
1603
1686
for old_path, new_path, file_id, new_details, real_add in adds:
1604
# the entry for this file_id must be in tree 0.
1605
entry = self._get_entry(0, file_id, new_path)
1606
if entry[0] is None or entry[0][2] != file_id:
1607
self._changes_aborted = True
1608
raise errors.InconsistentDelta(new_path, file_id,
1609
'working tree does not contain new entry')
1610
if real_add and entry[1][1][0] not in absent:
1611
self._changes_aborted = True
1612
raise errors.InconsistentDelta(new_path, file_id,
1613
'The entry was considered to be a genuinely new record,'
1614
' but there was already an old record for it.')
1615
# We don't need to update the target of an 'r' because the handling
1616
# of renames turns all 'r' situations into a delete at the original
1618
entry[1][1] = new_details
1687
dirname, basename = osutils.split(new_path)
1688
entry_key = st(dirname, basename, file_id)
1689
block_index, present = self._find_block_index_from_key(entry_key)
1691
# The block where we want to put the file is not present.
1692
# However, it might have just been an empty directory. Look for
1693
# the parent in the basis-so-far before throwing an error.
1694
parent_dir, parent_base = osutils.split(dirname)
1695
parent_block_idx, parent_entry_idx, _, parent_present = \
1696
self._get_block_entry_index(parent_dir, parent_base, 1)
1697
if not parent_present:
1698
self._raise_invalid(new_path, file_id,
1699
"Unable to find block for this record."
1700
" Was the parent added?")
1701
self._ensure_block(parent_block_idx, parent_entry_idx, dirname)
1703
block = self._dirblocks[block_index][1]
1704
entry_index, present = self._find_entry_index(entry_key, block)
1706
if old_path is not None:
1707
self._raise_invalid(new_path, file_id,
1708
'considered a real add but still had old_path at %s'
1711
entry = block[entry_index]
1712
basis_kind = entry[1][1][0]
1713
if basis_kind == b'a':
1714
entry[1][1] = new_details
1715
elif basis_kind == b'r':
1716
raise NotImplementedError()
1718
self._raise_invalid(new_path, file_id,
1719
"An entry was marked as a new add"
1720
" but the basis target already existed")
1722
# The exact key was not found in the block. However, we need to
1723
# check if there is a key next to us that would have matched.
1724
# We only need to check 2 locations, because there are only 2
1726
for maybe_index in range(entry_index - 1, entry_index + 1):
1727
if maybe_index < 0 or maybe_index >= len(block):
1729
maybe_entry = block[maybe_index]
1730
if maybe_entry[0][:2] != (dirname, basename):
1731
# Just a random neighbor
1733
if maybe_entry[0][2] == file_id:
1734
raise AssertionError(
1735
'_find_entry_index didnt find a key match'
1736
' but walking the data did, for %s'
1738
basis_kind = maybe_entry[1][1][0]
1739
if basis_kind not in (b'a', b'r'):
1740
self._raise_invalid(new_path, file_id,
1741
"we have an add record for path, but the path"
1742
" is already present with another file_id %s"
1743
% (maybe_entry[0][2],))
1745
entry = (entry_key, [DirState.NULL_PARENT_DETAILS,
1747
block.insert(entry_index, entry)
1749
active_kind = entry[1][0][0]
1750
if active_kind == b'a':
1751
# The active record shows up as absent, this could be genuine,
1752
# or it could be present at some other location. We need to
1754
id_index = self._get_id_index()
1755
# The id_index may not be perfectly accurate for tree1, because
1756
# we haven't been keeping it updated. However, it should be
1757
# fine for tree0, and that gives us enough info for what we
1759
keys = id_index.get(file_id, ())
1761
block_i, entry_i, d_present, f_present = \
1762
self._get_block_entry_index(key[0], key[1], 0)
1765
active_entry = self._dirblocks[block_i][1][entry_i]
1766
if (active_entry[0][2] != file_id):
1767
# Some other file is at this path, we don't need to
1770
real_active_kind = active_entry[1][0][0]
1771
if real_active_kind in (b'a', b'r'):
1772
# We found a record, which was not *this* record,
1773
# which matches the file_id, but is not actually
1774
# present. Something seems *really* wrong.
1775
self._raise_invalid(new_path, file_id,
1776
"We found a tree0 entry that doesnt make sense")
1777
# Now, we've found a tree0 entry which matches the file_id
1778
# but is at a different location. So update them to be
1780
active_dir, active_name = active_entry[0][:2]
1782
active_path = active_dir + b'/' + active_name
1784
active_path = active_name
1785
active_entry[1][1] = st(b'r', new_path, 0, False, b'')
1786
entry[1][0] = st(b'r', active_path, 0, False, b'')
1787
elif active_kind == b'r':
1788
raise NotImplementedError()
1790
new_kind = new_details[0]
1791
if new_kind == b'd':
1792
self._ensure_block(block_index, entry_index, new_path)
1620
1794
def _update_basis_apply_changes(self, changes):
1621
1795
"""Apply a sequence of changes to tree 1 during update_basis_by_delta.
1653
1820
null = DirState.NULL_PARENT_DETAILS
1654
1821
for old_path, new_path, file_id, _, real_delete in deletes:
1655
1822
if real_delete != (new_path is None):
1656
self._changes_aborted = True
1657
raise AssertionError("bad delete delta")
1823
self._raise_invalid(old_path, file_id, "bad delete delta")
1658
1824
# the entry for this file_id must be in tree 1.
1659
1825
dirname, basename = osutils.split(old_path)
1660
1826
block_index, entry_index, dir_present, file_present = \
1661
1827
self._get_block_entry_index(dirname, basename, 1)
1662
1828
if not file_present:
1663
self._changes_aborted = True
1664
raise errors.InconsistentDelta(old_path, file_id,
1665
'basis tree does not contain removed entry')
1829
self._raise_invalid(old_path, file_id,
1830
'basis tree does not contain removed entry')
1666
1831
entry = self._dirblocks[block_index][1][entry_index]
1832
# The state of the entry in the 'active' WT
1833
active_kind = entry[1][0][0]
1667
1834
if entry[0][2] != file_id:
1668
self._changes_aborted = True
1669
raise errors.InconsistentDelta(old_path, file_id,
1670
'mismatched file_id in tree 1')
1672
if entry[1][0][0] != 'a':
1673
self._changes_aborted = True
1674
raise errors.InconsistentDelta(old_path, file_id,
1675
'This was marked as a real delete, but the WT state'
1676
' claims that it still exists and is versioned.')
1835
self._raise_invalid(old_path, file_id,
1836
'mismatched file_id in tree 1')
1838
old_kind = entry[1][1][0]
1839
if active_kind in b'ar':
1840
# The active tree doesn't have this file_id.
1841
# The basis tree is changing this record. If this is a
1842
# rename, then we don't want the record here at all
1843
# anymore. If it is just an in-place change, we want the
1844
# record here, but we'll add it if we need to. So we just
1846
if active_kind == b'r':
1847
active_path = entry[1][0][1]
1848
active_entry = self._get_entry(0, file_id, active_path)
1849
if active_entry[1][1][0] != b'r':
1850
self._raise_invalid(old_path, file_id,
1851
"Dirstate did not have matching rename entries")
1852
elif active_entry[1][0][0] in b'ar':
1853
self._raise_invalid(old_path, file_id,
1854
"Dirstate had a rename pointing at an inactive"
1856
active_entry[1][1] = null
1677
1857
del self._dirblocks[block_index][1][entry_index]
1858
if old_kind == b'd':
1859
# This was a directory, and the active tree says it
1860
# doesn't exist, and now the basis tree says it doesn't
1861
# exist. Remove its dirblock if present
1863
present) = self._find_block_index_from_key(
1864
(old_path, b'', b''))
1866
dir_block = self._dirblocks[dir_block_index][1]
1868
# This entry is empty, go ahead and just remove it
1869
del self._dirblocks[dir_block_index]
1679
if entry[1][0][0] == 'a':
1680
self._changes_aborted = True
1681
raise errors.InconsistentDelta(old_path, file_id,
1682
'The entry was considered a rename, but the source path'
1683
' is marked as absent.')
1684
# For whatever reason, we were asked to rename an entry
1685
# that was originally marked as deleted. This could be
1686
# because we are renaming the parent directory, and the WT
1687
# current state has the file marked as deleted.
1688
elif entry[1][0][0] == 'r':
1689
# implement the rename
1690
del self._dirblocks[block_index][1][entry_index]
1692
# it is being resurrected here, so blank it out temporarily.
1693
self._dirblocks[block_index][1][entry_index][1][1] = null
1871
# There is still an active record, so just mark this
1874
block_i, entry_i, d_present, f_present = \
1875
self._get_block_entry_index(old_path, b'', 1)
1877
dir_block = self._dirblocks[block_i][1]
1878
for child_entry in dir_block:
1879
child_basis_kind = child_entry[1][1][0]
1880
if child_basis_kind not in b'ar':
1881
self._raise_invalid(old_path, file_id,
1882
"The file id was deleted but its children were "
1695
1885
def _after_delta_check_parents(self, parents, index):
1696
1886
"""Check that parents required by the delta are all intact.
1698
1888
:param parents: An iterable of (path_utf8, file_id) tuples which are
1699
1889
required to be present in tree 'index' at path_utf8 with id file_id
1700
1890
and be a directory.
1924
2114
tree present there.
1926
2116
self._read_dirblocks_if_needed()
1927
key = dirname, basename, ''
2117
key = dirname, basename, b''
1928
2118
block_index, present = self._find_block_index_from_key(key)
1929
2119
if not present:
1930
2120
# no such directory - return the dir index and 0 for the row.
1931
2121
return block_index, 0, False, False
1932
block = self._dirblocks[block_index][1] # access the entries only
2122
block = self._dirblocks[block_index][1] # access the entries only
1933
2123
entry_index, present = self._find_entry_index(key, block)
1934
2124
# linear search through entries at this path to find the one
1936
2126
while entry_index < len(block) and block[entry_index][0][1] == basename:
1937
if block[entry_index][1][tree_index][0] not in 'ar':
2127
if block[entry_index][1][tree_index][0] not in (b'a', b'r'):
1938
2128
# neither absent or relocated
1939
2129
return block_index, entry_index, True, True
1940
2130
entry_index += 1
1941
2131
return block_index, entry_index, True, False
1943
def _get_entry(self, tree_index, fileid_utf8=None, path_utf8=None, include_deleted=False):
2133
def _get_entry(self, tree_index, fileid_utf8=None, path_utf8=None,
2134
include_deleted=False):
1944
2135
"""Get the dirstate entry for path in tree tree_index.
1946
2137
If either file_id or path is supplied, it is used as the key to lookup.
2145
2338
def _get_id_index(self):
2146
"""Get an id index of self._dirblocks."""
2339
"""Get an id index of self._dirblocks.
2341
This maps from file_id => [(directory, name, file_id)] entries where
2342
that file_id appears in one of the trees.
2147
2344
if self._id_index is None:
2149
2346
for key, tree_details in self._iter_entries():
2150
id_index.setdefault(key[2], set()).add(key)
2347
self._add_to_id_index(id_index, key)
2151
2348
self._id_index = id_index
2152
2349
return self._id_index
2351
def _add_to_id_index(self, id_index, entry_key):
2352
"""Add this entry to the _id_index mapping."""
2353
# This code used to use a set for every entry in the id_index. However,
2354
# it is *rare* to have more than one entry. So a set is a large
2355
# overkill. And even when we do, we won't ever have more than the
2356
# number of parent trees. Which is still a small number (rarely >2). As
2357
# such, we use a simple tuple, and do our own uniqueness checks. While
2358
# the 'in' check is O(N) since N is nicely bounded it shouldn't ever
2359
# cause quadratic failure.
2360
file_id = entry_key[2]
2361
entry_key = static_tuple.StaticTuple.from_sequence(entry_key)
2362
if file_id not in id_index:
2363
id_index[file_id] = static_tuple.StaticTuple(entry_key,)
2365
entry_keys = id_index[file_id]
2366
if entry_key not in entry_keys:
2367
id_index[file_id] = entry_keys + (entry_key,)
2369
def _remove_from_id_index(self, id_index, entry_key):
2370
"""Remove this entry from the _id_index mapping.
2372
It is an programming error to call this when the entry_key is not
2375
file_id = entry_key[2]
2376
entry_keys = list(id_index[file_id])
2377
entry_keys.remove(entry_key)
2378
id_index[file_id] = static_tuple.StaticTuple.from_sequence(entry_keys)
2154
2380
def _get_output_lines(self, lines):
2155
2381
"""Format lines for final output.
2160
2386
output_lines = [DirState.HEADER_FORMAT_3]
2161
lines.append('') # a final newline
2162
inventory_text = '\0\n\0'.join(lines)
2163
output_lines.append('crc32: %s\n' % (zlib.crc32(inventory_text),))
2387
lines.append(b'') # a final newline
2388
inventory_text = b'\0\n\0'.join(lines)
2389
output_lines.append(b'crc32: %d\n' % (zlib.crc32(inventory_text),))
2164
2390
# -3, 1 for num parents, 1 for ghosts, 1 for final newline
2165
num_entries = len(lines)-3
2166
output_lines.append('num_entries: %s\n' % (num_entries,))
2391
num_entries = len(lines) - 3
2392
output_lines.append(b'num_entries: %d\n' % (num_entries,))
2167
2393
output_lines.append(inventory_text)
2168
2394
return output_lines
2170
2396
def _make_deleted_row(self, fileid_utf8, parents):
2171
2397
"""Return a deleted row for fileid_utf8."""
2172
return ('/', 'RECYCLED.BIN', 'file', fileid_utf8, 0, DirState.NULLSTAT,
2398
return (b'/', b'RECYCLED.BIN', b'file', fileid_utf8, 0, DirState.NULLSTAT,
2175
2401
def _num_present_parents(self):
2176
2402
"""The number of parent entries in each record row."""
2177
2403
return len(self._parents) - len(self._ghosts)
2180
def on_file(path, sha1_provider=None):
2406
def on_file(cls, path, sha1_provider=None, worth_saving_limit=0):
2181
2407
"""Construct a DirState on the file at path "path".
2183
2409
:param path: The path at which the dirstate file on disk should live.
2184
2410
:param sha1_provider: an object meeting the SHA1Provider interface.
2185
2411
If None, a DefaultSHA1Provider is used.
2412
:param worth_saving_limit: when the exact number of hash changed
2413
entries is known, only bother saving the dirstate if more than
2414
this count of entries have changed. -1 means never save.
2186
2415
:return: An unlocked DirState object, associated with the given path.
2188
2417
if sha1_provider is None:
2189
2418
sha1_provider = DefaultSHA1Provider()
2190
result = DirState(path, sha1_provider)
2419
result = cls(path, sha1_provider,
2420
worth_saving_limit=worth_saving_limit)
2193
2423
def _read_dirblocks_if_needed(self):
2243
2473
raise errors.BzrError(
2244
2474
'invalid header line: %r' % (header,))
2245
2475
crc_line = self._state_file.readline()
2246
if not crc_line.startswith('crc32: '):
2476
if not crc_line.startswith(b'crc32: '):
2247
2477
raise errors.BzrError('missing crc32 checksum: %r' % crc_line)
2248
self.crc_expected = int(crc_line[len('crc32: '):-1])
2478
self.crc_expected = int(crc_line[len(b'crc32: '):-1])
2249
2479
num_entries_line = self._state_file.readline()
2250
if not num_entries_line.startswith('num_entries: '):
2480
if not num_entries_line.startswith(b'num_entries: '):
2251
2481
raise errors.BzrError('missing num_entries line')
2252
self._num_entries = int(num_entries_line[len('num_entries: '):-1])
2482
self._num_entries = int(num_entries_line[len(b'num_entries: '):-1])
2254
def sha1_from_stat(self, path, stat_result, _pack_stat=pack_stat):
2484
def sha1_from_stat(self, path, stat_result):
2255
2485
"""Find a sha1 given a stat lookup."""
2256
return self._get_packed_stat_index().get(_pack_stat(stat_result), None)
2486
return self._get_packed_stat_index().get(pack_stat(stat_result), None)
2258
2488
def _get_packed_stat_index(self):
2259
2489
"""Get a packed_stat index of self._dirblocks."""
2260
2490
if self._packed_stat_index is None:
2262
2492
for key, tree_details in self._iter_entries():
2263
if tree_details[0][0] == 'f':
2493
if tree_details[0][0] == b'f':
2264
2494
index[tree_details[0][4]] = tree_details[0][1]
2265
2495
self._packed_stat_index = index
2266
2496
return self._packed_stat_index
2283
2513
# Should this be a warning? For now, I'm expecting that places that
2284
2514
# mark it inconsistent will warn, making a warning here redundant.
2285
2515
trace.mutter('Not saving DirState because '
2286
'_changes_aborted is set.')
2288
if (self._header_state == DirState.IN_MEMORY_MODIFIED or
2289
self._dirblock_state == DirState.IN_MEMORY_MODIFIED):
2516
'_changes_aborted is set.')
2518
# TODO: Since we now distinguish IN_MEMORY_MODIFIED from
2519
# IN_MEMORY_HASH_MODIFIED, we should only fail quietly if we fail
2520
# to save an IN_MEMORY_HASH_MODIFIED, and fail *noisily* if we
2521
# fail to save IN_MEMORY_MODIFIED
2522
if not self._worth_saving():
2291
grabbed_write_lock = False
2292
if self._lock_state != 'w':
2293
grabbed_write_lock, new_lock = self._lock_token.temporary_write_lock()
2294
# Switch over to the new lock, as the old one may be closed.
2525
grabbed_write_lock = False
2526
if self._lock_state != 'w':
2527
grabbed_write_lock, new_lock = self._lock_token.temporary_write_lock()
2528
# Switch over to the new lock, as the old one may be closed.
2529
# TODO: jam 20070315 We should validate the disk file has
2530
# not changed contents, since temporary_write_lock may
2531
# not be an atomic operation.
2532
self._lock_token = new_lock
2533
self._state_file = new_lock.f
2534
if not grabbed_write_lock:
2535
# We couldn't grab a write lock, so we switch back to a read one
2538
lines = self.get_lines()
2539
self._state_file.seek(0)
2540
self._state_file.writelines(lines)
2541
self._state_file.truncate()
2542
self._state_file.flush()
2543
self._maybe_fdatasync()
2544
self._mark_unmodified()
2546
if grabbed_write_lock:
2547
self._lock_token = self._lock_token.restore_read_lock()
2548
self._state_file = self._lock_token.f
2295
2549
# TODO: jam 20070315 We should validate the disk file has
2296
# not changed contents. Since temporary_write_lock may
2550
# not changed contents. Since restore_read_lock may
2297
2551
# not be an atomic operation.
2298
self._lock_token = new_lock
2299
self._state_file = new_lock.f
2300
if not grabbed_write_lock:
2301
# We couldn't grab a write lock, so we switch back to a read one
2304
self._state_file.seek(0)
2305
self._state_file.writelines(self.get_lines())
2306
self._state_file.truncate()
2307
self._state_file.flush()
2308
self._header_state = DirState.IN_MEMORY_UNMODIFIED
2309
self._dirblock_state = DirState.IN_MEMORY_UNMODIFIED
2311
if grabbed_write_lock:
2312
self._lock_token = self._lock_token.restore_read_lock()
2313
self._state_file = self._lock_token.f
2314
# TODO: jam 20070315 We should validate the disk file has
2315
# not changed contents. Since restore_read_lock may
2316
# not be an atomic operation.
2553
def _maybe_fdatasync(self):
2554
"""Flush to disk if possible and if not configured off."""
2555
if self._config_stack.get('dirstate.fdatasync'):
2556
osutils.fdatasync(self._state_file.fileno())
2558
def _worth_saving(self):
2559
"""Is it worth saving the dirstate or not?"""
2560
if (self._header_state == DirState.IN_MEMORY_MODIFIED
2561
or self._dirblock_state == DirState.IN_MEMORY_MODIFIED):
2563
if self._dirblock_state == DirState.IN_MEMORY_HASH_MODIFIED:
2564
if self._worth_saving_limit == -1:
2565
# We never save hash changes when the limit is -1
2567
# If we're using smart saving and only a small number of
2568
# entries have changed their hash, don't bother saving. John has
2569
# suggested using a heuristic here based on the size of the
2570
# changed files and/or tree. For now, we go with a configurable
2571
# number of changes, keeping the calculation time
2572
# as low overhead as possible. (This also keeps all existing
2573
# tests passing as the default is 0, i.e. always save.)
2574
if len(self._known_hash_changes) >= self._worth_saving_limit:
2318
2578
def _set_data(self, parent_ids, dirblocks):
2319
2579
"""Set the full dirstate data in memory.
2463
2739
# mapping from path,id. We need to look up the correct path
2464
2740
# for the indexes from 0 to tree_index -1
2465
2741
new_details = []
2466
for lookup_index in xrange(tree_index):
2742
for lookup_index in range(tree_index):
2467
2743
# boundary case: this is the first occurence of file_id
2468
# so there are no id_indexs, possibly take this out of
2744
# so there are no id_indexes, possibly take this out of
2470
if not len(id_index[file_id]):
2746
if not len(entry_keys):
2471
2747
new_details.append(DirState.NULL_PARENT_DETAILS)
2473
2749
# grab any one entry, use it to find the right path.
2474
# TODO: optimise this to reduce memory use in highly
2475
# fragmented situations by reusing the relocation
2477
a_key = iter(id_index[file_id]).next()
2478
if by_path[a_key][lookup_index][0] in ('r', 'a'):
2479
# its a pointer or missing statement, use it as is.
2480
new_details.append(by_path[a_key][lookup_index])
2750
a_key = next(iter(entry_keys))
2751
if by_path[a_key][lookup_index][0] in (b'r', b'a'):
2752
# its a pointer or missing statement, use it as
2755
by_path[a_key][lookup_index])
2482
2757
# we have the right key, make a pointer to it.
2483
real_path = ('/'.join(a_key[0:2])).strip('/')
2484
new_details.append(('r', real_path, 0, False, ''))
2758
real_path = (b'/'.join(a_key[0:2])).strip(b'/')
2759
new_details.append(st(b'r', real_path, 0, False,
2485
2761
new_details.append(self._inv_entry_to_details(entry))
2486
2762
new_details.extend(new_location_suffix)
2487
2763
by_path[new_entry_key] = new_details
2488
id_index[file_id].add(new_entry_key)
2764
self._add_to_id_index(id_index, new_entry_key)
2489
2765
# --- end generation of full tree mappings
2491
2767
# sort and output all the entries
2492
new_entries = self._sort_entries(by_path.items())
2768
new_entries = self._sort_entries(viewitems(by_path))
2493
2769
self._entries_to_current_state(new_entries)
2494
2770
self._parents = [rev_id for rev_id, tree in trees]
2495
2771
self._ghosts = list(ghosts)
2496
self._header_state = DirState.IN_MEMORY_MODIFIED
2497
self._dirblock_state = DirState.IN_MEMORY_MODIFIED
2772
self._mark_modified(header_modified=True)
2498
2773
self._id_index = id_index
2500
2775
def _sort_entries(self, entry_list):
2604
2892
# the minimal required trigger is if the execute bit or cached
2605
2893
# kind has changed.
2606
2894
if (current_old[1][0][3] != current_new[1].executable or
2607
current_old[1][0][0] != current_new_minikind):
2895
current_old[1][0][0] != current_new_minikind):
2609
2897
trace.mutter("Updating in-place change '%s'.",
2610
new_path_utf8.decode('utf8'))
2898
new_path_utf8.decode('utf8'))
2611
2899
self.update_minimal(current_old[0], current_new_minikind,
2612
executable=current_new[1].executable,
2613
path_utf8=new_path_utf8, fingerprint=fingerprint,
2900
executable=current_new[1].executable,
2901
path_utf8=new_path_utf8, fingerprint=fingerprint,
2615
2903
# both sides are dealt with, move on
2616
2904
current_old = advance(old_iterator)
2617
2905
current_new = advance(new_iterator)
2618
elif (cmp_by_dirs(new_dirname, current_old[0][0]) < 0
2619
or (new_dirname == current_old[0][0]
2620
and new_entry_key[1:] < current_old[0][1:])):
2906
elif (lt_by_dirs(new_dirname, current_old[0][0])
2907
or (new_dirname == current_old[0][0] and
2908
new_entry_key[1:] < current_old[0][1:])):
2621
2909
# new comes before:
2622
2910
# add a entry for this and advance new
2624
2912
trace.mutter("Inserting from new '%s'.",
2625
new_path_utf8.decode('utf8'))
2913
new_path_utf8.decode('utf8'))
2626
2914
self.update_minimal(new_entry_key, current_new_minikind,
2627
executable=current_new[1].executable,
2628
path_utf8=new_path_utf8, fingerprint=fingerprint,
2915
executable=current_new[1].executable,
2916
path_utf8=new_path_utf8, fingerprint=fingerprint,
2630
2918
current_new = advance(new_iterator)
2632
2920
# we've advanced past the place where the old key would be,
2633
2921
# without seeing it in the new list. so it must be gone.
2635
2923
trace.mutter("Deleting from old '%s/%s'.",
2636
current_old[0][0].decode('utf8'),
2637
current_old[0][1].decode('utf8'))
2924
current_old[0][0].decode('utf8'),
2925
current_old[0][1].decode('utf8'))
2638
2926
self._make_absent(current_old)
2639
2927
current_old = advance(old_iterator)
2640
self._dirblock_state = DirState.IN_MEMORY_MODIFIED
2928
self._mark_modified()
2641
2929
self._id_index = None
2642
2930
self._packed_stat_index = None
2644
2932
trace.mutter("set_state_from_inventory complete.")
2934
def set_state_from_scratch(self, working_inv, parent_trees, parent_ghosts):
2935
"""Wipe the currently stored state and set it to something new.
2937
This is a hard-reset for the data we are working with.
2939
# Technically, we really want a write lock, but until we write, we
2940
# don't really need it.
2941
self._requires_lock()
2942
# root dir and root dir contents with no children. We have to have a
2943
# root for set_state_from_inventory to work correctly.
2944
empty_root = ((b'', b'', inventory.ROOT_ID),
2945
[(b'd', b'', 0, False, DirState.NULLSTAT)])
2946
empty_tree_dirblocks = [(b'', [empty_root]), (b'', [])]
2947
self._set_data([], empty_tree_dirblocks)
2948
self.set_state_from_inventory(working_inv)
2949
self.set_parent_trees(parent_trees, parent_ghosts)
2646
2951
def _make_absent(self, current_old):
2647
2952
"""Mark current_old - an entry - as absent for tree 0.
2682
2990
update_block_index, present = \
2683
2991
self._find_block_index_from_key(update_key)
2684
2992
if not present:
2685
raise AssertionError('could not find block for %s' % (update_key,))
2993
raise AssertionError(
2994
'could not find block for %s' % (update_key,))
2686
2995
update_entry_index, present = \
2687
self._find_entry_index(update_key, self._dirblocks[update_block_index][1])
2996
self._find_entry_index(
2997
update_key, self._dirblocks[update_block_index][1])
2688
2998
if not present:
2689
raise AssertionError('could not find entry for %s' % (update_key,))
2999
raise AssertionError(
3000
'could not find entry for %s' % (update_key,))
2690
3001
update_tree_details = self._dirblocks[update_block_index][1][update_entry_index][1]
2691
3002
# it must not be absent at the moment
2692
if update_tree_details[0][0] == 'a': # absent
3003
if update_tree_details[0][0] == b'a': # absent
2693
3004
raise AssertionError('bad row %r' % (update_tree_details,))
2694
3005
update_tree_details[0] = DirState.NULL_PARENT_DETAILS
2695
self._dirblock_state = DirState.IN_MEMORY_MODIFIED
3006
self._mark_modified()
2696
3007
return last_reference
2698
def update_minimal(self, key, minikind, executable=False, fingerprint='',
2699
packed_stat=None, size=0, path_utf8=None, fullscan=False):
3009
def update_minimal(self, key, minikind, executable=False, fingerprint=b'',
3010
packed_stat=None, size=0, path_utf8=None, fullscan=False):
2700
3011
"""Update an entry to the state in tree 0.
2702
3013
This will either create a new entry at 'key' or update an existing one.
2803
3119
update_block_index, present = \
2804
3120
self._find_block_index_from_key(other_key)
2805
3121
if not present:
2806
raise AssertionError('could not find block for %s' % (other_key,))
3122
raise AssertionError(
3123
'could not find block for %s' % (other_key,))
2807
3124
update_entry_index, present = \
2808
self._find_entry_index(other_key, self._dirblocks[update_block_index][1])
3125
self._find_entry_index(
3126
other_key, self._dirblocks[update_block_index][1])
2809
3127
if not present:
2810
raise AssertionError('update_minimal: could not find entry for %s' % (other_key,))
3128
raise AssertionError(
3129
'update_minimal: could not find entry for %s' % (other_key,))
2811
3130
update_details = self._dirblocks[update_block_index][1][update_entry_index][1][lookup_index]
2812
if update_details[0] in 'ar': # relocated, absent
3131
if update_details[0] in (b'a', b'r'): # relocated, absent
2813
3132
# its a pointer or absent in lookup_index's tree, use
2815
3134
new_entry[1].append(update_details)
2817
3136
# we have the right key, make a pointer to it.
2818
3137
pointer_path = osutils.pathjoin(*other_key[0:2])
2819
new_entry[1].append(('r', pointer_path, 0, False, ''))
3138
new_entry[1].append(
3139
(b'r', pointer_path, 0, False, b''))
2820
3140
block.insert(entry_index, new_entry)
2821
existing_keys.add(key)
3141
self._add_to_id_index(id_index, key)
2823
3143
# Does the new state matter?
2824
3144
block[entry_index][1][0] = new_details
2842
3167
# other trees, so put absent pointers there
2843
3168
# This is the vertical axis in the matrix, all pointing
2844
3169
# to the real path.
2845
block_index, present = self._find_block_index_from_key(entry_key)
3170
block_index, present = self._find_block_index_from_key(
2846
3172
if not present:
2847
3173
raise AssertionError('not present: %r', entry_key)
2848
entry_index, present = self._find_entry_index(entry_key, self._dirblocks[block_index][1])
3174
entry_index, present = self._find_entry_index(
3175
entry_key, self._dirblocks[block_index][1])
2849
3176
if not present:
2850
3177
raise AssertionError('not present: %r', entry_key)
2851
3178
self._dirblocks[block_index][1][entry_index][1][0] = \
2852
('r', path_utf8, 0, False, '')
3179
(b'r', path_utf8, 0, False, b'')
2853
3180
# add a containing dirblock if needed.
2854
if new_details[0] == 'd':
2855
subdir_key = (osutils.pathjoin(*key[0:2]), '', '')
3181
if new_details[0] == b'd':
3182
# GZ 2017-06-09: Using pathjoin why?
3183
subdir_key = (osutils.pathjoin(*key[0:2]), b'', b'')
2856
3184
block_index, present = self._find_block_index_from_key(subdir_key)
2857
3185
if not present:
2858
3186
self._dirblocks.insert(block_index, (subdir_key[0], []))
2860
self._dirblock_state = DirState.IN_MEMORY_MODIFIED
3188
self._mark_modified()
2862
3190
def _maybe_remove_row(self, block, index, id_index):
2863
3191
"""Remove index if it is absent or relocated across the row.
2865
3193
id_index is updated accordingly.
3194
:return: True if we removed the row, False otherwise
2867
3196
present_in_row = False
2868
3197
entry = block[index]
2869
3198
for column in entry[1]:
2870
if column[0] not in 'ar':
3199
if column[0] not in (b'a', b'r'):
2871
3200
present_in_row = True
2873
3202
if not present_in_row:
2874
3203
block.pop(index)
2875
id_index[entry[0][2]].remove(entry[0])
3204
self._remove_from_id_index(id_index, entry[0])
2877
3208
def _validate(self):
2878
3209
"""Check that invariants on the dirblock are correct.
2958
3289
# We check this with a dict per tree pointing either to the present
2959
3290
# name, or None if absent.
2960
3291
tree_count = self._num_present_parents() + 1
2961
id_path_maps = [dict() for i in range(tree_count)]
3292
id_path_maps = [{} for _ in range(tree_count)]
2962
3293
# Make sure that all renamed entries point to the correct location.
2963
3294
for entry in self._iter_entries():
2964
3295
file_id = entry[0][2]
2965
3296
this_path = osutils.pathjoin(entry[0][0], entry[0][1])
2966
3297
if len(entry[1]) != tree_count:
2967
3298
raise AssertionError(
2968
"wrong number of entry details for row\n%s" \
2969
",\nexpected %d" % \
2970
(pformat(entry), tree_count))
3299
"wrong number of entry details for row\n%s"
3301
(pformat(entry), tree_count))
2971
3302
absent_positions = 0
2972
3303
for tree_index, tree_state in enumerate(entry[1]):
2973
3304
this_tree_map = id_path_maps[tree_index]
2974
3305
minikind = tree_state[0]
2975
if minikind in 'ar':
3306
if minikind in (b'a', b'r'):
2976
3307
absent_positions += 1
2977
3308
# have we seen this id before in this column?
2978
3309
if file_id in this_tree_map:
2979
3310
previous_path, previous_loc = this_tree_map[file_id]
2980
3311
# any later mention of this file must be consistent with
2981
3312
# what was said before
3313
if minikind == b'a':
2983
3314
if previous_path is not None:
2984
3315
raise AssertionError(
2985
"file %s is absent in row %r but also present " \
2987
(file_id, entry, previous_path))
2988
elif minikind == 'r':
3316
"file %s is absent in row %r but also present "
3318
(file_id.decode('utf-8'), entry, previous_path))
3319
elif minikind == b'r':
2989
3320
target_location = tree_state[1]
2990
3321
if previous_path != target_location:
2991
3322
raise AssertionError(
2992
"file %s relocation in row %r but also at %r" \
2993
% (file_id, entry, previous_path))
3323
"file %s relocation in row %r but also at %r"
3324
% (file_id, entry, previous_path))
2995
3326
# a file, directory, etc - may have been previously
2996
3327
# pointed to by a relocation, which must point here
3138
3491
# are calculated at the same time, so checking just the size
3139
3492
# gains nothing w.r.t. performance.
3140
3493
link_or_sha1 = state._sha1_file(abspath)
3141
entry[1][0] = ('f', link_or_sha1, stat_value.st_size,
3494
entry[1][0] = (b'f', link_or_sha1, stat_value.st_size,
3142
3495
executable, packed_stat)
3144
entry[1][0] = ('f', '', stat_value.st_size,
3497
entry[1][0] = (b'f', b'', stat_value.st_size,
3145
3498
executable, DirState.NULLSTAT)
3146
elif minikind == 'd':
3499
worth_saving = False
3500
elif minikind == b'd':
3147
3501
link_or_sha1 = None
3148
entry[1][0] = ('d', '', 0, False, packed_stat)
3149
if saved_minikind != 'd':
3502
entry[1][0] = (b'd', b'', 0, False, packed_stat)
3503
if saved_minikind != b'd':
3150
3504
# This changed from something into a directory. Make sure we
3151
3505
# have a directory block for it. This doesn't happen very
3152
3506
# often, so this doesn't have to be super fast.
3153
3507
block_index, entry_index, dir_present, file_present = \
3154
3508
state._get_block_entry_index(entry[0][0], entry[0][1], 0)
3155
3509
state._ensure_block(block_index, entry_index,
3156
osutils.pathjoin(entry[0][0], entry[0][1]))
3157
elif minikind == 'l':
3510
osutils.pathjoin(entry[0][0], entry[0][1]))
3512
worth_saving = False
3513
elif minikind == b'l':
3514
if saved_minikind == b'l':
3515
worth_saving = False
3158
3516
link_or_sha1 = state._read_link(abspath, saved_link_or_sha1)
3159
3517
if state._cutoff_time is None:
3160
3518
state._sha_cutoff_time()
3161
3519
if (stat_value.st_mtime < state._cutoff_time
3162
and stat_value.st_ctime < state._cutoff_time):
3163
entry[1][0] = ('l', link_or_sha1, stat_value.st_size,
3520
and stat_value.st_ctime < state._cutoff_time):
3521
entry[1][0] = (b'l', link_or_sha1, stat_value.st_size,
3164
3522
False, packed_stat)
3166
entry[1][0] = ('l', '', stat_value.st_size,
3524
entry[1][0] = (b'l', b'', stat_value.st_size,
3167
3525
False, DirState.NULLSTAT)
3168
state._dirblock_state = DirState.IN_MEMORY_MODIFIED
3527
state._mark_modified([entry])
3169
3528
return link_or_sha1
3172
3531
class ProcessEntryPython(object):
3174
3533
__slots__ = ["old_dirname_to_file_id", "new_dirname_to_file_id",
3175
"last_source_parent", "last_target_parent", "include_unchanged",
3176
"partial", "use_filesystem_for_exec", "utf8_decode",
3177
"searched_specific_files", "search_specific_files",
3178
"searched_exact_paths", "search_specific_file_parents", "seen_ids",
3179
"state", "source_index", "target_index", "want_unversioned", "tree"]
3534
"last_source_parent", "last_target_parent", "include_unchanged",
3535
"partial", "use_filesystem_for_exec", "utf8_decode",
3536
"searched_specific_files", "search_specific_files",
3537
"searched_exact_paths", "search_specific_file_parents", "seen_ids",
3538
"state", "source_index", "target_index", "want_unversioned", "tree"]
3181
3540
def __init__(self, include_unchanged, use_filesystem_for_exec,
3182
search_specific_files, state, source_index, target_index,
3183
want_unversioned, tree):
3541
search_specific_files, state, source_index, target_index,
3542
want_unversioned, tree):
3184
3543
self.old_dirname_to_file_id = {}
3185
3544
self.new_dirname_to_file_id = {}
3186
3545
# Are we doing a partial iter_changes?
3187
self.partial = search_specific_files != set([''])
3546
self.partial = search_specific_files != {''}
3188
3547
# Using a list so that we can access the values and change them in
3189
3548
# nested scope. Each one is [path, file_id, entry]
3190
3549
self.last_source_parent = [None, None]
3430
3794
target_exec = target_details[3]
3431
3795
return (entry[0][2],
3432
(None, self.utf8_decode(path)[0]),
3436
(None, self.utf8_decode(entry[0][1])[0]),
3437
(None, path_info[2]),
3438
(None, target_exec)), True
3796
(None, self.utf8_decode(path)[0]),
3800
(None, self.utf8_decode(entry[0][1])[0]),
3801
(None, path_info[2]),
3802
(None, target_exec)), True
3440
3804
# Its a missing file, report it as such.
3441
3805
return (entry[0][2],
3442
(None, self.utf8_decode(path)[0]),
3446
(None, self.utf8_decode(entry[0][1])[0]),
3448
(None, False)), True
3449
elif source_minikind in 'fdlt' and target_minikind in 'a':
3806
(None, self.utf8_decode(path)[0]),
3810
(None, self.utf8_decode(entry[0][1])[0]),
3812
(None, False)), True
3813
elif source_minikind in _fdlt and target_minikind in b'a':
3450
3814
# unversioned, possibly, or possibly not deleted: we dont care.
3451
3815
# if its still on disk, *and* theres no other entry at this
3452
3816
# path [we dont know this in this routine at the moment -
3453
3817
# perhaps we should change this - then it would be an unknown.
3454
3818
old_path = pathjoin(entry[0][0], entry[0][1])
3455
3819
# parent id is the entry for the path in the target tree
3456
parent_id = self.state._get_entry(self.source_index, path_utf8=entry[0][0])[0][2]
3820
parent_id = self.state._get_entry(
3821
self.source_index, path_utf8=entry[0][0])[0][2]
3457
3822
if parent_id == entry[0][2]:
3458
3823
parent_id = None
3459
3824
return (entry[0][2],
3460
(self.utf8_decode(old_path)[0], None),
3464
(self.utf8_decode(entry[0][1])[0], None),
3465
(DirState._minikind_to_kind[source_minikind], None),
3466
(source_details[3], None)), True
3467
elif source_minikind in 'fdlt' and target_minikind in 'r':
3825
(self.utf8_decode(old_path)[0], None),
3829
(self.utf8_decode(entry[0][1])[0], None),
3830
(DirState._minikind_to_kind[source_minikind], None),
3831
(source_details[3], None)), True
3832
elif source_minikind in _fdlt and target_minikind in b'r':
3468
3833
# a rename; could be a true rename, or a rename inherited from
3469
3834
# a renamed parent. TODO: handle this efficiently. Its not
3470
3835
# common case to rename dirs though, so a correct but slow
3471
3836
# implementation will do.
3472
if not osutils.is_inside_any(self.searched_specific_files, target_details[1]):
3837
if not osutils.is_inside_any(self.searched_specific_files,
3473
3839
self.search_specific_files.add(target_details[1])
3474
elif source_minikind in 'ra' and target_minikind in 'ra':
3840
elif source_minikind in _ra and target_minikind in _ra:
3475
3841
# neither of the selected trees contain this file,
3476
3842
# so skip over it. This is not currently directly tested, but
3477
3843
# is indirectly via test_too_much.TestCommands.test_conflicts.
3480
3846
raise AssertionError("don't know how to compare "
3481
"source_minikind=%r, target_minikind=%r"
3482
% (source_minikind, target_minikind))
3483
## import pdb;pdb.set_trace()
3847
"source_minikind=%r, target_minikind=%r"
3848
% (source_minikind, target_minikind))
3484
3849
return None, None
3486
3851
def __iter__(self):
3629
3996
if e.errno in (errno.ENOENT, errno.ENOTDIR, errno.EINVAL):
3630
3997
current_dir_info = None
3631
3998
elif (sys.platform == 'win32'
3632
and (e.errno in win_errors
3633
or e_winerror in win_errors)):
3999
and (e.errno in win_errors or
4000
e_winerror in win_errors)):
3634
4001
current_dir_info = None
3638
if current_dir_info[0][0] == '':
4005
if current_dir_info[0][0] == b'':
3639
4006
# remove .bzr from iteration
3640
bzr_index = bisect.bisect_left(current_dir_info[1], ('.bzr',))
3641
if current_dir_info[1][bzr_index][0] != '.bzr':
4007
bzr_index = bisect.bisect_left(
4008
current_dir_info[1], (b'.bzr',))
4009
if current_dir_info[1][bzr_index][0] != b'.bzr':
3642
4010
raise AssertionError()
3643
4011
del current_dir_info[1][bzr_index]
3644
4012
# walk until both the directory listing and the versioned metadata
3645
4013
# are exhausted.
3646
4014
if (block_index < len(self.state._dirblocks) and
3647
osutils.is_inside(current_root, self.state._dirblocks[block_index][0])):
4015
osutils.is_inside(current_root,
4016
self.state._dirblocks[block_index][0])):
3648
4017
current_block = self.state._dirblocks[block_index]
3650
4019
current_block = None
3651
4020
while (current_dir_info is not None or
3652
4021
current_block is not None):
3653
4022
if (current_dir_info and current_block
3654
and current_dir_info[0][0] != current_block[0]):
3655
if _cmp_by_dirs(current_dir_info[0][0], current_block[0]) < 0:
4023
and current_dir_info[0][0] != current_block[0]):
4024
if _lt_by_dirs(current_dir_info[0][0], current_block[0]):
3656
4025
# filesystem data refers to paths not covered by the dirblock.
3657
4026
# this has two possibilities:
3658
4027
# A) it is versioned but empty, so there is no block for it
3664
4033
# recurse into unknown directories.
3666
4035
while path_index < len(current_dir_info[1]):
3667
current_path_info = current_dir_info[1][path_index]
3668
if self.want_unversioned:
3669
if current_path_info[2] == 'directory':
3670
if self.tree._directory_is_tree_reference(
4036
current_path_info = current_dir_info[1][path_index]
4037
if self.want_unversioned:
4038
if current_path_info[2] == 'directory':
4039
if self.tree._directory_is_tree_reference(
3671
4040
current_path_info[0].decode('utf8')):
3672
current_path_info = current_path_info[:2] + \
3673
('tree-reference',) + current_path_info[3:]
3674
new_executable = bool(
3675
stat.S_ISREG(current_path_info[3].st_mode)
3676
and stat.S_IEXEC & current_path_info[3].st_mode)
3678
(None, utf8_decode(current_path_info[0])[0]),
3682
(None, utf8_decode(current_path_info[1])[0]),
3683
(None, current_path_info[2]),
3684
(None, new_executable))
3685
# dont descend into this unversioned path if it is
3687
if current_path_info[2] in ('directory',
3689
del current_dir_info[1][path_index]
4041
current_path_info = current_path_info[:2] + \
4042
('tree-reference',) + \
4043
current_path_info[3:]
4044
new_executable = bool(
4045
stat.S_ISREG(current_path_info[3].st_mode)
4046
and stat.S_IEXEC & current_path_info[3].st_mode)
4049
current_path_info[0])[0]),
4054
current_path_info[1])[0]),
4055
(None, current_path_info[2]),
4056
(None, new_executable))
4057
# dont descend into this unversioned path if it is
4059
if current_path_info[2] in ('directory',
4061
del current_dir_info[1][path_index]
3693
4065
# This dir info has been handled, go to the next
3695
current_dir_info = dir_iterator.next()
4067
current_dir_info = next(dir_iterator)
3696
4068
except StopIteration:
3697
4069
current_dir_info = None
3900
4280
raise AssertionError(
3901
4281
"Got entry<->path mismatch for specific path "
3902
4282
"%r entry %r path_info %r " % (
3903
path_utf8, entry, path_info))
4283
path_utf8, entry, path_info))
3904
4284
# Only include changes - we're outside the users requested
3907
4287
self._gather_result_for_consistency(result)
3908
4288
if (result[6][0] == 'directory' and
3909
result[6][1] != 'directory'):
4289
result[6][1] != 'directory'):
3910
4290
# This stopped being a directory, the old children have
3911
4291
# to be included.
3912
if entry[1][self.source_index][0] == 'r':
4292
if entry[1][self.source_index][0] == b'r':
3913
4293
# renamed, take the source path
3914
4294
entry_path_utf8 = entry[1][self.source_index][1]
3916
4296
entry_path_utf8 = path_utf8
3917
initial_key = (entry_path_utf8, '', '')
4297
initial_key = (entry_path_utf8, b'', b'')
3918
4298
block_index, _ = self.state._find_block_index_from_key(
3920
4300
if block_index == 0:
3921
4301
# The children of the root are in block index 1.
3923
4303
current_block = None
3924
4304
if block_index < len(self.state._dirblocks):
3925
4305
current_block = self.state._dirblocks[block_index]
3926
4306
if not osutils.is_inside(
3927
entry_path_utf8, current_block[0]):
4307
entry_path_utf8, current_block[0]):
3928
4308
# No entries for this directory at all.
3929
4309
current_block = None
3930
4310
if current_block is not None:
3931
4311
for entry in current_block[1]:
3932
if entry[1][self.source_index][0] in 'ar':
4312
if entry[1][self.source_index][0] in (b'a', b'r'):
3933
4313
# Not in the source tree, so doesn't have to be