20
20
lines by NL. The field delimiters are ommitted in the grammar, line delimiters
21
21
are not - this is done for clarity of reading. All string data is in utf8.
23
MINIKIND = "f" | "d" | "l" | "a" | "r" | "t";
26
WHOLE_NUMBER = {digit}, digit;
28
REVISION_ID = a non-empty utf8 string;
30
dirstate format = header line, full checksum, row count, parent details,
31
ghost_details, entries;
32
header line = "#bazaar dirstate flat format 3", NL;
33
full checksum = "crc32: ", ["-"], WHOLE_NUMBER, NL;
34
row count = "num_entries: ", WHOLE_NUMBER, NL;
35
parent_details = WHOLE NUMBER, {REVISION_ID}* NL;
36
ghost_details = WHOLE NUMBER, {REVISION_ID}*, NL;
38
entry = entry_key, current_entry_details, {parent_entry_details};
39
entry_key = dirname, basename, fileid;
40
current_entry_details = common_entry_details, working_entry_details;
41
parent_entry_details = common_entry_details, history_entry_details;
42
common_entry_details = MINIKIND, fingerprint, size, executable
43
working_entry_details = packed_stat
44
history_entry_details = REVISION_ID;
47
fingerprint = a nonempty utf8 sequence with meaning defined by minikind.
49
Given this definition, the following is useful to know:
50
entry (aka row) - all the data for a given key.
51
entry[0]: The key (dirname, basename, fileid)
55
entry[1]: The tree(s) data for this path and id combination.
56
entry[1][0]: The current tree
57
entry[1][1]: The second tree
59
For an entry for a tree, we have (using tree 0 - current tree) to demonstrate:
60
entry[1][0][0]: minikind
61
entry[1][0][1]: fingerprint
63
entry[1][0][3]: executable
64
entry[1][0][4]: packed_stat
66
entry[1][1][4]: revision_id
25
MINIKIND = "f" | "d" | "l" | "a" | "r" | "t";
28
WHOLE_NUMBER = {digit}, digit;
30
REVISION_ID = a non-empty utf8 string;
32
dirstate format = header line, full checksum, row count, parent details,
33
ghost_details, entries;
34
header line = "#bazaar dirstate flat format 3", NL;
35
full checksum = "crc32: ", ["-"], WHOLE_NUMBER, NL;
36
row count = "num_entries: ", WHOLE_NUMBER, NL;
37
parent_details = WHOLE NUMBER, {REVISION_ID}* NL;
38
ghost_details = WHOLE NUMBER, {REVISION_ID}*, NL;
40
entry = entry_key, current_entry_details, {parent_entry_details};
41
entry_key = dirname, basename, fileid;
42
current_entry_details = common_entry_details, working_entry_details;
43
parent_entry_details = common_entry_details, history_entry_details;
44
common_entry_details = MINIKIND, fingerprint, size, executable
45
working_entry_details = packed_stat
46
history_entry_details = REVISION_ID;
49
fingerprint = a nonempty utf8 sequence with meaning defined by minikind.
51
Given this definition, the following is useful to know::
53
entry (aka row) - all the data for a given key.
54
entry[0]: The key (dirname, basename, fileid)
58
entry[1]: The tree(s) data for this path and id combination.
59
entry[1][0]: The current tree
60
entry[1][1]: The second tree
62
For an entry for a tree, we have (using tree 0 - current tree) to demonstrate::
64
entry[1][0][0]: minikind
65
entry[1][0][1]: fingerprint
67
entry[1][0][3]: executable
68
entry[1][0][4]: packed_stat
72
entry[1][1][4]: revision_id
68
74
There may be multiple rows at the root, one per id present in the root, so the
69
in memory root row is now:
70
self._dirblocks[0] -> ('', [entry ...]),
71
and the entries in there are
74
entries[0][2]: file_id
75
entries[1][0]: The tree data for the current tree for this fileid at /
79
'r' is a relocated entry: This path is not present in this tree with this id,
80
but the id can be found at another location. The fingerprint is used to
81
point to the target location.
82
'a' is an absent entry: In that tree the id is not present at this path.
83
'd' is a directory entry: This path in this tree is a directory with the
84
current file id. There is no fingerprint for directories.
85
'f' is a file entry: As for directory, but it's a file. The fingerprint is the
86
sha1 value of the file's canonical form, i.e. after any read filters have
87
been applied to the convenience form stored in the working tree.
88
'l' is a symlink entry: As for directory, but a symlink. The fingerprint is the
90
't' is a reference to a nested subtree; the fingerprint is the referenced
75
in memory root row is now::
77
self._dirblocks[0] -> ('', [entry ...]),
79
and the entries in there are::
83
entries[0][2]: file_id
84
entries[1][0]: The tree data for the current tree for this fileid at /
89
b'r' is a relocated entry: This path is not present in this tree with this
90
id, but the id can be found at another location. The fingerprint is
91
used to point to the target location.
92
b'a' is an absent entry: In that tree the id is not present at this path.
93
b'd' is a directory entry: This path in this tree is a directory with the
94
current file id. There is no fingerprint for directories.
95
b'f' is a file entry: As for directory, but it's a file. The fingerprint is
96
the sha1 value of the file's canonical form, i.e. after any read
97
filters have been applied to the convenience form stored in the working
99
b'l' is a symlink entry: As for directory, but a symlink. The fingerprint is
101
b't' is a reference to a nested subtree; the fingerprint is the referenced
95
The entries on disk and in memory are ordered according to the following keys:
106
The entries on disk and in memory are ordered according to the following keys::
97
108
directory, as a list of components
101
112
--- Format 1 had the following different definition: ---
102
rows = dirname, NULL, basename, NULL, MINIKIND, NULL, fileid_utf8, NULL,
103
WHOLE NUMBER (* size *), NULL, packed stat, NULL, sha1|symlink target,
105
PARENT ROW = NULL, revision_utf8, NULL, MINIKIND, NULL, dirname, NULL,
106
basename, NULL, WHOLE NUMBER (* size *), NULL, "y" | "n", NULL,
116
rows = dirname, NULL, basename, NULL, MINIKIND, NULL, fileid_utf8, NULL,
117
WHOLE NUMBER (* size *), NULL, packed stat, NULL, sha1|symlink target,
119
PARENT ROW = NULL, revision_utf8, NULL, MINIKIND, NULL, dirname, NULL,
120
basename, NULL, WHOLE NUMBER (* size *), NULL, "y" | "n", NULL,
109
123
PARENT ROW's are emitted for every parent that is not in the ghosts details
110
124
line. That is, if the parents are foo, bar, baz, and the ghosts are bar, then
231
259
ERROR_DIRECTORY = 267
234
if not getattr(struct, '_compile', None):
235
# Cannot pre-compile the dirstate pack_stat
236
def pack_stat(st, _encode=binascii.b2a_base64, _pack=struct.pack):
237
"""Convert stat values into a packed representation."""
238
return _encode(_pack('>LLLLLL', st.st_size, int(st.st_mtime),
239
int(st.st_ctime), st.st_dev, st.st_ino & 0xFFFFFFFF,
242
# compile the struct compiler we need, so as to only do it once
243
from _struct import Struct
244
_compiled_pack = Struct('>LLLLLL').pack
245
def pack_stat(st, _encode=binascii.b2a_base64, _pack=_compiled_pack):
246
"""Convert stat values into a packed representation."""
247
# jam 20060614 it isn't really worth removing more entries if we
248
# are going to leave it in packed form.
249
# With only st_mtime and st_mode filesize is 5.5M and read time is 275ms
250
# With all entries, filesize is 5.9M and read time is maybe 280ms
251
# well within the noise margin
253
# base64 encoding always adds a final newline, so strip it off
254
# The current version
255
return _encode(_pack(st.st_size, int(st.st_mtime), int(st.st_ctime),
256
st.st_dev, st.st_ino & 0xFFFFFFFF, st.st_mode))[:-1]
257
# This is 0.060s / 1.520s faster by not encoding as much information
258
# return _encode(_pack('>LL', int(st.st_mtime), st.st_mode))[:-1]
259
# This is not strictly faster than _encode(_pack())[:-1]
260
# return '%X.%X.%X.%X.%X.%X' % (
261
# st.st_size, int(st.st_mtime), int(st.st_ctime),
262
# st.st_dev, st.st_ino, st.st_mode)
263
# Similar to the _encode(_pack('>LL'))
264
# return '%X.%X' % (int(st.st_mtime), st.st_mode)
262
class DirstateCorrupt(errors.BzrError):
264
_fmt = "The dirstate file (%(state)s) appears to be corrupt: %(msg)s"
266
def __init__(self, state, msg):
267
errors.BzrError.__init__(self)
267
272
class SHA1Provider(object):
354
356
NOT_IN_MEMORY = 0
355
357
IN_MEMORY_UNMODIFIED = 1
356
358
IN_MEMORY_MODIFIED = 2
359
IN_MEMORY_HASH_MODIFIED = 3 # Only hash-cache updates
358
361
# A pack_stat (the x's) that is just noise and will never match the output
359
362
# of base64 encode.
361
NULL_PARENT_DETAILS = ('a', '', 0, False, '')
363
HEADER_FORMAT_2 = '#bazaar dirstate flat format 2\n'
364
HEADER_FORMAT_3 = '#bazaar dirstate flat format 3\n'
366
def __init__(self, path, sha1_provider):
364
NULL_PARENT_DETAILS = static_tuple.StaticTuple(b'a', b'', 0, False, b'')
366
HEADER_FORMAT_2 = b'#bazaar dirstate flat format 2\n'
367
HEADER_FORMAT_3 = b'#bazaar dirstate flat format 3\n'
369
def __init__(self, path, sha1_provider, worth_saving_limit=0,
370
use_filesystem_for_exec=True):
367
371
"""Create a DirState object.
369
373
:param path: The path at which the dirstate file on disk should live.
370
374
:param sha1_provider: an object meeting the SHA1Provider interface.
375
:param worth_saving_limit: when the exact number of hash changed
376
entries is known, only bother saving the dirstate if more than
377
this count of entries have changed.
378
-1 means never save hash changes, 0 means always save hash changes.
379
:param use_filesystem_for_exec: Whether to trust the filesystem
380
for executable bit information
372
382
# _header_state and _dirblock_state represent the current state
373
383
# of the dirstate metadata and the per-row data respectiely.
411
421
self._last_block_index = None
412
422
self._last_entry_index = None
423
# The set of known hash changes
424
self._known_hash_changes = set()
425
# How many hash changed entries can we have without saving
426
self._worth_saving_limit = worth_saving_limit
427
self._config_stack = config.LocationStack(urlutils.local_path_to_url(
429
self._use_filesystem_for_exec = use_filesystem_for_exec
414
431
def __repr__(self):
415
432
return "%s(%r)" % \
416
433
(self.__class__.__name__, self._filename)
435
def _mark_modified(self, hash_changed_entries=None, header_modified=False):
436
"""Mark this dirstate as modified.
438
:param hash_changed_entries: if non-None, mark just these entries as
439
having their hash modified.
440
:param header_modified: mark the header modified as well, not just the
443
#trace.mutter_callsite(3, "modified hash entries: %s", hash_changed_entries)
444
if hash_changed_entries:
445
self._known_hash_changes.update(
446
[e[0] for e in hash_changed_entries])
447
if self._dirblock_state in (DirState.NOT_IN_MEMORY,
448
DirState.IN_MEMORY_UNMODIFIED):
449
# If the dirstate is already marked a IN_MEMORY_MODIFIED, then
450
# that takes precedence.
451
self._dirblock_state = DirState.IN_MEMORY_HASH_MODIFIED
453
# TODO: Since we now have a IN_MEMORY_HASH_MODIFIED state, we
454
# should fail noisily if someone tries to set
455
# IN_MEMORY_MODIFIED but we don't have a write-lock!
456
# We don't know exactly what changed so disable smart saving
457
self._dirblock_state = DirState.IN_MEMORY_MODIFIED
459
self._header_state = DirState.IN_MEMORY_MODIFIED
461
def _mark_unmodified(self):
462
"""Mark this dirstate as unmodified."""
463
self._header_state = DirState.IN_MEMORY_UNMODIFIED
464
self._dirblock_state = DirState.IN_MEMORY_UNMODIFIED
465
self._known_hash_changes = set()
418
467
def add(self, path, file_id, kind, stat, fingerprint):
419
468
"""Add a path to be tracked.
421
:param path: The path within the dirstate - '' is the root, 'foo' is the
470
:param path: The path within the dirstate - b'' is the root, 'foo' is the
422
471
path foo within the root, 'foo/bar' is the path bar within foo
424
473
:param file_id: The file id of the path being added.
457
506
utf8path = (dirname + '/' + basename).strip('/').encode('utf8')
458
507
dirname, basename = osutils.split(utf8path)
459
508
# uses __class__ for speed; the check is needed for safety
460
if file_id.__class__ is not str:
509
if file_id.__class__ is not bytes:
461
510
raise AssertionError(
462
511
"must be a utf8 file_id not %s" % (type(file_id), ))
463
512
# Make sure the file_id does not exist in this tree
464
513
rename_from = None
465
file_id_entry = self._get_entry(0, fileid_utf8=file_id, include_deleted=True)
514
file_id_entry = self._get_entry(
515
0, fileid_utf8=file_id, include_deleted=True)
466
516
if file_id_entry != (None, None):
467
if file_id_entry[1][0][0] == 'a':
517
if file_id_entry[1][0][0] == b'a':
468
518
if file_id_entry[0] != (dirname, basename, file_id):
469
519
# set the old name's current operation to rename
470
520
self.update_minimal(file_id_entry[0],
476
526
rename_from = file_id_entry[0][0:2]
478
path = osutils.pathjoin(file_id_entry[0][0], file_id_entry[0][1])
528
path = osutils.pathjoin(
529
file_id_entry[0][0], file_id_entry[0][1])
479
530
kind = DirState._minikind_to_kind[file_id_entry[1][0][0]]
480
531
info = '%s:%s' % (kind, path)
481
532
raise errors.DuplicateFileId(file_id, info)
482
first_key = (dirname, basename, '')
533
first_key = (dirname, basename, b'')
483
534
block_index, present = self._find_block_index_from_key(first_key)
485
536
# check the path is not in the tree
486
537
block = self._dirblocks[block_index][1]
487
538
entry_index, _ = self._find_entry_index(first_key, block)
488
539
while (entry_index < len(block) and
489
block[entry_index][0][0:2] == first_key[0:2]):
490
if block[entry_index][1][0][0] not in 'ar':
540
block[entry_index][0][0:2] == first_key[0:2]):
541
if block[entry_index][1][0][0] not in (b'a', b'r'):
491
542
# this path is in the dirstate in the current tree.
492
raise Exception, "adding already added path!"
543
raise Exception("adding already added path!")
495
546
# The block where we want to put the file is not present. But it
1351
1424
fingerprint, new_child_path)
1352
1425
self._check_delta_ids_absent(new_ids, delta, 0)
1354
self._apply_removals(removals.iteritems())
1355
self._apply_insertions(insertions.values())
1427
self._apply_removals(viewitems(removals))
1428
self._apply_insertions(viewvalues(insertions))
1356
1429
# Validate parents
1357
1430
self._after_delta_check_parents(parents, 0)
1358
except errors.BzrError, e:
1431
except errors.BzrError as e:
1359
1432
self._changes_aborted = True
1360
1433
if 'integrity error' not in str(e):
1362
1435
# _get_entry raises BzrError when a request is inconsistent; we
1363
# want such errors to be shown as InconsistentDelta - and that
1436
# want such errors to be shown as InconsistentDelta - and that
1364
1437
# fits the behaviour we trigger.
1365
raise errors.InconsistentDeltaDelta(delta, "error from _get_entry.")
1438
raise errors.InconsistentDeltaDelta(delta,
1439
"error from _get_entry. %s" % (e,))
1367
1441
def _apply_removals(self, removals):
1368
1442
for file_id, path in sorted(removals, reverse=True,
1369
key=operator.itemgetter(1)):
1443
key=operator.itemgetter(1)):
1370
1444
dirname, basename = osutils.split(path)
1371
1445
block_i, entry_i, d_present, f_present = \
1372
1446
self._get_block_entry_index(dirname, basename, 0)
1374
1448
entry = self._dirblocks[block_i][1][entry_i]
1375
1449
except IndexError:
1376
self._changes_aborted = True
1377
raise errors.InconsistentDelta(path, file_id,
1378
"Wrong path for old path.")
1379
if not f_present or entry[1][0][0] in 'ar':
1380
self._changes_aborted = True
1381
raise errors.InconsistentDelta(path, file_id,
1382
"Wrong path for old path.")
1450
self._raise_invalid(path, file_id,
1451
"Wrong path for old path.")
1452
if not f_present or entry[1][0][0] in (b'a', b'r'):
1453
self._raise_invalid(path, file_id,
1454
"Wrong path for old path.")
1383
1455
if file_id != entry[0][2]:
1384
self._changes_aborted = True
1385
raise errors.InconsistentDelta(path, file_id,
1386
"Attempt to remove path has wrong id - found %r."
1456
self._raise_invalid(path, file_id,
1457
"Attempt to remove path has wrong id - found %r."
1388
1459
self._make_absent(entry)
1389
1460
# See if we have a malformed delta: deleting a directory must not
1390
1461
# leave crud behind. This increases the number of bisects needed
1471
1537
new_ids = set()
1472
1538
for old_path, new_path, file_id, inv_entry in delta:
1539
if file_id.__class__ is not bytes:
1540
raise AssertionError(
1541
"must be a utf8 file_id not %s" % (type(file_id), ))
1473
1542
if inv_entry is not None and file_id != inv_entry.file_id:
1474
raise errors.InconsistentDelta(new_path, file_id,
1475
"mismatched entry file_id %r" % inv_entry)
1476
if new_path is not None:
1543
self._raise_invalid(new_path, file_id,
1544
"mismatched entry file_id %r" % inv_entry)
1545
if new_path is None:
1546
new_path_utf8 = None
1477
1548
if inv_entry is None:
1478
raise errors.InconsistentDelta(new_path, file_id,
1479
"new_path with no entry")
1549
self._raise_invalid(new_path, file_id,
1550
"new_path with no entry")
1480
1551
new_path_utf8 = encode(new_path)
1481
1552
# note the parent for validation
1482
1553
dirname_utf8, basename_utf8 = osutils.split(new_path_utf8)
1483
1554
if basename_utf8:
1484
1555
parents.add((dirname_utf8, inv_entry.parent_id))
1485
1556
if old_path is None:
1486
adds.append((None, encode(new_path), file_id,
1487
inv_to_entry(inv_entry), True))
1557
old_path_utf8 = None
1559
old_path_utf8 = encode(old_path)
1560
if old_path is None:
1561
adds.append((None, new_path_utf8, file_id,
1562
inv_to_entry(inv_entry), True))
1488
1563
new_ids.add(file_id)
1489
1564
elif new_path is None:
1490
deletes.append((encode(old_path), None, file_id, None, True))
1491
elif (old_path, new_path) != root_only:
1565
deletes.append((old_path_utf8, None, file_id, None, True))
1566
elif (old_path, new_path) == root_only:
1567
# change things in-place
1568
# Note: the case of a parent directory changing its file_id
1569
# tends to break optimizations here, because officially
1570
# the file has actually been moved, it just happens to
1571
# end up at the same path. If we can figure out how to
1572
# handle that case, we can avoid a lot of add+delete
1573
# pairs for objects that stay put.
1574
# elif old_path == new_path:
1575
changes.append((old_path_utf8, new_path_utf8, file_id,
1576
inv_to_entry(inv_entry)))
1493
1579
# Because renames must preserve their children we must have
1494
1580
# processed all relocations and removes before hand. The sort
1497
1583
# pair will result in the deleted item being reinserted, or
1498
1584
# renamed items being reinserted twice - and possibly at the
1499
1585
# wrong place. Splitting into a delete/add pair also simplifies
1500
# the handling of entries with ('f', ...), ('r' ...) because
1501
# the target of the 'r' is old_path here, and we add that to
1586
# the handling of entries with (b'f', ...), (b'r' ...) because
1587
# the target of the b'r' is old_path here, and we add that to
1502
1588
# deletes, meaning that the add handler does not need to check
1503
# for 'r' items on every pass.
1589
# for b'r' items on every pass.
1504
1590
self._update_basis_apply_deletes(deletes)
1506
1592
# Split into an add/delete pair recursively.
1507
adds.append((None, new_path_utf8, file_id,
1508
inv_to_entry(inv_entry), False))
1593
adds.append((old_path_utf8, new_path_utf8, file_id,
1594
inv_to_entry(inv_entry), False))
1509
1595
# Expunge deletes that we've seen so that deleted/renamed
1510
1596
# children of a rename directory are handled correctly.
1511
new_deletes = reversed(list(self._iter_child_entries(1,
1597
new_deletes = reversed(list(
1598
self._iter_child_entries(1, old_path_utf8)))
1513
1599
# Remove the current contents of the tree at orig_path, and
1514
1600
# reinsert at the correct new path.
1515
1601
for entry in new_deletes:
1517
source_path = entry[0][0] + '/' + entry[0][1]
1602
child_dirname, child_basename, child_file_id = entry[0]
1604
source_path = child_dirname + b'/' + child_basename
1519
source_path = entry[0][1]
1606
source_path = child_basename
1520
1607
if new_path_utf8:
1521
target_path = new_path_utf8 + source_path[len(old_path):]
1609
new_path_utf8 + source_path[len(old_path_utf8):]
1611
if old_path_utf8 == b'':
1524
1612
raise AssertionError("cannot rename directory to"
1526
target_path = source_path[len(old_path) + 1:]
1527
adds.append((None, target_path, entry[0][2], entry[1][1], False))
1614
target_path = source_path[len(old_path_utf8) + 1:]
1616
(None, target_path, entry[0][2], entry[1][1], False))
1528
1617
deletes.append(
1529
1618
(source_path, target_path, entry[0][2], None, False))
1530
1619
deletes.append(
1531
(encode(old_path), new_path, file_id, None, False))
1533
# changes to just the root should not require remove/insertion
1535
changes.append((encode(old_path), encode(new_path), file_id,
1536
inv_to_entry(inv_entry)))
1620
(old_path_utf8, new_path_utf8, file_id, None, False))
1537
1622
self._check_delta_ids_absent(new_ids, delta, 1)
1539
1624
# Finish expunging deletes/first half of renames.
1597
1681
# Adds are accumulated partly from renames, so can be in any input
1598
1682
# order - sort it.
1683
# TODO: we may want to sort in dirblocks order. That way each entry
1684
# will end up in the same directory, allowing the _get_entry
1685
# fast-path for looking up 2 items in the same dir work.
1686
adds.sort(key=lambda x: x[1])
1600
1687
# adds is now in lexographic order, which places all parents before
1601
1688
# their children, so we can process it linearly.
1689
st = static_tuple.StaticTuple
1603
1690
for old_path, new_path, file_id, new_details, real_add in adds:
1604
# the entry for this file_id must be in tree 0.
1605
entry = self._get_entry(0, file_id, new_path)
1606
if entry[0] is None or entry[0][2] != file_id:
1607
self._changes_aborted = True
1608
raise errors.InconsistentDelta(new_path, file_id,
1609
'working tree does not contain new entry')
1610
if real_add and entry[1][1][0] not in absent:
1611
self._changes_aborted = True
1612
raise errors.InconsistentDelta(new_path, file_id,
1613
'The entry was considered to be a genuinely new record,'
1614
' but there was already an old record for it.')
1615
# We don't need to update the target of an 'r' because the handling
1616
# of renames turns all 'r' situations into a delete at the original
1618
entry[1][1] = new_details
1691
dirname, basename = osutils.split(new_path)
1692
entry_key = st(dirname, basename, file_id)
1693
block_index, present = self._find_block_index_from_key(entry_key)
1695
# The block where we want to put the file is not present.
1696
# However, it might have just been an empty directory. Look for
1697
# the parent in the basis-so-far before throwing an error.
1698
parent_dir, parent_base = osutils.split(dirname)
1699
parent_block_idx, parent_entry_idx, _, parent_present = \
1700
self._get_block_entry_index(parent_dir, parent_base, 1)
1701
if not parent_present:
1702
self._raise_invalid(new_path, file_id,
1703
"Unable to find block for this record."
1704
" Was the parent added?")
1705
self._ensure_block(parent_block_idx, parent_entry_idx, dirname)
1707
block = self._dirblocks[block_index][1]
1708
entry_index, present = self._find_entry_index(entry_key, block)
1710
if old_path is not None:
1711
self._raise_invalid(new_path, file_id,
1712
'considered a real add but still had old_path at %s'
1715
entry = block[entry_index]
1716
basis_kind = entry[1][1][0]
1717
if basis_kind == b'a':
1718
entry[1][1] = new_details
1719
elif basis_kind == b'r':
1720
raise NotImplementedError()
1722
self._raise_invalid(new_path, file_id,
1723
"An entry was marked as a new add"
1724
" but the basis target already existed")
1726
# The exact key was not found in the block. However, we need to
1727
# check if there is a key next to us that would have matched.
1728
# We only need to check 2 locations, because there are only 2
1730
for maybe_index in range(entry_index - 1, entry_index + 1):
1731
if maybe_index < 0 or maybe_index >= len(block):
1733
maybe_entry = block[maybe_index]
1734
if maybe_entry[0][:2] != (dirname, basename):
1735
# Just a random neighbor
1737
if maybe_entry[0][2] == file_id:
1738
raise AssertionError(
1739
'_find_entry_index didnt find a key match'
1740
' but walking the data did, for %s'
1742
basis_kind = maybe_entry[1][1][0]
1743
if basis_kind not in (b'a', b'r'):
1744
self._raise_invalid(new_path, file_id,
1745
"we have an add record for path, but the path"
1746
" is already present with another file_id %s"
1747
% (maybe_entry[0][2],))
1749
entry = (entry_key, [DirState.NULL_PARENT_DETAILS,
1751
block.insert(entry_index, entry)
1753
active_kind = entry[1][0][0]
1754
if active_kind == b'a':
1755
# The active record shows up as absent, this could be genuine,
1756
# or it could be present at some other location. We need to
1758
id_index = self._get_id_index()
1759
# The id_index may not be perfectly accurate for tree1, because
1760
# we haven't been keeping it updated. However, it should be
1761
# fine for tree0, and that gives us enough info for what we
1763
keys = id_index.get(file_id, ())
1765
block_i, entry_i, d_present, f_present = \
1766
self._get_block_entry_index(key[0], key[1], 0)
1769
active_entry = self._dirblocks[block_i][1][entry_i]
1770
if (active_entry[0][2] != file_id):
1771
# Some other file is at this path, we don't need to
1774
real_active_kind = active_entry[1][0][0]
1775
if real_active_kind in (b'a', b'r'):
1776
# We found a record, which was not *this* record,
1777
# which matches the file_id, but is not actually
1778
# present. Something seems *really* wrong.
1779
self._raise_invalid(new_path, file_id,
1780
"We found a tree0 entry that doesnt make sense")
1781
# Now, we've found a tree0 entry which matches the file_id
1782
# but is at a different location. So update them to be
1784
active_dir, active_name = active_entry[0][:2]
1786
active_path = active_dir + b'/' + active_name
1788
active_path = active_name
1789
active_entry[1][1] = st(b'r', new_path, 0, False, b'')
1790
entry[1][0] = st(b'r', active_path, 0, False, b'')
1791
elif active_kind == b'r':
1792
raise NotImplementedError()
1794
new_kind = new_details[0]
1795
if new_kind == b'd':
1796
self._ensure_block(block_index, entry_index, new_path)
1620
1798
def _update_basis_apply_changes(self, changes):
1621
1799
"""Apply a sequence of changes to tree 1 during update_basis_by_delta.
1653
1824
null = DirState.NULL_PARENT_DETAILS
1654
1825
for old_path, new_path, file_id, _, real_delete in deletes:
1655
1826
if real_delete != (new_path is None):
1656
self._changes_aborted = True
1657
raise AssertionError("bad delete delta")
1827
self._raise_invalid(old_path, file_id, "bad delete delta")
1658
1828
# the entry for this file_id must be in tree 1.
1659
1829
dirname, basename = osutils.split(old_path)
1660
1830
block_index, entry_index, dir_present, file_present = \
1661
1831
self._get_block_entry_index(dirname, basename, 1)
1662
1832
if not file_present:
1663
self._changes_aborted = True
1664
raise errors.InconsistentDelta(old_path, file_id,
1665
'basis tree does not contain removed entry')
1833
self._raise_invalid(old_path, file_id,
1834
'basis tree does not contain removed entry')
1666
1835
entry = self._dirblocks[block_index][1][entry_index]
1836
# The state of the entry in the 'active' WT
1837
active_kind = entry[1][0][0]
1667
1838
if entry[0][2] != file_id:
1668
self._changes_aborted = True
1669
raise errors.InconsistentDelta(old_path, file_id,
1670
'mismatched file_id in tree 1')
1672
if entry[1][0][0] != 'a':
1673
self._changes_aborted = True
1674
raise errors.InconsistentDelta(old_path, file_id,
1675
'This was marked as a real delete, but the WT state'
1676
' claims that it still exists and is versioned.')
1839
self._raise_invalid(old_path, file_id,
1840
'mismatched file_id in tree 1')
1842
old_kind = entry[1][1][0]
1843
if active_kind in b'ar':
1844
# The active tree doesn't have this file_id.
1845
# The basis tree is changing this record. If this is a
1846
# rename, then we don't want the record here at all
1847
# anymore. If it is just an in-place change, we want the
1848
# record here, but we'll add it if we need to. So we just
1850
if active_kind == b'r':
1851
active_path = entry[1][0][1]
1852
active_entry = self._get_entry(0, file_id, active_path)
1853
if active_entry[1][1][0] != b'r':
1854
self._raise_invalid(old_path, file_id,
1855
"Dirstate did not have matching rename entries")
1856
elif active_entry[1][0][0] in b'ar':
1857
self._raise_invalid(old_path, file_id,
1858
"Dirstate had a rename pointing at an inactive"
1860
active_entry[1][1] = null
1677
1861
del self._dirblocks[block_index][1][entry_index]
1862
if old_kind == b'd':
1863
# This was a directory, and the active tree says it
1864
# doesn't exist, and now the basis tree says it doesn't
1865
# exist. Remove its dirblock if present
1867
present) = self._find_block_index_from_key(
1868
(old_path, b'', b''))
1870
dir_block = self._dirblocks[dir_block_index][1]
1872
# This entry is empty, go ahead and just remove it
1873
del self._dirblocks[dir_block_index]
1679
if entry[1][0][0] == 'a':
1680
self._changes_aborted = True
1681
raise errors.InconsistentDelta(old_path, file_id,
1682
'The entry was considered a rename, but the source path'
1683
' is marked as absent.')
1684
# For whatever reason, we were asked to rename an entry
1685
# that was originally marked as deleted. This could be
1686
# because we are renaming the parent directory, and the WT
1687
# current state has the file marked as deleted.
1688
elif entry[1][0][0] == 'r':
1689
# implement the rename
1690
del self._dirblocks[block_index][1][entry_index]
1692
# it is being resurrected here, so blank it out temporarily.
1693
self._dirblocks[block_index][1][entry_index][1][1] = null
1875
# There is still an active record, so just mark this
1878
block_i, entry_i, d_present, f_present = \
1879
self._get_block_entry_index(old_path, b'', 1)
1881
dir_block = self._dirblocks[block_i][1]
1882
for child_entry in dir_block:
1883
child_basis_kind = child_entry[1][1][0]
1884
if child_basis_kind not in b'ar':
1885
self._raise_invalid(old_path, file_id,
1886
"The file id was deleted but its children were "
1695
1889
def _after_delta_check_parents(self, parents, index):
1696
1890
"""Check that parents required by the delta are all intact.
1698
1892
:param parents: An iterable of (path_utf8, file_id) tuples which are
1699
1893
required to be present in tree 'index' at path_utf8 with id file_id
1700
1894
and be a directory.
1924
2114
tree present there.
1926
2116
self._read_dirblocks_if_needed()
1927
key = dirname, basename, ''
2117
key = dirname, basename, b''
1928
2118
block_index, present = self._find_block_index_from_key(key)
1929
2119
if not present:
1930
2120
# no such directory - return the dir index and 0 for the row.
1931
2121
return block_index, 0, False, False
1932
block = self._dirblocks[block_index][1] # access the entries only
2122
block = self._dirblocks[block_index][1] # access the entries only
1933
2123
entry_index, present = self._find_entry_index(key, block)
1934
2124
# linear search through entries at this path to find the one
1936
2126
while entry_index < len(block) and block[entry_index][0][1] == basename:
1937
if block[entry_index][1][tree_index][0] not in 'ar':
2127
if block[entry_index][1][tree_index][0] not in (b'a', b'r'):
1938
2128
# neither absent or relocated
1939
2129
return block_index, entry_index, True, True
1940
2130
entry_index += 1
1941
2131
return block_index, entry_index, True, False
1943
def _get_entry(self, tree_index, fileid_utf8=None, path_utf8=None, include_deleted=False):
2133
def _get_entry(self, tree_index, fileid_utf8=None, path_utf8=None,
2134
include_deleted=False):
1944
2135
"""Get the dirstate entry for path in tree tree_index.
1946
2137
If either file_id or path is supplied, it is used as the key to lookup.
2145
2338
def _get_id_index(self):
2146
"""Get an id index of self._dirblocks."""
2339
"""Get an id index of self._dirblocks.
2341
This maps from file_id => [(directory, name, file_id)] entries where
2342
that file_id appears in one of the trees.
2147
2344
if self._id_index is None:
2149
2346
for key, tree_details in self._iter_entries():
2150
id_index.setdefault(key[2], set()).add(key)
2347
self._add_to_id_index(id_index, key)
2151
2348
self._id_index = id_index
2152
2349
return self._id_index
2351
def _add_to_id_index(self, id_index, entry_key):
2352
"""Add this entry to the _id_index mapping."""
2353
# This code used to use a set for every entry in the id_index. However,
2354
# it is *rare* to have more than one entry. So a set is a large
2355
# overkill. And even when we do, we won't ever have more than the
2356
# number of parent trees. Which is still a small number (rarely >2). As
2357
# such, we use a simple tuple, and do our own uniqueness checks. While
2358
# the 'in' check is O(N) since N is nicely bounded it shouldn't ever
2359
# cause quadratic failure.
2360
file_id = entry_key[2]
2361
entry_key = static_tuple.StaticTuple.from_sequence(entry_key)
2362
if file_id not in id_index:
2363
id_index[file_id] = static_tuple.StaticTuple(entry_key,)
2365
entry_keys = id_index[file_id]
2366
if entry_key not in entry_keys:
2367
id_index[file_id] = entry_keys + (entry_key,)
2369
def _remove_from_id_index(self, id_index, entry_key):
2370
"""Remove this entry from the _id_index mapping.
2372
It is an programming error to call this when the entry_key is not
2375
file_id = entry_key[2]
2376
entry_keys = list(id_index[file_id])
2377
entry_keys.remove(entry_key)
2378
id_index[file_id] = static_tuple.StaticTuple.from_sequence(entry_keys)
2154
2380
def _get_output_lines(self, lines):
2155
2381
"""Format lines for final output.
2160
2386
output_lines = [DirState.HEADER_FORMAT_3]
2161
lines.append('') # a final newline
2162
inventory_text = '\0\n\0'.join(lines)
2163
output_lines.append('crc32: %s\n' % (zlib.crc32(inventory_text),))
2387
lines.append(b'') # a final newline
2388
inventory_text = b'\0\n\0'.join(lines)
2389
output_lines.append(b'crc32: %d\n' % (zlib.crc32(inventory_text),))
2164
2390
# -3, 1 for num parents, 1 for ghosts, 1 for final newline
2165
num_entries = len(lines)-3
2166
output_lines.append('num_entries: %s\n' % (num_entries,))
2391
num_entries = len(lines) - 3
2392
output_lines.append(b'num_entries: %d\n' % (num_entries,))
2167
2393
output_lines.append(inventory_text)
2168
2394
return output_lines
2170
2396
def _make_deleted_row(self, fileid_utf8, parents):
2171
2397
"""Return a deleted row for fileid_utf8."""
2172
return ('/', 'RECYCLED.BIN', 'file', fileid_utf8, 0, DirState.NULLSTAT,
2398
return (b'/', b'RECYCLED.BIN', b'file', fileid_utf8, 0, DirState.NULLSTAT,
2175
2401
def _num_present_parents(self):
2176
2402
"""The number of parent entries in each record row."""
2177
2403
return len(self._parents) - len(self._ghosts)
2180
def on_file(path, sha1_provider=None):
2406
def on_file(cls, path, sha1_provider=None, worth_saving_limit=0,
2407
use_filesystem_for_exec=True):
2181
2408
"""Construct a DirState on the file at path "path".
2183
2410
:param path: The path at which the dirstate file on disk should live.
2184
2411
:param sha1_provider: an object meeting the SHA1Provider interface.
2185
2412
If None, a DefaultSHA1Provider is used.
2413
:param worth_saving_limit: when the exact number of hash changed
2414
entries is known, only bother saving the dirstate if more than
2415
this count of entries have changed. -1 means never save.
2416
:param use_filesystem_for_exec: Whether to trust the filesystem
2417
for executable bit information
2186
2418
:return: An unlocked DirState object, associated with the given path.
2188
2420
if sha1_provider is None:
2189
2421
sha1_provider = DefaultSHA1Provider()
2190
result = DirState(path, sha1_provider)
2422
result = cls(path, sha1_provider,
2423
worth_saving_limit=worth_saving_limit,
2424
use_filesystem_for_exec=use_filesystem_for_exec)
2193
2427
def _read_dirblocks_if_needed(self):
2243
2477
raise errors.BzrError(
2244
2478
'invalid header line: %r' % (header,))
2245
2479
crc_line = self._state_file.readline()
2246
if not crc_line.startswith('crc32: '):
2480
if not crc_line.startswith(b'crc32: '):
2247
2481
raise errors.BzrError('missing crc32 checksum: %r' % crc_line)
2248
self.crc_expected = int(crc_line[len('crc32: '):-1])
2482
self.crc_expected = int(crc_line[len(b'crc32: '):-1])
2249
2483
num_entries_line = self._state_file.readline()
2250
if not num_entries_line.startswith('num_entries: '):
2484
if not num_entries_line.startswith(b'num_entries: '):
2251
2485
raise errors.BzrError('missing num_entries line')
2252
self._num_entries = int(num_entries_line[len('num_entries: '):-1])
2486
self._num_entries = int(num_entries_line[len(b'num_entries: '):-1])
2254
def sha1_from_stat(self, path, stat_result, _pack_stat=pack_stat):
2488
def sha1_from_stat(self, path, stat_result):
2255
2489
"""Find a sha1 given a stat lookup."""
2256
return self._get_packed_stat_index().get(_pack_stat(stat_result), None)
2490
return self._get_packed_stat_index().get(pack_stat(stat_result), None)
2258
2492
def _get_packed_stat_index(self):
2259
2493
"""Get a packed_stat index of self._dirblocks."""
2260
2494
if self._packed_stat_index is None:
2262
2496
for key, tree_details in self._iter_entries():
2263
if tree_details[0][0] == 'f':
2497
if tree_details[0][0] == b'f':
2264
2498
index[tree_details[0][4]] = tree_details[0][1]
2265
2499
self._packed_stat_index = index
2266
2500
return self._packed_stat_index
2283
2517
# Should this be a warning? For now, I'm expecting that places that
2284
2518
# mark it inconsistent will warn, making a warning here redundant.
2285
2519
trace.mutter('Not saving DirState because '
2286
'_changes_aborted is set.')
2288
if (self._header_state == DirState.IN_MEMORY_MODIFIED or
2289
self._dirblock_state == DirState.IN_MEMORY_MODIFIED):
2520
'_changes_aborted is set.')
2522
# TODO: Since we now distinguish IN_MEMORY_MODIFIED from
2523
# IN_MEMORY_HASH_MODIFIED, we should only fail quietly if we fail
2524
# to save an IN_MEMORY_HASH_MODIFIED, and fail *noisily* if we
2525
# fail to save IN_MEMORY_MODIFIED
2526
if not self._worth_saving():
2291
grabbed_write_lock = False
2292
if self._lock_state != 'w':
2293
grabbed_write_lock, new_lock = self._lock_token.temporary_write_lock()
2294
# Switch over to the new lock, as the old one may be closed.
2529
grabbed_write_lock = False
2530
if self._lock_state != 'w':
2531
grabbed_write_lock, new_lock = self._lock_token.temporary_write_lock()
2532
# Switch over to the new lock, as the old one may be closed.
2533
# TODO: jam 20070315 We should validate the disk file has
2534
# not changed contents, since temporary_write_lock may
2535
# not be an atomic operation.
2536
self._lock_token = new_lock
2537
self._state_file = new_lock.f
2538
if not grabbed_write_lock:
2539
# We couldn't grab a write lock, so we switch back to a read one
2542
lines = self.get_lines()
2543
self._state_file.seek(0)
2544
self._state_file.writelines(lines)
2545
self._state_file.truncate()
2546
self._state_file.flush()
2547
self._maybe_fdatasync()
2548
self._mark_unmodified()
2550
if grabbed_write_lock:
2551
self._lock_token = self._lock_token.restore_read_lock()
2552
self._state_file = self._lock_token.f
2295
2553
# TODO: jam 20070315 We should validate the disk file has
2296
# not changed contents. Since temporary_write_lock may
2554
# not changed contents. Since restore_read_lock may
2297
2555
# not be an atomic operation.
2298
self._lock_token = new_lock
2299
self._state_file = new_lock.f
2300
if not grabbed_write_lock:
2301
# We couldn't grab a write lock, so we switch back to a read one
2304
self._state_file.seek(0)
2305
self._state_file.writelines(self.get_lines())
2306
self._state_file.truncate()
2307
self._state_file.flush()
2308
self._header_state = DirState.IN_MEMORY_UNMODIFIED
2309
self._dirblock_state = DirState.IN_MEMORY_UNMODIFIED
2311
if grabbed_write_lock:
2312
self._lock_token = self._lock_token.restore_read_lock()
2313
self._state_file = self._lock_token.f
2314
# TODO: jam 20070315 We should validate the disk file has
2315
# not changed contents. Since restore_read_lock may
2316
# not be an atomic operation.
2557
def _maybe_fdatasync(self):
2558
"""Flush to disk if possible and if not configured off."""
2559
if self._config_stack.get('dirstate.fdatasync'):
2560
osutils.fdatasync(self._state_file.fileno())
2562
def _worth_saving(self):
2563
"""Is it worth saving the dirstate or not?"""
2564
if (self._header_state == DirState.IN_MEMORY_MODIFIED
2565
or self._dirblock_state == DirState.IN_MEMORY_MODIFIED):
2567
if self._dirblock_state == DirState.IN_MEMORY_HASH_MODIFIED:
2568
if self._worth_saving_limit == -1:
2569
# We never save hash changes when the limit is -1
2571
# If we're using smart saving and only a small number of
2572
# entries have changed their hash, don't bother saving. John has
2573
# suggested using a heuristic here based on the size of the
2574
# changed files and/or tree. For now, we go with a configurable
2575
# number of changes, keeping the calculation time
2576
# as low overhead as possible. (This also keeps all existing
2577
# tests passing as the default is 0, i.e. always save.)
2578
if len(self._known_hash_changes) >= self._worth_saving_limit:
2318
2582
def _set_data(self, parent_ids, dirblocks):
2319
2583
"""Set the full dirstate data in memory.
2463
2743
# mapping from path,id. We need to look up the correct path
2464
2744
# for the indexes from 0 to tree_index -1
2465
2745
new_details = []
2466
for lookup_index in xrange(tree_index):
2746
for lookup_index in range(tree_index):
2467
2747
# boundary case: this is the first occurence of file_id
2468
# so there are no id_indexs, possibly take this out of
2748
# so there are no id_indexes, possibly take this out of
2470
if not len(id_index[file_id]):
2750
if not len(entry_keys):
2471
2751
new_details.append(DirState.NULL_PARENT_DETAILS)
2473
2753
# grab any one entry, use it to find the right path.
2474
# TODO: optimise this to reduce memory use in highly
2475
# fragmented situations by reusing the relocation
2477
a_key = iter(id_index[file_id]).next()
2478
if by_path[a_key][lookup_index][0] in ('r', 'a'):
2479
# its a pointer or missing statement, use it as is.
2480
new_details.append(by_path[a_key][lookup_index])
2754
a_key = next(iter(entry_keys))
2755
if by_path[a_key][lookup_index][0] in (b'r', b'a'):
2756
# its a pointer or missing statement, use it as
2759
by_path[a_key][lookup_index])
2482
2761
# we have the right key, make a pointer to it.
2483
real_path = ('/'.join(a_key[0:2])).strip('/')
2484
new_details.append(('r', real_path, 0, False, ''))
2762
real_path = (b'/'.join(a_key[0:2])).strip(b'/')
2763
new_details.append(st(b'r', real_path, 0, False,
2485
2765
new_details.append(self._inv_entry_to_details(entry))
2486
2766
new_details.extend(new_location_suffix)
2487
2767
by_path[new_entry_key] = new_details
2488
id_index[file_id].add(new_entry_key)
2768
self._add_to_id_index(id_index, new_entry_key)
2489
2769
# --- end generation of full tree mappings
2491
2771
# sort and output all the entries
2492
new_entries = self._sort_entries(by_path.items())
2772
new_entries = self._sort_entries(viewitems(by_path))
2493
2773
self._entries_to_current_state(new_entries)
2494
2774
self._parents = [rev_id for rev_id, tree in trees]
2495
2775
self._ghosts = list(ghosts)
2496
self._header_state = DirState.IN_MEMORY_MODIFIED
2497
self._dirblock_state = DirState.IN_MEMORY_MODIFIED
2776
self._mark_modified(header_modified=True)
2498
2777
self._id_index = id_index
2500
2779
def _sort_entries(self, entry_list):
2604
2896
# the minimal required trigger is if the execute bit or cached
2605
2897
# kind has changed.
2606
2898
if (current_old[1][0][3] != current_new[1].executable or
2607
current_old[1][0][0] != current_new_minikind):
2899
current_old[1][0][0] != current_new_minikind):
2609
2901
trace.mutter("Updating in-place change '%s'.",
2610
new_path_utf8.decode('utf8'))
2902
new_path_utf8.decode('utf8'))
2611
2903
self.update_minimal(current_old[0], current_new_minikind,
2612
executable=current_new[1].executable,
2613
path_utf8=new_path_utf8, fingerprint=fingerprint,
2904
executable=current_new[1].executable,
2905
path_utf8=new_path_utf8, fingerprint=fingerprint,
2615
2907
# both sides are dealt with, move on
2616
2908
current_old = advance(old_iterator)
2617
2909
current_new = advance(new_iterator)
2618
elif (cmp_by_dirs(new_dirname, current_old[0][0]) < 0
2619
or (new_dirname == current_old[0][0]
2620
and new_entry_key[1:] < current_old[0][1:])):
2910
elif (lt_by_dirs(new_dirname, current_old[0][0])
2911
or (new_dirname == current_old[0][0] and
2912
new_entry_key[1:] < current_old[0][1:])):
2621
2913
# new comes before:
2622
2914
# add a entry for this and advance new
2624
2916
trace.mutter("Inserting from new '%s'.",
2625
new_path_utf8.decode('utf8'))
2917
new_path_utf8.decode('utf8'))
2626
2918
self.update_minimal(new_entry_key, current_new_minikind,
2627
executable=current_new[1].executable,
2628
path_utf8=new_path_utf8, fingerprint=fingerprint,
2919
executable=current_new[1].executable,
2920
path_utf8=new_path_utf8, fingerprint=fingerprint,
2630
2922
current_new = advance(new_iterator)
2632
2924
# we've advanced past the place where the old key would be,
2633
2925
# without seeing it in the new list. so it must be gone.
2635
2927
trace.mutter("Deleting from old '%s/%s'.",
2636
current_old[0][0].decode('utf8'),
2637
current_old[0][1].decode('utf8'))
2928
current_old[0][0].decode('utf8'),
2929
current_old[0][1].decode('utf8'))
2638
2930
self._make_absent(current_old)
2639
2931
current_old = advance(old_iterator)
2640
self._dirblock_state = DirState.IN_MEMORY_MODIFIED
2932
self._mark_modified()
2641
2933
self._id_index = None
2642
2934
self._packed_stat_index = None
2644
2936
trace.mutter("set_state_from_inventory complete.")
2938
def set_state_from_scratch(self, working_inv, parent_trees, parent_ghosts):
2939
"""Wipe the currently stored state and set it to something new.
2941
This is a hard-reset for the data we are working with.
2943
# Technically, we really want a write lock, but until we write, we
2944
# don't really need it.
2945
self._requires_lock()
2946
# root dir and root dir contents with no children. We have to have a
2947
# root for set_state_from_inventory to work correctly.
2948
empty_root = ((b'', b'', inventory.ROOT_ID),
2949
[(b'd', b'', 0, False, DirState.NULLSTAT)])
2950
empty_tree_dirblocks = [(b'', [empty_root]), (b'', [])]
2951
self._set_data([], empty_tree_dirblocks)
2952
self.set_state_from_inventory(working_inv)
2953
self.set_parent_trees(parent_trees, parent_ghosts)
2646
2955
def _make_absent(self, current_old):
2647
2956
"""Mark current_old - an entry - as absent for tree 0.
2682
2994
update_block_index, present = \
2683
2995
self._find_block_index_from_key(update_key)
2684
2996
if not present:
2685
raise AssertionError('could not find block for %s' % (update_key,))
2997
raise AssertionError(
2998
'could not find block for %s' % (update_key,))
2686
2999
update_entry_index, present = \
2687
self._find_entry_index(update_key, self._dirblocks[update_block_index][1])
3000
self._find_entry_index(
3001
update_key, self._dirblocks[update_block_index][1])
2688
3002
if not present:
2689
raise AssertionError('could not find entry for %s' % (update_key,))
3003
raise AssertionError(
3004
'could not find entry for %s' % (update_key,))
2690
3005
update_tree_details = self._dirblocks[update_block_index][1][update_entry_index][1]
2691
3006
# it must not be absent at the moment
2692
if update_tree_details[0][0] == 'a': # absent
3007
if update_tree_details[0][0] == b'a': # absent
2693
3008
raise AssertionError('bad row %r' % (update_tree_details,))
2694
3009
update_tree_details[0] = DirState.NULL_PARENT_DETAILS
2695
self._dirblock_state = DirState.IN_MEMORY_MODIFIED
3010
self._mark_modified()
2696
3011
return last_reference
2698
def update_minimal(self, key, minikind, executable=False, fingerprint='',
2699
packed_stat=None, size=0, path_utf8=None, fullscan=False):
3013
def update_minimal(self, key, minikind, executable=False, fingerprint=b'',
3014
packed_stat=None, size=0, path_utf8=None, fullscan=False):
2700
3015
"""Update an entry to the state in tree 0.
2702
3017
This will either create a new entry at 'key' or update an existing one.
2803
3123
update_block_index, present = \
2804
3124
self._find_block_index_from_key(other_key)
2805
3125
if not present:
2806
raise AssertionError('could not find block for %s' % (other_key,))
3126
raise AssertionError(
3127
'could not find block for %s' % (other_key,))
2807
3128
update_entry_index, present = \
2808
self._find_entry_index(other_key, self._dirblocks[update_block_index][1])
3129
self._find_entry_index(
3130
other_key, self._dirblocks[update_block_index][1])
2809
3131
if not present:
2810
raise AssertionError('update_minimal: could not find entry for %s' % (other_key,))
3132
raise AssertionError(
3133
'update_minimal: could not find entry for %s' % (other_key,))
2811
3134
update_details = self._dirblocks[update_block_index][1][update_entry_index][1][lookup_index]
2812
if update_details[0] in 'ar': # relocated, absent
3135
if update_details[0] in (b'a', b'r'): # relocated, absent
2813
3136
# its a pointer or absent in lookup_index's tree, use
2815
3138
new_entry[1].append(update_details)
2817
3140
# we have the right key, make a pointer to it.
2818
3141
pointer_path = osutils.pathjoin(*other_key[0:2])
2819
new_entry[1].append(('r', pointer_path, 0, False, ''))
3142
new_entry[1].append(
3143
(b'r', pointer_path, 0, False, b''))
2820
3144
block.insert(entry_index, new_entry)
2821
existing_keys.add(key)
3145
self._add_to_id_index(id_index, key)
2823
3147
# Does the new state matter?
2824
3148
block[entry_index][1][0] = new_details
2842
3171
# other trees, so put absent pointers there
2843
3172
# This is the vertical axis in the matrix, all pointing
2844
3173
# to the real path.
2845
block_index, present = self._find_block_index_from_key(entry_key)
3174
block_index, present = self._find_block_index_from_key(
2846
3176
if not present:
2847
3177
raise AssertionError('not present: %r', entry_key)
2848
entry_index, present = self._find_entry_index(entry_key, self._dirblocks[block_index][1])
3178
entry_index, present = self._find_entry_index(
3179
entry_key, self._dirblocks[block_index][1])
2849
3180
if not present:
2850
3181
raise AssertionError('not present: %r', entry_key)
2851
3182
self._dirblocks[block_index][1][entry_index][1][0] = \
2852
('r', path_utf8, 0, False, '')
3183
(b'r', path_utf8, 0, False, b'')
2853
3184
# add a containing dirblock if needed.
2854
if new_details[0] == 'd':
2855
subdir_key = (osutils.pathjoin(*key[0:2]), '', '')
3185
if new_details[0] == b'd':
3186
# GZ 2017-06-09: Using pathjoin why?
3187
subdir_key = (osutils.pathjoin(*key[0:2]), b'', b'')
2856
3188
block_index, present = self._find_block_index_from_key(subdir_key)
2857
3189
if not present:
2858
3190
self._dirblocks.insert(block_index, (subdir_key[0], []))
2860
self._dirblock_state = DirState.IN_MEMORY_MODIFIED
3192
self._mark_modified()
2862
3194
def _maybe_remove_row(self, block, index, id_index):
2863
3195
"""Remove index if it is absent or relocated across the row.
2865
3197
id_index is updated accordingly.
3198
:return: True if we removed the row, False otherwise
2867
3200
present_in_row = False
2868
3201
entry = block[index]
2869
3202
for column in entry[1]:
2870
if column[0] not in 'ar':
3203
if column[0] not in (b'a', b'r'):
2871
3204
present_in_row = True
2873
3206
if not present_in_row:
2874
3207
block.pop(index)
2875
id_index[entry[0][2]].remove(entry[0])
3208
self._remove_from_id_index(id_index, entry[0])
2877
3212
def _validate(self):
2878
3213
"""Check that invariants on the dirblock are correct.
2958
3293
# We check this with a dict per tree pointing either to the present
2959
3294
# name, or None if absent.
2960
3295
tree_count = self._num_present_parents() + 1
2961
id_path_maps = [dict() for i in range(tree_count)]
3296
id_path_maps = [{} for _ in range(tree_count)]
2962
3297
# Make sure that all renamed entries point to the correct location.
2963
3298
for entry in self._iter_entries():
2964
3299
file_id = entry[0][2]
2965
3300
this_path = osutils.pathjoin(entry[0][0], entry[0][1])
2966
3301
if len(entry[1]) != tree_count:
2967
3302
raise AssertionError(
2968
"wrong number of entry details for row\n%s" \
2969
",\nexpected %d" % \
2970
(pformat(entry), tree_count))
3303
"wrong number of entry details for row\n%s"
3305
(pformat(entry), tree_count))
2971
3306
absent_positions = 0
2972
3307
for tree_index, tree_state in enumerate(entry[1]):
2973
3308
this_tree_map = id_path_maps[tree_index]
2974
3309
minikind = tree_state[0]
2975
if minikind in 'ar':
3310
if minikind in (b'a', b'r'):
2976
3311
absent_positions += 1
2977
3312
# have we seen this id before in this column?
2978
3313
if file_id in this_tree_map:
2979
3314
previous_path, previous_loc = this_tree_map[file_id]
2980
3315
# any later mention of this file must be consistent with
2981
3316
# what was said before
3317
if minikind == b'a':
2983
3318
if previous_path is not None:
2984
3319
raise AssertionError(
2985
"file %s is absent in row %r but also present " \
2987
(file_id, entry, previous_path))
2988
elif minikind == 'r':
3320
"file %s is absent in row %r but also present "
3322
(file_id.decode('utf-8'), entry, previous_path))
3323
elif minikind == b'r':
2989
3324
target_location = tree_state[1]
2990
3325
if previous_path != target_location:
2991
3326
raise AssertionError(
2992
"file %s relocation in row %r but also at %r" \
2993
% (file_id, entry, previous_path))
3327
"file %s relocation in row %r but also at %r"
3328
% (file_id, entry, previous_path))
2995
3330
# a file, directory, etc - may have been previously
2996
3331
# pointed to by a relocation, which must point here
3138
3495
# are calculated at the same time, so checking just the size
3139
3496
# gains nothing w.r.t. performance.
3140
3497
link_or_sha1 = state._sha1_file(abspath)
3141
entry[1][0] = ('f', link_or_sha1, stat_value.st_size,
3498
entry[1][0] = (b'f', link_or_sha1, stat_value.st_size,
3142
3499
executable, packed_stat)
3144
entry[1][0] = ('f', '', stat_value.st_size,
3501
entry[1][0] = (b'f', b'', stat_value.st_size,
3145
3502
executable, DirState.NULLSTAT)
3146
elif minikind == 'd':
3503
worth_saving = False
3504
elif minikind == b'd':
3147
3505
link_or_sha1 = None
3148
entry[1][0] = ('d', '', 0, False, packed_stat)
3149
if saved_minikind != 'd':
3506
entry[1][0] = (b'd', b'', 0, False, packed_stat)
3507
if saved_minikind != b'd':
3150
3508
# This changed from something into a directory. Make sure we
3151
3509
# have a directory block for it. This doesn't happen very
3152
3510
# often, so this doesn't have to be super fast.
3153
3511
block_index, entry_index, dir_present, file_present = \
3154
3512
state._get_block_entry_index(entry[0][0], entry[0][1], 0)
3155
3513
state._ensure_block(block_index, entry_index,
3156
osutils.pathjoin(entry[0][0], entry[0][1]))
3157
elif minikind == 'l':
3514
osutils.pathjoin(entry[0][0], entry[0][1]))
3516
worth_saving = False
3517
elif minikind == b'l':
3518
if saved_minikind == b'l':
3519
worth_saving = False
3158
3520
link_or_sha1 = state._read_link(abspath, saved_link_or_sha1)
3159
3521
if state._cutoff_time is None:
3160
3522
state._sha_cutoff_time()
3161
3523
if (stat_value.st_mtime < state._cutoff_time
3162
and stat_value.st_ctime < state._cutoff_time):
3163
entry[1][0] = ('l', link_or_sha1, stat_value.st_size,
3524
and stat_value.st_ctime < state._cutoff_time):
3525
entry[1][0] = (b'l', link_or_sha1, stat_value.st_size,
3164
3526
False, packed_stat)
3166
entry[1][0] = ('l', '', stat_value.st_size,
3528
entry[1][0] = (b'l', b'', stat_value.st_size,
3167
3529
False, DirState.NULLSTAT)
3168
state._dirblock_state = DirState.IN_MEMORY_MODIFIED
3531
state._mark_modified([entry])
3169
3532
return link_or_sha1
3172
3535
class ProcessEntryPython(object):
3174
3537
__slots__ = ["old_dirname_to_file_id", "new_dirname_to_file_id",
3175
"last_source_parent", "last_target_parent", "include_unchanged",
3176
"partial", "use_filesystem_for_exec", "utf8_decode",
3177
"searched_specific_files", "search_specific_files",
3178
"searched_exact_paths", "search_specific_file_parents", "seen_ids",
3179
"state", "source_index", "target_index", "want_unversioned", "tree"]
3538
"last_source_parent", "last_target_parent", "include_unchanged",
3539
"partial", "use_filesystem_for_exec", "utf8_decode",
3540
"searched_specific_files", "search_specific_files",
3541
"searched_exact_paths", "search_specific_file_parents", "seen_ids",
3542
"state", "source_index", "target_index", "want_unversioned", "tree"]
3181
3544
def __init__(self, include_unchanged, use_filesystem_for_exec,
3182
search_specific_files, state, source_index, target_index,
3183
want_unversioned, tree):
3545
search_specific_files, state, source_index, target_index,
3546
want_unversioned, tree):
3184
3547
self.old_dirname_to_file_id = {}
3185
3548
self.new_dirname_to_file_id = {}
3186
3549
# Are we doing a partial iter_changes?
3187
self.partial = search_specific_files != set([''])
3550
self.partial = search_specific_files != {''}
3188
3551
# Using a list so that we can access the values and change them in
3189
3552
# nested scope. Each one is [path, file_id, entry]
3190
3553
self.last_source_parent = [None, None]
3430
3798
target_exec = target_details[3]
3431
3799
return (entry[0][2],
3432
(None, self.utf8_decode(path)[0]),
3436
(None, self.utf8_decode(entry[0][1])[0]),
3437
(None, path_info[2]),
3438
(None, target_exec)), True
3800
(None, self.utf8_decode(path)[0]),
3804
(None, self.utf8_decode(entry[0][1])[0]),
3805
(None, path_info[2]),
3806
(None, target_exec)), True
3440
3808
# Its a missing file, report it as such.
3441
3809
return (entry[0][2],
3442
(None, self.utf8_decode(path)[0]),
3446
(None, self.utf8_decode(entry[0][1])[0]),
3448
(None, False)), True
3449
elif source_minikind in 'fdlt' and target_minikind in 'a':
3810
(None, self.utf8_decode(path)[0]),
3814
(None, self.utf8_decode(entry[0][1])[0]),
3816
(None, False)), True
3817
elif source_minikind in _fdlt and target_minikind in b'a':
3450
3818
# unversioned, possibly, or possibly not deleted: we dont care.
3451
3819
# if its still on disk, *and* theres no other entry at this
3452
3820
# path [we dont know this in this routine at the moment -
3453
3821
# perhaps we should change this - then it would be an unknown.
3454
3822
old_path = pathjoin(entry[0][0], entry[0][1])
3455
3823
# parent id is the entry for the path in the target tree
3456
parent_id = self.state._get_entry(self.source_index, path_utf8=entry[0][0])[0][2]
3824
parent_id = self.state._get_entry(
3825
self.source_index, path_utf8=entry[0][0])[0][2]
3457
3826
if parent_id == entry[0][2]:
3458
3827
parent_id = None
3459
3828
return (entry[0][2],
3460
(self.utf8_decode(old_path)[0], None),
3464
(self.utf8_decode(entry[0][1])[0], None),
3465
(DirState._minikind_to_kind[source_minikind], None),
3466
(source_details[3], None)), True
3467
elif source_minikind in 'fdlt' and target_minikind in 'r':
3829
(self.utf8_decode(old_path)[0], None),
3833
(self.utf8_decode(entry[0][1])[0], None),
3834
(DirState._minikind_to_kind[source_minikind], None),
3835
(source_details[3], None)), True
3836
elif source_minikind in _fdlt and target_minikind in b'r':
3468
3837
# a rename; could be a true rename, or a rename inherited from
3469
3838
# a renamed parent. TODO: handle this efficiently. Its not
3470
3839
# common case to rename dirs though, so a correct but slow
3471
3840
# implementation will do.
3472
if not osutils.is_inside_any(self.searched_specific_files, target_details[1]):
3841
if not osutils.is_inside_any(self.searched_specific_files,
3473
3843
self.search_specific_files.add(target_details[1])
3474
elif source_minikind in 'ra' and target_minikind in 'ra':
3844
elif source_minikind in _ra and target_minikind in _ra:
3475
3845
# neither of the selected trees contain this file,
3476
3846
# so skip over it. This is not currently directly tested, but
3477
3847
# is indirectly via test_too_much.TestCommands.test_conflicts.
3480
3850
raise AssertionError("don't know how to compare "
3481
"source_minikind=%r, target_minikind=%r"
3482
% (source_minikind, target_minikind))
3483
## import pdb;pdb.set_trace()
3851
"source_minikind=%r, target_minikind=%r"
3852
% (source_minikind, target_minikind))
3484
3853
return None, None
3486
3855
def __iter__(self):
3629
4000
if e.errno in (errno.ENOENT, errno.ENOTDIR, errno.EINVAL):
3630
4001
current_dir_info = None
3631
4002
elif (sys.platform == 'win32'
3632
and (e.errno in win_errors
3633
or e_winerror in win_errors)):
4003
and (e.errno in win_errors or
4004
e_winerror in win_errors)):
3634
4005
current_dir_info = None
3638
if current_dir_info[0][0] == '':
4009
if current_dir_info[0][0] == b'':
3639
4010
# remove .bzr from iteration
3640
bzr_index = bisect.bisect_left(current_dir_info[1], ('.bzr',))
3641
if current_dir_info[1][bzr_index][0] != '.bzr':
4011
bzr_index = bisect.bisect_left(
4012
current_dir_info[1], (b'.bzr',))
4013
if current_dir_info[1][bzr_index][0] != b'.bzr':
3642
4014
raise AssertionError()
3643
4015
del current_dir_info[1][bzr_index]
3644
4016
# walk until both the directory listing and the versioned metadata
3645
4017
# are exhausted.
3646
4018
if (block_index < len(self.state._dirblocks) and
3647
osutils.is_inside(current_root, self.state._dirblocks[block_index][0])):
4019
osutils.is_inside(current_root,
4020
self.state._dirblocks[block_index][0])):
3648
4021
current_block = self.state._dirblocks[block_index]
3650
4023
current_block = None
3651
4024
while (current_dir_info is not None or
3652
4025
current_block is not None):
3653
4026
if (current_dir_info and current_block
3654
and current_dir_info[0][0] != current_block[0]):
3655
if _cmp_by_dirs(current_dir_info[0][0], current_block[0]) < 0:
4027
and current_dir_info[0][0] != current_block[0]):
4028
if _lt_by_dirs(current_dir_info[0][0], current_block[0]):
3656
4029
# filesystem data refers to paths not covered by the dirblock.
3657
4030
# this has two possibilities:
3658
4031
# A) it is versioned but empty, so there is no block for it
3664
4037
# recurse into unknown directories.
3666
4039
while path_index < len(current_dir_info[1]):
3667
current_path_info = current_dir_info[1][path_index]
3668
if self.want_unversioned:
3669
if current_path_info[2] == 'directory':
3670
if self.tree._directory_is_tree_reference(
4040
current_path_info = current_dir_info[1][path_index]
4041
if self.want_unversioned:
4042
if current_path_info[2] == 'directory':
4043
if self.tree._directory_is_tree_reference(
3671
4044
current_path_info[0].decode('utf8')):
3672
current_path_info = current_path_info[:2] + \
3673
('tree-reference',) + current_path_info[3:]
3674
new_executable = bool(
3675
stat.S_ISREG(current_path_info[3].st_mode)
3676
and stat.S_IEXEC & current_path_info[3].st_mode)
3678
(None, utf8_decode(current_path_info[0])[0]),
3682
(None, utf8_decode(current_path_info[1])[0]),
3683
(None, current_path_info[2]),
3684
(None, new_executable))
3685
# dont descend into this unversioned path if it is
3687
if current_path_info[2] in ('directory',
3689
del current_dir_info[1][path_index]
4045
current_path_info = current_path_info[:2] + \
4046
('tree-reference',) + \
4047
current_path_info[3:]
4048
new_executable = bool(
4049
stat.S_ISREG(current_path_info[3].st_mode)
4050
and stat.S_IEXEC & current_path_info[3].st_mode)
4053
current_path_info[0])[0]),
4058
current_path_info[1])[0]),
4059
(None, current_path_info[2]),
4060
(None, new_executable))
4061
# dont descend into this unversioned path if it is
4063
if current_path_info[2] in ('directory',
4065
del current_dir_info[1][path_index]
3693
4069
# This dir info has been handled, go to the next
3695
current_dir_info = dir_iterator.next()
4071
current_dir_info = next(dir_iterator)
3696
4072
except StopIteration:
3697
4073
current_dir_info = None
3900
4284
raise AssertionError(
3901
4285
"Got entry<->path mismatch for specific path "
3902
4286
"%r entry %r path_info %r " % (
3903
path_utf8, entry, path_info))
4287
path_utf8, entry, path_info))
3904
4288
# Only include changes - we're outside the users requested
3907
4291
self._gather_result_for_consistency(result)
3908
4292
if (result[6][0] == 'directory' and
3909
result[6][1] != 'directory'):
4293
result[6][1] != 'directory'):
3910
4294
# This stopped being a directory, the old children have
3911
4295
# to be included.
3912
if entry[1][self.source_index][0] == 'r':
4296
if entry[1][self.source_index][0] == b'r':
3913
4297
# renamed, take the source path
3914
4298
entry_path_utf8 = entry[1][self.source_index][1]
3916
4300
entry_path_utf8 = path_utf8
3917
initial_key = (entry_path_utf8, '', '')
4301
initial_key = (entry_path_utf8, b'', b'')
3918
4302
block_index, _ = self.state._find_block_index_from_key(
3920
4304
if block_index == 0:
3921
4305
# The children of the root are in block index 1.
3923
4307
current_block = None
3924
4308
if block_index < len(self.state._dirblocks):
3925
4309
current_block = self.state._dirblocks[block_index]
3926
4310
if not osutils.is_inside(
3927
entry_path_utf8, current_block[0]):
4311
entry_path_utf8, current_block[0]):
3928
4312
# No entries for this directory at all.
3929
4313
current_block = None
3930
4314
if current_block is not None:
3931
4315
for entry in current_block[1]:
3932
if entry[1][self.source_index][0] in 'ar':
4316
if entry[1][self.source_index][0] in (b'a', b'r'):
3933
4317
# Not in the source tree, so doesn't have to be