20
20
lines by NL. The field delimiters are ommitted in the grammar, line delimiters
21
21
are not - this is done for clarity of reading. All string data is in utf8.
23
MINIKIND = "f" | "d" | "l" | "a" | "r" | "t";
26
WHOLE_NUMBER = {digit}, digit;
28
REVISION_ID = a non-empty utf8 string;
30
dirstate format = header line, full checksum, row count, parent details,
31
ghost_details, entries;
32
header line = "#bazaar dirstate flat format 3", NL;
33
full checksum = "crc32: ", ["-"], WHOLE_NUMBER, NL;
34
row count = "num_entries: ", WHOLE_NUMBER, NL;
35
parent_details = WHOLE NUMBER, {REVISION_ID}* NL;
36
ghost_details = WHOLE NUMBER, {REVISION_ID}*, NL;
38
entry = entry_key, current_entry_details, {parent_entry_details};
39
entry_key = dirname, basename, fileid;
40
current_entry_details = common_entry_details, working_entry_details;
41
parent_entry_details = common_entry_details, history_entry_details;
42
common_entry_details = MINIKIND, fingerprint, size, executable
43
working_entry_details = packed_stat
44
history_entry_details = REVISION_ID;
47
fingerprint = a nonempty utf8 sequence with meaning defined by minikind.
49
Given this definition, the following is useful to know:
50
entry (aka row) - all the data for a given key.
51
entry[0]: The key (dirname, basename, fileid)
55
entry[1]: The tree(s) data for this path and id combination.
56
entry[1][0]: The current tree
57
entry[1][1]: The second tree
59
For an entry for a tree, we have (using tree 0 - current tree) to demonstrate:
60
entry[1][0][0]: minikind
61
entry[1][0][1]: fingerprint
63
entry[1][0][3]: executable
64
entry[1][0][4]: packed_stat
66
entry[1][1][4]: revision_id
25
MINIKIND = "f" | "d" | "l" | "a" | "r" | "t";
28
WHOLE_NUMBER = {digit}, digit;
30
REVISION_ID = a non-empty utf8 string;
32
dirstate format = header line, full checksum, row count, parent details,
33
ghost_details, entries;
34
header line = "#bazaar dirstate flat format 3", NL;
35
full checksum = "crc32: ", ["-"], WHOLE_NUMBER, NL;
36
row count = "num_entries: ", WHOLE_NUMBER, NL;
37
parent_details = WHOLE NUMBER, {REVISION_ID}* NL;
38
ghost_details = WHOLE NUMBER, {REVISION_ID}*, NL;
40
entry = entry_key, current_entry_details, {parent_entry_details};
41
entry_key = dirname, basename, fileid;
42
current_entry_details = common_entry_details, working_entry_details;
43
parent_entry_details = common_entry_details, history_entry_details;
44
common_entry_details = MINIKIND, fingerprint, size, executable
45
working_entry_details = packed_stat
46
history_entry_details = REVISION_ID;
49
fingerprint = a nonempty utf8 sequence with meaning defined by minikind.
51
Given this definition, the following is useful to know::
53
entry (aka row) - all the data for a given key.
54
entry[0]: The key (dirname, basename, fileid)
58
entry[1]: The tree(s) data for this path and id combination.
59
entry[1][0]: The current tree
60
entry[1][1]: The second tree
62
For an entry for a tree, we have (using tree 0 - current tree) to demonstrate::
64
entry[1][0][0]: minikind
65
entry[1][0][1]: fingerprint
67
entry[1][0][3]: executable
68
entry[1][0][4]: packed_stat
72
entry[1][1][4]: revision_id
68
74
There may be multiple rows at the root, one per id present in the root, so the
69
in memory root row is now:
70
self._dirblocks[0] -> ('', [entry ...]),
71
and the entries in there are
74
entries[0][2]: file_id
75
entries[1][0]: The tree data for the current tree for this fileid at /
79
'r' is a relocated entry: This path is not present in this tree with this id,
80
but the id can be found at another location. The fingerprint is used to
81
point to the target location.
82
'a' is an absent entry: In that tree the id is not present at this path.
83
'd' is a directory entry: This path in this tree is a directory with the
84
current file id. There is no fingerprint for directories.
85
'f' is a file entry: As for directory, but it's a file. The fingerprint is the
86
sha1 value of the file's canonical form, i.e. after any read filters have
87
been applied to the convenience form stored in the working tree.
88
'l' is a symlink entry: As for directory, but a symlink. The fingerprint is the
90
't' is a reference to a nested subtree; the fingerprint is the referenced
75
in memory root row is now::
77
self._dirblocks[0] -> ('', [entry ...]),
79
and the entries in there are::
83
entries[0][2]: file_id
84
entries[1][0]: The tree data for the current tree for this fileid at /
89
b'r' is a relocated entry: This path is not present in this tree with this
90
id, but the id can be found at another location. The fingerprint is
91
used to point to the target location.
92
b'a' is an absent entry: In that tree the id is not present at this path.
93
b'd' is a directory entry: This path in this tree is a directory with the
94
current file id. There is no fingerprint for directories.
95
b'f' is a file entry: As for directory, but it's a file. The fingerprint is
96
the sha1 value of the file's canonical form, i.e. after any read
97
filters have been applied to the convenience form stored in the working
99
b'l' is a symlink entry: As for directory, but a symlink. The fingerprint is
101
b't' is a reference to a nested subtree; the fingerprint is the referenced
95
The entries on disk and in memory are ordered according to the following keys:
106
The entries on disk and in memory are ordered according to the following keys::
97
108
directory, as a list of components
101
112
--- Format 1 had the following different definition: ---
102
rows = dirname, NULL, basename, NULL, MINIKIND, NULL, fileid_utf8, NULL,
103
WHOLE NUMBER (* size *), NULL, packed stat, NULL, sha1|symlink target,
105
PARENT ROW = NULL, revision_utf8, NULL, MINIKIND, NULL, dirname, NULL,
106
basename, NULL, WHOLE NUMBER (* size *), NULL, "y" | "n", NULL,
116
rows = dirname, NULL, basename, NULL, MINIKIND, NULL, fileid_utf8, NULL,
117
WHOLE NUMBER (* size *), NULL, packed stat, NULL, sha1|symlink target,
119
PARENT ROW = NULL, revision_utf8, NULL, MINIKIND, NULL, dirname, NULL,
120
basename, NULL, WHOLE NUMBER (* size *), NULL, "y" | "n", NULL,
109
123
PARENT ROW's are emitted for every parent that is not in the ghosts details
110
124
line. That is, if the parents are foo, bar, baz, and the ghosts are bar, then
231
261
ERROR_DIRECTORY = 267
234
if not getattr(struct, '_compile', None):
235
# Cannot pre-compile the dirstate pack_stat
236
def pack_stat(st, _encode=binascii.b2a_base64, _pack=struct.pack):
237
"""Convert stat values into a packed representation."""
238
return _encode(_pack('>LLLLLL', st.st_size, int(st.st_mtime),
239
int(st.st_ctime), st.st_dev, st.st_ino & 0xFFFFFFFF,
242
# compile the struct compiler we need, so as to only do it once
243
from _struct import Struct
244
_compiled_pack = Struct('>LLLLLL').pack
245
def pack_stat(st, _encode=binascii.b2a_base64, _pack=_compiled_pack):
246
"""Convert stat values into a packed representation."""
247
# jam 20060614 it isn't really worth removing more entries if we
248
# are going to leave it in packed form.
249
# With only st_mtime and st_mode filesize is 5.5M and read time is 275ms
250
# With all entries, filesize is 5.9M and read time is maybe 280ms
251
# well within the noise margin
253
# base64 encoding always adds a final newline, so strip it off
254
# The current version
255
return _encode(_pack(st.st_size, int(st.st_mtime), int(st.st_ctime),
256
st.st_dev, st.st_ino & 0xFFFFFFFF, st.st_mode))[:-1]
257
# This is 0.060s / 1.520s faster by not encoding as much information
258
# return _encode(_pack('>LL', int(st.st_mtime), st.st_mode))[:-1]
259
# This is not strictly faster than _encode(_pack())[:-1]
260
# return '%X.%X.%X.%X.%X.%X' % (
261
# st.st_size, int(st.st_mtime), int(st.st_ctime),
262
# st.st_dev, st.st_ino, st.st_mode)
263
# Similar to the _encode(_pack('>LL'))
264
# return '%X.%X' % (int(st.st_mtime), st.st_mode)
264
class DirstateCorrupt(errors.BzrError):
266
_fmt = "The dirstate file (%(state)s) appears to be corrupt: %(msg)s"
268
def __init__(self, state, msg):
269
errors.BzrError.__init__(self)
267
274
class SHA1Provider(object):
354
358
NOT_IN_MEMORY = 0
355
359
IN_MEMORY_UNMODIFIED = 1
356
360
IN_MEMORY_MODIFIED = 2
361
IN_MEMORY_HASH_MODIFIED = 3 # Only hash-cache updates
358
363
# A pack_stat (the x's) that is just noise and will never match the output
359
364
# of base64 encode.
361
NULL_PARENT_DETAILS = ('a', '', 0, False, '')
363
HEADER_FORMAT_2 = '#bazaar dirstate flat format 2\n'
364
HEADER_FORMAT_3 = '#bazaar dirstate flat format 3\n'
366
def __init__(self, path, sha1_provider):
366
NULL_PARENT_DETAILS = static_tuple.StaticTuple(b'a', b'', 0, False, b'')
368
HEADER_FORMAT_2 = b'#bazaar dirstate flat format 2\n'
369
HEADER_FORMAT_3 = b'#bazaar dirstate flat format 3\n'
371
def __init__(self, path, sha1_provider, worth_saving_limit=0,
372
use_filesystem_for_exec=True):
367
373
"""Create a DirState object.
369
375
:param path: The path at which the dirstate file on disk should live.
370
376
:param sha1_provider: an object meeting the SHA1Provider interface.
377
:param worth_saving_limit: when the exact number of hash changed
378
entries is known, only bother saving the dirstate if more than
379
this count of entries have changed.
380
-1 means never save hash changes, 0 means always save hash changes.
381
:param use_filesystem_for_exec: Whether to trust the filesystem
382
for executable bit information
372
384
# _header_state and _dirblock_state represent the current state
373
385
# of the dirstate metadata and the per-row data respectiely.
411
423
self._last_block_index = None
412
424
self._last_entry_index = None
425
# The set of known hash changes
426
self._known_hash_changes = set()
427
# How many hash changed entries can we have without saving
428
self._worth_saving_limit = worth_saving_limit
429
self._config_stack = config.LocationStack(urlutils.local_path_to_url(
431
self._use_filesystem_for_exec = use_filesystem_for_exec
414
433
def __repr__(self):
415
434
return "%s(%r)" % \
416
435
(self.__class__.__name__, self._filename)
437
def _mark_modified(self, hash_changed_entries=None, header_modified=False):
438
"""Mark this dirstate as modified.
440
:param hash_changed_entries: if non-None, mark just these entries as
441
having their hash modified.
442
:param header_modified: mark the header modified as well, not just the
445
#trace.mutter_callsite(3, "modified hash entries: %s", hash_changed_entries)
446
if hash_changed_entries:
447
self._known_hash_changes.update(
448
[e[0] for e in hash_changed_entries])
449
if self._dirblock_state in (DirState.NOT_IN_MEMORY,
450
DirState.IN_MEMORY_UNMODIFIED):
451
# If the dirstate is already marked a IN_MEMORY_MODIFIED, then
452
# that takes precedence.
453
self._dirblock_state = DirState.IN_MEMORY_HASH_MODIFIED
455
# TODO: Since we now have a IN_MEMORY_HASH_MODIFIED state, we
456
# should fail noisily if someone tries to set
457
# IN_MEMORY_MODIFIED but we don't have a write-lock!
458
# We don't know exactly what changed so disable smart saving
459
self._dirblock_state = DirState.IN_MEMORY_MODIFIED
461
self._header_state = DirState.IN_MEMORY_MODIFIED
463
def _mark_unmodified(self):
464
"""Mark this dirstate as unmodified."""
465
self._header_state = DirState.IN_MEMORY_UNMODIFIED
466
self._dirblock_state = DirState.IN_MEMORY_UNMODIFIED
467
self._known_hash_changes = set()
418
469
def add(self, path, file_id, kind, stat, fingerprint):
419
470
"""Add a path to be tracked.
421
:param path: The path within the dirstate - '' is the root, 'foo' is the
472
:param path: The path within the dirstate - b'' is the root, 'foo' is the
422
473
path foo within the root, 'foo/bar' is the path bar within foo
424
475
:param file_id: The file id of the path being added.
457
508
utf8path = (dirname + '/' + basename).strip('/').encode('utf8')
458
509
dirname, basename = osutils.split(utf8path)
459
510
# uses __class__ for speed; the check is needed for safety
460
if file_id.__class__ is not str:
511
if file_id.__class__ is not bytes:
461
512
raise AssertionError(
462
513
"must be a utf8 file_id not %s" % (type(file_id), ))
463
514
# Make sure the file_id does not exist in this tree
464
515
rename_from = None
465
file_id_entry = self._get_entry(0, fileid_utf8=file_id, include_deleted=True)
516
file_id_entry = self._get_entry(
517
0, fileid_utf8=file_id, include_deleted=True)
466
518
if file_id_entry != (None, None):
467
if file_id_entry[1][0][0] == 'a':
519
if file_id_entry[1][0][0] == b'a':
468
520
if file_id_entry[0] != (dirname, basename, file_id):
469
521
# set the old name's current operation to rename
470
522
self.update_minimal(file_id_entry[0],
476
528
rename_from = file_id_entry[0][0:2]
478
path = osutils.pathjoin(file_id_entry[0][0], file_id_entry[0][1])
530
path = osutils.pathjoin(
531
file_id_entry[0][0], file_id_entry[0][1])
479
532
kind = DirState._minikind_to_kind[file_id_entry[1][0][0]]
480
533
info = '%s:%s' % (kind, path)
481
534
raise errors.DuplicateFileId(file_id, info)
482
first_key = (dirname, basename, '')
535
first_key = (dirname, basename, b'')
483
536
block_index, present = self._find_block_index_from_key(first_key)
485
538
# check the path is not in the tree
486
539
block = self._dirblocks[block_index][1]
487
540
entry_index, _ = self._find_entry_index(first_key, block)
488
541
while (entry_index < len(block) and
489
block[entry_index][0][0:2] == first_key[0:2]):
490
if block[entry_index][1][0][0] not in 'ar':
542
block[entry_index][0][0:2] == first_key[0:2]):
543
if block[entry_index][1][0][0] not in (b'a', b'r'):
491
544
# this path is in the dirstate in the current tree.
492
raise Exception, "adding already added path!"
545
raise Exception("adding already added path!")
495
548
# The block where we want to put the file is not present. But it
1351
1423
fingerprint, new_child_path)
1352
1424
self._check_delta_ids_absent(new_ids, delta, 0)
1354
self._apply_removals(removals.iteritems())
1355
self._apply_insertions(insertions.values())
1426
self._apply_removals(viewitems(removals))
1427
self._apply_insertions(viewvalues(insertions))
1356
1428
# Validate parents
1357
1429
self._after_delta_check_parents(parents, 0)
1358
except errors.BzrError, e:
1430
except errors.BzrError as e:
1359
1431
self._changes_aborted = True
1360
1432
if 'integrity error' not in str(e):
1362
1434
# _get_entry raises BzrError when a request is inconsistent; we
1363
# want such errors to be shown as InconsistentDelta - and that
1435
# want such errors to be shown as InconsistentDelta - and that
1364
1436
# fits the behaviour we trigger.
1365
raise errors.InconsistentDeltaDelta(delta, "error from _get_entry.")
1437
raise errors.InconsistentDeltaDelta(delta,
1438
"error from _get_entry. %s" % (e,))
1367
1440
def _apply_removals(self, removals):
1368
1441
for file_id, path in sorted(removals, reverse=True,
1369
key=operator.itemgetter(1)):
1442
key=operator.itemgetter(1)):
1370
1443
dirname, basename = osutils.split(path)
1371
1444
block_i, entry_i, d_present, f_present = \
1372
1445
self._get_block_entry_index(dirname, basename, 0)
1374
1447
entry = self._dirblocks[block_i][1][entry_i]
1375
1448
except IndexError:
1376
self._changes_aborted = True
1377
raise errors.InconsistentDelta(path, file_id,
1378
"Wrong path for old path.")
1379
if not f_present or entry[1][0][0] in 'ar':
1380
self._changes_aborted = True
1381
raise errors.InconsistentDelta(path, file_id,
1382
"Wrong path for old path.")
1449
self._raise_invalid(path, file_id,
1450
"Wrong path for old path.")
1451
if not f_present or entry[1][0][0] in (b'a', b'r'):
1452
self._raise_invalid(path, file_id,
1453
"Wrong path for old path.")
1383
1454
if file_id != entry[0][2]:
1384
self._changes_aborted = True
1385
raise errors.InconsistentDelta(path, file_id,
1386
"Attempt to remove path has wrong id - found %r."
1455
self._raise_invalid(path, file_id,
1456
"Attempt to remove path has wrong id - found %r."
1388
1458
self._make_absent(entry)
1389
1459
# See if we have a malformed delta: deleting a directory must not
1390
1460
# leave crud behind. This increases the number of bisects needed
1471
1536
new_ids = set()
1472
1537
for old_path, new_path, file_id, inv_entry in delta:
1538
if file_id.__class__ is not bytes:
1539
raise AssertionError(
1540
"must be a utf8 file_id not %s" % (type(file_id), ))
1473
1541
if inv_entry is not None and file_id != inv_entry.file_id:
1474
raise errors.InconsistentDelta(new_path, file_id,
1475
"mismatched entry file_id %r" % inv_entry)
1476
if new_path is not None:
1542
self._raise_invalid(new_path, file_id,
1543
"mismatched entry file_id %r" % inv_entry)
1544
if new_path is None:
1545
new_path_utf8 = None
1477
1547
if inv_entry is None:
1478
raise errors.InconsistentDelta(new_path, file_id,
1479
"new_path with no entry")
1548
self._raise_invalid(new_path, file_id,
1549
"new_path with no entry")
1480
1550
new_path_utf8 = encode(new_path)
1481
1551
# note the parent for validation
1482
1552
dirname_utf8, basename_utf8 = osutils.split(new_path_utf8)
1483
1553
if basename_utf8:
1484
1554
parents.add((dirname_utf8, inv_entry.parent_id))
1485
1555
if old_path is None:
1486
adds.append((None, encode(new_path), file_id,
1487
inv_to_entry(inv_entry), True))
1556
old_path_utf8 = None
1558
old_path_utf8 = encode(old_path)
1559
if old_path is None:
1560
adds.append((None, new_path_utf8, file_id,
1561
inv_to_entry(inv_entry), True))
1488
1562
new_ids.add(file_id)
1489
1563
elif new_path is None:
1490
deletes.append((encode(old_path), None, file_id, None, True))
1491
elif (old_path, new_path) != root_only:
1564
deletes.append((old_path_utf8, None, file_id, None, True))
1565
elif (old_path, new_path) == root_only:
1566
# change things in-place
1567
# Note: the case of a parent directory changing its file_id
1568
# tends to break optimizations here, because officially
1569
# the file has actually been moved, it just happens to
1570
# end up at the same path. If we can figure out how to
1571
# handle that case, we can avoid a lot of add+delete
1572
# pairs for objects that stay put.
1573
# elif old_path == new_path:
1574
changes.append((old_path_utf8, new_path_utf8, file_id,
1575
inv_to_entry(inv_entry)))
1493
1578
# Because renames must preserve their children we must have
1494
1579
# processed all relocations and removes before hand. The sort
1497
1582
# pair will result in the deleted item being reinserted, or
1498
1583
# renamed items being reinserted twice - and possibly at the
1499
1584
# wrong place. Splitting into a delete/add pair also simplifies
1500
# the handling of entries with ('f', ...), ('r' ...) because
1501
# the target of the 'r' is old_path here, and we add that to
1585
# the handling of entries with (b'f', ...), (b'r' ...) because
1586
# the target of the b'r' is old_path here, and we add that to
1502
1587
# deletes, meaning that the add handler does not need to check
1503
# for 'r' items on every pass.
1588
# for b'r' items on every pass.
1504
1589
self._update_basis_apply_deletes(deletes)
1506
1591
# Split into an add/delete pair recursively.
1507
adds.append((None, new_path_utf8, file_id,
1508
inv_to_entry(inv_entry), False))
1592
adds.append((old_path_utf8, new_path_utf8, file_id,
1593
inv_to_entry(inv_entry), False))
1509
1594
# Expunge deletes that we've seen so that deleted/renamed
1510
1595
# children of a rename directory are handled correctly.
1511
new_deletes = reversed(list(self._iter_child_entries(1,
1596
new_deletes = reversed(list(
1597
self._iter_child_entries(1, old_path_utf8)))
1513
1598
# Remove the current contents of the tree at orig_path, and
1514
1599
# reinsert at the correct new path.
1515
1600
for entry in new_deletes:
1517
source_path = entry[0][0] + '/' + entry[0][1]
1601
child_dirname, child_basename, child_file_id = entry[0]
1603
source_path = child_dirname + b'/' + child_basename
1519
source_path = entry[0][1]
1605
source_path = child_basename
1520
1606
if new_path_utf8:
1521
target_path = new_path_utf8 + source_path[len(old_path):]
1608
new_path_utf8 + source_path[len(old_path_utf8):]
1610
if old_path_utf8 == b'':
1524
1611
raise AssertionError("cannot rename directory to"
1526
target_path = source_path[len(old_path) + 1:]
1527
adds.append((None, target_path, entry[0][2], entry[1][1], False))
1613
target_path = source_path[len(old_path_utf8) + 1:]
1615
(None, target_path, entry[0][2], entry[1][1], False))
1528
1616
deletes.append(
1529
1617
(source_path, target_path, entry[0][2], None, False))
1530
1618
deletes.append(
1531
(encode(old_path), new_path, file_id, None, False))
1533
# changes to just the root should not require remove/insertion
1535
changes.append((encode(old_path), encode(new_path), file_id,
1536
inv_to_entry(inv_entry)))
1619
(old_path_utf8, new_path_utf8, file_id, None, False))
1537
1621
self._check_delta_ids_absent(new_ids, delta, 1)
1539
1623
# Finish expunging deletes/first half of renames.
1597
1680
# Adds are accumulated partly from renames, so can be in any input
1598
1681
# order - sort it.
1682
# TODO: we may want to sort in dirblocks order. That way each entry
1683
# will end up in the same directory, allowing the _get_entry
1684
# fast-path for looking up 2 items in the same dir work.
1685
adds.sort(key=lambda x: x[1])
1600
1686
# adds is now in lexographic order, which places all parents before
1601
1687
# their children, so we can process it linearly.
1688
st = static_tuple.StaticTuple
1603
1689
for old_path, new_path, file_id, new_details, real_add in adds:
1604
# the entry for this file_id must be in tree 0.
1605
entry = self._get_entry(0, file_id, new_path)
1606
if entry[0] is None or entry[0][2] != file_id:
1607
self._changes_aborted = True
1608
raise errors.InconsistentDelta(new_path, file_id,
1609
'working tree does not contain new entry')
1610
if real_add and entry[1][1][0] not in absent:
1611
self._changes_aborted = True
1612
raise errors.InconsistentDelta(new_path, file_id,
1613
'The entry was considered to be a genuinely new record,'
1614
' but there was already an old record for it.')
1615
# We don't need to update the target of an 'r' because the handling
1616
# of renames turns all 'r' situations into a delete at the original
1618
entry[1][1] = new_details
1690
dirname, basename = osutils.split(new_path)
1691
entry_key = st(dirname, basename, file_id)
1692
block_index, present = self._find_block_index_from_key(entry_key)
1694
# The block where we want to put the file is not present.
1695
# However, it might have just been an empty directory. Look for
1696
# the parent in the basis-so-far before throwing an error.
1697
parent_dir, parent_base = osutils.split(dirname)
1698
parent_block_idx, parent_entry_idx, _, parent_present = \
1699
self._get_block_entry_index(parent_dir, parent_base, 1)
1700
if not parent_present:
1701
self._raise_invalid(new_path, file_id,
1702
"Unable to find block for this record."
1703
" Was the parent added?")
1704
self._ensure_block(parent_block_idx, parent_entry_idx, dirname)
1706
block = self._dirblocks[block_index][1]
1707
entry_index, present = self._find_entry_index(entry_key, block)
1709
if old_path is not None:
1710
self._raise_invalid(new_path, file_id,
1711
'considered a real add but still had old_path at %s'
1714
entry = block[entry_index]
1715
basis_kind = entry[1][1][0]
1716
if basis_kind == b'a':
1717
entry[1][1] = new_details
1718
elif basis_kind == b'r':
1719
raise NotImplementedError()
1721
self._raise_invalid(new_path, file_id,
1722
"An entry was marked as a new add"
1723
" but the basis target already existed")
1725
# The exact key was not found in the block. However, we need to
1726
# check if there is a key next to us that would have matched.
1727
# We only need to check 2 locations, because there are only 2
1729
for maybe_index in range(entry_index - 1, entry_index + 1):
1730
if maybe_index < 0 or maybe_index >= len(block):
1732
maybe_entry = block[maybe_index]
1733
if maybe_entry[0][:2] != (dirname, basename):
1734
# Just a random neighbor
1736
if maybe_entry[0][2] == file_id:
1737
raise AssertionError(
1738
'_find_entry_index didnt find a key match'
1739
' but walking the data did, for %s'
1741
basis_kind = maybe_entry[1][1][0]
1742
if basis_kind not in (b'a', b'r'):
1743
self._raise_invalid(new_path, file_id,
1744
"we have an add record for path, but the path"
1745
" is already present with another file_id %s"
1746
% (maybe_entry[0][2],))
1748
entry = (entry_key, [DirState.NULL_PARENT_DETAILS,
1750
block.insert(entry_index, entry)
1752
active_kind = entry[1][0][0]
1753
if active_kind == b'a':
1754
# The active record shows up as absent, this could be genuine,
1755
# or it could be present at some other location. We need to
1757
id_index = self._get_id_index()
1758
# The id_index may not be perfectly accurate for tree1, because
1759
# we haven't been keeping it updated. However, it should be
1760
# fine for tree0, and that gives us enough info for what we
1762
keys = id_index.get(file_id, ())
1764
block_i, entry_i, d_present, f_present = \
1765
self._get_block_entry_index(key[0], key[1], 0)
1768
active_entry = self._dirblocks[block_i][1][entry_i]
1769
if (active_entry[0][2] != file_id):
1770
# Some other file is at this path, we don't need to
1773
real_active_kind = active_entry[1][0][0]
1774
if real_active_kind in (b'a', b'r'):
1775
# We found a record, which was not *this* record,
1776
# which matches the file_id, but is not actually
1777
# present. Something seems *really* wrong.
1778
self._raise_invalid(new_path, file_id,
1779
"We found a tree0 entry that doesnt make sense")
1780
# Now, we've found a tree0 entry which matches the file_id
1781
# but is at a different location. So update them to be
1783
active_dir, active_name = active_entry[0][:2]
1785
active_path = active_dir + b'/' + active_name
1787
active_path = active_name
1788
active_entry[1][1] = st(b'r', new_path, 0, False, b'')
1789
entry[1][0] = st(b'r', active_path, 0, False, b'')
1790
elif active_kind == b'r':
1791
raise NotImplementedError()
1793
new_kind = new_details[0]
1794
if new_kind == b'd':
1795
self._ensure_block(block_index, entry_index, new_path)
1620
1797
def _update_basis_apply_changes(self, changes):
1621
1798
"""Apply a sequence of changes to tree 1 during update_basis_by_delta.
1653
1823
null = DirState.NULL_PARENT_DETAILS
1654
1824
for old_path, new_path, file_id, _, real_delete in deletes:
1655
1825
if real_delete != (new_path is None):
1656
self._changes_aborted = True
1657
raise AssertionError("bad delete delta")
1826
self._raise_invalid(old_path, file_id, "bad delete delta")
1658
1827
# the entry for this file_id must be in tree 1.
1659
1828
dirname, basename = osutils.split(old_path)
1660
1829
block_index, entry_index, dir_present, file_present = \
1661
1830
self._get_block_entry_index(dirname, basename, 1)
1662
1831
if not file_present:
1663
self._changes_aborted = True
1664
raise errors.InconsistentDelta(old_path, file_id,
1665
'basis tree does not contain removed entry')
1832
self._raise_invalid(old_path, file_id,
1833
'basis tree does not contain removed entry')
1666
1834
entry = self._dirblocks[block_index][1][entry_index]
1835
# The state of the entry in the 'active' WT
1836
active_kind = entry[1][0][0]
1667
1837
if entry[0][2] != file_id:
1668
self._changes_aborted = True
1669
raise errors.InconsistentDelta(old_path, file_id,
1670
'mismatched file_id in tree 1')
1672
if entry[1][0][0] != 'a':
1673
self._changes_aborted = True
1674
raise errors.InconsistentDelta(old_path, file_id,
1675
'This was marked as a real delete, but the WT state'
1676
' claims that it still exists and is versioned.')
1838
self._raise_invalid(old_path, file_id,
1839
'mismatched file_id in tree 1')
1841
old_kind = entry[1][1][0]
1842
if active_kind in b'ar':
1843
# The active tree doesn't have this file_id.
1844
# The basis tree is changing this record. If this is a
1845
# rename, then we don't want the record here at all
1846
# anymore. If it is just an in-place change, we want the
1847
# record here, but we'll add it if we need to. So we just
1849
if active_kind == b'r':
1850
active_path = entry[1][0][1]
1851
active_entry = self._get_entry(0, file_id, active_path)
1852
if active_entry[1][1][0] != b'r':
1853
self._raise_invalid(old_path, file_id,
1854
"Dirstate did not have matching rename entries")
1855
elif active_entry[1][0][0] in b'ar':
1856
self._raise_invalid(old_path, file_id,
1857
"Dirstate had a rename pointing at an inactive"
1859
active_entry[1][1] = null
1677
1860
del self._dirblocks[block_index][1][entry_index]
1861
if old_kind == b'd':
1862
# This was a directory, and the active tree says it
1863
# doesn't exist, and now the basis tree says it doesn't
1864
# exist. Remove its dirblock if present
1866
present) = self._find_block_index_from_key(
1867
(old_path, b'', b''))
1869
dir_block = self._dirblocks[dir_block_index][1]
1871
# This entry is empty, go ahead and just remove it
1872
del self._dirblocks[dir_block_index]
1679
if entry[1][0][0] == 'a':
1680
self._changes_aborted = True
1681
raise errors.InconsistentDelta(old_path, file_id,
1682
'The entry was considered a rename, but the source path'
1683
' is marked as absent.')
1684
# For whatever reason, we were asked to rename an entry
1685
# that was originally marked as deleted. This could be
1686
# because we are renaming the parent directory, and the WT
1687
# current state has the file marked as deleted.
1688
elif entry[1][0][0] == 'r':
1689
# implement the rename
1690
del self._dirblocks[block_index][1][entry_index]
1692
# it is being resurrected here, so blank it out temporarily.
1693
self._dirblocks[block_index][1][entry_index][1][1] = null
1874
# There is still an active record, so just mark this
1877
block_i, entry_i, d_present, f_present = \
1878
self._get_block_entry_index(old_path, b'', 1)
1880
dir_block = self._dirblocks[block_i][1]
1881
for child_entry in dir_block:
1882
child_basis_kind = child_entry[1][1][0]
1883
if child_basis_kind not in b'ar':
1884
self._raise_invalid(old_path, file_id,
1885
"The file id was deleted but its children were "
1695
1888
def _after_delta_check_parents(self, parents, index):
1696
1889
"""Check that parents required by the delta are all intact.
1698
1891
:param parents: An iterable of (path_utf8, file_id) tuples which are
1699
1892
required to be present in tree 'index' at path_utf8 with id file_id
1700
1893
and be a directory.
1924
2113
tree present there.
1926
2115
self._read_dirblocks_if_needed()
1927
key = dirname, basename, ''
2116
key = dirname, basename, b''
1928
2117
block_index, present = self._find_block_index_from_key(key)
1929
2118
if not present:
1930
2119
# no such directory - return the dir index and 0 for the row.
1931
2120
return block_index, 0, False, False
1932
block = self._dirblocks[block_index][1] # access the entries only
2121
block = self._dirblocks[block_index][1] # access the entries only
1933
2122
entry_index, present = self._find_entry_index(key, block)
1934
2123
# linear search through entries at this path to find the one
1936
2125
while entry_index < len(block) and block[entry_index][0][1] == basename:
1937
if block[entry_index][1][tree_index][0] not in 'ar':
2126
if block[entry_index][1][tree_index][0] not in (b'a', b'r'):
1938
2127
# neither absent or relocated
1939
2128
return block_index, entry_index, True, True
1940
2129
entry_index += 1
1941
2130
return block_index, entry_index, True, False
1943
def _get_entry(self, tree_index, fileid_utf8=None, path_utf8=None, include_deleted=False):
2132
def _get_entry(self, tree_index, fileid_utf8=None, path_utf8=None,
2133
include_deleted=False):
1944
2134
"""Get the dirstate entry for path in tree tree_index.
1946
2136
If either file_id or path is supplied, it is used as the key to lookup.
2145
2337
def _get_id_index(self):
2146
"""Get an id index of self._dirblocks."""
2338
"""Get an id index of self._dirblocks.
2340
This maps from file_id => [(directory, name, file_id)] entries where
2341
that file_id appears in one of the trees.
2147
2343
if self._id_index is None:
2149
2345
for key, tree_details in self._iter_entries():
2150
id_index.setdefault(key[2], set()).add(key)
2346
self._add_to_id_index(id_index, key)
2151
2347
self._id_index = id_index
2152
2348
return self._id_index
2350
def _add_to_id_index(self, id_index, entry_key):
2351
"""Add this entry to the _id_index mapping."""
2352
# This code used to use a set for every entry in the id_index. However,
2353
# it is *rare* to have more than one entry. So a set is a large
2354
# overkill. And even when we do, we won't ever have more than the
2355
# number of parent trees. Which is still a small number (rarely >2). As
2356
# such, we use a simple tuple, and do our own uniqueness checks. While
2357
# the 'in' check is O(N) since N is nicely bounded it shouldn't ever
2358
# cause quadratic failure.
2359
file_id = entry_key[2]
2360
entry_key = static_tuple.StaticTuple.from_sequence(entry_key)
2361
if file_id not in id_index:
2362
id_index[file_id] = static_tuple.StaticTuple(entry_key,)
2364
entry_keys = id_index[file_id]
2365
if entry_key not in entry_keys:
2366
id_index[file_id] = entry_keys + (entry_key,)
2368
def _remove_from_id_index(self, id_index, entry_key):
2369
"""Remove this entry from the _id_index mapping.
2371
It is an programming error to call this when the entry_key is not
2374
file_id = entry_key[2]
2375
entry_keys = list(id_index[file_id])
2376
entry_keys.remove(entry_key)
2377
id_index[file_id] = static_tuple.StaticTuple.from_sequence(entry_keys)
2154
2379
def _get_output_lines(self, lines):
2155
2380
"""Format lines for final output.
2160
2385
output_lines = [DirState.HEADER_FORMAT_3]
2161
lines.append('') # a final newline
2162
inventory_text = '\0\n\0'.join(lines)
2163
output_lines.append('crc32: %s\n' % (zlib.crc32(inventory_text),))
2386
lines.append(b'') # a final newline
2387
inventory_text = b'\0\n\0'.join(lines)
2388
output_lines.append(b'crc32: %d\n' % (zlib.crc32(inventory_text),))
2164
2389
# -3, 1 for num parents, 1 for ghosts, 1 for final newline
2165
num_entries = len(lines)-3
2166
output_lines.append('num_entries: %s\n' % (num_entries,))
2390
num_entries = len(lines) - 3
2391
output_lines.append(b'num_entries: %d\n' % (num_entries,))
2167
2392
output_lines.append(inventory_text)
2168
2393
return output_lines
2170
2395
def _make_deleted_row(self, fileid_utf8, parents):
2171
2396
"""Return a deleted row for fileid_utf8."""
2172
return ('/', 'RECYCLED.BIN', 'file', fileid_utf8, 0, DirState.NULLSTAT,
2397
return (b'/', b'RECYCLED.BIN', b'file', fileid_utf8, 0, DirState.NULLSTAT,
2175
2400
def _num_present_parents(self):
2176
2401
"""The number of parent entries in each record row."""
2177
2402
return len(self._parents) - len(self._ghosts)
2180
def on_file(path, sha1_provider=None):
2405
def on_file(cls, path, sha1_provider=None, worth_saving_limit=0,
2406
use_filesystem_for_exec=True):
2181
2407
"""Construct a DirState on the file at path "path".
2183
2409
:param path: The path at which the dirstate file on disk should live.
2184
2410
:param sha1_provider: an object meeting the SHA1Provider interface.
2185
2411
If None, a DefaultSHA1Provider is used.
2412
:param worth_saving_limit: when the exact number of hash changed
2413
entries is known, only bother saving the dirstate if more than
2414
this count of entries have changed. -1 means never save.
2415
:param use_filesystem_for_exec: Whether to trust the filesystem
2416
for executable bit information
2186
2417
:return: An unlocked DirState object, associated with the given path.
2188
2419
if sha1_provider is None:
2189
2420
sha1_provider = DefaultSHA1Provider()
2190
result = DirState(path, sha1_provider)
2421
result = cls(path, sha1_provider,
2422
worth_saving_limit=worth_saving_limit,
2423
use_filesystem_for_exec=use_filesystem_for_exec)
2193
2426
def _read_dirblocks_if_needed(self):
2243
2476
raise errors.BzrError(
2244
2477
'invalid header line: %r' % (header,))
2245
2478
crc_line = self._state_file.readline()
2246
if not crc_line.startswith('crc32: '):
2479
if not crc_line.startswith(b'crc32: '):
2247
2480
raise errors.BzrError('missing crc32 checksum: %r' % crc_line)
2248
self.crc_expected = int(crc_line[len('crc32: '):-1])
2481
self.crc_expected = int(crc_line[len(b'crc32: '):-1])
2249
2482
num_entries_line = self._state_file.readline()
2250
if not num_entries_line.startswith('num_entries: '):
2483
if not num_entries_line.startswith(b'num_entries: '):
2251
2484
raise errors.BzrError('missing num_entries line')
2252
self._num_entries = int(num_entries_line[len('num_entries: '):-1])
2485
self._num_entries = int(num_entries_line[len(b'num_entries: '):-1])
2254
def sha1_from_stat(self, path, stat_result, _pack_stat=pack_stat):
2487
def sha1_from_stat(self, path, stat_result):
2255
2488
"""Find a sha1 given a stat lookup."""
2256
return self._get_packed_stat_index().get(_pack_stat(stat_result), None)
2489
return self._get_packed_stat_index().get(pack_stat(stat_result), None)
2258
2491
def _get_packed_stat_index(self):
2259
2492
"""Get a packed_stat index of self._dirblocks."""
2260
2493
if self._packed_stat_index is None:
2262
2495
for key, tree_details in self._iter_entries():
2263
if tree_details[0][0] == 'f':
2496
if tree_details[0][0] == b'f':
2264
2497
index[tree_details[0][4]] = tree_details[0][1]
2265
2498
self._packed_stat_index = index
2266
2499
return self._packed_stat_index
2283
2516
# Should this be a warning? For now, I'm expecting that places that
2284
2517
# mark it inconsistent will warn, making a warning here redundant.
2285
2518
trace.mutter('Not saving DirState because '
2286
'_changes_aborted is set.')
2288
if (self._header_state == DirState.IN_MEMORY_MODIFIED or
2289
self._dirblock_state == DirState.IN_MEMORY_MODIFIED):
2519
'_changes_aborted is set.')
2521
# TODO: Since we now distinguish IN_MEMORY_MODIFIED from
2522
# IN_MEMORY_HASH_MODIFIED, we should only fail quietly if we fail
2523
# to save an IN_MEMORY_HASH_MODIFIED, and fail *noisily* if we
2524
# fail to save IN_MEMORY_MODIFIED
2525
if not self._worth_saving():
2291
grabbed_write_lock = False
2292
if self._lock_state != 'w':
2293
grabbed_write_lock, new_lock = self._lock_token.temporary_write_lock()
2294
# Switch over to the new lock, as the old one may be closed.
2528
grabbed_write_lock = False
2529
if self._lock_state != 'w':
2530
grabbed_write_lock, new_lock = self._lock_token.temporary_write_lock()
2531
# Switch over to the new lock, as the old one may be closed.
2532
# TODO: jam 20070315 We should validate the disk file has
2533
# not changed contents, since temporary_write_lock may
2534
# not be an atomic operation.
2535
self._lock_token = new_lock
2536
self._state_file = new_lock.f
2537
if not grabbed_write_lock:
2538
# We couldn't grab a write lock, so we switch back to a read one
2541
lines = self.get_lines()
2542
self._state_file.seek(0)
2543
self._state_file.writelines(lines)
2544
self._state_file.truncate()
2545
self._state_file.flush()
2546
self._maybe_fdatasync()
2547
self._mark_unmodified()
2549
if grabbed_write_lock:
2550
self._lock_token = self._lock_token.restore_read_lock()
2551
self._state_file = self._lock_token.f
2295
2552
# TODO: jam 20070315 We should validate the disk file has
2296
# not changed contents. Since temporary_write_lock may
2553
# not changed contents. Since restore_read_lock may
2297
2554
# not be an atomic operation.
2298
self._lock_token = new_lock
2299
self._state_file = new_lock.f
2300
if not grabbed_write_lock:
2301
# We couldn't grab a write lock, so we switch back to a read one
2304
self._state_file.seek(0)
2305
self._state_file.writelines(self.get_lines())
2306
self._state_file.truncate()
2307
self._state_file.flush()
2308
self._header_state = DirState.IN_MEMORY_UNMODIFIED
2309
self._dirblock_state = DirState.IN_MEMORY_UNMODIFIED
2311
if grabbed_write_lock:
2312
self._lock_token = self._lock_token.restore_read_lock()
2313
self._state_file = self._lock_token.f
2314
# TODO: jam 20070315 We should validate the disk file has
2315
# not changed contents. Since restore_read_lock may
2316
# not be an atomic operation.
2556
def _maybe_fdatasync(self):
2557
"""Flush to disk if possible and if not configured off."""
2558
if self._config_stack.get('dirstate.fdatasync'):
2559
osutils.fdatasync(self._state_file.fileno())
2561
def _worth_saving(self):
2562
"""Is it worth saving the dirstate or not?"""
2563
if (self._header_state == DirState.IN_MEMORY_MODIFIED
2564
or self._dirblock_state == DirState.IN_MEMORY_MODIFIED):
2566
if self._dirblock_state == DirState.IN_MEMORY_HASH_MODIFIED:
2567
if self._worth_saving_limit == -1:
2568
# We never save hash changes when the limit is -1
2570
# If we're using smart saving and only a small number of
2571
# entries have changed their hash, don't bother saving. John has
2572
# suggested using a heuristic here based on the size of the
2573
# changed files and/or tree. For now, we go with a configurable
2574
# number of changes, keeping the calculation time
2575
# as low overhead as possible. (This also keeps all existing
2576
# tests passing as the default is 0, i.e. always save.)
2577
if len(self._known_hash_changes) >= self._worth_saving_limit:
2318
2581
def _set_data(self, parent_ids, dirblocks):
2319
2582
"""Set the full dirstate data in memory.
2463
2742
# mapping from path,id. We need to look up the correct path
2464
2743
# for the indexes from 0 to tree_index -1
2465
2744
new_details = []
2466
for lookup_index in xrange(tree_index):
2745
for lookup_index in range(tree_index):
2467
2746
# boundary case: this is the first occurence of file_id
2468
# so there are no id_indexs, possibly take this out of
2747
# so there are no id_indexes, possibly take this out of
2470
if not len(id_index[file_id]):
2749
if not len(entry_keys):
2471
2750
new_details.append(DirState.NULL_PARENT_DETAILS)
2473
2752
# grab any one entry, use it to find the right path.
2474
# TODO: optimise this to reduce memory use in highly
2475
# fragmented situations by reusing the relocation
2477
a_key = iter(id_index[file_id]).next()
2478
if by_path[a_key][lookup_index][0] in ('r', 'a'):
2479
# its a pointer or missing statement, use it as is.
2480
new_details.append(by_path[a_key][lookup_index])
2753
a_key = next(iter(entry_keys))
2754
if by_path[a_key][lookup_index][0] in (b'r', b'a'):
2755
# its a pointer or missing statement, use it as
2758
by_path[a_key][lookup_index])
2482
2760
# we have the right key, make a pointer to it.
2483
real_path = ('/'.join(a_key[0:2])).strip('/')
2484
new_details.append(('r', real_path, 0, False, ''))
2761
real_path = (b'/'.join(a_key[0:2])).strip(b'/')
2762
new_details.append(st(b'r', real_path, 0, False,
2485
2764
new_details.append(self._inv_entry_to_details(entry))
2486
2765
new_details.extend(new_location_suffix)
2487
2766
by_path[new_entry_key] = new_details
2488
id_index[file_id].add(new_entry_key)
2767
self._add_to_id_index(id_index, new_entry_key)
2489
2768
# --- end generation of full tree mappings
2491
2770
# sort and output all the entries
2492
new_entries = self._sort_entries(by_path.items())
2771
new_entries = self._sort_entries(viewitems(by_path))
2493
2772
self._entries_to_current_state(new_entries)
2494
2773
self._parents = [rev_id for rev_id, tree in trees]
2495
2774
self._ghosts = list(ghosts)
2496
self._header_state = DirState.IN_MEMORY_MODIFIED
2497
self._dirblock_state = DirState.IN_MEMORY_MODIFIED
2775
self._mark_modified(header_modified=True)
2498
2776
self._id_index = id_index
2500
2778
def _sort_entries(self, entry_list):
2604
2895
# the minimal required trigger is if the execute bit or cached
2605
2896
# kind has changed.
2606
2897
if (current_old[1][0][3] != current_new[1].executable or
2607
current_old[1][0][0] != current_new_minikind):
2898
current_old[1][0][0] != current_new_minikind):
2609
2900
trace.mutter("Updating in-place change '%s'.",
2610
new_path_utf8.decode('utf8'))
2901
new_path_utf8.decode('utf8'))
2611
2902
self.update_minimal(current_old[0], current_new_minikind,
2612
executable=current_new[1].executable,
2613
path_utf8=new_path_utf8, fingerprint=fingerprint,
2903
executable=current_new[1].executable,
2904
path_utf8=new_path_utf8, fingerprint=fingerprint,
2615
2906
# both sides are dealt with, move on
2616
2907
current_old = advance(old_iterator)
2617
2908
current_new = advance(new_iterator)
2618
elif (cmp_by_dirs(new_dirname, current_old[0][0]) < 0
2619
or (new_dirname == current_old[0][0]
2620
and new_entry_key[1:] < current_old[0][1:])):
2909
elif (lt_by_dirs(new_dirname, current_old[0][0])
2910
or (new_dirname == current_old[0][0] and
2911
new_entry_key[1:] < current_old[0][1:])):
2621
2912
# new comes before:
2622
2913
# add a entry for this and advance new
2624
2915
trace.mutter("Inserting from new '%s'.",
2625
new_path_utf8.decode('utf8'))
2916
new_path_utf8.decode('utf8'))
2626
2917
self.update_minimal(new_entry_key, current_new_minikind,
2627
executable=current_new[1].executable,
2628
path_utf8=new_path_utf8, fingerprint=fingerprint,
2918
executable=current_new[1].executable,
2919
path_utf8=new_path_utf8, fingerprint=fingerprint,
2630
2921
current_new = advance(new_iterator)
2632
2923
# we've advanced past the place where the old key would be,
2633
2924
# without seeing it in the new list. so it must be gone.
2635
2926
trace.mutter("Deleting from old '%s/%s'.",
2636
current_old[0][0].decode('utf8'),
2637
current_old[0][1].decode('utf8'))
2927
current_old[0][0].decode('utf8'),
2928
current_old[0][1].decode('utf8'))
2638
2929
self._make_absent(current_old)
2639
2930
current_old = advance(old_iterator)
2640
self._dirblock_state = DirState.IN_MEMORY_MODIFIED
2931
self._mark_modified()
2641
2932
self._id_index = None
2642
2933
self._packed_stat_index = None
2644
2935
trace.mutter("set_state_from_inventory complete.")
2937
def set_state_from_scratch(self, working_inv, parent_trees, parent_ghosts):
2938
"""Wipe the currently stored state and set it to something new.
2940
This is a hard-reset for the data we are working with.
2942
# Technically, we really want a write lock, but until we write, we
2943
# don't really need it.
2944
self._requires_lock()
2945
# root dir and root dir contents with no children. We have to have a
2946
# root for set_state_from_inventory to work correctly.
2947
empty_root = ((b'', b'', inventory.ROOT_ID),
2948
[(b'd', b'', 0, False, DirState.NULLSTAT)])
2949
empty_tree_dirblocks = [(b'', [empty_root]), (b'', [])]
2950
self._set_data([], empty_tree_dirblocks)
2951
self.set_state_from_inventory(working_inv)
2952
self.set_parent_trees(parent_trees, parent_ghosts)
2646
2954
def _make_absent(self, current_old):
2647
2955
"""Mark current_old - an entry - as absent for tree 0.
2682
2993
update_block_index, present = \
2683
2994
self._find_block_index_from_key(update_key)
2684
2995
if not present:
2685
raise AssertionError('could not find block for %s' % (update_key,))
2996
raise AssertionError(
2997
'could not find block for %s' % (update_key,))
2686
2998
update_entry_index, present = \
2687
self._find_entry_index(update_key, self._dirblocks[update_block_index][1])
2999
self._find_entry_index(
3000
update_key, self._dirblocks[update_block_index][1])
2688
3001
if not present:
2689
raise AssertionError('could not find entry for %s' % (update_key,))
3002
raise AssertionError(
3003
'could not find entry for %s' % (update_key,))
2690
3004
update_tree_details = self._dirblocks[update_block_index][1][update_entry_index][1]
2691
3005
# it must not be absent at the moment
2692
if update_tree_details[0][0] == 'a': # absent
3006
if update_tree_details[0][0] == b'a': # absent
2693
3007
raise AssertionError('bad row %r' % (update_tree_details,))
2694
3008
update_tree_details[0] = DirState.NULL_PARENT_DETAILS
2695
self._dirblock_state = DirState.IN_MEMORY_MODIFIED
3009
self._mark_modified()
2696
3010
return last_reference
2698
def update_minimal(self, key, minikind, executable=False, fingerprint='',
2699
packed_stat=None, size=0, path_utf8=None, fullscan=False):
3012
def update_minimal(self, key, minikind, executable=False, fingerprint=b'',
3013
packed_stat=None, size=0, path_utf8=None, fullscan=False):
2700
3014
"""Update an entry to the state in tree 0.
2702
3016
This will either create a new entry at 'key' or update an existing one.
2803
3122
update_block_index, present = \
2804
3123
self._find_block_index_from_key(other_key)
2805
3124
if not present:
2806
raise AssertionError('could not find block for %s' % (other_key,))
3125
raise AssertionError(
3126
'could not find block for %s' % (other_key,))
2807
3127
update_entry_index, present = \
2808
self._find_entry_index(other_key, self._dirblocks[update_block_index][1])
3128
self._find_entry_index(
3129
other_key, self._dirblocks[update_block_index][1])
2809
3130
if not present:
2810
raise AssertionError('update_minimal: could not find entry for %s' % (other_key,))
3131
raise AssertionError(
3132
'update_minimal: could not find entry for %s' % (other_key,))
2811
3133
update_details = self._dirblocks[update_block_index][1][update_entry_index][1][lookup_index]
2812
if update_details[0] in 'ar': # relocated, absent
3134
if update_details[0] in (b'a', b'r'): # relocated, absent
2813
3135
# its a pointer or absent in lookup_index's tree, use
2815
3137
new_entry[1].append(update_details)
2817
3139
# we have the right key, make a pointer to it.
2818
3140
pointer_path = osutils.pathjoin(*other_key[0:2])
2819
new_entry[1].append(('r', pointer_path, 0, False, ''))
3141
new_entry[1].append(
3142
(b'r', pointer_path, 0, False, b''))
2820
3143
block.insert(entry_index, new_entry)
2821
existing_keys.add(key)
3144
self._add_to_id_index(id_index, key)
2823
3146
# Does the new state matter?
2824
3147
block[entry_index][1][0] = new_details
2842
3170
# other trees, so put absent pointers there
2843
3171
# This is the vertical axis in the matrix, all pointing
2844
3172
# to the real path.
2845
block_index, present = self._find_block_index_from_key(entry_key)
3173
block_index, present = self._find_block_index_from_key(
2846
3175
if not present:
2847
3176
raise AssertionError('not present: %r', entry_key)
2848
entry_index, present = self._find_entry_index(entry_key, self._dirblocks[block_index][1])
3177
entry_index, present = self._find_entry_index(
3178
entry_key, self._dirblocks[block_index][1])
2849
3179
if not present:
2850
3180
raise AssertionError('not present: %r', entry_key)
2851
3181
self._dirblocks[block_index][1][entry_index][1][0] = \
2852
('r', path_utf8, 0, False, '')
3182
(b'r', path_utf8, 0, False, b'')
2853
3183
# add a containing dirblock if needed.
2854
if new_details[0] == 'd':
2855
subdir_key = (osutils.pathjoin(*key[0:2]), '', '')
3184
if new_details[0] == b'd':
3185
# GZ 2017-06-09: Using pathjoin why?
3186
subdir_key = (osutils.pathjoin(*key[0:2]), b'', b'')
2856
3187
block_index, present = self._find_block_index_from_key(subdir_key)
2857
3188
if not present:
2858
3189
self._dirblocks.insert(block_index, (subdir_key[0], []))
2860
self._dirblock_state = DirState.IN_MEMORY_MODIFIED
3191
self._mark_modified()
2862
3193
def _maybe_remove_row(self, block, index, id_index):
2863
3194
"""Remove index if it is absent or relocated across the row.
2865
3196
id_index is updated accordingly.
3197
:return: True if we removed the row, False otherwise
2867
3199
present_in_row = False
2868
3200
entry = block[index]
2869
3201
for column in entry[1]:
2870
if column[0] not in 'ar':
3202
if column[0] not in (b'a', b'r'):
2871
3203
present_in_row = True
2873
3205
if not present_in_row:
2874
3206
block.pop(index)
2875
id_index[entry[0][2]].remove(entry[0])
3207
self._remove_from_id_index(id_index, entry[0])
2877
3211
def _validate(self):
2878
3212
"""Check that invariants on the dirblock are correct.
2958
3292
# We check this with a dict per tree pointing either to the present
2959
3293
# name, or None if absent.
2960
3294
tree_count = self._num_present_parents() + 1
2961
id_path_maps = [dict() for i in range(tree_count)]
3295
id_path_maps = [{} for _ in range(tree_count)]
2962
3296
# Make sure that all renamed entries point to the correct location.
2963
3297
for entry in self._iter_entries():
2964
3298
file_id = entry[0][2]
2965
3299
this_path = osutils.pathjoin(entry[0][0], entry[0][1])
2966
3300
if len(entry[1]) != tree_count:
2967
3301
raise AssertionError(
2968
"wrong number of entry details for row\n%s" \
2969
",\nexpected %d" % \
2970
(pformat(entry), tree_count))
3302
"wrong number of entry details for row\n%s"
3304
(pformat(entry), tree_count))
2971
3305
absent_positions = 0
2972
3306
for tree_index, tree_state in enumerate(entry[1]):
2973
3307
this_tree_map = id_path_maps[tree_index]
2974
3308
minikind = tree_state[0]
2975
if minikind in 'ar':
3309
if minikind in (b'a', b'r'):
2976
3310
absent_positions += 1
2977
3311
# have we seen this id before in this column?
2978
3312
if file_id in this_tree_map:
2979
3313
previous_path, previous_loc = this_tree_map[file_id]
2980
3314
# any later mention of this file must be consistent with
2981
3315
# what was said before
3316
if minikind == b'a':
2983
3317
if previous_path is not None:
2984
3318
raise AssertionError(
2985
"file %s is absent in row %r but also present " \
2987
(file_id, entry, previous_path))
2988
elif minikind == 'r':
3319
"file %s is absent in row %r but also present "
3321
(file_id.decode('utf-8'), entry, previous_path))
3322
elif minikind == b'r':
2989
3323
target_location = tree_state[1]
2990
3324
if previous_path != target_location:
2991
3325
raise AssertionError(
2992
"file %s relocation in row %r but also at %r" \
2993
% (file_id, entry, previous_path))
3326
"file %s relocation in row %r but also at %r"
3327
% (file_id, entry, previous_path))
2995
3329
# a file, directory, etc - may have been previously
2996
3330
# pointed to by a relocation, which must point here
3138
3494
# are calculated at the same time, so checking just the size
3139
3495
# gains nothing w.r.t. performance.
3140
3496
link_or_sha1 = state._sha1_file(abspath)
3141
entry[1][0] = ('f', link_or_sha1, stat_value.st_size,
3497
entry[1][0] = (b'f', link_or_sha1, stat_value.st_size,
3142
3498
executable, packed_stat)
3144
entry[1][0] = ('f', '', stat_value.st_size,
3500
entry[1][0] = (b'f', b'', stat_value.st_size,
3145
3501
executable, DirState.NULLSTAT)
3146
elif minikind == 'd':
3502
worth_saving = False
3503
elif minikind == b'd':
3147
3504
link_or_sha1 = None
3148
entry[1][0] = ('d', '', 0, False, packed_stat)
3149
if saved_minikind != 'd':
3505
entry[1][0] = (b'd', b'', 0, False, packed_stat)
3506
if saved_minikind != b'd':
3150
3507
# This changed from something into a directory. Make sure we
3151
3508
# have a directory block for it. This doesn't happen very
3152
3509
# often, so this doesn't have to be super fast.
3153
3510
block_index, entry_index, dir_present, file_present = \
3154
3511
state._get_block_entry_index(entry[0][0], entry[0][1], 0)
3155
3512
state._ensure_block(block_index, entry_index,
3156
osutils.pathjoin(entry[0][0], entry[0][1]))
3157
elif minikind == 'l':
3513
osutils.pathjoin(entry[0][0], entry[0][1]))
3515
worth_saving = False
3516
elif minikind == b'l':
3517
if saved_minikind == b'l':
3518
worth_saving = False
3158
3519
link_or_sha1 = state._read_link(abspath, saved_link_or_sha1)
3159
3520
if state._cutoff_time is None:
3160
3521
state._sha_cutoff_time()
3161
3522
if (stat_value.st_mtime < state._cutoff_time
3162
and stat_value.st_ctime < state._cutoff_time):
3163
entry[1][0] = ('l', link_or_sha1, stat_value.st_size,
3523
and stat_value.st_ctime < state._cutoff_time):
3524
entry[1][0] = (b'l', link_or_sha1, stat_value.st_size,
3164
3525
False, packed_stat)
3166
entry[1][0] = ('l', '', stat_value.st_size,
3527
entry[1][0] = (b'l', b'', stat_value.st_size,
3167
3528
False, DirState.NULLSTAT)
3168
state._dirblock_state = DirState.IN_MEMORY_MODIFIED
3530
state._mark_modified([entry])
3169
3531
return link_or_sha1
3172
3534
class ProcessEntryPython(object):
3174
3536
__slots__ = ["old_dirname_to_file_id", "new_dirname_to_file_id",
3175
"last_source_parent", "last_target_parent", "include_unchanged",
3176
"partial", "use_filesystem_for_exec", "utf8_decode",
3177
"searched_specific_files", "search_specific_files",
3178
"searched_exact_paths", "search_specific_file_parents", "seen_ids",
3179
"state", "source_index", "target_index", "want_unversioned", "tree"]
3537
"last_source_parent", "last_target_parent", "include_unchanged",
3538
"partial", "use_filesystem_for_exec", "utf8_decode",
3539
"searched_specific_files", "search_specific_files",
3540
"searched_exact_paths", "search_specific_file_parents", "seen_ids",
3541
"state", "source_index", "target_index", "want_unversioned", "tree"]
3181
3543
def __init__(self, include_unchanged, use_filesystem_for_exec,
3182
search_specific_files, state, source_index, target_index,
3183
want_unversioned, tree):
3544
search_specific_files, state, source_index, target_index,
3545
want_unversioned, tree):
3184
3546
self.old_dirname_to_file_id = {}
3185
3547
self.new_dirname_to_file_id = {}
3186
3548
# Are we doing a partial iter_changes?
3187
self.partial = search_specific_files != set([''])
3549
self.partial = search_specific_files != {''}
3188
3550
# Using a list so that we can access the values and change them in
3189
3551
# nested scope. Each one is [path, file_id, entry]
3190
3552
self.last_source_parent = [None, None]
3428
3796
and stat.S_IEXEC & path_info[3].st_mode)
3430
3798
target_exec = target_details[3]
3431
return (entry[0][2],
3432
(None, self.utf8_decode(path)[0]),
3436
(None, self.utf8_decode(entry[0][1])[0]),
3437
(None, path_info[2]),
3438
(None, target_exec)), True
3801
(None, self.utf8_decode(path)[0]),
3805
(None, self.utf8_decode(entry[0][1])[0]),
3806
(None, path_info[2]),
3807
(None, target_exec)), True
3440
3809
# Its a missing file, report it as such.
3441
return (entry[0][2],
3442
(None, self.utf8_decode(path)[0]),
3446
(None, self.utf8_decode(entry[0][1])[0]),
3448
(None, False)), True
3449
elif source_minikind in 'fdlt' and target_minikind in 'a':
3812
(None, self.utf8_decode(path)[0]),
3816
(None, self.utf8_decode(entry[0][1])[0]),
3818
(None, False)), True
3819
elif source_minikind in _fdlt and target_minikind in b'a':
3450
3820
# unversioned, possibly, or possibly not deleted: we dont care.
3451
3821
# if its still on disk, *and* theres no other entry at this
3452
3822
# path [we dont know this in this routine at the moment -
3453
3823
# perhaps we should change this - then it would be an unknown.
3454
3824
old_path = pathjoin(entry[0][0], entry[0][1])
3455
3825
# parent id is the entry for the path in the target tree
3456
parent_id = self.state._get_entry(self.source_index, path_utf8=entry[0][0])[0][2]
3826
parent_id = self.state._get_entry(
3827
self.source_index, path_utf8=entry[0][0])[0][2]
3457
3828
if parent_id == entry[0][2]:
3458
3829
parent_id = None
3459
return (entry[0][2],
3460
(self.utf8_decode(old_path)[0], None),
3464
(self.utf8_decode(entry[0][1])[0], None),
3465
(DirState._minikind_to_kind[source_minikind], None),
3466
(source_details[3], None)), True
3467
elif source_minikind in 'fdlt' and target_minikind in 'r':
3832
(self.utf8_decode(old_path)[0], None),
3836
(self.utf8_decode(entry[0][1])[0], None),
3837
(DirState._minikind_to_kind[source_minikind], None),
3838
(source_details[3], None)), True
3839
elif source_minikind in _fdlt and target_minikind in b'r':
3468
3840
# a rename; could be a true rename, or a rename inherited from
3469
3841
# a renamed parent. TODO: handle this efficiently. Its not
3470
3842
# common case to rename dirs though, so a correct but slow
3471
3843
# implementation will do.
3472
if not osutils.is_inside_any(self.searched_specific_files, target_details[1]):
3844
if not osutils.is_inside_any(self.searched_specific_files,
3473
3846
self.search_specific_files.add(target_details[1])
3474
elif source_minikind in 'ra' and target_minikind in 'ra':
3847
elif source_minikind in _ra and target_minikind in _ra:
3475
3848
# neither of the selected trees contain this file,
3476
3849
# so skip over it. This is not currently directly tested, but
3477
3850
# is indirectly via test_too_much.TestCommands.test_conflicts.
3480
3853
raise AssertionError("don't know how to compare "
3481
"source_minikind=%r, target_minikind=%r"
3482
% (source_minikind, target_minikind))
3483
## import pdb;pdb.set_trace()
3854
"source_minikind=%r, target_minikind=%r"
3855
% (source_minikind, target_minikind))
3484
3856
return None, None
3486
3858
def __iter__(self):
3596
3969
new_executable = bool(
3597
3970
stat.S_ISREG(root_dir_info[3].st_mode)
3598
3971
and stat.S_IEXEC & root_dir_info[3].st_mode)
3600
(None, current_root_unicode),
3604
(None, splitpath(current_root_unicode)[-1]),
3605
(None, root_dir_info[2]),
3606
(None, new_executable)
3608
initial_key = (current_root, '', '')
3974
(None, current_root_unicode),
3978
(None, splitpath(current_root_unicode)[-1]),
3979
(None, root_dir_info[2]),
3980
(None, new_executable)
3982
initial_key = (current_root, b'', b'')
3609
3983
block_index, _ = self.state._find_block_index_from_key(initial_key)
3610
3984
if block_index == 0:
3611
3985
# we have processed the total root already, but because the
3612
3986
# initial key matched it we should skip it here.
3614
3988
if root_dir_info and root_dir_info[2] == 'tree-reference':
3615
3989
current_dir_info = None
3617
dir_iterator = osutils._walkdirs_utf8(root_abspath, prefix=current_root)
3991
dir_iterator = osutils._walkdirs_utf8(
3992
root_abspath, prefix=current_root)
3619
current_dir_info = dir_iterator.next()
3994
current_dir_info = next(dir_iterator)
3995
except OSError as e:
3621
3996
# on win32, python2.4 has e.errno == ERROR_DIRECTORY, but
3622
3997
# python 2.5 has e.errno == EINVAL,
3623
3998
# and e.winerror == ERROR_DIRECTORY
3629
4004
if e.errno in (errno.ENOENT, errno.ENOTDIR, errno.EINVAL):
3630
4005
current_dir_info = None
3631
4006
elif (sys.platform == 'win32'
3632
and (e.errno in win_errors
3633
or e_winerror in win_errors)):
4007
and (e.errno in win_errors or
4008
e_winerror in win_errors)):
3634
4009
current_dir_info = None
3638
if current_dir_info[0][0] == '':
4013
if current_dir_info[0][0] == b'':
3639
4014
# remove .bzr from iteration
3640
bzr_index = bisect.bisect_left(current_dir_info[1], ('.bzr',))
3641
if current_dir_info[1][bzr_index][0] != '.bzr':
4015
bzr_index = bisect.bisect_left(
4016
current_dir_info[1], (b'.bzr',))
4017
if current_dir_info[1][bzr_index][0] != b'.bzr':
3642
4018
raise AssertionError()
3643
4019
del current_dir_info[1][bzr_index]
3644
4020
# walk until both the directory listing and the versioned metadata
3645
4021
# are exhausted.
3646
4022
if (block_index < len(self.state._dirblocks) and
3647
osutils.is_inside(current_root, self.state._dirblocks[block_index][0])):
4023
osutils.is_inside(current_root,
4024
self.state._dirblocks[block_index][0])):
3648
4025
current_block = self.state._dirblocks[block_index]
3650
4027
current_block = None
3651
4028
while (current_dir_info is not None or
3652
4029
current_block is not None):
3653
4030
if (current_dir_info and current_block
3654
and current_dir_info[0][0] != current_block[0]):
3655
if _cmp_by_dirs(current_dir_info[0][0], current_block[0]) < 0:
4031
and current_dir_info[0][0] != current_block[0]):
4032
if _lt_by_dirs(current_dir_info[0][0], current_block[0]):
3656
4033
# filesystem data refers to paths not covered by the dirblock.
3657
4034
# this has two possibilities:
3658
4035
# A) it is versioned but empty, so there is no block for it
3664
4041
# recurse into unknown directories.
3666
4043
while path_index < len(current_dir_info[1]):
3667
current_path_info = current_dir_info[1][path_index]
3668
if self.want_unversioned:
3669
if current_path_info[2] == 'directory':
3670
if self.tree._directory_is_tree_reference(
4044
current_path_info = current_dir_info[1][path_index]
4045
if self.want_unversioned:
4046
if current_path_info[2] == 'directory':
4047
if self.tree._directory_is_tree_reference(
3671
4048
current_path_info[0].decode('utf8')):
3672
current_path_info = current_path_info[:2] + \
3673
('tree-reference',) + current_path_info[3:]
3674
new_executable = bool(
3675
stat.S_ISREG(current_path_info[3].st_mode)
3676
and stat.S_IEXEC & current_path_info[3].st_mode)
3678
(None, utf8_decode(current_path_info[0])[0]),
3682
(None, utf8_decode(current_path_info[1])[0]),
3683
(None, current_path_info[2]),
3684
(None, new_executable))
3685
# dont descend into this unversioned path if it is
3687
if current_path_info[2] in ('directory',
3689
del current_dir_info[1][path_index]
4049
current_path_info = current_path_info[:2] + \
4050
('tree-reference',) + \
4051
current_path_info[3:]
4052
new_executable = bool(
4053
stat.S_ISREG(current_path_info[3].st_mode)
4054
and stat.S_IEXEC & current_path_info[3].st_mode)
4057
(None, utf8_decode(current_path_info[0])[0]),
4061
(None, utf8_decode(current_path_info[1])[0]),
4062
(None, current_path_info[2]),
4063
(None, new_executable))
4064
# dont descend into this unversioned path if it is
4066
if current_path_info[2] in ('directory',
4068
del current_dir_info[1][path_index]
3693
4072
# This dir info has been handled, go to the next
3695
current_dir_info = dir_iterator.next()
4074
current_dir_info = next(dir_iterator)
3696
4075
except StopIteration:
3697
4076
current_dir_info = None
3900
4287
raise AssertionError(
3901
4288
"Got entry<->path mismatch for specific path "
3902
4289
"%r entry %r path_info %r " % (
3903
path_utf8, entry, path_info))
4290
path_utf8, entry, path_info))
3904
4291
# Only include changes - we're outside the users requested
3907
4294
self._gather_result_for_consistency(result)
3908
if (result[6][0] == 'directory' and
3909
result[6][1] != 'directory'):
4295
if (result.kind[0] == 'directory' and
4296
result.kind[1] != 'directory'):
3910
4297
# This stopped being a directory, the old children have
3911
4298
# to be included.
3912
if entry[1][self.source_index][0] == 'r':
4299
if entry[1][self.source_index][0] == b'r':
3913
4300
# renamed, take the source path
3914
4301
entry_path_utf8 = entry[1][self.source_index][1]
3916
4303
entry_path_utf8 = path_utf8
3917
initial_key = (entry_path_utf8, '', '')
4304
initial_key = (entry_path_utf8, b'', b'')
3918
4305
block_index, _ = self.state._find_block_index_from_key(
3920
4307
if block_index == 0:
3921
4308
# The children of the root are in block index 1.
3923
4310
current_block = None
3924
4311
if block_index < len(self.state._dirblocks):
3925
4312
current_block = self.state._dirblocks[block_index]
3926
4313
if not osutils.is_inside(
3927
entry_path_utf8, current_block[0]):
4314
entry_path_utf8, current_block[0]):
3928
4315
# No entries for this directory at all.
3929
4316
current_block = None
3930
4317
if current_block is not None:
3931
4318
for entry in current_block[1]:
3932
if entry[1][self.source_index][0] in 'ar':
4319
if entry[1][self.source_index][0] in (b'a', b'r'):
3933
4320
# Not in the source tree, so doesn't have to be