20
20
lines by NL. The field delimiters are ommitted in the grammar, line delimiters
21
21
are not - this is done for clarity of reading. All string data is in utf8.
23
MINIKIND = "f" | "d" | "l" | "a" | "r" | "t";
26
WHOLE_NUMBER = {digit}, digit;
28
REVISION_ID = a non-empty utf8 string;
30
dirstate format = header line, full checksum, row count, parent details,
31
ghost_details, entries;
32
header line = "#bazaar dirstate flat format 3", NL;
33
full checksum = "crc32: ", ["-"], WHOLE_NUMBER, NL;
34
row count = "num_entries: ", WHOLE_NUMBER, NL;
35
parent_details = WHOLE NUMBER, {REVISION_ID}* NL;
36
ghost_details = WHOLE NUMBER, {REVISION_ID}*, NL;
38
entry = entry_key, current_entry_details, {parent_entry_details};
39
entry_key = dirname, basename, fileid;
40
current_entry_details = common_entry_details, working_entry_details;
41
parent_entry_details = common_entry_details, history_entry_details;
42
common_entry_details = MINIKIND, fingerprint, size, executable
43
working_entry_details = packed_stat
44
history_entry_details = REVISION_ID;
47
fingerprint = a nonempty utf8 sequence with meaning defined by minikind.
49
Given this definition, the following is useful to know:
50
entry (aka row) - all the data for a given key.
51
entry[0]: The key (dirname, basename, fileid)
55
entry[1]: The tree(s) data for this path and id combination.
56
entry[1][0]: The current tree
57
entry[1][1]: The second tree
59
For an entry for a tree, we have (using tree 0 - current tree) to demonstrate:
60
entry[1][0][0]: minikind
61
entry[1][0][1]: fingerprint
63
entry[1][0][3]: executable
64
entry[1][0][4]: packed_stat
66
entry[1][1][4]: revision_id
25
MINIKIND = "f" | "d" | "l" | "a" | "r" | "t";
28
WHOLE_NUMBER = {digit}, digit;
30
REVISION_ID = a non-empty utf8 string;
32
dirstate format = header line, full checksum, row count, parent details,
33
ghost_details, entries;
34
header line = "#bazaar dirstate flat format 3", NL;
35
full checksum = "crc32: ", ["-"], WHOLE_NUMBER, NL;
36
row count = "num_entries: ", WHOLE_NUMBER, NL;
37
parent_details = WHOLE NUMBER, {REVISION_ID}* NL;
38
ghost_details = WHOLE NUMBER, {REVISION_ID}*, NL;
40
entry = entry_key, current_entry_details, {parent_entry_details};
41
entry_key = dirname, basename, fileid;
42
current_entry_details = common_entry_details, working_entry_details;
43
parent_entry_details = common_entry_details, history_entry_details;
44
common_entry_details = MINIKIND, fingerprint, size, executable
45
working_entry_details = packed_stat
46
history_entry_details = REVISION_ID;
49
fingerprint = a nonempty utf8 sequence with meaning defined by minikind.
51
Given this definition, the following is useful to know::
53
entry (aka row) - all the data for a given key.
54
entry[0]: The key (dirname, basename, fileid)
58
entry[1]: The tree(s) data for this path and id combination.
59
entry[1][0]: The current tree
60
entry[1][1]: The second tree
62
For an entry for a tree, we have (using tree 0 - current tree) to demonstrate::
64
entry[1][0][0]: minikind
65
entry[1][0][1]: fingerprint
67
entry[1][0][3]: executable
68
entry[1][0][4]: packed_stat
72
entry[1][1][4]: revision_id
68
74
There may be multiple rows at the root, one per id present in the root, so the
69
in memory root row is now:
70
self._dirblocks[0] -> ('', [entry ...]),
71
and the entries in there are
74
entries[0][2]: file_id
75
entries[1][0]: The tree data for the current tree for this fileid at /
79
'r' is a relocated entry: This path is not present in this tree with this id,
80
but the id can be found at another location. The fingerprint is used to
81
point to the target location.
82
'a' is an absent entry: In that tree the id is not present at this path.
83
'd' is a directory entry: This path in this tree is a directory with the
84
current file id. There is no fingerprint for directories.
85
'f' is a file entry: As for directory, but it's a file. The fingerprint is the
86
sha1 value of the file's canonical form, i.e. after any read filters have
87
been applied to the convenience form stored in the working tree.
88
'l' is a symlink entry: As for directory, but a symlink. The fingerprint is the
90
't' is a reference to a nested subtree; the fingerprint is the referenced
75
in memory root row is now::
77
self._dirblocks[0] -> ('', [entry ...]),
79
and the entries in there are::
83
entries[0][2]: file_id
84
entries[1][0]: The tree data for the current tree for this fileid at /
89
b'r' is a relocated entry: This path is not present in this tree with this
90
id, but the id can be found at another location. The fingerprint is
91
used to point to the target location.
92
b'a' is an absent entry: In that tree the id is not present at this path.
93
b'd' is a directory entry: This path in this tree is a directory with the
94
current file id. There is no fingerprint for directories.
95
b'f' is a file entry: As for directory, but it's a file. The fingerprint is
96
the sha1 value of the file's canonical form, i.e. after any read
97
filters have been applied to the convenience form stored in the working
99
b'l' is a symlink entry: As for directory, but a symlink. The fingerprint is
101
b't' is a reference to a nested subtree; the fingerprint is the referenced
95
The entries on disk and in memory are ordered according to the following keys:
106
The entries on disk and in memory are ordered according to the following keys::
97
108
directory, as a list of components
101
112
--- Format 1 had the following different definition: ---
102
rows = dirname, NULL, basename, NULL, MINIKIND, NULL, fileid_utf8, NULL,
103
WHOLE NUMBER (* size *), NULL, packed stat, NULL, sha1|symlink target,
105
PARENT ROW = NULL, revision_utf8, NULL, MINIKIND, NULL, dirname, NULL,
106
basename, NULL, WHOLE NUMBER (* size *), NULL, "y" | "n", NULL,
116
rows = dirname, NULL, basename, NULL, MINIKIND, NULL, fileid_utf8, NULL,
117
WHOLE NUMBER (* size *), NULL, packed stat, NULL, sha1|symlink target,
119
PARENT ROW = NULL, revision_utf8, NULL, MINIKIND, NULL, dirname, NULL,
120
basename, NULL, WHOLE NUMBER (* size *), NULL, "y" | "n", NULL,
109
123
PARENT ROW's are emitted for every parent that is not in the ghosts details
110
124
line. That is, if the parents are foo, bar, baz, and the ghosts are bar, then
231
260
ERROR_DIRECTORY = 267
234
if not getattr(struct, '_compile', None):
235
# Cannot pre-compile the dirstate pack_stat
236
def pack_stat(st, _encode=binascii.b2a_base64, _pack=struct.pack):
237
"""Convert stat values into a packed representation."""
238
return _encode(_pack('>LLLLLL', st.st_size, int(st.st_mtime),
239
int(st.st_ctime), st.st_dev, st.st_ino & 0xFFFFFFFF,
242
# compile the struct compiler we need, so as to only do it once
243
from _struct import Struct
244
_compiled_pack = Struct('>LLLLLL').pack
245
def pack_stat(st, _encode=binascii.b2a_base64, _pack=_compiled_pack):
246
"""Convert stat values into a packed representation."""
247
# jam 20060614 it isn't really worth removing more entries if we
248
# are going to leave it in packed form.
249
# With only st_mtime and st_mode filesize is 5.5M and read time is 275ms
250
# With all entries, filesize is 5.9M and read time is maybe 280ms
251
# well within the noise margin
253
# base64 encoding always adds a final newline, so strip it off
254
# The current version
255
return _encode(_pack(st.st_size, int(st.st_mtime), int(st.st_ctime),
256
st.st_dev, st.st_ino & 0xFFFFFFFF, st.st_mode))[:-1]
257
# This is 0.060s / 1.520s faster by not encoding as much information
258
# return _encode(_pack('>LL', int(st.st_mtime), st.st_mode))[:-1]
259
# This is not strictly faster than _encode(_pack())[:-1]
260
# return '%X.%X.%X.%X.%X.%X' % (
261
# st.st_size, int(st.st_mtime), int(st.st_ctime),
262
# st.st_dev, st.st_ino, st.st_mode)
263
# Similar to the _encode(_pack('>LL'))
264
# return '%X.%X' % (int(st.st_mtime), st.st_mode)
263
class DirstateCorrupt(errors.BzrError):
265
_fmt = "The dirstate file (%(state)s) appears to be corrupt: %(msg)s"
267
def __init__(self, state, msg):
268
errors.BzrError.__init__(self)
267
273
class SHA1Provider(object):
354
357
NOT_IN_MEMORY = 0
355
358
IN_MEMORY_UNMODIFIED = 1
356
359
IN_MEMORY_MODIFIED = 2
360
IN_MEMORY_HASH_MODIFIED = 3 # Only hash-cache updates
358
362
# A pack_stat (the x's) that is just noise and will never match the output
359
363
# of base64 encode.
361
NULL_PARENT_DETAILS = ('a', '', 0, False, '')
363
HEADER_FORMAT_2 = '#bazaar dirstate flat format 2\n'
364
HEADER_FORMAT_3 = '#bazaar dirstate flat format 3\n'
366
def __init__(self, path, sha1_provider):
365
NULL_PARENT_DETAILS = static_tuple.StaticTuple(b'a', b'', 0, False, b'')
367
HEADER_FORMAT_2 = b'#bazaar dirstate flat format 2\n'
368
HEADER_FORMAT_3 = b'#bazaar dirstate flat format 3\n'
370
def __init__(self, path, sha1_provider, worth_saving_limit=0,
371
use_filesystem_for_exec=True):
367
372
"""Create a DirState object.
369
374
:param path: The path at which the dirstate file on disk should live.
370
375
:param sha1_provider: an object meeting the SHA1Provider interface.
376
:param worth_saving_limit: when the exact number of hash changed
377
entries is known, only bother saving the dirstate if more than
378
this count of entries have changed.
379
-1 means never save hash changes, 0 means always save hash changes.
380
:param use_filesystem_for_exec: Whether to trust the filesystem
381
for executable bit information
372
383
# _header_state and _dirblock_state represent the current state
373
384
# of the dirstate metadata and the per-row data respectiely.
411
422
self._last_block_index = None
412
423
self._last_entry_index = None
424
# The set of known hash changes
425
self._known_hash_changes = set()
426
# How many hash changed entries can we have without saving
427
self._worth_saving_limit = worth_saving_limit
428
self._config_stack = config.LocationStack(urlutils.local_path_to_url(
430
self._use_filesystem_for_exec = use_filesystem_for_exec
414
432
def __repr__(self):
415
433
return "%s(%r)" % \
416
434
(self.__class__.__name__, self._filename)
436
def _mark_modified(self, hash_changed_entries=None, header_modified=False):
437
"""Mark this dirstate as modified.
439
:param hash_changed_entries: if non-None, mark just these entries as
440
having their hash modified.
441
:param header_modified: mark the header modified as well, not just the
444
#trace.mutter_callsite(3, "modified hash entries: %s", hash_changed_entries)
445
if hash_changed_entries:
446
self._known_hash_changes.update(
447
[e[0] for e in hash_changed_entries])
448
if self._dirblock_state in (DirState.NOT_IN_MEMORY,
449
DirState.IN_MEMORY_UNMODIFIED):
450
# If the dirstate is already marked a IN_MEMORY_MODIFIED, then
451
# that takes precedence.
452
self._dirblock_state = DirState.IN_MEMORY_HASH_MODIFIED
454
# TODO: Since we now have a IN_MEMORY_HASH_MODIFIED state, we
455
# should fail noisily if someone tries to set
456
# IN_MEMORY_MODIFIED but we don't have a write-lock!
457
# We don't know exactly what changed so disable smart saving
458
self._dirblock_state = DirState.IN_MEMORY_MODIFIED
460
self._header_state = DirState.IN_MEMORY_MODIFIED
462
def _mark_unmodified(self):
463
"""Mark this dirstate as unmodified."""
464
self._header_state = DirState.IN_MEMORY_UNMODIFIED
465
self._dirblock_state = DirState.IN_MEMORY_UNMODIFIED
466
self._known_hash_changes = set()
418
468
def add(self, path, file_id, kind, stat, fingerprint):
419
469
"""Add a path to be tracked.
421
:param path: The path within the dirstate - '' is the root, 'foo' is the
471
:param path: The path within the dirstate - b'' is the root, 'foo' is the
422
472
path foo within the root, 'foo/bar' is the path bar within foo
424
474
:param file_id: The file id of the path being added.
457
507
utf8path = (dirname + '/' + basename).strip('/').encode('utf8')
458
508
dirname, basename = osutils.split(utf8path)
459
509
# uses __class__ for speed; the check is needed for safety
460
if file_id.__class__ is not str:
510
if file_id.__class__ is not bytes:
461
511
raise AssertionError(
462
512
"must be a utf8 file_id not %s" % (type(file_id), ))
463
513
# Make sure the file_id does not exist in this tree
464
514
rename_from = None
465
file_id_entry = self._get_entry(0, fileid_utf8=file_id, include_deleted=True)
515
file_id_entry = self._get_entry(
516
0, fileid_utf8=file_id, include_deleted=True)
466
517
if file_id_entry != (None, None):
467
if file_id_entry[1][0][0] == 'a':
518
if file_id_entry[1][0][0] == b'a':
468
519
if file_id_entry[0] != (dirname, basename, file_id):
469
520
# set the old name's current operation to rename
470
521
self.update_minimal(file_id_entry[0],
476
527
rename_from = file_id_entry[0][0:2]
478
path = osutils.pathjoin(file_id_entry[0][0], file_id_entry[0][1])
529
path = osutils.pathjoin(
530
file_id_entry[0][0], file_id_entry[0][1])
479
531
kind = DirState._minikind_to_kind[file_id_entry[1][0][0]]
480
532
info = '%s:%s' % (kind, path)
481
533
raise errors.DuplicateFileId(file_id, info)
482
first_key = (dirname, basename, '')
534
first_key = (dirname, basename, b'')
483
535
block_index, present = self._find_block_index_from_key(first_key)
485
537
# check the path is not in the tree
486
538
block = self._dirblocks[block_index][1]
487
539
entry_index, _ = self._find_entry_index(first_key, block)
488
540
while (entry_index < len(block) and
489
block[entry_index][0][0:2] == first_key[0:2]):
490
if block[entry_index][1][0][0] not in 'ar':
541
block[entry_index][0][0:2] == first_key[0:2]):
542
if block[entry_index][1][0][0] not in (b'a', b'r'):
491
543
# this path is in the dirstate in the current tree.
492
raise Exception, "adding already added path!"
544
raise Exception("adding already added path!")
495
547
# The block where we want to put the file is not present. But it
1351
1425
fingerprint, new_child_path)
1352
1426
self._check_delta_ids_absent(new_ids, delta, 0)
1354
self._apply_removals(removals.iteritems())
1355
self._apply_insertions(insertions.values())
1428
self._apply_removals(viewitems(removals))
1429
self._apply_insertions(viewvalues(insertions))
1356
1430
# Validate parents
1357
1431
self._after_delta_check_parents(parents, 0)
1358
except errors.BzrError, e:
1432
except errors.BzrError as e:
1359
1433
self._changes_aborted = True
1360
1434
if 'integrity error' not in str(e):
1362
1436
# _get_entry raises BzrError when a request is inconsistent; we
1363
# want such errors to be shown as InconsistentDelta - and that
1437
# want such errors to be shown as InconsistentDelta - and that
1364
1438
# fits the behaviour we trigger.
1365
raise errors.InconsistentDeltaDelta(delta, "error from _get_entry.")
1439
raise errors.InconsistentDeltaDelta(delta,
1440
"error from _get_entry. %s" % (e,))
1367
1442
def _apply_removals(self, removals):
1368
1443
for file_id, path in sorted(removals, reverse=True,
1369
key=operator.itemgetter(1)):
1444
key=operator.itemgetter(1)):
1370
1445
dirname, basename = osutils.split(path)
1371
1446
block_i, entry_i, d_present, f_present = \
1372
1447
self._get_block_entry_index(dirname, basename, 0)
1374
1449
entry = self._dirblocks[block_i][1][entry_i]
1375
1450
except IndexError:
1376
self._changes_aborted = True
1377
raise errors.InconsistentDelta(path, file_id,
1378
"Wrong path for old path.")
1379
if not f_present or entry[1][0][0] in 'ar':
1380
self._changes_aborted = True
1381
raise errors.InconsistentDelta(path, file_id,
1382
"Wrong path for old path.")
1451
self._raise_invalid(path, file_id,
1452
"Wrong path for old path.")
1453
if not f_present or entry[1][0][0] in (b'a', b'r'):
1454
self._raise_invalid(path, file_id,
1455
"Wrong path for old path.")
1383
1456
if file_id != entry[0][2]:
1384
self._changes_aborted = True
1385
raise errors.InconsistentDelta(path, file_id,
1386
"Attempt to remove path has wrong id - found %r."
1457
self._raise_invalid(path, file_id,
1458
"Attempt to remove path has wrong id - found %r."
1388
1460
self._make_absent(entry)
1389
1461
# See if we have a malformed delta: deleting a directory must not
1390
1462
# leave crud behind. This increases the number of bisects needed
1471
1538
new_ids = set()
1472
1539
for old_path, new_path, file_id, inv_entry in delta:
1540
if file_id.__class__ is not bytes:
1541
raise AssertionError(
1542
"must be a utf8 file_id not %s" % (type(file_id), ))
1473
1543
if inv_entry is not None and file_id != inv_entry.file_id:
1474
raise errors.InconsistentDelta(new_path, file_id,
1475
"mismatched entry file_id %r" % inv_entry)
1476
if new_path is not None:
1544
self._raise_invalid(new_path, file_id,
1545
"mismatched entry file_id %r" % inv_entry)
1546
if new_path is None:
1547
new_path_utf8 = None
1477
1549
if inv_entry is None:
1478
raise errors.InconsistentDelta(new_path, file_id,
1479
"new_path with no entry")
1550
self._raise_invalid(new_path, file_id,
1551
"new_path with no entry")
1480
1552
new_path_utf8 = encode(new_path)
1481
1553
# note the parent for validation
1482
1554
dirname_utf8, basename_utf8 = osutils.split(new_path_utf8)
1483
1555
if basename_utf8:
1484
1556
parents.add((dirname_utf8, inv_entry.parent_id))
1485
1557
if old_path is None:
1486
adds.append((None, encode(new_path), file_id,
1487
inv_to_entry(inv_entry), True))
1558
old_path_utf8 = None
1560
old_path_utf8 = encode(old_path)
1561
if old_path is None:
1562
adds.append((None, new_path_utf8, file_id,
1563
inv_to_entry(inv_entry), True))
1488
1564
new_ids.add(file_id)
1489
1565
elif new_path is None:
1490
deletes.append((encode(old_path), None, file_id, None, True))
1491
elif (old_path, new_path) != root_only:
1566
deletes.append((old_path_utf8, None, file_id, None, True))
1567
elif (old_path, new_path) == root_only:
1568
# change things in-place
1569
# Note: the case of a parent directory changing its file_id
1570
# tends to break optimizations here, because officially
1571
# the file has actually been moved, it just happens to
1572
# end up at the same path. If we can figure out how to
1573
# handle that case, we can avoid a lot of add+delete
1574
# pairs for objects that stay put.
1575
# elif old_path == new_path:
1576
changes.append((old_path_utf8, new_path_utf8, file_id,
1577
inv_to_entry(inv_entry)))
1493
1580
# Because renames must preserve their children we must have
1494
1581
# processed all relocations and removes before hand. The sort
1497
1584
# pair will result in the deleted item being reinserted, or
1498
1585
# renamed items being reinserted twice - and possibly at the
1499
1586
# wrong place. Splitting into a delete/add pair also simplifies
1500
# the handling of entries with ('f', ...), ('r' ...) because
1501
# the target of the 'r' is old_path here, and we add that to
1587
# the handling of entries with (b'f', ...), (b'r' ...) because
1588
# the target of the b'r' is old_path here, and we add that to
1502
1589
# deletes, meaning that the add handler does not need to check
1503
# for 'r' items on every pass.
1590
# for b'r' items on every pass.
1504
1591
self._update_basis_apply_deletes(deletes)
1506
1593
# Split into an add/delete pair recursively.
1507
adds.append((None, new_path_utf8, file_id,
1508
inv_to_entry(inv_entry), False))
1594
adds.append((old_path_utf8, new_path_utf8, file_id,
1595
inv_to_entry(inv_entry), False))
1509
1596
# Expunge deletes that we've seen so that deleted/renamed
1510
1597
# children of a rename directory are handled correctly.
1511
new_deletes = reversed(list(self._iter_child_entries(1,
1598
new_deletes = reversed(list(
1599
self._iter_child_entries(1, old_path_utf8)))
1513
1600
# Remove the current contents of the tree at orig_path, and
1514
1601
# reinsert at the correct new path.
1515
1602
for entry in new_deletes:
1517
source_path = entry[0][0] + '/' + entry[0][1]
1603
child_dirname, child_basename, child_file_id = entry[0]
1605
source_path = child_dirname + b'/' + child_basename
1519
source_path = entry[0][1]
1607
source_path = child_basename
1520
1608
if new_path_utf8:
1521
target_path = new_path_utf8 + source_path[len(old_path):]
1610
new_path_utf8 + source_path[len(old_path_utf8):]
1612
if old_path_utf8 == b'':
1524
1613
raise AssertionError("cannot rename directory to"
1526
target_path = source_path[len(old_path) + 1:]
1527
adds.append((None, target_path, entry[0][2], entry[1][1], False))
1615
target_path = source_path[len(old_path_utf8) + 1:]
1617
(None, target_path, entry[0][2], entry[1][1], False))
1528
1618
deletes.append(
1529
1619
(source_path, target_path, entry[0][2], None, False))
1530
1620
deletes.append(
1531
(encode(old_path), new_path, file_id, None, False))
1533
# changes to just the root should not require remove/insertion
1535
changes.append((encode(old_path), encode(new_path), file_id,
1536
inv_to_entry(inv_entry)))
1621
(old_path_utf8, new_path_utf8, file_id, None, False))
1537
1623
self._check_delta_ids_absent(new_ids, delta, 1)
1539
1625
# Finish expunging deletes/first half of renames.
1597
1682
# Adds are accumulated partly from renames, so can be in any input
1598
1683
# order - sort it.
1684
# TODO: we may want to sort in dirblocks order. That way each entry
1685
# will end up in the same directory, allowing the _get_entry
1686
# fast-path for looking up 2 items in the same dir work.
1687
adds.sort(key=lambda x: x[1])
1600
1688
# adds is now in lexographic order, which places all parents before
1601
1689
# their children, so we can process it linearly.
1690
st = static_tuple.StaticTuple
1603
1691
for old_path, new_path, file_id, new_details, real_add in adds:
1604
# the entry for this file_id must be in tree 0.
1605
entry = self._get_entry(0, file_id, new_path)
1606
if entry[0] is None or entry[0][2] != file_id:
1607
self._changes_aborted = True
1608
raise errors.InconsistentDelta(new_path, file_id,
1609
'working tree does not contain new entry')
1610
if real_add and entry[1][1][0] not in absent:
1611
self._changes_aborted = True
1612
raise errors.InconsistentDelta(new_path, file_id,
1613
'The entry was considered to be a genuinely new record,'
1614
' but there was already an old record for it.')
1615
# We don't need to update the target of an 'r' because the handling
1616
# of renames turns all 'r' situations into a delete at the original
1618
entry[1][1] = new_details
1692
dirname, basename = osutils.split(new_path)
1693
entry_key = st(dirname, basename, file_id)
1694
block_index, present = self._find_block_index_from_key(entry_key)
1696
# The block where we want to put the file is not present.
1697
# However, it might have just been an empty directory. Look for
1698
# the parent in the basis-so-far before throwing an error.
1699
parent_dir, parent_base = osutils.split(dirname)
1700
parent_block_idx, parent_entry_idx, _, parent_present = \
1701
self._get_block_entry_index(parent_dir, parent_base, 1)
1702
if not parent_present:
1703
self._raise_invalid(new_path, file_id,
1704
"Unable to find block for this record."
1705
" Was the parent added?")
1706
self._ensure_block(parent_block_idx, parent_entry_idx, dirname)
1708
block = self._dirblocks[block_index][1]
1709
entry_index, present = self._find_entry_index(entry_key, block)
1711
if old_path is not None:
1712
self._raise_invalid(new_path, file_id,
1713
'considered a real add but still had old_path at %s'
1716
entry = block[entry_index]
1717
basis_kind = entry[1][1][0]
1718
if basis_kind == b'a':
1719
entry[1][1] = new_details
1720
elif basis_kind == b'r':
1721
raise NotImplementedError()
1723
self._raise_invalid(new_path, file_id,
1724
"An entry was marked as a new add"
1725
" but the basis target already existed")
1727
# The exact key was not found in the block. However, we need to
1728
# check if there is a key next to us that would have matched.
1729
# We only need to check 2 locations, because there are only 2
1731
for maybe_index in range(entry_index - 1, entry_index + 1):
1732
if maybe_index < 0 or maybe_index >= len(block):
1734
maybe_entry = block[maybe_index]
1735
if maybe_entry[0][:2] != (dirname, basename):
1736
# Just a random neighbor
1738
if maybe_entry[0][2] == file_id:
1739
raise AssertionError(
1740
'_find_entry_index didnt find a key match'
1741
' but walking the data did, for %s'
1743
basis_kind = maybe_entry[1][1][0]
1744
if basis_kind not in (b'a', b'r'):
1745
self._raise_invalid(new_path, file_id,
1746
"we have an add record for path, but the path"
1747
" is already present with another file_id %s"
1748
% (maybe_entry[0][2],))
1750
entry = (entry_key, [DirState.NULL_PARENT_DETAILS,
1752
block.insert(entry_index, entry)
1754
active_kind = entry[1][0][0]
1755
if active_kind == b'a':
1756
# The active record shows up as absent, this could be genuine,
1757
# or it could be present at some other location. We need to
1759
id_index = self._get_id_index()
1760
# The id_index may not be perfectly accurate for tree1, because
1761
# we haven't been keeping it updated. However, it should be
1762
# fine for tree0, and that gives us enough info for what we
1764
keys = id_index.get(file_id, ())
1766
block_i, entry_i, d_present, f_present = \
1767
self._get_block_entry_index(key[0], key[1], 0)
1770
active_entry = self._dirblocks[block_i][1][entry_i]
1771
if (active_entry[0][2] != file_id):
1772
# Some other file is at this path, we don't need to
1775
real_active_kind = active_entry[1][0][0]
1776
if real_active_kind in (b'a', b'r'):
1777
# We found a record, which was not *this* record,
1778
# which matches the file_id, but is not actually
1779
# present. Something seems *really* wrong.
1780
self._raise_invalid(new_path, file_id,
1781
"We found a tree0 entry that doesnt make sense")
1782
# Now, we've found a tree0 entry which matches the file_id
1783
# but is at a different location. So update them to be
1785
active_dir, active_name = active_entry[0][:2]
1787
active_path = active_dir + b'/' + active_name
1789
active_path = active_name
1790
active_entry[1][1] = st(b'r', new_path, 0, False, b'')
1791
entry[1][0] = st(b'r', active_path, 0, False, b'')
1792
elif active_kind == b'r':
1793
raise NotImplementedError()
1795
new_kind = new_details[0]
1796
if new_kind == b'd':
1797
self._ensure_block(block_index, entry_index, new_path)
1620
1799
def _update_basis_apply_changes(self, changes):
1621
1800
"""Apply a sequence of changes to tree 1 during update_basis_by_delta.
1653
1825
null = DirState.NULL_PARENT_DETAILS
1654
1826
for old_path, new_path, file_id, _, real_delete in deletes:
1655
1827
if real_delete != (new_path is None):
1656
self._changes_aborted = True
1657
raise AssertionError("bad delete delta")
1828
self._raise_invalid(old_path, file_id, "bad delete delta")
1658
1829
# the entry for this file_id must be in tree 1.
1659
1830
dirname, basename = osutils.split(old_path)
1660
1831
block_index, entry_index, dir_present, file_present = \
1661
1832
self._get_block_entry_index(dirname, basename, 1)
1662
1833
if not file_present:
1663
self._changes_aborted = True
1664
raise errors.InconsistentDelta(old_path, file_id,
1665
'basis tree does not contain removed entry')
1834
self._raise_invalid(old_path, file_id,
1835
'basis tree does not contain removed entry')
1666
1836
entry = self._dirblocks[block_index][1][entry_index]
1837
# The state of the entry in the 'active' WT
1838
active_kind = entry[1][0][0]
1667
1839
if entry[0][2] != file_id:
1668
self._changes_aborted = True
1669
raise errors.InconsistentDelta(old_path, file_id,
1670
'mismatched file_id in tree 1')
1672
if entry[1][0][0] != 'a':
1673
self._changes_aborted = True
1674
raise errors.InconsistentDelta(old_path, file_id,
1675
'This was marked as a real delete, but the WT state'
1676
' claims that it still exists and is versioned.')
1840
self._raise_invalid(old_path, file_id,
1841
'mismatched file_id in tree 1')
1843
old_kind = entry[1][1][0]
1844
if active_kind in b'ar':
1845
# The active tree doesn't have this file_id.
1846
# The basis tree is changing this record. If this is a
1847
# rename, then we don't want the record here at all
1848
# anymore. If it is just an in-place change, we want the
1849
# record here, but we'll add it if we need to. So we just
1851
if active_kind == b'r':
1852
active_path = entry[1][0][1]
1853
active_entry = self._get_entry(0, file_id, active_path)
1854
if active_entry[1][1][0] != b'r':
1855
self._raise_invalid(old_path, file_id,
1856
"Dirstate did not have matching rename entries")
1857
elif active_entry[1][0][0] in b'ar':
1858
self._raise_invalid(old_path, file_id,
1859
"Dirstate had a rename pointing at an inactive"
1861
active_entry[1][1] = null
1677
1862
del self._dirblocks[block_index][1][entry_index]
1863
if old_kind == b'd':
1864
# This was a directory, and the active tree says it
1865
# doesn't exist, and now the basis tree says it doesn't
1866
# exist. Remove its dirblock if present
1868
present) = self._find_block_index_from_key(
1869
(old_path, b'', b''))
1871
dir_block = self._dirblocks[dir_block_index][1]
1873
# This entry is empty, go ahead and just remove it
1874
del self._dirblocks[dir_block_index]
1679
if entry[1][0][0] == 'a':
1680
self._changes_aborted = True
1681
raise errors.InconsistentDelta(old_path, file_id,
1682
'The entry was considered a rename, but the source path'
1683
' is marked as absent.')
1684
# For whatever reason, we were asked to rename an entry
1685
# that was originally marked as deleted. This could be
1686
# because we are renaming the parent directory, and the WT
1687
# current state has the file marked as deleted.
1688
elif entry[1][0][0] == 'r':
1689
# implement the rename
1690
del self._dirblocks[block_index][1][entry_index]
1692
# it is being resurrected here, so blank it out temporarily.
1693
self._dirblocks[block_index][1][entry_index][1][1] = null
1876
# There is still an active record, so just mark this
1879
block_i, entry_i, d_present, f_present = \
1880
self._get_block_entry_index(old_path, b'', 1)
1882
dir_block = self._dirblocks[block_i][1]
1883
for child_entry in dir_block:
1884
child_basis_kind = child_entry[1][1][0]
1885
if child_basis_kind not in b'ar':
1886
self._raise_invalid(old_path, file_id,
1887
"The file id was deleted but its children were "
1695
1890
def _after_delta_check_parents(self, parents, index):
1696
1891
"""Check that parents required by the delta are all intact.
1698
1893
:param parents: An iterable of (path_utf8, file_id) tuples which are
1699
1894
required to be present in tree 'index' at path_utf8 with id file_id
1700
1895
and be a directory.
1924
2115
tree present there.
1926
2117
self._read_dirblocks_if_needed()
1927
key = dirname, basename, ''
2118
key = dirname, basename, b''
1928
2119
block_index, present = self._find_block_index_from_key(key)
1929
2120
if not present:
1930
2121
# no such directory - return the dir index and 0 for the row.
1931
2122
return block_index, 0, False, False
1932
block = self._dirblocks[block_index][1] # access the entries only
2123
block = self._dirblocks[block_index][1] # access the entries only
1933
2124
entry_index, present = self._find_entry_index(key, block)
1934
2125
# linear search through entries at this path to find the one
1936
2127
while entry_index < len(block) and block[entry_index][0][1] == basename:
1937
if block[entry_index][1][tree_index][0] not in 'ar':
2128
if block[entry_index][1][tree_index][0] not in (b'a', b'r'):
1938
2129
# neither absent or relocated
1939
2130
return block_index, entry_index, True, True
1940
2131
entry_index += 1
1941
2132
return block_index, entry_index, True, False
1943
def _get_entry(self, tree_index, fileid_utf8=None, path_utf8=None, include_deleted=False):
2134
def _get_entry(self, tree_index, fileid_utf8=None, path_utf8=None,
2135
include_deleted=False):
1944
2136
"""Get the dirstate entry for path in tree tree_index.
1946
2138
If either file_id or path is supplied, it is used as the key to lookup.
2145
2339
def _get_id_index(self):
2146
"""Get an id index of self._dirblocks."""
2340
"""Get an id index of self._dirblocks.
2342
This maps from file_id => [(directory, name, file_id)] entries where
2343
that file_id appears in one of the trees.
2147
2345
if self._id_index is None:
2149
2347
for key, tree_details in self._iter_entries():
2150
id_index.setdefault(key[2], set()).add(key)
2348
self._add_to_id_index(id_index, key)
2151
2349
self._id_index = id_index
2152
2350
return self._id_index
2352
def _add_to_id_index(self, id_index, entry_key):
2353
"""Add this entry to the _id_index mapping."""
2354
# This code used to use a set for every entry in the id_index. However,
2355
# it is *rare* to have more than one entry. So a set is a large
2356
# overkill. And even when we do, we won't ever have more than the
2357
# number of parent trees. Which is still a small number (rarely >2). As
2358
# such, we use a simple tuple, and do our own uniqueness checks. While
2359
# the 'in' check is O(N) since N is nicely bounded it shouldn't ever
2360
# cause quadratic failure.
2361
file_id = entry_key[2]
2362
entry_key = static_tuple.StaticTuple.from_sequence(entry_key)
2363
if file_id not in id_index:
2364
id_index[file_id] = static_tuple.StaticTuple(entry_key,)
2366
entry_keys = id_index[file_id]
2367
if entry_key not in entry_keys:
2368
id_index[file_id] = entry_keys + (entry_key,)
2370
def _remove_from_id_index(self, id_index, entry_key):
2371
"""Remove this entry from the _id_index mapping.
2373
It is an programming error to call this when the entry_key is not
2376
file_id = entry_key[2]
2377
entry_keys = list(id_index[file_id])
2378
entry_keys.remove(entry_key)
2379
id_index[file_id] = static_tuple.StaticTuple.from_sequence(entry_keys)
2154
2381
def _get_output_lines(self, lines):
2155
2382
"""Format lines for final output.
2160
2387
output_lines = [DirState.HEADER_FORMAT_3]
2161
lines.append('') # a final newline
2162
inventory_text = '\0\n\0'.join(lines)
2163
output_lines.append('crc32: %s\n' % (zlib.crc32(inventory_text),))
2388
lines.append(b'') # a final newline
2389
inventory_text = b'\0\n\0'.join(lines)
2390
output_lines.append(b'crc32: %d\n' % (zlib.crc32(inventory_text),))
2164
2391
# -3, 1 for num parents, 1 for ghosts, 1 for final newline
2165
num_entries = len(lines)-3
2166
output_lines.append('num_entries: %s\n' % (num_entries,))
2392
num_entries = len(lines) - 3
2393
output_lines.append(b'num_entries: %d\n' % (num_entries,))
2167
2394
output_lines.append(inventory_text)
2168
2395
return output_lines
2170
2397
def _make_deleted_row(self, fileid_utf8, parents):
2171
2398
"""Return a deleted row for fileid_utf8."""
2172
return ('/', 'RECYCLED.BIN', 'file', fileid_utf8, 0, DirState.NULLSTAT,
2399
return (b'/', b'RECYCLED.BIN', b'file', fileid_utf8, 0, DirState.NULLSTAT,
2175
2402
def _num_present_parents(self):
2176
2403
"""The number of parent entries in each record row."""
2177
2404
return len(self._parents) - len(self._ghosts)
2180
def on_file(path, sha1_provider=None):
2407
def on_file(cls, path, sha1_provider=None, worth_saving_limit=0,
2408
use_filesystem_for_exec=True):
2181
2409
"""Construct a DirState on the file at path "path".
2183
2411
:param path: The path at which the dirstate file on disk should live.
2184
2412
:param sha1_provider: an object meeting the SHA1Provider interface.
2185
2413
If None, a DefaultSHA1Provider is used.
2414
:param worth_saving_limit: when the exact number of hash changed
2415
entries is known, only bother saving the dirstate if more than
2416
this count of entries have changed. -1 means never save.
2417
:param use_filesystem_for_exec: Whether to trust the filesystem
2418
for executable bit information
2186
2419
:return: An unlocked DirState object, associated with the given path.
2188
2421
if sha1_provider is None:
2189
2422
sha1_provider = DefaultSHA1Provider()
2190
result = DirState(path, sha1_provider)
2423
result = cls(path, sha1_provider,
2424
worth_saving_limit=worth_saving_limit,
2425
use_filesystem_for_exec=use_filesystem_for_exec)
2193
2428
def _read_dirblocks_if_needed(self):
2243
2478
raise errors.BzrError(
2244
2479
'invalid header line: %r' % (header,))
2245
2480
crc_line = self._state_file.readline()
2246
if not crc_line.startswith('crc32: '):
2481
if not crc_line.startswith(b'crc32: '):
2247
2482
raise errors.BzrError('missing crc32 checksum: %r' % crc_line)
2248
self.crc_expected = int(crc_line[len('crc32: '):-1])
2483
self.crc_expected = int(crc_line[len(b'crc32: '):-1])
2249
2484
num_entries_line = self._state_file.readline()
2250
if not num_entries_line.startswith('num_entries: '):
2485
if not num_entries_line.startswith(b'num_entries: '):
2251
2486
raise errors.BzrError('missing num_entries line')
2252
self._num_entries = int(num_entries_line[len('num_entries: '):-1])
2487
self._num_entries = int(num_entries_line[len(b'num_entries: '):-1])
2254
def sha1_from_stat(self, path, stat_result, _pack_stat=pack_stat):
2489
def sha1_from_stat(self, path, stat_result):
2255
2490
"""Find a sha1 given a stat lookup."""
2256
return self._get_packed_stat_index().get(_pack_stat(stat_result), None)
2491
return self._get_packed_stat_index().get(pack_stat(stat_result), None)
2258
2493
def _get_packed_stat_index(self):
2259
2494
"""Get a packed_stat index of self._dirblocks."""
2260
2495
if self._packed_stat_index is None:
2262
2497
for key, tree_details in self._iter_entries():
2263
if tree_details[0][0] == 'f':
2498
if tree_details[0][0] == b'f':
2264
2499
index[tree_details[0][4]] = tree_details[0][1]
2265
2500
self._packed_stat_index = index
2266
2501
return self._packed_stat_index
2283
2518
# Should this be a warning? For now, I'm expecting that places that
2284
2519
# mark it inconsistent will warn, making a warning here redundant.
2285
2520
trace.mutter('Not saving DirState because '
2286
'_changes_aborted is set.')
2288
if (self._header_state == DirState.IN_MEMORY_MODIFIED or
2289
self._dirblock_state == DirState.IN_MEMORY_MODIFIED):
2521
'_changes_aborted is set.')
2523
# TODO: Since we now distinguish IN_MEMORY_MODIFIED from
2524
# IN_MEMORY_HASH_MODIFIED, we should only fail quietly if we fail
2525
# to save an IN_MEMORY_HASH_MODIFIED, and fail *noisily* if we
2526
# fail to save IN_MEMORY_MODIFIED
2527
if not self._worth_saving():
2291
grabbed_write_lock = False
2292
if self._lock_state != 'w':
2293
grabbed_write_lock, new_lock = self._lock_token.temporary_write_lock()
2294
# Switch over to the new lock, as the old one may be closed.
2530
grabbed_write_lock = False
2531
if self._lock_state != 'w':
2532
grabbed_write_lock, new_lock = self._lock_token.temporary_write_lock()
2533
# Switch over to the new lock, as the old one may be closed.
2534
# TODO: jam 20070315 We should validate the disk file has
2535
# not changed contents, since temporary_write_lock may
2536
# not be an atomic operation.
2537
self._lock_token = new_lock
2538
self._state_file = new_lock.f
2539
if not grabbed_write_lock:
2540
# We couldn't grab a write lock, so we switch back to a read one
2543
lines = self.get_lines()
2544
self._state_file.seek(0)
2545
self._state_file.writelines(lines)
2546
self._state_file.truncate()
2547
self._state_file.flush()
2548
self._maybe_fdatasync()
2549
self._mark_unmodified()
2551
if grabbed_write_lock:
2552
self._lock_token = self._lock_token.restore_read_lock()
2553
self._state_file = self._lock_token.f
2295
2554
# TODO: jam 20070315 We should validate the disk file has
2296
# not changed contents. Since temporary_write_lock may
2555
# not changed contents. Since restore_read_lock may
2297
2556
# not be an atomic operation.
2298
self._lock_token = new_lock
2299
self._state_file = new_lock.f
2300
if not grabbed_write_lock:
2301
# We couldn't grab a write lock, so we switch back to a read one
2304
self._state_file.seek(0)
2305
self._state_file.writelines(self.get_lines())
2306
self._state_file.truncate()
2307
self._state_file.flush()
2308
self._header_state = DirState.IN_MEMORY_UNMODIFIED
2309
self._dirblock_state = DirState.IN_MEMORY_UNMODIFIED
2311
if grabbed_write_lock:
2312
self._lock_token = self._lock_token.restore_read_lock()
2313
self._state_file = self._lock_token.f
2314
# TODO: jam 20070315 We should validate the disk file has
2315
# not changed contents. Since restore_read_lock may
2316
# not be an atomic operation.
2558
def _maybe_fdatasync(self):
2559
"""Flush to disk if possible and if not configured off."""
2560
if self._config_stack.get('dirstate.fdatasync'):
2561
osutils.fdatasync(self._state_file.fileno())
2563
def _worth_saving(self):
2564
"""Is it worth saving the dirstate or not?"""
2565
if (self._header_state == DirState.IN_MEMORY_MODIFIED
2566
or self._dirblock_state == DirState.IN_MEMORY_MODIFIED):
2568
if self._dirblock_state == DirState.IN_MEMORY_HASH_MODIFIED:
2569
if self._worth_saving_limit == -1:
2570
# We never save hash changes when the limit is -1
2572
# If we're using smart saving and only a small number of
2573
# entries have changed their hash, don't bother saving. John has
2574
# suggested using a heuristic here based on the size of the
2575
# changed files and/or tree. For now, we go with a configurable
2576
# number of changes, keeping the calculation time
2577
# as low overhead as possible. (This also keeps all existing
2578
# tests passing as the default is 0, i.e. always save.)
2579
if len(self._known_hash_changes) >= self._worth_saving_limit:
2318
2583
def _set_data(self, parent_ids, dirblocks):
2319
2584
"""Set the full dirstate data in memory.
2463
2744
# mapping from path,id. We need to look up the correct path
2464
2745
# for the indexes from 0 to tree_index -1
2465
2746
new_details = []
2466
for lookup_index in xrange(tree_index):
2747
for lookup_index in range(tree_index):
2467
2748
# boundary case: this is the first occurence of file_id
2468
# so there are no id_indexs, possibly take this out of
2749
# so there are no id_indexes, possibly take this out of
2470
if not len(id_index[file_id]):
2751
if not len(entry_keys):
2471
2752
new_details.append(DirState.NULL_PARENT_DETAILS)
2473
2754
# grab any one entry, use it to find the right path.
2474
# TODO: optimise this to reduce memory use in highly
2475
# fragmented situations by reusing the relocation
2477
a_key = iter(id_index[file_id]).next()
2478
if by_path[a_key][lookup_index][0] in ('r', 'a'):
2479
# its a pointer or missing statement, use it as is.
2480
new_details.append(by_path[a_key][lookup_index])
2755
a_key = next(iter(entry_keys))
2756
if by_path[a_key][lookup_index][0] in (b'r', b'a'):
2757
# its a pointer or missing statement, use it as
2760
by_path[a_key][lookup_index])
2482
2762
# we have the right key, make a pointer to it.
2483
real_path = ('/'.join(a_key[0:2])).strip('/')
2484
new_details.append(('r', real_path, 0, False, ''))
2763
real_path = (b'/'.join(a_key[0:2])).strip(b'/')
2764
new_details.append(st(b'r', real_path, 0, False,
2485
2766
new_details.append(self._inv_entry_to_details(entry))
2486
2767
new_details.extend(new_location_suffix)
2487
2768
by_path[new_entry_key] = new_details
2488
id_index[file_id].add(new_entry_key)
2769
self._add_to_id_index(id_index, new_entry_key)
2489
2770
# --- end generation of full tree mappings
2491
2772
# sort and output all the entries
2492
new_entries = self._sort_entries(by_path.items())
2773
new_entries = self._sort_entries(viewitems(by_path))
2493
2774
self._entries_to_current_state(new_entries)
2494
2775
self._parents = [rev_id for rev_id, tree in trees]
2495
2776
self._ghosts = list(ghosts)
2496
self._header_state = DirState.IN_MEMORY_MODIFIED
2497
self._dirblock_state = DirState.IN_MEMORY_MODIFIED
2777
self._mark_modified(header_modified=True)
2498
2778
self._id_index = id_index
2500
2780
def _sort_entries(self, entry_list):
2604
2897
# the minimal required trigger is if the execute bit or cached
2605
2898
# kind has changed.
2606
2899
if (current_old[1][0][3] != current_new[1].executable or
2607
current_old[1][0][0] != current_new_minikind):
2900
current_old[1][0][0] != current_new_minikind):
2609
2902
trace.mutter("Updating in-place change '%s'.",
2610
new_path_utf8.decode('utf8'))
2903
new_path_utf8.decode('utf8'))
2611
2904
self.update_minimal(current_old[0], current_new_minikind,
2612
executable=current_new[1].executable,
2613
path_utf8=new_path_utf8, fingerprint=fingerprint,
2905
executable=current_new[1].executable,
2906
path_utf8=new_path_utf8, fingerprint=fingerprint,
2615
2908
# both sides are dealt with, move on
2616
2909
current_old = advance(old_iterator)
2617
2910
current_new = advance(new_iterator)
2618
elif (cmp_by_dirs(new_dirname, current_old[0][0]) < 0
2619
or (new_dirname == current_old[0][0]
2620
and new_entry_key[1:] < current_old[0][1:])):
2911
elif (lt_by_dirs(new_dirname, current_old[0][0])
2912
or (new_dirname == current_old[0][0] and
2913
new_entry_key[1:] < current_old[0][1:])):
2621
2914
# new comes before:
2622
2915
# add a entry for this and advance new
2624
2917
trace.mutter("Inserting from new '%s'.",
2625
new_path_utf8.decode('utf8'))
2918
new_path_utf8.decode('utf8'))
2626
2919
self.update_minimal(new_entry_key, current_new_minikind,
2627
executable=current_new[1].executable,
2628
path_utf8=new_path_utf8, fingerprint=fingerprint,
2920
executable=current_new[1].executable,
2921
path_utf8=new_path_utf8, fingerprint=fingerprint,
2630
2923
current_new = advance(new_iterator)
2632
2925
# we've advanced past the place where the old key would be,
2633
2926
# without seeing it in the new list. so it must be gone.
2635
2928
trace.mutter("Deleting from old '%s/%s'.",
2636
current_old[0][0].decode('utf8'),
2637
current_old[0][1].decode('utf8'))
2929
current_old[0][0].decode('utf8'),
2930
current_old[0][1].decode('utf8'))
2638
2931
self._make_absent(current_old)
2639
2932
current_old = advance(old_iterator)
2640
self._dirblock_state = DirState.IN_MEMORY_MODIFIED
2933
self._mark_modified()
2641
2934
self._id_index = None
2642
2935
self._packed_stat_index = None
2644
2937
trace.mutter("set_state_from_inventory complete.")
2939
def set_state_from_scratch(self, working_inv, parent_trees, parent_ghosts):
2940
"""Wipe the currently stored state and set it to something new.
2942
This is a hard-reset for the data we are working with.
2944
# Technically, we really want a write lock, but until we write, we
2945
# don't really need it.
2946
self._requires_lock()
2947
# root dir and root dir contents with no children. We have to have a
2948
# root for set_state_from_inventory to work correctly.
2949
empty_root = ((b'', b'', inventory.ROOT_ID),
2950
[(b'd', b'', 0, False, DirState.NULLSTAT)])
2951
empty_tree_dirblocks = [(b'', [empty_root]), (b'', [])]
2952
self._set_data([], empty_tree_dirblocks)
2953
self.set_state_from_inventory(working_inv)
2954
self.set_parent_trees(parent_trees, parent_ghosts)
2646
2956
def _make_absent(self, current_old):
2647
2957
"""Mark current_old - an entry - as absent for tree 0.
2682
2995
update_block_index, present = \
2683
2996
self._find_block_index_from_key(update_key)
2684
2997
if not present:
2685
raise AssertionError('could not find block for %s' % (update_key,))
2998
raise AssertionError(
2999
'could not find block for %s' % (update_key,))
2686
3000
update_entry_index, present = \
2687
self._find_entry_index(update_key, self._dirblocks[update_block_index][1])
3001
self._find_entry_index(
3002
update_key, self._dirblocks[update_block_index][1])
2688
3003
if not present:
2689
raise AssertionError('could not find entry for %s' % (update_key,))
3004
raise AssertionError(
3005
'could not find entry for %s' % (update_key,))
2690
3006
update_tree_details = self._dirblocks[update_block_index][1][update_entry_index][1]
2691
3007
# it must not be absent at the moment
2692
if update_tree_details[0][0] == 'a': # absent
3008
if update_tree_details[0][0] == b'a': # absent
2693
3009
raise AssertionError('bad row %r' % (update_tree_details,))
2694
3010
update_tree_details[0] = DirState.NULL_PARENT_DETAILS
2695
self._dirblock_state = DirState.IN_MEMORY_MODIFIED
3011
self._mark_modified()
2696
3012
return last_reference
2698
def update_minimal(self, key, minikind, executable=False, fingerprint='',
2699
packed_stat=None, size=0, path_utf8=None, fullscan=False):
3014
def update_minimal(self, key, minikind, executable=False, fingerprint=b'',
3015
packed_stat=None, size=0, path_utf8=None, fullscan=False):
2700
3016
"""Update an entry to the state in tree 0.
2702
3018
This will either create a new entry at 'key' or update an existing one.
2803
3124
update_block_index, present = \
2804
3125
self._find_block_index_from_key(other_key)
2805
3126
if not present:
2806
raise AssertionError('could not find block for %s' % (other_key,))
3127
raise AssertionError(
3128
'could not find block for %s' % (other_key,))
2807
3129
update_entry_index, present = \
2808
self._find_entry_index(other_key, self._dirblocks[update_block_index][1])
3130
self._find_entry_index(
3131
other_key, self._dirblocks[update_block_index][1])
2809
3132
if not present:
2810
raise AssertionError('update_minimal: could not find entry for %s' % (other_key,))
3133
raise AssertionError(
3134
'update_minimal: could not find entry for %s' % (other_key,))
2811
3135
update_details = self._dirblocks[update_block_index][1][update_entry_index][1][lookup_index]
2812
if update_details[0] in 'ar': # relocated, absent
3136
if update_details[0] in (b'a', b'r'): # relocated, absent
2813
3137
# its a pointer or absent in lookup_index's tree, use
2815
3139
new_entry[1].append(update_details)
2817
3141
# we have the right key, make a pointer to it.
2818
3142
pointer_path = osutils.pathjoin(*other_key[0:2])
2819
new_entry[1].append(('r', pointer_path, 0, False, ''))
3143
new_entry[1].append(
3144
(b'r', pointer_path, 0, False, b''))
2820
3145
block.insert(entry_index, new_entry)
2821
existing_keys.add(key)
3146
self._add_to_id_index(id_index, key)
2823
3148
# Does the new state matter?
2824
3149
block[entry_index][1][0] = new_details
2842
3172
# other trees, so put absent pointers there
2843
3173
# This is the vertical axis in the matrix, all pointing
2844
3174
# to the real path.
2845
block_index, present = self._find_block_index_from_key(entry_key)
3175
block_index, present = self._find_block_index_from_key(
2846
3177
if not present:
2847
3178
raise AssertionError('not present: %r', entry_key)
2848
entry_index, present = self._find_entry_index(entry_key, self._dirblocks[block_index][1])
3179
entry_index, present = self._find_entry_index(
3180
entry_key, self._dirblocks[block_index][1])
2849
3181
if not present:
2850
3182
raise AssertionError('not present: %r', entry_key)
2851
3183
self._dirblocks[block_index][1][entry_index][1][0] = \
2852
('r', path_utf8, 0, False, '')
3184
(b'r', path_utf8, 0, False, b'')
2853
3185
# add a containing dirblock if needed.
2854
if new_details[0] == 'd':
2855
subdir_key = (osutils.pathjoin(*key[0:2]), '', '')
3186
if new_details[0] == b'd':
3187
# GZ 2017-06-09: Using pathjoin why?
3188
subdir_key = (osutils.pathjoin(*key[0:2]), b'', b'')
2856
3189
block_index, present = self._find_block_index_from_key(subdir_key)
2857
3190
if not present:
2858
3191
self._dirblocks.insert(block_index, (subdir_key[0], []))
2860
self._dirblock_state = DirState.IN_MEMORY_MODIFIED
3193
self._mark_modified()
2862
3195
def _maybe_remove_row(self, block, index, id_index):
2863
3196
"""Remove index if it is absent or relocated across the row.
2865
3198
id_index is updated accordingly.
3199
:return: True if we removed the row, False otherwise
2867
3201
present_in_row = False
2868
3202
entry = block[index]
2869
3203
for column in entry[1]:
2870
if column[0] not in 'ar':
3204
if column[0] not in (b'a', b'r'):
2871
3205
present_in_row = True
2873
3207
if not present_in_row:
2874
3208
block.pop(index)
2875
id_index[entry[0][2]].remove(entry[0])
3209
self._remove_from_id_index(id_index, entry[0])
2877
3213
def _validate(self):
2878
3214
"""Check that invariants on the dirblock are correct.
2958
3294
# We check this with a dict per tree pointing either to the present
2959
3295
# name, or None if absent.
2960
3296
tree_count = self._num_present_parents() + 1
2961
id_path_maps = [dict() for i in range(tree_count)]
3297
id_path_maps = [{} for _ in range(tree_count)]
2962
3298
# Make sure that all renamed entries point to the correct location.
2963
3299
for entry in self._iter_entries():
2964
3300
file_id = entry[0][2]
2965
3301
this_path = osutils.pathjoin(entry[0][0], entry[0][1])
2966
3302
if len(entry[1]) != tree_count:
2967
3303
raise AssertionError(
2968
"wrong number of entry details for row\n%s" \
2969
",\nexpected %d" % \
2970
(pformat(entry), tree_count))
3304
"wrong number of entry details for row\n%s"
3306
(pformat(entry), tree_count))
2971
3307
absent_positions = 0
2972
3308
for tree_index, tree_state in enumerate(entry[1]):
2973
3309
this_tree_map = id_path_maps[tree_index]
2974
3310
minikind = tree_state[0]
2975
if minikind in 'ar':
3311
if minikind in (b'a', b'r'):
2976
3312
absent_positions += 1
2977
3313
# have we seen this id before in this column?
2978
3314
if file_id in this_tree_map:
2979
3315
previous_path, previous_loc = this_tree_map[file_id]
2980
3316
# any later mention of this file must be consistent with
2981
3317
# what was said before
3318
if minikind == b'a':
2983
3319
if previous_path is not None:
2984
3320
raise AssertionError(
2985
"file %s is absent in row %r but also present " \
2987
(file_id, entry, previous_path))
2988
elif minikind == 'r':
3321
"file %s is absent in row %r but also present "
3323
(file_id.decode('utf-8'), entry, previous_path))
3324
elif minikind == b'r':
2989
3325
target_location = tree_state[1]
2990
3326
if previous_path != target_location:
2991
3327
raise AssertionError(
2992
"file %s relocation in row %r but also at %r" \
2993
% (file_id, entry, previous_path))
3328
"file %s relocation in row %r but also at %r"
3329
% (file_id, entry, previous_path))
2995
3331
# a file, directory, etc - may have been previously
2996
3332
# pointed to by a relocation, which must point here
3138
3496
# are calculated at the same time, so checking just the size
3139
3497
# gains nothing w.r.t. performance.
3140
3498
link_or_sha1 = state._sha1_file(abspath)
3141
entry[1][0] = ('f', link_or_sha1, stat_value.st_size,
3499
entry[1][0] = (b'f', link_or_sha1, stat_value.st_size,
3142
3500
executable, packed_stat)
3144
entry[1][0] = ('f', '', stat_value.st_size,
3502
entry[1][0] = (b'f', b'', stat_value.st_size,
3145
3503
executable, DirState.NULLSTAT)
3146
elif minikind == 'd':
3504
worth_saving = False
3505
elif minikind == b'd':
3147
3506
link_or_sha1 = None
3148
entry[1][0] = ('d', '', 0, False, packed_stat)
3149
if saved_minikind != 'd':
3507
entry[1][0] = (b'd', b'', 0, False, packed_stat)
3508
if saved_minikind != b'd':
3150
3509
# This changed from something into a directory. Make sure we
3151
3510
# have a directory block for it. This doesn't happen very
3152
3511
# often, so this doesn't have to be super fast.
3153
3512
block_index, entry_index, dir_present, file_present = \
3154
3513
state._get_block_entry_index(entry[0][0], entry[0][1], 0)
3155
3514
state._ensure_block(block_index, entry_index,
3156
osutils.pathjoin(entry[0][0], entry[0][1]))
3157
elif minikind == 'l':
3515
osutils.pathjoin(entry[0][0], entry[0][1]))
3517
worth_saving = False
3518
elif minikind == b'l':
3519
if saved_minikind == b'l':
3520
worth_saving = False
3158
3521
link_or_sha1 = state._read_link(abspath, saved_link_or_sha1)
3159
3522
if state._cutoff_time is None:
3160
3523
state._sha_cutoff_time()
3161
3524
if (stat_value.st_mtime < state._cutoff_time
3162
and stat_value.st_ctime < state._cutoff_time):
3163
entry[1][0] = ('l', link_or_sha1, stat_value.st_size,
3525
and stat_value.st_ctime < state._cutoff_time):
3526
entry[1][0] = (b'l', link_or_sha1, stat_value.st_size,
3164
3527
False, packed_stat)
3166
entry[1][0] = ('l', '', stat_value.st_size,
3529
entry[1][0] = (b'l', b'', stat_value.st_size,
3167
3530
False, DirState.NULLSTAT)
3168
state._dirblock_state = DirState.IN_MEMORY_MODIFIED
3532
state._mark_modified([entry])
3169
3533
return link_or_sha1
3172
3536
class ProcessEntryPython(object):
3174
3538
__slots__ = ["old_dirname_to_file_id", "new_dirname_to_file_id",
3175
"last_source_parent", "last_target_parent", "include_unchanged",
3176
"partial", "use_filesystem_for_exec", "utf8_decode",
3177
"searched_specific_files", "search_specific_files",
3178
"searched_exact_paths", "search_specific_file_parents", "seen_ids",
3179
"state", "source_index", "target_index", "want_unversioned", "tree"]
3539
"last_source_parent", "last_target_parent", "include_unchanged",
3540
"partial", "use_filesystem_for_exec", "utf8_decode",
3541
"searched_specific_files", "search_specific_files",
3542
"searched_exact_paths", "search_specific_file_parents", "seen_ids",
3543
"state", "source_index", "target_index", "want_unversioned", "tree"]
3181
3545
def __init__(self, include_unchanged, use_filesystem_for_exec,
3182
search_specific_files, state, source_index, target_index,
3183
want_unversioned, tree):
3546
search_specific_files, state, source_index, target_index,
3547
want_unversioned, tree):
3184
3548
self.old_dirname_to_file_id = {}
3185
3549
self.new_dirname_to_file_id = {}
3186
3550
# Are we doing a partial iter_changes?
3187
self.partial = search_specific_files != set([''])
3551
self.partial = search_specific_files != {''}
3188
3552
# Using a list so that we can access the values and change them in
3189
3553
# nested scope. Each one is [path, file_id, entry]
3190
3554
self.last_source_parent = [None, None]
3428
3798
and stat.S_IEXEC & path_info[3].st_mode)
3430
3800
target_exec = target_details[3]
3431
return (entry[0][2],
3432
(None, self.utf8_decode(path)[0]),
3436
(None, self.utf8_decode(entry[0][1])[0]),
3437
(None, path_info[2]),
3438
(None, target_exec)), True
3803
(None, self.utf8_decode(path)[0]),
3807
(None, self.utf8_decode(entry[0][1])[0]),
3808
(None, path_info[2]),
3809
(None, target_exec)), True
3440
3811
# Its a missing file, report it as such.
3441
return (entry[0][2],
3442
(None, self.utf8_decode(path)[0]),
3446
(None, self.utf8_decode(entry[0][1])[0]),
3448
(None, False)), True
3449
elif source_minikind in 'fdlt' and target_minikind in 'a':
3814
(None, self.utf8_decode(path)[0]),
3818
(None, self.utf8_decode(entry[0][1])[0]),
3820
(None, False)), True
3821
elif source_minikind in _fdlt and target_minikind in b'a':
3450
3822
# unversioned, possibly, or possibly not deleted: we dont care.
3451
3823
# if its still on disk, *and* theres no other entry at this
3452
3824
# path [we dont know this in this routine at the moment -
3453
3825
# perhaps we should change this - then it would be an unknown.
3454
3826
old_path = pathjoin(entry[0][0], entry[0][1])
3455
3827
# parent id is the entry for the path in the target tree
3456
parent_id = self.state._get_entry(self.source_index, path_utf8=entry[0][0])[0][2]
3828
parent_id = self.state._get_entry(
3829
self.source_index, path_utf8=entry[0][0])[0][2]
3457
3830
if parent_id == entry[0][2]:
3458
3831
parent_id = None
3459
return (entry[0][2],
3460
(self.utf8_decode(old_path)[0], None),
3464
(self.utf8_decode(entry[0][1])[0], None),
3465
(DirState._minikind_to_kind[source_minikind], None),
3466
(source_details[3], None)), True
3467
elif source_minikind in 'fdlt' and target_minikind in 'r':
3834
(self.utf8_decode(old_path)[0], None),
3838
(self.utf8_decode(entry[0][1])[0], None),
3839
(DirState._minikind_to_kind[source_minikind], None),
3840
(source_details[3], None)), True
3841
elif source_minikind in _fdlt and target_minikind in b'r':
3468
3842
# a rename; could be a true rename, or a rename inherited from
3469
3843
# a renamed parent. TODO: handle this efficiently. Its not
3470
3844
# common case to rename dirs though, so a correct but slow
3471
3845
# implementation will do.
3472
if not osutils.is_inside_any(self.searched_specific_files, target_details[1]):
3846
if not osutils.is_inside_any(self.searched_specific_files,
3473
3848
self.search_specific_files.add(target_details[1])
3474
elif source_minikind in 'ra' and target_minikind in 'ra':
3849
elif source_minikind in _ra and target_minikind in _ra:
3475
3850
# neither of the selected trees contain this file,
3476
3851
# so skip over it. This is not currently directly tested, but
3477
3852
# is indirectly via test_too_much.TestCommands.test_conflicts.
3480
3855
raise AssertionError("don't know how to compare "
3481
"source_minikind=%r, target_minikind=%r"
3482
% (source_minikind, target_minikind))
3483
## import pdb;pdb.set_trace()
3856
"source_minikind=%r, target_minikind=%r"
3857
% (source_minikind, target_minikind))
3484
3858
return None, None
3486
3860
def __iter__(self):
3596
3971
new_executable = bool(
3597
3972
stat.S_ISREG(root_dir_info[3].st_mode)
3598
3973
and stat.S_IEXEC & root_dir_info[3].st_mode)
3600
(None, current_root_unicode),
3604
(None, splitpath(current_root_unicode)[-1]),
3605
(None, root_dir_info[2]),
3606
(None, new_executable)
3608
initial_key = (current_root, '', '')
3976
(None, current_root_unicode),
3980
(None, splitpath(current_root_unicode)[-1]),
3981
(None, root_dir_info[2]),
3982
(None, new_executable)
3984
initial_key = (current_root, b'', b'')
3609
3985
block_index, _ = self.state._find_block_index_from_key(initial_key)
3610
3986
if block_index == 0:
3611
3987
# we have processed the total root already, but because the
3612
3988
# initial key matched it we should skip it here.
3614
3990
if root_dir_info and root_dir_info[2] == 'tree-reference':
3615
3991
current_dir_info = None
3617
dir_iterator = osutils._walkdirs_utf8(root_abspath, prefix=current_root)
3993
dir_iterator = osutils._walkdirs_utf8(
3994
root_abspath, prefix=current_root)
3619
current_dir_info = dir_iterator.next()
3996
current_dir_info = next(dir_iterator)
3997
except OSError as e:
3621
3998
# on win32, python2.4 has e.errno == ERROR_DIRECTORY, but
3622
3999
# python 2.5 has e.errno == EINVAL,
3623
4000
# and e.winerror == ERROR_DIRECTORY
3629
4006
if e.errno in (errno.ENOENT, errno.ENOTDIR, errno.EINVAL):
3630
4007
current_dir_info = None
3631
4008
elif (sys.platform == 'win32'
3632
and (e.errno in win_errors
3633
or e_winerror in win_errors)):
4009
and (e.errno in win_errors or
4010
e_winerror in win_errors)):
3634
4011
current_dir_info = None
3638
if current_dir_info[0][0] == '':
4015
if current_dir_info[0][0] == b'':
3639
4016
# remove .bzr from iteration
3640
bzr_index = bisect.bisect_left(current_dir_info[1], ('.bzr',))
3641
if current_dir_info[1][bzr_index][0] != '.bzr':
4017
bzr_index = bisect.bisect_left(
4018
current_dir_info[1], (b'.bzr',))
4019
if current_dir_info[1][bzr_index][0] != b'.bzr':
3642
4020
raise AssertionError()
3643
4021
del current_dir_info[1][bzr_index]
3644
4022
# walk until both the directory listing and the versioned metadata
3645
4023
# are exhausted.
3646
4024
if (block_index < len(self.state._dirblocks) and
3647
osutils.is_inside(current_root, self.state._dirblocks[block_index][0])):
4025
osutils.is_inside(current_root,
4026
self.state._dirblocks[block_index][0])):
3648
4027
current_block = self.state._dirblocks[block_index]
3650
4029
current_block = None
3651
4030
while (current_dir_info is not None or
3652
4031
current_block is not None):
3653
4032
if (current_dir_info and current_block
3654
and current_dir_info[0][0] != current_block[0]):
3655
if _cmp_by_dirs(current_dir_info[0][0], current_block[0]) < 0:
4033
and current_dir_info[0][0] != current_block[0]):
4034
if _lt_by_dirs(current_dir_info[0][0], current_block[0]):
3656
4035
# filesystem data refers to paths not covered by the dirblock.
3657
4036
# this has two possibilities:
3658
4037
# A) it is versioned but empty, so there is no block for it
3664
4043
# recurse into unknown directories.
3666
4045
while path_index < len(current_dir_info[1]):
3667
current_path_info = current_dir_info[1][path_index]
3668
if self.want_unversioned:
3669
if current_path_info[2] == 'directory':
3670
if self.tree._directory_is_tree_reference(
4046
current_path_info = current_dir_info[1][path_index]
4047
if self.want_unversioned:
4048
if current_path_info[2] == 'directory':
4049
if self.tree._directory_is_tree_reference(
3671
4050
current_path_info[0].decode('utf8')):
3672
current_path_info = current_path_info[:2] + \
3673
('tree-reference',) + current_path_info[3:]
3674
new_executable = bool(
3675
stat.S_ISREG(current_path_info[3].st_mode)
3676
and stat.S_IEXEC & current_path_info[3].st_mode)
3678
(None, utf8_decode(current_path_info[0])[0]),
3682
(None, utf8_decode(current_path_info[1])[0]),
3683
(None, current_path_info[2]),
3684
(None, new_executable))
3685
# dont descend into this unversioned path if it is
3687
if current_path_info[2] in ('directory',
3689
del current_dir_info[1][path_index]
4051
current_path_info = current_path_info[:2] + \
4052
('tree-reference',) + \
4053
current_path_info[3:]
4054
new_executable = bool(
4055
stat.S_ISREG(current_path_info[3].st_mode)
4056
and stat.S_IEXEC & current_path_info[3].st_mode)
4059
(None, utf8_decode(current_path_info[0])[0]),
4063
(None, utf8_decode(current_path_info[1])[0]),
4064
(None, current_path_info[2]),
4065
(None, new_executable))
4066
# dont descend into this unversioned path if it is
4068
if current_path_info[2] in ('directory',
4070
del current_dir_info[1][path_index]
3693
4074
# This dir info has been handled, go to the next
3695
current_dir_info = dir_iterator.next()
4076
current_dir_info = next(dir_iterator)
3696
4077
except StopIteration:
3697
4078
current_dir_info = None
3900
4289
raise AssertionError(
3901
4290
"Got entry<->path mismatch for specific path "
3902
4291
"%r entry %r path_info %r " % (
3903
path_utf8, entry, path_info))
4292
path_utf8, entry, path_info))
3904
4293
# Only include changes - we're outside the users requested
3907
4296
self._gather_result_for_consistency(result)
3908
4297
if (result[6][0] == 'directory' and
3909
result[6][1] != 'directory'):
4298
result[6][1] != 'directory'):
3910
4299
# This stopped being a directory, the old children have
3911
4300
# to be included.
3912
if entry[1][self.source_index][0] == 'r':
4301
if entry[1][self.source_index][0] == b'r':
3913
4302
# renamed, take the source path
3914
4303
entry_path_utf8 = entry[1][self.source_index][1]
3916
4305
entry_path_utf8 = path_utf8
3917
initial_key = (entry_path_utf8, '', '')
4306
initial_key = (entry_path_utf8, b'', b'')
3918
4307
block_index, _ = self.state._find_block_index_from_key(
3920
4309
if block_index == 0:
3921
4310
# The children of the root are in block index 1.
3923
4312
current_block = None
3924
4313
if block_index < len(self.state._dirblocks):
3925
4314
current_block = self.state._dirblocks[block_index]
3926
4315
if not osutils.is_inside(
3927
entry_path_utf8, current_block[0]):
4316
entry_path_utf8, current_block[0]):
3928
4317
# No entries for this directory at all.
3929
4318
current_block = None
3930
4319
if current_block is not None:
3931
4320
for entry in current_block[1]:
3932
if entry[1][self.source_index][0] in 'ar':
4321
if entry[1][self.source_index][0] in (b'a', b'r'):
3933
4322
# Not in the source tree, so doesn't have to be