20
20
lines by NL. The field delimiters are ommitted in the grammar, line delimiters
21
21
are not - this is done for clarity of reading. All string data is in utf8.
25
MINIKIND = "f" | "d" | "l" | "a" | "r" | "t";
28
WHOLE_NUMBER = {digit}, digit;
30
REVISION_ID = a non-empty utf8 string;
32
dirstate format = header line, full checksum, row count, parent details,
33
ghost_details, entries;
34
header line = "#bazaar dirstate flat format 3", NL;
35
full checksum = "crc32: ", ["-"], WHOLE_NUMBER, NL;
36
row count = "num_entries: ", WHOLE_NUMBER, NL;
37
parent_details = WHOLE NUMBER, {REVISION_ID}* NL;
38
ghost_details = WHOLE NUMBER, {REVISION_ID}*, NL;
40
entry = entry_key, current_entry_details, {parent_entry_details};
41
entry_key = dirname, basename, fileid;
42
current_entry_details = common_entry_details, working_entry_details;
43
parent_entry_details = common_entry_details, history_entry_details;
44
common_entry_details = MINIKIND, fingerprint, size, executable
45
working_entry_details = packed_stat
46
history_entry_details = REVISION_ID;
49
fingerprint = a nonempty utf8 sequence with meaning defined by minikind.
51
Given this definition, the following is useful to know::
53
entry (aka row) - all the data for a given key.
54
entry[0]: The key (dirname, basename, fileid)
58
entry[1]: The tree(s) data for this path and id combination.
59
entry[1][0]: The current tree
60
entry[1][1]: The second tree
62
For an entry for a tree, we have (using tree 0 - current tree) to demonstrate::
64
entry[1][0][0]: minikind
65
entry[1][0][1]: fingerprint
67
entry[1][0][3]: executable
68
entry[1][0][4]: packed_stat
72
entry[1][1][4]: revision_id
23
MINIKIND = "f" | "d" | "l" | "a" | "r" | "t";
26
WHOLE_NUMBER = {digit}, digit;
28
REVISION_ID = a non-empty utf8 string;
30
dirstate format = header line, full checksum, row count, parent details,
31
ghost_details, entries;
32
header line = "#bazaar dirstate flat format 3", NL;
33
full checksum = "crc32: ", ["-"], WHOLE_NUMBER, NL;
34
row count = "num_entries: ", WHOLE_NUMBER, NL;
35
parent_details = WHOLE NUMBER, {REVISION_ID}* NL;
36
ghost_details = WHOLE NUMBER, {REVISION_ID}*, NL;
38
entry = entry_key, current_entry_details, {parent_entry_details};
39
entry_key = dirname, basename, fileid;
40
current_entry_details = common_entry_details, working_entry_details;
41
parent_entry_details = common_entry_details, history_entry_details;
42
common_entry_details = MINIKIND, fingerprint, size, executable
43
working_entry_details = packed_stat
44
history_entry_details = REVISION_ID;
47
fingerprint = a nonempty utf8 sequence with meaning defined by minikind.
49
Given this definition, the following is useful to know:
50
entry (aka row) - all the data for a given key.
51
entry[0]: The key (dirname, basename, fileid)
55
entry[1]: The tree(s) data for this path and id combination.
56
entry[1][0]: The current tree
57
entry[1][1]: The second tree
59
For an entry for a tree, we have (using tree 0 - current tree) to demonstrate:
60
entry[1][0][0]: minikind
61
entry[1][0][1]: fingerprint
63
entry[1][0][3]: executable
64
entry[1][0][4]: packed_stat
66
entry[1][1][4]: revision_id
74
68
There may be multiple rows at the root, one per id present in the root, so the
75
in memory root row is now::
77
self._dirblocks[0] -> ('', [entry ...]),
79
and the entries in there are::
83
entries[0][2]: file_id
84
entries[1][0]: The tree data for the current tree for this fileid at /
89
b'r' is a relocated entry: This path is not present in this tree with this
90
id, but the id can be found at another location. The fingerprint is
91
used to point to the target location.
92
b'a' is an absent entry: In that tree the id is not present at this path.
93
b'd' is a directory entry: This path in this tree is a directory with the
94
current file id. There is no fingerprint for directories.
95
b'f' is a file entry: As for directory, but it's a file. The fingerprint is
96
the sha1 value of the file's canonical form, i.e. after any read
97
filters have been applied to the convenience form stored in the working
99
b'l' is a symlink entry: As for directory, but a symlink. The fingerprint is
101
b't' is a reference to a nested subtree; the fingerprint is the referenced
69
in memory root row is now:
70
self._dirblocks[0] -> ('', [entry ...]),
71
and the entries in there are
74
entries[0][2]: file_id
75
entries[1][0]: The tree data for the current tree for this fileid at /
79
'r' is a relocated entry: This path is not present in this tree with this id,
80
but the id can be found at another location. The fingerprint is used to
81
point to the target location.
82
'a' is an absent entry: In that tree the id is not present at this path.
83
'd' is a directory entry: This path in this tree is a directory with the
84
current file id. There is no fingerprint for directories.
85
'f' is a file entry: As for directory, but it's a file. The fingerprint is the
86
sha1 value of the file's canonical form, i.e. after any read filters have
87
been applied to the convenience form stored in the working tree.
88
'l' is a symlink entry: As for directory, but a symlink. The fingerprint is the
90
't' is a reference to a nested subtree; the fingerprint is the referenced
106
The entries on disk and in memory are ordered according to the following keys::
95
The entries on disk and in memory are ordered according to the following keys:
108
97
directory, as a list of components
112
101
--- Format 1 had the following different definition: ---
116
rows = dirname, NULL, basename, NULL, MINIKIND, NULL, fileid_utf8, NULL,
117
WHOLE NUMBER (* size *), NULL, packed stat, NULL, sha1|symlink target,
119
PARENT ROW = NULL, revision_utf8, NULL, MINIKIND, NULL, dirname, NULL,
120
basename, NULL, WHOLE NUMBER (* size *), NULL, "y" | "n", NULL,
102
rows = dirname, NULL, basename, NULL, MINIKIND, NULL, fileid_utf8, NULL,
103
WHOLE NUMBER (* size *), NULL, packed stat, NULL, sha1|symlink target,
105
PARENT ROW = NULL, revision_utf8, NULL, MINIKIND, NULL, dirname, NULL,
106
basename, NULL, WHOLE NUMBER (* size *), NULL, "y" | "n", NULL,
123
109
PARENT ROW's are emitted for every parent that is not in the ghosts details
124
110
line. That is, if the parents are foo, bar, baz, and the ghosts are bar, then
253
231
ERROR_DIRECTORY = 267
256
class DirstateCorrupt(errors.BzrError):
258
_fmt = "The dirstate file (%(state)s) appears to be corrupt: %(msg)s"
260
def __init__(self, state, msg):
261
errors.BzrError.__init__(self)
234
if not getattr(struct, '_compile', None):
235
# Cannot pre-compile the dirstate pack_stat
236
def pack_stat(st, _encode=binascii.b2a_base64, _pack=struct.pack):
237
"""Convert stat values into a packed representation."""
238
return _encode(_pack('>LLLLLL', st.st_size, int(st.st_mtime),
239
int(st.st_ctime), st.st_dev, st.st_ino & 0xFFFFFFFF,
242
# compile the struct compiler we need, so as to only do it once
243
from _struct import Struct
244
_compiled_pack = Struct('>LLLLLL').pack
245
def pack_stat(st, _encode=binascii.b2a_base64, _pack=_compiled_pack):
246
"""Convert stat values into a packed representation."""
247
# jam 20060614 it isn't really worth removing more entries if we
248
# are going to leave it in packed form.
249
# With only st_mtime and st_mode filesize is 5.5M and read time is 275ms
250
# With all entries, filesize is 5.9M and read time is maybe 280ms
251
# well within the noise margin
253
# base64 encoding always adds a final newline, so strip it off
254
# The current version
255
return _encode(_pack(st.st_size, int(st.st_mtime), int(st.st_ctime),
256
st.st_dev, st.st_ino & 0xFFFFFFFF, st.st_mode))[:-1]
257
# This is 0.060s / 1.520s faster by not encoding as much information
258
# return _encode(_pack('>LL', int(st.st_mtime), st.st_mode))[:-1]
259
# This is not strictly faster than _encode(_pack())[:-1]
260
# return '%X.%X.%X.%X.%X.%X' % (
261
# st.st_size, int(st.st_mtime), int(st.st_ctime),
262
# st.st_dev, st.st_ino, st.st_mode)
263
# Similar to the _encode(_pack('>LL'))
264
# return '%X.%X' % (int(st.st_mtime), st.st_mode)
266
267
class SHA1Provider(object):
350
354
NOT_IN_MEMORY = 0
351
355
IN_MEMORY_UNMODIFIED = 1
352
356
IN_MEMORY_MODIFIED = 2
353
IN_MEMORY_HASH_MODIFIED = 3 # Only hash-cache updates
355
358
# A pack_stat (the x's) that is just noise and will never match the output
356
359
# of base64 encode.
358
NULL_PARENT_DETAILS = static_tuple.StaticTuple(b'a', b'', 0, False, b'')
360
HEADER_FORMAT_2 = b'#bazaar dirstate flat format 2\n'
361
HEADER_FORMAT_3 = b'#bazaar dirstate flat format 3\n'
363
def __init__(self, path, sha1_provider, worth_saving_limit=0,
364
use_filesystem_for_exec=True):
361
NULL_PARENT_DETAILS = ('a', '', 0, False, '')
363
HEADER_FORMAT_2 = '#bazaar dirstate flat format 2\n'
364
HEADER_FORMAT_3 = '#bazaar dirstate flat format 3\n'
366
def __init__(self, path, sha1_provider):
365
367
"""Create a DirState object.
367
369
:param path: The path at which the dirstate file on disk should live.
368
370
:param sha1_provider: an object meeting the SHA1Provider interface.
369
:param worth_saving_limit: when the exact number of hash changed
370
entries is known, only bother saving the dirstate if more than
371
this count of entries have changed.
372
-1 means never save hash changes, 0 means always save hash changes.
373
:param use_filesystem_for_exec: Whether to trust the filesystem
374
for executable bit information
376
372
# _header_state and _dirblock_state represent the current state
377
373
# of the dirstate metadata and the per-row data respectiely.
415
411
self._last_block_index = None
416
412
self._last_entry_index = None
417
# The set of known hash changes
418
self._known_hash_changes = set()
419
# How many hash changed entries can we have without saving
420
self._worth_saving_limit = worth_saving_limit
421
self._config_stack = config.LocationStack(urlutils.local_path_to_url(
423
self._use_filesystem_for_exec = use_filesystem_for_exec
425
414
def __repr__(self):
426
415
return "%s(%r)" % \
427
416
(self.__class__.__name__, self._filename)
429
def _mark_modified(self, hash_changed_entries=None, header_modified=False):
430
"""Mark this dirstate as modified.
432
:param hash_changed_entries: if non-None, mark just these entries as
433
having their hash modified.
434
:param header_modified: mark the header modified as well, not just the
437
#trace.mutter_callsite(3, "modified hash entries: %s", hash_changed_entries)
438
if hash_changed_entries:
439
self._known_hash_changes.update(
440
[e[0] for e in hash_changed_entries])
441
if self._dirblock_state in (DirState.NOT_IN_MEMORY,
442
DirState.IN_MEMORY_UNMODIFIED):
443
# If the dirstate is already marked a IN_MEMORY_MODIFIED, then
444
# that takes precedence.
445
self._dirblock_state = DirState.IN_MEMORY_HASH_MODIFIED
447
# TODO: Since we now have a IN_MEMORY_HASH_MODIFIED state, we
448
# should fail noisily if someone tries to set
449
# IN_MEMORY_MODIFIED but we don't have a write-lock!
450
# We don't know exactly what changed so disable smart saving
451
self._dirblock_state = DirState.IN_MEMORY_MODIFIED
453
self._header_state = DirState.IN_MEMORY_MODIFIED
455
def _mark_unmodified(self):
456
"""Mark this dirstate as unmodified."""
457
self._header_state = DirState.IN_MEMORY_UNMODIFIED
458
self._dirblock_state = DirState.IN_MEMORY_UNMODIFIED
459
self._known_hash_changes = set()
461
418
def add(self, path, file_id, kind, stat, fingerprint):
462
419
"""Add a path to be tracked.
464
:param path: The path within the dirstate - b'' is the root, 'foo' is the
421
:param path: The path within the dirstate - '' is the root, 'foo' is the
465
422
path foo within the root, 'foo/bar' is the path bar within foo
467
424
:param file_id: The file id of the path being added.
493
450
# you should never have files called . or ..; just add the directory
494
451
# in the parent, or according to the special treatment for the root
495
452
if basename == '.' or basename == '..':
496
raise inventory.InvalidEntryName(path)
453
raise errors.InvalidEntryName(path)
497
454
# now that we've normalised, we need the correct utf8 path and
498
455
# dirname and basename elements. This single encode and split should be
499
456
# faster than three separate encodes.
500
457
utf8path = (dirname + '/' + basename).strip('/').encode('utf8')
501
458
dirname, basename = osutils.split(utf8path)
502
459
# uses __class__ for speed; the check is needed for safety
503
if file_id.__class__ is not bytes:
460
if file_id.__class__ is not str:
504
461
raise AssertionError(
505
462
"must be a utf8 file_id not %s" % (type(file_id), ))
506
463
# Make sure the file_id does not exist in this tree
507
464
rename_from = None
508
file_id_entry = self._get_entry(
509
0, fileid_utf8=file_id, include_deleted=True)
465
file_id_entry = self._get_entry(0, fileid_utf8=file_id, include_deleted=True)
510
466
if file_id_entry != (None, None):
511
if file_id_entry[1][0][0] == b'a':
467
if file_id_entry[1][0][0] == 'a':
512
468
if file_id_entry[0] != (dirname, basename, file_id):
513
469
# set the old name's current operation to rename
514
470
self.update_minimal(file_id_entry[0],
520
476
rename_from = file_id_entry[0][0:2]
522
path = osutils.pathjoin(
523
file_id_entry[0][0], file_id_entry[0][1])
478
path = osutils.pathjoin(file_id_entry[0][0], file_id_entry[0][1])
524
479
kind = DirState._minikind_to_kind[file_id_entry[1][0][0]]
525
480
info = '%s:%s' % (kind, path)
526
raise inventory.DuplicateFileId(file_id, info)
527
first_key = (dirname, basename, b'')
481
raise errors.DuplicateFileId(file_id, info)
482
first_key = (dirname, basename, '')
528
483
block_index, present = self._find_block_index_from_key(first_key)
530
485
# check the path is not in the tree
531
486
block = self._dirblocks[block_index][1]
532
487
entry_index, _ = self._find_entry_index(first_key, block)
533
488
while (entry_index < len(block) and
534
block[entry_index][0][0:2] == first_key[0:2]):
535
if block[entry_index][1][0][0] not in (b'a', b'r'):
489
block[entry_index][0][0:2] == first_key[0:2]):
490
if block[entry_index][1][0][0] not in 'ar':
536
491
# this path is in the dirstate in the current tree.
537
raise Exception("adding already added path!")
492
raise Exception, "adding already added path!"
540
495
# The block where we want to put the file is not present. But it
1403
1339
minikind = child[1][0][0]
1404
1340
fingerprint = child[1][0][4]
1405
1341
executable = child[1][0][3]
1406
old_child_path = osutils.pathjoin(child_dirname,
1342
old_child_path = osutils.pathjoin(child[0][0],
1408
1344
removals[child[0][2]] = old_child_path
1409
1345
child_suffix = child_dirname[len(old_path):]
1410
1346
new_child_dirname = (new_path + child_suffix)
1411
1347
key = (new_child_dirname, child_basename, child[0][2])
1412
new_child_path = osutils.pathjoin(new_child_dirname,
1348
new_child_path = os.path.join(new_child_dirname,
1414
1350
insertions[child[0][2]] = (key, minikind, executable,
1415
1351
fingerprint, new_child_path)
1416
1352
self._check_delta_ids_absent(new_ids, delta, 0)
1418
self._apply_removals(removals.items())
1354
self._apply_removals(removals.iteritems())
1419
1355
self._apply_insertions(insertions.values())
1420
1356
# Validate parents
1421
1357
self._after_delta_check_parents(parents, 0)
1422
except errors.BzrError as e:
1358
except errors.BzrError, e:
1423
1359
self._changes_aborted = True
1424
1360
if 'integrity error' not in str(e):
1426
1362
# _get_entry raises BzrError when a request is inconsistent; we
1427
# want such errors to be shown as InconsistentDelta - and that
1363
# want such errors to be shown as InconsistentDelta - and that
1428
1364
# fits the behaviour we trigger.
1429
raise errors.InconsistentDeltaDelta(delta,
1430
"error from _get_entry. %s" % (e,))
1365
raise errors.InconsistentDeltaDelta(delta, "error from _get_entry.")
1432
1367
def _apply_removals(self, removals):
1433
1368
for file_id, path in sorted(removals, reverse=True,
1434
key=operator.itemgetter(1)):
1369
key=operator.itemgetter(1)):
1435
1370
dirname, basename = osutils.split(path)
1436
1371
block_i, entry_i, d_present, f_present = \
1437
1372
self._get_block_entry_index(dirname, basename, 0)
1439
1374
entry = self._dirblocks[block_i][1][entry_i]
1440
1375
except IndexError:
1441
self._raise_invalid(path, file_id,
1442
"Wrong path for old path.")
1443
if not f_present or entry[1][0][0] in (b'a', b'r'):
1444
self._raise_invalid(path, file_id,
1445
"Wrong path for old path.")
1376
self._changes_aborted = True
1377
raise errors.InconsistentDelta(path, file_id,
1378
"Wrong path for old path.")
1379
if not f_present or entry[1][0][0] in 'ar':
1380
self._changes_aborted = True
1381
raise errors.InconsistentDelta(path, file_id,
1382
"Wrong path for old path.")
1446
1383
if file_id != entry[0][2]:
1447
self._raise_invalid(path, file_id,
1448
"Attempt to remove path has wrong id - found %r."
1384
self._changes_aborted = True
1385
raise errors.InconsistentDelta(path, file_id,
1386
"Attempt to remove path has wrong id - found %r."
1450
1388
self._make_absent(entry)
1451
1389
# See if we have a malformed delta: deleting a directory must not
1452
1390
# leave crud behind. This increases the number of bisects needed
1528
1471
new_ids = set()
1529
1472
for old_path, new_path, file_id, inv_entry in delta:
1530
if file_id.__class__ is not bytes:
1531
raise AssertionError(
1532
"must be a utf8 file_id not %s" % (type(file_id), ))
1533
1473
if inv_entry is not None and file_id != inv_entry.file_id:
1534
self._raise_invalid(new_path, file_id,
1535
"mismatched entry file_id %r" % inv_entry)
1536
if new_path is None:
1537
new_path_utf8 = None
1474
raise errors.InconsistentDelta(new_path, file_id,
1475
"mismatched entry file_id %r" % inv_entry)
1476
if new_path is not None:
1539
1477
if inv_entry is None:
1540
self._raise_invalid(new_path, file_id,
1541
"new_path with no entry")
1478
raise errors.InconsistentDelta(new_path, file_id,
1479
"new_path with no entry")
1542
1480
new_path_utf8 = encode(new_path)
1543
1481
# note the parent for validation
1544
1482
dirname_utf8, basename_utf8 = osutils.split(new_path_utf8)
1545
1483
if basename_utf8:
1546
1484
parents.add((dirname_utf8, inv_entry.parent_id))
1547
1485
if old_path is None:
1548
old_path_utf8 = None
1550
old_path_utf8 = encode(old_path)
1551
if old_path is None:
1552
adds.append((None, new_path_utf8, file_id,
1553
inv_to_entry(inv_entry), True))
1486
adds.append((None, encode(new_path), file_id,
1487
inv_to_entry(inv_entry), True))
1554
1488
new_ids.add(file_id)
1555
1489
elif new_path is None:
1556
deletes.append((old_path_utf8, None, file_id, None, True))
1557
elif (old_path, new_path) == root_only:
1558
# change things in-place
1559
# Note: the case of a parent directory changing its file_id
1560
# tends to break optimizations here, because officially
1561
# the file has actually been moved, it just happens to
1562
# end up at the same path. If we can figure out how to
1563
# handle that case, we can avoid a lot of add+delete
1564
# pairs for objects that stay put.
1565
# elif old_path == new_path:
1566
changes.append((old_path_utf8, new_path_utf8, file_id,
1567
inv_to_entry(inv_entry)))
1490
deletes.append((encode(old_path), None, file_id, None, True))
1491
elif (old_path, new_path) != root_only:
1570
1493
# Because renames must preserve their children we must have
1571
1494
# processed all relocations and removes before hand. The sort
1574
1497
# pair will result in the deleted item being reinserted, or
1575
1498
# renamed items being reinserted twice - and possibly at the
1576
1499
# wrong place. Splitting into a delete/add pair also simplifies
1577
# the handling of entries with (b'f', ...), (b'r' ...) because
1578
# the target of the b'r' is old_path here, and we add that to
1500
# the handling of entries with ('f', ...), ('r' ...) because
1501
# the target of the 'r' is old_path here, and we add that to
1579
1502
# deletes, meaning that the add handler does not need to check
1580
# for b'r' items on every pass.
1503
# for 'r' items on every pass.
1581
1504
self._update_basis_apply_deletes(deletes)
1583
1506
# Split into an add/delete pair recursively.
1584
adds.append((old_path_utf8, new_path_utf8, file_id,
1585
inv_to_entry(inv_entry), False))
1507
adds.append((None, new_path_utf8, file_id,
1508
inv_to_entry(inv_entry), False))
1586
1509
# Expunge deletes that we've seen so that deleted/renamed
1587
1510
# children of a rename directory are handled correctly.
1588
new_deletes = reversed(list(
1589
self._iter_child_entries(1, old_path_utf8)))
1511
new_deletes = reversed(list(self._iter_child_entries(1,
1590
1513
# Remove the current contents of the tree at orig_path, and
1591
1514
# reinsert at the correct new path.
1592
1515
for entry in new_deletes:
1593
child_dirname, child_basename, child_file_id = entry[0]
1595
source_path = child_dirname + b'/' + child_basename
1517
source_path = entry[0][0] + '/' + entry[0][1]
1597
source_path = child_basename
1519
source_path = entry[0][1]
1598
1520
if new_path_utf8:
1600
new_path_utf8 + source_path[len(old_path_utf8):]
1521
target_path = new_path_utf8 + source_path[len(old_path):]
1602
if old_path_utf8 == b'':
1603
1524
raise AssertionError("cannot rename directory to"
1605
target_path = source_path[len(old_path_utf8) + 1:]
1607
(None, target_path, entry[0][2], entry[1][1], False))
1526
target_path = source_path[len(old_path) + 1:]
1527
adds.append((None, target_path, entry[0][2], entry[1][1], False))
1608
1528
deletes.append(
1609
1529
(source_path, target_path, entry[0][2], None, False))
1610
1530
deletes.append(
1611
(old_path_utf8, new_path_utf8, file_id, None, False))
1531
(encode(old_path), new_path, file_id, None, False))
1533
# changes to just the root should not require remove/insertion
1535
changes.append((encode(old_path), encode(new_path), file_id,
1536
inv_to_entry(inv_entry)))
1613
1537
self._check_delta_ids_absent(new_ids, delta, 1)
1615
1539
# Finish expunging deletes/first half of renames.
1672
1597
# Adds are accumulated partly from renames, so can be in any input
1673
1598
# order - sort it.
1674
# TODO: we may want to sort in dirblocks order. That way each entry
1675
# will end up in the same directory, allowing the _get_entry
1676
# fast-path for looking up 2 items in the same dir work.
1677
adds.sort(key=lambda x: x[1])
1678
1600
# adds is now in lexographic order, which places all parents before
1679
1601
# their children, so we can process it linearly.
1680
st = static_tuple.StaticTuple
1681
1603
for old_path, new_path, file_id, new_details, real_add in adds:
1682
dirname, basename = osutils.split(new_path)
1683
entry_key = st(dirname, basename, file_id)
1684
block_index, present = self._find_block_index_from_key(entry_key)
1686
# The block where we want to put the file is not present.
1687
# However, it might have just been an empty directory. Look for
1688
# the parent in the basis-so-far before throwing an error.
1689
parent_dir, parent_base = osutils.split(dirname)
1690
parent_block_idx, parent_entry_idx, _, parent_present = \
1691
self._get_block_entry_index(parent_dir, parent_base, 1)
1692
if not parent_present:
1693
self._raise_invalid(new_path, file_id,
1694
"Unable to find block for this record."
1695
" Was the parent added?")
1696
self._ensure_block(parent_block_idx, parent_entry_idx, dirname)
1698
block = self._dirblocks[block_index][1]
1699
entry_index, present = self._find_entry_index(entry_key, block)
1701
if old_path is not None:
1702
self._raise_invalid(new_path, file_id,
1703
'considered a real add but still had old_path at %s'
1706
entry = block[entry_index]
1707
basis_kind = entry[1][1][0]
1708
if basis_kind == b'a':
1709
entry[1][1] = new_details
1710
elif basis_kind == b'r':
1711
raise NotImplementedError()
1713
self._raise_invalid(new_path, file_id,
1714
"An entry was marked as a new add"
1715
" but the basis target already existed")
1717
# The exact key was not found in the block. However, we need to
1718
# check if there is a key next to us that would have matched.
1719
# We only need to check 2 locations, because there are only 2
1721
for maybe_index in range(entry_index - 1, entry_index + 1):
1722
if maybe_index < 0 or maybe_index >= len(block):
1724
maybe_entry = block[maybe_index]
1725
if maybe_entry[0][:2] != (dirname, basename):
1726
# Just a random neighbor
1728
if maybe_entry[0][2] == file_id:
1729
raise AssertionError(
1730
'_find_entry_index didnt find a key match'
1731
' but walking the data did, for %s'
1733
basis_kind = maybe_entry[1][1][0]
1734
if basis_kind not in (b'a', b'r'):
1735
self._raise_invalid(new_path, file_id,
1736
"we have an add record for path, but the path"
1737
" is already present with another file_id %s"
1738
% (maybe_entry[0][2],))
1740
entry = (entry_key, [DirState.NULL_PARENT_DETAILS,
1742
block.insert(entry_index, entry)
1744
active_kind = entry[1][0][0]
1745
if active_kind == b'a':
1746
# The active record shows up as absent, this could be genuine,
1747
# or it could be present at some other location. We need to
1749
id_index = self._get_id_index()
1750
# The id_index may not be perfectly accurate for tree1, because
1751
# we haven't been keeping it updated. However, it should be
1752
# fine for tree0, and that gives us enough info for what we
1754
keys = id_index.get(file_id, ())
1756
block_i, entry_i, d_present, f_present = \
1757
self._get_block_entry_index(key[0], key[1], 0)
1760
active_entry = self._dirblocks[block_i][1][entry_i]
1761
if (active_entry[0][2] != file_id):
1762
# Some other file is at this path, we don't need to
1765
real_active_kind = active_entry[1][0][0]
1766
if real_active_kind in (b'a', b'r'):
1767
# We found a record, which was not *this* record,
1768
# which matches the file_id, but is not actually
1769
# present. Something seems *really* wrong.
1770
self._raise_invalid(new_path, file_id,
1771
"We found a tree0 entry that doesnt make sense")
1772
# Now, we've found a tree0 entry which matches the file_id
1773
# but is at a different location. So update them to be
1775
active_dir, active_name = active_entry[0][:2]
1777
active_path = active_dir + b'/' + active_name
1779
active_path = active_name
1780
active_entry[1][1] = st(b'r', new_path, 0, False, b'')
1781
entry[1][0] = st(b'r', active_path, 0, False, b'')
1782
elif active_kind == b'r':
1783
raise NotImplementedError()
1785
new_kind = new_details[0]
1786
if new_kind == b'd':
1787
self._ensure_block(block_index, entry_index, new_path)
1604
# the entry for this file_id must be in tree 0.
1605
entry = self._get_entry(0, file_id, new_path)
1606
if entry[0] is None or entry[0][2] != file_id:
1607
self._changes_aborted = True
1608
raise errors.InconsistentDelta(new_path, file_id,
1609
'working tree does not contain new entry')
1610
if real_add and entry[1][1][0] not in absent:
1611
self._changes_aborted = True
1612
raise errors.InconsistentDelta(new_path, file_id,
1613
'The entry was considered to be a genuinely new record,'
1614
' but there was already an old record for it.')
1615
# We don't need to update the target of an 'r' because the handling
1616
# of renames turns all 'r' situations into a delete at the original
1618
entry[1][1] = new_details
1789
1620
def _update_basis_apply_changes(self, changes):
1790
1621
"""Apply a sequence of changes to tree 1 during update_basis_by_delta.
1815
1653
null = DirState.NULL_PARENT_DETAILS
1816
1654
for old_path, new_path, file_id, _, real_delete in deletes:
1817
1655
if real_delete != (new_path is None):
1818
self._raise_invalid(old_path, file_id, "bad delete delta")
1656
self._changes_aborted = True
1657
raise AssertionError("bad delete delta")
1819
1658
# the entry for this file_id must be in tree 1.
1820
1659
dirname, basename = osutils.split(old_path)
1821
1660
block_index, entry_index, dir_present, file_present = \
1822
1661
self._get_block_entry_index(dirname, basename, 1)
1823
1662
if not file_present:
1824
self._raise_invalid(old_path, file_id,
1825
'basis tree does not contain removed entry')
1663
self._changes_aborted = True
1664
raise errors.InconsistentDelta(old_path, file_id,
1665
'basis tree does not contain removed entry')
1826
1666
entry = self._dirblocks[block_index][1][entry_index]
1827
# The state of the entry in the 'active' WT
1828
active_kind = entry[1][0][0]
1829
1667
if entry[0][2] != file_id:
1830
self._raise_invalid(old_path, file_id,
1831
'mismatched file_id in tree 1')
1833
old_kind = entry[1][1][0]
1834
if active_kind in b'ar':
1835
# The active tree doesn't have this file_id.
1836
# The basis tree is changing this record. If this is a
1837
# rename, then we don't want the record here at all
1838
# anymore. If it is just an in-place change, we want the
1839
# record here, but we'll add it if we need to. So we just
1841
if active_kind == b'r':
1842
active_path = entry[1][0][1]
1843
active_entry = self._get_entry(0, file_id, active_path)
1844
if active_entry[1][1][0] != b'r':
1845
self._raise_invalid(old_path, file_id,
1846
"Dirstate did not have matching rename entries")
1847
elif active_entry[1][0][0] in b'ar':
1848
self._raise_invalid(old_path, file_id,
1849
"Dirstate had a rename pointing at an inactive"
1851
active_entry[1][1] = null
1668
self._changes_aborted = True
1669
raise errors.InconsistentDelta(old_path, file_id,
1670
'mismatched file_id in tree 1')
1672
if entry[1][0][0] != 'a':
1673
self._changes_aborted = True
1674
raise errors.InconsistentDelta(old_path, file_id,
1675
'This was marked as a real delete, but the WT state'
1676
' claims that it still exists and is versioned.')
1852
1677
del self._dirblocks[block_index][1][entry_index]
1853
if old_kind == b'd':
1854
# This was a directory, and the active tree says it
1855
# doesn't exist, and now the basis tree says it doesn't
1856
# exist. Remove its dirblock if present
1858
present) = self._find_block_index_from_key(
1859
(old_path, b'', b''))
1861
dir_block = self._dirblocks[dir_block_index][1]
1863
# This entry is empty, go ahead and just remove it
1864
del self._dirblocks[dir_block_index]
1866
# There is still an active record, so just mark this
1869
block_i, entry_i, d_present, f_present = \
1870
self._get_block_entry_index(old_path, b'', 1)
1872
dir_block = self._dirblocks[block_i][1]
1873
for child_entry in dir_block:
1874
child_basis_kind = child_entry[1][1][0]
1875
if child_basis_kind not in b'ar':
1876
self._raise_invalid(old_path, file_id,
1877
"The file id was deleted but its children were "
1679
if entry[1][0][0] == 'a':
1680
self._changes_aborted = True
1681
raise errors.InconsistentDelta(old_path, file_id,
1682
'The entry was considered a rename, but the source path'
1683
' is marked as absent.')
1684
# For whatever reason, we were asked to rename an entry
1685
# that was originally marked as deleted. This could be
1686
# because we are renaming the parent directory, and the WT
1687
# current state has the file marked as deleted.
1688
elif entry[1][0][0] == 'r':
1689
# implement the rename
1690
del self._dirblocks[block_index][1][entry_index]
1692
# it is being resurrected here, so blank it out temporarily.
1693
self._dirblocks[block_index][1][entry_index][1][1] = null
1880
1695
def _after_delta_check_parents(self, parents, index):
1881
1696
"""Check that parents required by the delta are all intact.
1883
1698
:param parents: An iterable of (path_utf8, file_id) tuples which are
1884
1699
required to be present in tree 'index' at path_utf8 with id file_id
1885
1700
and be a directory.
2105
1924
tree present there.
2107
1926
self._read_dirblocks_if_needed()
2108
key = dirname, basename, b''
1927
key = dirname, basename, ''
2109
1928
block_index, present = self._find_block_index_from_key(key)
2110
1929
if not present:
2111
1930
# no such directory - return the dir index and 0 for the row.
2112
1931
return block_index, 0, False, False
2113
block = self._dirblocks[block_index][1] # access the entries only
1932
block = self._dirblocks[block_index][1] # access the entries only
2114
1933
entry_index, present = self._find_entry_index(key, block)
2115
1934
# linear search through entries at this path to find the one
2117
1936
while entry_index < len(block) and block[entry_index][0][1] == basename:
2118
if block[entry_index][1][tree_index][0] not in (b'a', b'r'):
1937
if block[entry_index][1][tree_index][0] not in 'ar':
2119
1938
# neither absent or relocated
2120
1939
return block_index, entry_index, True, True
2121
1940
entry_index += 1
2122
1941
return block_index, entry_index, True, False
2124
def _get_entry(self, tree_index, fileid_utf8=None, path_utf8=None,
2125
include_deleted=False):
1943
def _get_entry(self, tree_index, fileid_utf8=None, path_utf8=None, include_deleted=False):
2126
1944
"""Get the dirstate entry for path in tree tree_index.
2128
1946
If either file_id or path is supplied, it is used as the key to lookup.
2329
2143
def _get_id_index(self):
2330
"""Get an id index of self._dirblocks.
2332
This maps from file_id => [(directory, name, file_id)] entries where
2333
that file_id appears in one of the trees.
2144
"""Get an id index of self._dirblocks."""
2335
2145
if self._id_index is None:
2337
2147
for key, tree_details in self._iter_entries():
2338
self._add_to_id_index(id_index, key)
2148
id_index.setdefault(key[2], set()).add(key)
2339
2149
self._id_index = id_index
2340
2150
return self._id_index
2342
def _add_to_id_index(self, id_index, entry_key):
2343
"""Add this entry to the _id_index mapping."""
2344
# This code used to use a set for every entry in the id_index. However,
2345
# it is *rare* to have more than one entry. So a set is a large
2346
# overkill. And even when we do, we won't ever have more than the
2347
# number of parent trees. Which is still a small number (rarely >2). As
2348
# such, we use a simple tuple, and do our own uniqueness checks. While
2349
# the 'in' check is O(N) since N is nicely bounded it shouldn't ever
2350
# cause quadratic failure.
2351
file_id = entry_key[2]
2352
entry_key = static_tuple.StaticTuple.from_sequence(entry_key)
2353
if file_id not in id_index:
2354
id_index[file_id] = static_tuple.StaticTuple(entry_key,)
2356
entry_keys = id_index[file_id]
2357
if entry_key not in entry_keys:
2358
id_index[file_id] = entry_keys + (entry_key,)
2360
def _remove_from_id_index(self, id_index, entry_key):
2361
"""Remove this entry from the _id_index mapping.
2363
It is an programming error to call this when the entry_key is not
2366
file_id = entry_key[2]
2367
entry_keys = list(id_index[file_id])
2368
entry_keys.remove(entry_key)
2369
id_index[file_id] = static_tuple.StaticTuple.from_sequence(entry_keys)
2371
2152
def _get_output_lines(self, lines):
2372
2153
"""Format lines for final output.
2377
2158
output_lines = [DirState.HEADER_FORMAT_3]
2378
lines.append(b'') # a final newline
2379
inventory_text = b'\0\n\0'.join(lines)
2380
output_lines.append(b'crc32: %d\n' % (zlib.crc32(inventory_text),))
2159
lines.append('') # a final newline
2160
inventory_text = '\0\n\0'.join(lines)
2161
output_lines.append('crc32: %s\n' % (zlib.crc32(inventory_text),))
2381
2162
# -3, 1 for num parents, 1 for ghosts, 1 for final newline
2382
num_entries = len(lines) - 3
2383
output_lines.append(b'num_entries: %d\n' % (num_entries,))
2163
num_entries = len(lines)-3
2164
output_lines.append('num_entries: %s\n' % (num_entries,))
2384
2165
output_lines.append(inventory_text)
2385
2166
return output_lines
2387
2168
def _make_deleted_row(self, fileid_utf8, parents):
2388
2169
"""Return a deleted row for fileid_utf8."""
2389
return (b'/', b'RECYCLED.BIN', b'file', fileid_utf8, 0, DirState.NULLSTAT,
2170
return ('/', 'RECYCLED.BIN', 'file', fileid_utf8, 0, DirState.NULLSTAT,
2392
2173
def _num_present_parents(self):
2393
2174
"""The number of parent entries in each record row."""
2394
2175
return len(self._parents) - len(self._ghosts)
2397
def on_file(cls, path, sha1_provider=None, worth_saving_limit=0,
2398
use_filesystem_for_exec=True):
2178
def on_file(path, sha1_provider=None):
2399
2179
"""Construct a DirState on the file at path "path".
2401
2181
:param path: The path at which the dirstate file on disk should live.
2402
2182
:param sha1_provider: an object meeting the SHA1Provider interface.
2403
2183
If None, a DefaultSHA1Provider is used.
2404
:param worth_saving_limit: when the exact number of hash changed
2405
entries is known, only bother saving the dirstate if more than
2406
this count of entries have changed. -1 means never save.
2407
:param use_filesystem_for_exec: Whether to trust the filesystem
2408
for executable bit information
2409
2184
:return: An unlocked DirState object, associated with the given path.
2411
2186
if sha1_provider is None:
2412
2187
sha1_provider = DefaultSHA1Provider()
2413
result = cls(path, sha1_provider,
2414
worth_saving_limit=worth_saving_limit,
2415
use_filesystem_for_exec=use_filesystem_for_exec)
2188
result = DirState(path, sha1_provider)
2418
2191
def _read_dirblocks_if_needed(self):
2468
2241
raise errors.BzrError(
2469
2242
'invalid header line: %r' % (header,))
2470
2243
crc_line = self._state_file.readline()
2471
if not crc_line.startswith(b'crc32: '):
2244
if not crc_line.startswith('crc32: '):
2472
2245
raise errors.BzrError('missing crc32 checksum: %r' % crc_line)
2473
self.crc_expected = int(crc_line[len(b'crc32: '):-1])
2246
self.crc_expected = int(crc_line[len('crc32: '):-1])
2474
2247
num_entries_line = self._state_file.readline()
2475
if not num_entries_line.startswith(b'num_entries: '):
2248
if not num_entries_line.startswith('num_entries: '):
2476
2249
raise errors.BzrError('missing num_entries line')
2477
self._num_entries = int(num_entries_line[len(b'num_entries: '):-1])
2250
self._num_entries = int(num_entries_line[len('num_entries: '):-1])
2479
def sha1_from_stat(self, path, stat_result):
2252
def sha1_from_stat(self, path, stat_result, _pack_stat=pack_stat):
2480
2253
"""Find a sha1 given a stat lookup."""
2481
return self._get_packed_stat_index().get(pack_stat(stat_result), None)
2254
return self._get_packed_stat_index().get(_pack_stat(stat_result), None)
2483
2256
def _get_packed_stat_index(self):
2484
2257
"""Get a packed_stat index of self._dirblocks."""
2485
2258
if self._packed_stat_index is None:
2487
2260
for key, tree_details in self._iter_entries():
2488
if tree_details[0][0] == b'f':
2261
if tree_details[0][0] == 'f':
2489
2262
index[tree_details[0][4]] = tree_details[0][1]
2490
2263
self._packed_stat_index = index
2491
2264
return self._packed_stat_index
2508
2281
# Should this be a warning? For now, I'm expecting that places that
2509
2282
# mark it inconsistent will warn, making a warning here redundant.
2510
2283
trace.mutter('Not saving DirState because '
2511
'_changes_aborted is set.')
2513
# TODO: Since we now distinguish IN_MEMORY_MODIFIED from
2514
# IN_MEMORY_HASH_MODIFIED, we should only fail quietly if we fail
2515
# to save an IN_MEMORY_HASH_MODIFIED, and fail *noisily* if we
2516
# fail to save IN_MEMORY_MODIFIED
2517
if not self._worth_saving():
2284
'_changes_aborted is set.')
2286
if (self._header_state == DirState.IN_MEMORY_MODIFIED or
2287
self._dirblock_state == DirState.IN_MEMORY_MODIFIED):
2520
grabbed_write_lock = False
2521
if self._lock_state != 'w':
2522
grabbed_write_lock, new_lock = self._lock_token.temporary_write_lock()
2523
# Switch over to the new lock, as the old one may be closed.
2524
# TODO: jam 20070315 We should validate the disk file has
2525
# not changed contents, since temporary_write_lock may
2526
# not be an atomic operation.
2527
self._lock_token = new_lock
2528
self._state_file = new_lock.f
2529
if not grabbed_write_lock:
2530
# We couldn't grab a write lock, so we switch back to a read one
2533
lines = self.get_lines()
2534
self._state_file.seek(0)
2535
self._state_file.writelines(lines)
2536
self._state_file.truncate()
2537
self._state_file.flush()
2538
self._maybe_fdatasync()
2539
self._mark_unmodified()
2541
if grabbed_write_lock:
2542
self._lock_token = self._lock_token.restore_read_lock()
2543
self._state_file = self._lock_token.f
2289
grabbed_write_lock = False
2290
if self._lock_state != 'w':
2291
grabbed_write_lock, new_lock = self._lock_token.temporary_write_lock()
2292
# Switch over to the new lock, as the old one may be closed.
2544
2293
# TODO: jam 20070315 We should validate the disk file has
2545
# not changed contents. Since restore_read_lock may
2294
# not changed contents. Since temporary_write_lock may
2546
2295
# not be an atomic operation.
2548
def _maybe_fdatasync(self):
2549
"""Flush to disk if possible and if not configured off."""
2550
if self._config_stack.get('dirstate.fdatasync'):
2551
osutils.fdatasync(self._state_file.fileno())
2553
def _worth_saving(self):
2554
"""Is it worth saving the dirstate or not?"""
2555
if (self._header_state == DirState.IN_MEMORY_MODIFIED
2556
or self._dirblock_state == DirState.IN_MEMORY_MODIFIED):
2558
if self._dirblock_state == DirState.IN_MEMORY_HASH_MODIFIED:
2559
if self._worth_saving_limit == -1:
2560
# We never save hash changes when the limit is -1
2562
# If we're using smart saving and only a small number of
2563
# entries have changed their hash, don't bother saving. John has
2564
# suggested using a heuristic here based on the size of the
2565
# changed files and/or tree. For now, we go with a configurable
2566
# number of changes, keeping the calculation time
2567
# as low overhead as possible. (This also keeps all existing
2568
# tests passing as the default is 0, i.e. always save.)
2569
if len(self._known_hash_changes) >= self._worth_saving_limit:
2296
self._lock_token = new_lock
2297
self._state_file = new_lock.f
2298
if not grabbed_write_lock:
2299
# We couldn't grab a write lock, so we switch back to a read one
2302
self._state_file.seek(0)
2303
self._state_file.writelines(self.get_lines())
2304
self._state_file.truncate()
2305
self._state_file.flush()
2306
self._header_state = DirState.IN_MEMORY_UNMODIFIED
2307
self._dirblock_state = DirState.IN_MEMORY_UNMODIFIED
2309
if grabbed_write_lock:
2310
self._lock_token = self._lock_token.restore_read_lock()
2311
self._state_file = self._lock_token.f
2312
# TODO: jam 20070315 We should validate the disk file has
2313
# not changed contents. Since restore_read_lock may
2314
# not be an atomic operation.
2573
2316
def _set_data(self, parent_ids, dirblocks):
2574
2317
"""Set the full dirstate data in memory.
2734
2463
# mapping from path,id. We need to look up the correct path
2735
2464
# for the indexes from 0 to tree_index -1
2736
2465
new_details = []
2737
for lookup_index in range(tree_index):
2466
for lookup_index in xrange(tree_index):
2738
2467
# boundary case: this is the first occurence of file_id
2739
# so there are no id_indexes, possibly take this out of
2468
# so there are no id_indexs, possibly take this out of
2741
if not len(entry_keys):
2470
if not len(id_index[file_id]):
2742
2471
new_details.append(DirState.NULL_PARENT_DETAILS)
2744
2473
# grab any one entry, use it to find the right path.
2745
a_key = next(iter(entry_keys))
2746
if by_path[a_key][lookup_index][0] in (b'r', b'a'):
2747
# its a pointer or missing statement, use it as
2750
by_path[a_key][lookup_index])
2474
# TODO: optimise this to reduce memory use in highly
2475
# fragmented situations by reusing the relocation
2477
a_key = iter(id_index[file_id]).next()
2478
if by_path[a_key][lookup_index][0] in ('r', 'a'):
2479
# its a pointer or missing statement, use it as is.
2480
new_details.append(by_path[a_key][lookup_index])
2752
2482
# we have the right key, make a pointer to it.
2753
real_path = (b'/'.join(a_key[0:2])).strip(b'/')
2754
new_details.append(st(b'r', real_path, 0, False,
2483
real_path = ('/'.join(a_key[0:2])).strip('/')
2484
new_details.append(('r', real_path, 0, False, ''))
2756
2485
new_details.append(self._inv_entry_to_details(entry))
2757
2486
new_details.extend(new_location_suffix)
2758
2487
by_path[new_entry_key] = new_details
2759
self._add_to_id_index(id_index, new_entry_key)
2488
id_index[file_id].add(new_entry_key)
2760
2489
# --- end generation of full tree mappings
2762
2491
# sort and output all the entries
2887
2604
# the minimal required trigger is if the execute bit or cached
2888
2605
# kind has changed.
2889
2606
if (current_old[1][0][3] != current_new[1].executable or
2890
current_old[1][0][0] != current_new_minikind):
2607
current_old[1][0][0] != current_new_minikind):
2892
2609
trace.mutter("Updating in-place change '%s'.",
2893
new_path_utf8.decode('utf8'))
2610
new_path_utf8.decode('utf8'))
2894
2611
self.update_minimal(current_old[0], current_new_minikind,
2895
executable=current_new[1].executable,
2896
path_utf8=new_path_utf8, fingerprint=fingerprint,
2612
executable=current_new[1].executable,
2613
path_utf8=new_path_utf8, fingerprint=fingerprint,
2898
2615
# both sides are dealt with, move on
2899
2616
current_old = advance(old_iterator)
2900
2617
current_new = advance(new_iterator)
2901
elif (lt_by_dirs(new_dirname, current_old[0][0])
2902
or (new_dirname == current_old[0][0] and
2903
new_entry_key[1:] < current_old[0][1:])):
2618
elif (cmp_by_dirs(new_dirname, current_old[0][0]) < 0
2619
or (new_dirname == current_old[0][0]
2620
and new_entry_key[1:] < current_old[0][1:])):
2904
2621
# new comes before:
2905
2622
# add a entry for this and advance new
2907
2624
trace.mutter("Inserting from new '%s'.",
2908
new_path_utf8.decode('utf8'))
2625
new_path_utf8.decode('utf8'))
2909
2626
self.update_minimal(new_entry_key, current_new_minikind,
2910
executable=current_new[1].executable,
2911
path_utf8=new_path_utf8, fingerprint=fingerprint,
2627
executable=current_new[1].executable,
2628
path_utf8=new_path_utf8, fingerprint=fingerprint,
2913
2630
current_new = advance(new_iterator)
2915
2632
# we've advanced past the place where the old key would be,
2916
2633
# without seeing it in the new list. so it must be gone.
2918
2635
trace.mutter("Deleting from old '%s/%s'.",
2919
current_old[0][0].decode('utf8'),
2920
current_old[0][1].decode('utf8'))
2636
current_old[0][0].decode('utf8'),
2637
current_old[0][1].decode('utf8'))
2921
2638
self._make_absent(current_old)
2922
2639
current_old = advance(old_iterator)
2923
self._mark_modified()
2640
self._dirblock_state = DirState.IN_MEMORY_MODIFIED
2924
2641
self._id_index = None
2925
2642
self._packed_stat_index = None
2927
2644
trace.mutter("set_state_from_inventory complete.")
2929
def set_state_from_scratch(self, working_inv, parent_trees, parent_ghosts):
2930
"""Wipe the currently stored state and set it to something new.
2932
This is a hard-reset for the data we are working with.
2934
# Technically, we really want a write lock, but until we write, we
2935
# don't really need it.
2936
self._requires_lock()
2937
# root dir and root dir contents with no children. We have to have a
2938
# root for set_state_from_inventory to work correctly.
2939
empty_root = ((b'', b'', inventory.ROOT_ID),
2940
[(b'd', b'', 0, False, DirState.NULLSTAT)])
2941
empty_tree_dirblocks = [(b'', [empty_root]), (b'', [])]
2942
self._set_data([], empty_tree_dirblocks)
2943
self.set_state_from_inventory(working_inv)
2944
self.set_parent_trees(parent_trees, parent_ghosts)
2946
2646
def _make_absent(self, current_old):
2947
2647
"""Mark current_old - an entry - as absent for tree 0.
2985
2682
update_block_index, present = \
2986
2683
self._find_block_index_from_key(update_key)
2987
2684
if not present:
2988
raise AssertionError(
2989
'could not find block for %s' % (update_key,))
2685
raise AssertionError('could not find block for %s' % (update_key,))
2990
2686
update_entry_index, present = \
2991
self._find_entry_index(
2992
update_key, self._dirblocks[update_block_index][1])
2687
self._find_entry_index(update_key, self._dirblocks[update_block_index][1])
2993
2688
if not present:
2994
raise AssertionError(
2995
'could not find entry for %s' % (update_key,))
2689
raise AssertionError('could not find entry for %s' % (update_key,))
2996
2690
update_tree_details = self._dirblocks[update_block_index][1][update_entry_index][1]
2997
2691
# it must not be absent at the moment
2998
if update_tree_details[0][0] == b'a': # absent
2692
if update_tree_details[0][0] == 'a': # absent
2999
2693
raise AssertionError('bad row %r' % (update_tree_details,))
3000
2694
update_tree_details[0] = DirState.NULL_PARENT_DETAILS
3001
self._mark_modified()
2695
self._dirblock_state = DirState.IN_MEMORY_MODIFIED
3002
2696
return last_reference
3004
def update_minimal(self, key, minikind, executable=False, fingerprint=b'',
3005
packed_stat=None, size=0, path_utf8=None, fullscan=False):
2698
def update_minimal(self, key, minikind, executable=False, fingerprint='',
2699
packed_stat=None, size=0, path_utf8=None, fullscan=False):
3006
2700
"""Update an entry to the state in tree 0.
3008
2702
This will either create a new entry at 'key' or update an existing one.
3114
2803
update_block_index, present = \
3115
2804
self._find_block_index_from_key(other_key)
3116
2805
if not present:
3117
raise AssertionError(
3118
'could not find block for %s' % (other_key,))
2806
raise AssertionError('could not find block for %s' % (other_key,))
3119
2807
update_entry_index, present = \
3120
self._find_entry_index(
3121
other_key, self._dirblocks[update_block_index][1])
2808
self._find_entry_index(other_key, self._dirblocks[update_block_index][1])
3122
2809
if not present:
3123
raise AssertionError(
3124
'update_minimal: could not find entry for %s' % (other_key,))
2810
raise AssertionError('update_minimal: could not find entry for %s' % (other_key,))
3125
2811
update_details = self._dirblocks[update_block_index][1][update_entry_index][1][lookup_index]
3126
if update_details[0] in (b'a', b'r'): # relocated, absent
2812
if update_details[0] in 'ar': # relocated, absent
3127
2813
# its a pointer or absent in lookup_index's tree, use
3129
2815
new_entry[1].append(update_details)
3131
2817
# we have the right key, make a pointer to it.
3132
2818
pointer_path = osutils.pathjoin(*other_key[0:2])
3133
new_entry[1].append(
3134
(b'r', pointer_path, 0, False, b''))
2819
new_entry[1].append(('r', pointer_path, 0, False, ''))
3135
2820
block.insert(entry_index, new_entry)
3136
self._add_to_id_index(id_index, key)
2821
existing_keys.add(key)
3138
2823
# Does the new state matter?
3139
2824
block[entry_index][1][0] = new_details
3162
2842
# other trees, so put absent pointers there
3163
2843
# This is the vertical axis in the matrix, all pointing
3164
2844
# to the real path.
3165
block_index, present = self._find_block_index_from_key(
2845
block_index, present = self._find_block_index_from_key(entry_key)
3167
2846
if not present:
3168
2847
raise AssertionError('not present: %r', entry_key)
3169
entry_index, present = self._find_entry_index(
3170
entry_key, self._dirblocks[block_index][1])
2848
entry_index, present = self._find_entry_index(entry_key, self._dirblocks[block_index][1])
3171
2849
if not present:
3172
2850
raise AssertionError('not present: %r', entry_key)
3173
2851
self._dirblocks[block_index][1][entry_index][1][0] = \
3174
(b'r', path_utf8, 0, False, b'')
2852
('r', path_utf8, 0, False, '')
3175
2853
# add a containing dirblock if needed.
3176
if new_details[0] == b'd':
3177
# GZ 2017-06-09: Using pathjoin why?
3178
subdir_key = (osutils.pathjoin(*key[0:2]), b'', b'')
2854
if new_details[0] == 'd':
2855
subdir_key = (osutils.pathjoin(*key[0:2]), '', '')
3179
2856
block_index, present = self._find_block_index_from_key(subdir_key)
3180
2857
if not present:
3181
2858
self._dirblocks.insert(block_index, (subdir_key[0], []))
3183
self._mark_modified()
2860
self._dirblock_state = DirState.IN_MEMORY_MODIFIED
3185
2862
def _maybe_remove_row(self, block, index, id_index):
3186
2863
"""Remove index if it is absent or relocated across the row.
3188
2865
id_index is updated accordingly.
3189
:return: True if we removed the row, False otherwise
3191
2867
present_in_row = False
3192
2868
entry = block[index]
3193
2869
for column in entry[1]:
3194
if column[0] not in (b'a', b'r'):
2870
if column[0] not in 'ar':
3195
2871
present_in_row = True
3197
2873
if not present_in_row:
3198
2874
block.pop(index)
3199
self._remove_from_id_index(id_index, entry[0])
2875
id_index[entry[0][2]].remove(entry[0])
3203
2877
def _validate(self):
3204
2878
"""Check that invariants on the dirblock are correct.
3284
2958
# We check this with a dict per tree pointing either to the present
3285
2959
# name, or None if absent.
3286
2960
tree_count = self._num_present_parents() + 1
3287
id_path_maps = [{} for _ in range(tree_count)]
2961
id_path_maps = [dict() for i in range(tree_count)]
3288
2962
# Make sure that all renamed entries point to the correct location.
3289
2963
for entry in self._iter_entries():
3290
2964
file_id = entry[0][2]
3291
2965
this_path = osutils.pathjoin(entry[0][0], entry[0][1])
3292
2966
if len(entry[1]) != tree_count:
3293
2967
raise AssertionError(
3294
"wrong number of entry details for row\n%s"
3296
(pformat(entry), tree_count))
2968
"wrong number of entry details for row\n%s" \
2969
",\nexpected %d" % \
2970
(pformat(entry), tree_count))
3297
2971
absent_positions = 0
3298
2972
for tree_index, tree_state in enumerate(entry[1]):
3299
2973
this_tree_map = id_path_maps[tree_index]
3300
2974
minikind = tree_state[0]
3301
if minikind in (b'a', b'r'):
2975
if minikind in 'ar':
3302
2976
absent_positions += 1
3303
2977
# have we seen this id before in this column?
3304
2978
if file_id in this_tree_map:
3305
2979
previous_path, previous_loc = this_tree_map[file_id]
3306
2980
# any later mention of this file must be consistent with
3307
2981
# what was said before
3308
if minikind == b'a':
3309
2983
if previous_path is not None:
3310
2984
raise AssertionError(
3311
"file %s is absent in row %r but also present "
3313
(file_id.decode('utf-8'), entry, previous_path))
3314
elif minikind == b'r':
2985
"file %s is absent in row %r but also present " \
2987
(file_id, entry, previous_path))
2988
elif minikind == 'r':
3315
2989
target_location = tree_state[1]
3316
2990
if previous_path != target_location:
3317
2991
raise AssertionError(
3318
"file %s relocation in row %r but also at %r"
3319
% (file_id, entry, previous_path))
2992
"file %s relocation in row %r but also at %r" \
2993
% (file_id, entry, previous_path))
3321
2995
# a file, directory, etc - may have been previously
3322
2996
# pointed to by a relocation, which must point here
3486
3131
# are calculated at the same time, so checking just the size
3487
3132
# gains nothing w.r.t. performance.
3488
3133
link_or_sha1 = state._sha1_file(abspath)
3489
entry[1][0] = (b'f', link_or_sha1, stat_value.st_size,
3134
entry[1][0] = ('f', link_or_sha1, stat_value.st_size,
3490
3135
executable, packed_stat)
3492
entry[1][0] = (b'f', b'', stat_value.st_size,
3137
entry[1][0] = ('f', '', stat_value.st_size,
3493
3138
executable, DirState.NULLSTAT)
3494
worth_saving = False
3495
elif minikind == b'd':
3139
elif minikind == 'd':
3496
3140
link_or_sha1 = None
3497
entry[1][0] = (b'd', b'', 0, False, packed_stat)
3498
if saved_minikind != b'd':
3141
entry[1][0] = ('d', '', 0, False, packed_stat)
3142
if saved_minikind != 'd':
3499
3143
# This changed from something into a directory. Make sure we
3500
3144
# have a directory block for it. This doesn't happen very
3501
3145
# often, so this doesn't have to be super fast.
3502
3146
block_index, entry_index, dir_present, file_present = \
3503
3147
state._get_block_entry_index(entry[0][0], entry[0][1], 0)
3504
3148
state._ensure_block(block_index, entry_index,
3505
osutils.pathjoin(entry[0][0], entry[0][1]))
3507
worth_saving = False
3508
elif minikind == b'l':
3509
if saved_minikind == b'l':
3510
worth_saving = False
3149
osutils.pathjoin(entry[0][0], entry[0][1]))
3150
elif minikind == 'l':
3511
3151
link_or_sha1 = state._read_link(abspath, saved_link_or_sha1)
3512
3152
if state._cutoff_time is None:
3513
3153
state._sha_cutoff_time()
3514
3154
if (stat_value.st_mtime < state._cutoff_time
3515
and stat_value.st_ctime < state._cutoff_time):
3516
entry[1][0] = (b'l', link_or_sha1, stat_value.st_size,
3155
and stat_value.st_ctime < state._cutoff_time):
3156
entry[1][0] = ('l', link_or_sha1, stat_value.st_size,
3517
3157
False, packed_stat)
3519
entry[1][0] = (b'l', b'', stat_value.st_size,
3159
entry[1][0] = ('l', '', stat_value.st_size,
3520
3160
False, DirState.NULLSTAT)
3522
state._mark_modified([entry])
3161
state._dirblock_state = DirState.IN_MEMORY_MODIFIED
3523
3162
return link_or_sha1
3526
3165
class ProcessEntryPython(object):
3528
3167
__slots__ = ["old_dirname_to_file_id", "new_dirname_to_file_id",
3529
"last_source_parent", "last_target_parent", "include_unchanged",
3530
"partial", "use_filesystem_for_exec", "utf8_decode",
3531
"searched_specific_files", "search_specific_files",
3532
"searched_exact_paths", "search_specific_file_parents", "seen_ids",
3533
"state", "source_index", "target_index", "want_unversioned", "tree"]
3168
"last_source_parent", "last_target_parent", "include_unchanged",
3169
"partial", "use_filesystem_for_exec", "utf8_decode",
3170
"searched_specific_files", "search_specific_files",
3171
"searched_exact_paths", "search_specific_file_parents", "seen_ids",
3172
"state", "source_index", "target_index", "want_unversioned", "tree"]
3535
3174
def __init__(self, include_unchanged, use_filesystem_for_exec,
3536
search_specific_files, state, source_index, target_index,
3537
want_unversioned, tree):
3175
search_specific_files, state, source_index, target_index,
3176
want_unversioned, tree):
3538
3177
self.old_dirname_to_file_id = {}
3539
3178
self.new_dirname_to_file_id = {}
3540
3179
# Are we doing a partial iter_changes?
3541
self.partial = search_specific_files != {''}
3180
self.partial = search_specific_files != set([''])
3542
3181
# Using a list so that we can access the values and change them in
3543
3182
# nested scope. Each one is [path, file_id, entry]
3544
3183
self.last_source_parent = [None, None]
3788
3419
and stat.S_IEXEC & path_info[3].st_mode)
3790
3421
target_exec = target_details[3]
3793
(None, self.utf8_decode(path)[0]),
3797
(None, self.utf8_decode(entry[0][1])[0]),
3798
(None, path_info[2]),
3799
(None, target_exec)), True
3422
return (entry[0][2],
3423
(None, self.utf8_decode(path)[0]),
3427
(None, self.utf8_decode(entry[0][1])[0]),
3428
(None, path_info[2]),
3429
(None, target_exec)), True
3801
3431
# Its a missing file, report it as such.
3804
(None, self.utf8_decode(path)[0]),
3808
(None, self.utf8_decode(entry[0][1])[0]),
3810
(None, False)), True
3811
elif source_minikind in _fdlt and target_minikind in b'a':
3432
return (entry[0][2],
3433
(None, self.utf8_decode(path)[0]),
3437
(None, self.utf8_decode(entry[0][1])[0]),
3439
(None, False)), True
3440
elif source_minikind in 'fdlt' and target_minikind in 'a':
3812
3441
# unversioned, possibly, or possibly not deleted: we dont care.
3813
3442
# if its still on disk, *and* theres no other entry at this
3814
3443
# path [we dont know this in this routine at the moment -
3815
3444
# perhaps we should change this - then it would be an unknown.
3816
3445
old_path = pathjoin(entry[0][0], entry[0][1])
3817
3446
# parent id is the entry for the path in the target tree
3818
parent_id = self.state._get_entry(
3819
self.source_index, path_utf8=entry[0][0])[0][2]
3447
parent_id = self.state._get_entry(self.source_index, path_utf8=entry[0][0])[0][2]
3820
3448
if parent_id == entry[0][2]:
3821
3449
parent_id = None
3824
(self.utf8_decode(old_path)[0], None),
3828
(self.utf8_decode(entry[0][1])[0], None),
3829
(DirState._minikind_to_kind[source_minikind], None),
3830
(source_details[3], None)), True
3831
elif source_minikind in _fdlt and target_minikind in b'r':
3450
return (entry[0][2],
3451
(self.utf8_decode(old_path)[0], None),
3455
(self.utf8_decode(entry[0][1])[0], None),
3456
(DirState._minikind_to_kind[source_minikind], None),
3457
(source_details[3], None)), True
3458
elif source_minikind in 'fdlt' and target_minikind in 'r':
3832
3459
# a rename; could be a true rename, or a rename inherited from
3833
3460
# a renamed parent. TODO: handle this efficiently. Its not
3834
3461
# common case to rename dirs though, so a correct but slow
3835
3462
# implementation will do.
3836
if not osutils.is_inside_any(self.searched_specific_files,
3463
if not osutils.is_inside_any(self.searched_specific_files, target_details[1]):
3838
3464
self.search_specific_files.add(target_details[1])
3839
elif source_minikind in _ra and target_minikind in _ra:
3465
elif source_minikind in 'ra' and target_minikind in 'ra':
3840
3466
# neither of the selected trees contain this file,
3841
3467
# so skip over it. This is not currently directly tested, but
3842
3468
# is indirectly via test_too_much.TestCommands.test_conflicts.
3845
3471
raise AssertionError("don't know how to compare "
3846
"source_minikind=%r, target_minikind=%r"
3847
% (source_minikind, target_minikind))
3472
"source_minikind=%r, target_minikind=%r"
3473
% (source_minikind, target_minikind))
3474
## import pdb;pdb.set_trace()
3848
3475
return None, None
3850
3477
def __iter__(self):
3961
3587
new_executable = bool(
3962
3588
stat.S_ISREG(root_dir_info[3].st_mode)
3963
3589
and stat.S_IEXEC & root_dir_info[3].st_mode)
3966
(None, current_root_unicode),
3970
(None, splitpath(current_root_unicode)[-1]),
3971
(None, root_dir_info[2]),
3972
(None, new_executable)
3974
initial_key = (current_root, b'', b'')
3591
(None, current_root_unicode),
3595
(None, splitpath(current_root_unicode)[-1]),
3596
(None, root_dir_info[2]),
3597
(None, new_executable)
3599
initial_key = (current_root, '', '')
3975
3600
block_index, _ = self.state._find_block_index_from_key(initial_key)
3976
3601
if block_index == 0:
3977
3602
# we have processed the total root already, but because the
3978
3603
# initial key matched it we should skip it here.
3980
3605
if root_dir_info and root_dir_info[2] == 'tree-reference':
3981
3606
current_dir_info = None
3983
dir_iterator = osutils._walkdirs_utf8(
3984
root_abspath, prefix=current_root)
3608
dir_iterator = osutils._walkdirs_utf8(root_abspath, prefix=current_root)
3986
current_dir_info = next(dir_iterator)
3987
except OSError as e:
3610
current_dir_info = dir_iterator.next()
3988
3612
# on win32, python2.4 has e.errno == ERROR_DIRECTORY, but
3989
3613
# python 2.5 has e.errno == EINVAL,
3990
3614
# and e.winerror == ERROR_DIRECTORY
3996
3620
if e.errno in (errno.ENOENT, errno.ENOTDIR, errno.EINVAL):
3997
3621
current_dir_info = None
3998
3622
elif (sys.platform == 'win32'
3999
and (e.errno in win_errors or
4000
e_winerror in win_errors)):
3623
and (e.errno in win_errors
3624
or e_winerror in win_errors)):
4001
3625
current_dir_info = None
4005
if current_dir_info[0][0] == b'':
3629
if current_dir_info[0][0] == '':
4006
3630
# remove .bzr from iteration
4007
bzr_index = bisect.bisect_left(
4008
current_dir_info[1], (b'.bzr',))
4009
if current_dir_info[1][bzr_index][0] != b'.bzr':
3631
bzr_index = bisect.bisect_left(current_dir_info[1], ('.bzr',))
3632
if current_dir_info[1][bzr_index][0] != '.bzr':
4010
3633
raise AssertionError()
4011
3634
del current_dir_info[1][bzr_index]
4012
3635
# walk until both the directory listing and the versioned metadata
4013
3636
# are exhausted.
4014
3637
if (block_index < len(self.state._dirblocks) and
4015
osutils.is_inside(current_root,
4016
self.state._dirblocks[block_index][0])):
3638
osutils.is_inside(current_root, self.state._dirblocks[block_index][0])):
4017
3639
current_block = self.state._dirblocks[block_index]
4019
3641
current_block = None
4020
3642
while (current_dir_info is not None or
4021
3643
current_block is not None):
4022
3644
if (current_dir_info and current_block
4023
and current_dir_info[0][0] != current_block[0]):
4024
if _lt_by_dirs(current_dir_info[0][0], current_block[0]):
3645
and current_dir_info[0][0] != current_block[0]):
3646
if _cmp_by_dirs(current_dir_info[0][0], current_block[0]) < 0:
4025
3647
# filesystem data refers to paths not covered by the dirblock.
4026
3648
# this has two possibilities:
4027
3649
# A) it is versioned but empty, so there is no block for it
4033
3655
# recurse into unknown directories.
4035
3657
while path_index < len(current_dir_info[1]):
4036
current_path_info = current_dir_info[1][path_index]
4037
if self.want_unversioned:
4038
if current_path_info[2] == 'directory':
4039
if self.tree._directory_is_tree_reference(
3658
current_path_info = current_dir_info[1][path_index]
3659
if self.want_unversioned:
3660
if current_path_info[2] == 'directory':
3661
if self.tree._directory_is_tree_reference(
4040
3662
current_path_info[0].decode('utf8')):
4041
current_path_info = current_path_info[:2] + \
4042
('tree-reference',) + \
4043
current_path_info[3:]
4044
new_executable = bool(
4045
stat.S_ISREG(current_path_info[3].st_mode)
4046
and stat.S_IEXEC & current_path_info[3].st_mode)
4049
(None, utf8_decode(current_path_info[0])[0]),
4053
(None, utf8_decode(current_path_info[1])[0]),
4054
(None, current_path_info[2]),
4055
(None, new_executable))
4056
# dont descend into this unversioned path if it is
4058
if current_path_info[2] in ('directory',
4060
del current_dir_info[1][path_index]
3663
current_path_info = current_path_info[:2] + \
3664
('tree-reference',) + current_path_info[3:]
3665
new_executable = bool(
3666
stat.S_ISREG(current_path_info[3].st_mode)
3667
and stat.S_IEXEC & current_path_info[3].st_mode)
3669
(None, utf8_decode(current_path_info[0])[0]),
3673
(None, utf8_decode(current_path_info[1])[0]),
3674
(None, current_path_info[2]),
3675
(None, new_executable))
3676
# dont descend into this unversioned path if it is
3678
if current_path_info[2] in ('directory',
3680
del current_dir_info[1][path_index]
4064
3684
# This dir info has been handled, go to the next
4066
current_dir_info = next(dir_iterator)
3686
current_dir_info = dir_iterator.next()
4067
3687
except StopIteration:
4068
3688
current_dir_info = None
4279
3891
raise AssertionError(
4280
3892
"Got entry<->path mismatch for specific path "
4281
3893
"%r entry %r path_info %r " % (
4282
path_utf8, entry, path_info))
3894
path_utf8, entry, path_info))
4283
3895
# Only include changes - we're outside the users requested
4286
3898
self._gather_result_for_consistency(result)
4287
if (result.kind[0] == 'directory' and
4288
result.kind[1] != 'directory'):
3899
if (result[6][0] == 'directory' and
3900
result[6][1] != 'directory'):
4289
3901
# This stopped being a directory, the old children have
4290
3902
# to be included.
4291
if entry[1][self.source_index][0] == b'r':
3903
if entry[1][self.source_index][0] == 'r':
4292
3904
# renamed, take the source path
4293
3905
entry_path_utf8 = entry[1][self.source_index][1]
4295
3907
entry_path_utf8 = path_utf8
4296
initial_key = (entry_path_utf8, b'', b'')
3908
initial_key = (entry_path_utf8, '', '')
4297
3909
block_index, _ = self.state._find_block_index_from_key(
4299
3911
if block_index == 0:
4300
3912
# The children of the root are in block index 1.
4302
3914
current_block = None
4303
3915
if block_index < len(self.state._dirblocks):
4304
3916
current_block = self.state._dirblocks[block_index]
4305
3917
if not osutils.is_inside(
4306
entry_path_utf8, current_block[0]):
3918
entry_path_utf8, current_block[0]):
4307
3919
# No entries for this directory at all.
4308
3920
current_block = None
4309
3921
if current_block is not None:
4310
3922
for entry in current_block[1]:
4311
if entry[1][self.source_index][0] in (b'a', b'r'):
3923
if entry[1][self.source_index][0] in 'ar':
4312
3924
# Not in the source tree, so doesn't have to be