20
20
lines by NL. The field delimiters are ommitted in the grammar, line delimiters
21
21
are not - this is done for clarity of reading. All string data is in utf8.
25
MINIKIND = "f" | "d" | "l" | "a" | "r" | "t";
28
WHOLE_NUMBER = {digit}, digit;
30
REVISION_ID = a non-empty utf8 string;
32
dirstate format = header line, full checksum, row count, parent details,
33
ghost_details, entries;
34
header line = "#bazaar dirstate flat format 3", NL;
35
full checksum = "crc32: ", ["-"], WHOLE_NUMBER, NL;
36
row count = "num_entries: ", WHOLE_NUMBER, NL;
37
parent_details = WHOLE NUMBER, {REVISION_ID}* NL;
38
ghost_details = WHOLE NUMBER, {REVISION_ID}*, NL;
40
entry = entry_key, current_entry_details, {parent_entry_details};
41
entry_key = dirname, basename, fileid;
42
current_entry_details = common_entry_details, working_entry_details;
43
parent_entry_details = common_entry_details, history_entry_details;
44
common_entry_details = MINIKIND, fingerprint, size, executable
45
working_entry_details = packed_stat
46
history_entry_details = REVISION_ID;
49
fingerprint = a nonempty utf8 sequence with meaning defined by minikind.
51
Given this definition, the following is useful to know::
53
entry (aka row) - all the data for a given key.
54
entry[0]: The key (dirname, basename, fileid)
58
entry[1]: The tree(s) data for this path and id combination.
59
entry[1][0]: The current tree
60
entry[1][1]: The second tree
62
For an entry for a tree, we have (using tree 0 - current tree) to demonstrate::
64
entry[1][0][0]: minikind
65
entry[1][0][1]: fingerprint
67
entry[1][0][3]: executable
68
entry[1][0][4]: packed_stat
72
entry[1][1][4]: revision_id
23
MINIKIND = "f" | "d" | "l" | "a" | "r" | "t";
26
WHOLE_NUMBER = {digit}, digit;
28
REVISION_ID = a non-empty utf8 string;
30
dirstate format = header line, full checksum, row count, parent details,
31
ghost_details, entries;
32
header line = "#bazaar dirstate flat format 2", NL;
33
full checksum = "crc32: ", ["-"], WHOLE_NUMBER, NL;
34
row count = "num_entries: ", digit, NL;
35
parent_details = WHOLE NUMBER, {REVISION_ID}* NL;
36
ghost_details = WHOLE NUMBER, {REVISION_ID}*, NL;
38
entry = entry_key, current_entry_details, {parent_entry_details};
39
entry_key = dirname, basename, fileid;
40
current_entry_details = common_entry_details, working_entry_details;
41
parent_entry_details = common_entry_details, history_entry_details;
42
common_entry_details = MINIKIND, fingerprint, size, executable
43
working_entry_details = packed_stat
44
history_entry_details = REVISION_ID;
47
fingerprint = a nonempty utf8 sequence with meaning defined by minikind.
49
Given this definition, the following is useful to know:
50
entry (aka row) - all the data for a given key.
51
entry[0]: The key (dirname, basename, fileid)
55
entry[1]: The tree(s) data for this path and id combination.
56
entry[1][0]: The current tree
57
entry[1][1]: The second tree
59
For an entry for a tree, we have (using tree 0 - current tree) to demonstrate:
60
entry[1][0][0]: minikind
61
entry[1][0][1]: fingerprint
63
entry[1][0][3]: executable
64
entry[1][0][4]: packed_stat
66
entry[1][1][4]: revision_id
74
68
There may be multiple rows at the root, one per id present in the root, so the
75
in memory root row is now::
77
self._dirblocks[0] -> ('', [entry ...]),
79
and the entries in there are::
83
entries[0][2]: file_id
84
entries[1][0]: The tree data for the current tree for this fileid at /
89
b'r' is a relocated entry: This path is not present in this tree with this
90
id, but the id can be found at another location. The fingerprint is
91
used to point to the target location.
92
b'a' is an absent entry: In that tree the id is not present at this path.
93
b'd' is a directory entry: This path in this tree is a directory with the
94
current file id. There is no fingerprint for directories.
95
b'f' is a file entry: As for directory, but it's a file. The fingerprint is
96
the sha1 value of the file's canonical form, i.e. after any read
97
filters have been applied to the convenience form stored in the working
99
b'l' is a symlink entry: As for directory, but a symlink. The fingerprint is
101
b't' is a reference to a nested subtree; the fingerprint is the referenced
69
in memory root row is now:
70
self._dirblocks[0] -> ('', [entry ...]),
71
and the entries in there are
74
entries[0][2]: file_id
75
entries[1][0]: The tree data for the current tree for this fileid at /
79
'r' is a relocated entry: This path is not present in this tree with this id,
80
but the id can be found at another location. The fingerprint is used to
81
point to the target location.
82
'a' is an absent entry: In that tree the id is not present at this path.
83
'd' is a directory entry: This path in this tree is a directory with the
84
current file id. There is no fingerprint for directories.
85
'f' is a file entry: As for directory, but its a file. The fingerprint is a
87
'l' is a symlink entry: As for directory, but a symlink. The fingerprint is the
89
't' is a reference to a nested subtree; the fingerprint is the referenced
106
The entries on disk and in memory are ordered according to the following keys::
94
The entries on disk and in memory are ordered according to the following keys:
108
96
directory, as a list of components
112
100
--- Format 1 had the following different definition: ---
116
rows = dirname, NULL, basename, NULL, MINIKIND, NULL, fileid_utf8, NULL,
117
WHOLE NUMBER (* size *), NULL, packed stat, NULL, sha1|symlink target,
119
PARENT ROW = NULL, revision_utf8, NULL, MINIKIND, NULL, dirname, NULL,
120
basename, NULL, WHOLE NUMBER (* size *), NULL, "y" | "n", NULL,
101
rows = dirname, NULL, basename, NULL, MINIKIND, NULL, fileid_utf8, NULL,
102
WHOLE NUMBER (* size *), NULL, packed stat, NULL, sha1|symlink target,
104
PARENT ROW = NULL, revision_utf8, NULL, MINIKIND, NULL, dirname, NULL,
105
basename, NULL, WHOLE NUMBER (* size *), NULL, "y" | "n", NULL,
123
108
PARENT ROW's are emitted for every parent that is not in the ghosts details
124
109
line. That is, if the parents are foo, bar, baz, and the ghosts are bar, then
1326
def _check_delta_is_valid(self, delta):
1327
delta = list(inventory._check_delta_unique_ids(
1328
inventory._check_delta_unique_old_paths(
1329
inventory._check_delta_unique_new_paths(
1330
inventory._check_delta_ids_match_entry(
1331
inventory._check_delta_ids_are_valid(
1332
inventory._check_delta_new_path_entry_both_or_None(delta)))))))
1335
(old_path, new_path, file_id, new_entry) = d
1336
if old_path is None:
1338
if new_path is None:
1340
return (old_path, new_path, file_id, new_entry)
1341
delta.sort(key=delta_key, reverse=True)
1344
def update_by_delta(self, delta):
1345
"""Apply an inventory delta to the dirstate for tree 0
1347
This is the workhorse for apply_inventory_delta in dirstate based
1350
:param delta: An inventory delta. See Inventory.apply_delta for
1062
def update_entry(self, entry, abspath, stat_value=None):
1063
"""Update the entry based on what is actually on disk.
1065
:param entry: This is the dirblock entry for the file in question.
1066
:param abspath: The path on disk for this file.
1067
:param stat_value: (optional) if we already have done a stat on the
1069
:return: The sha1 hexdigest of the file (40 bytes) or link target of a
1353
self._read_dirblocks_if_needed()
1354
encode = cache_utf8.encode
1357
# Accumulate parent references (path_utf8, id), to check for parentless
1358
# items or items placed under files/links/tree-references. We get
1359
# references from every item in the delta that is not a deletion and
1360
# is not itself the root.
1362
# Added ids must not be in the dirstate already. This set holds those
1365
# This loop transforms the delta to single atomic operations that can
1366
# be executed and validated.
1367
delta = self._check_delta_is_valid(delta)
1368
for old_path, new_path, file_id, inv_entry in delta:
1369
if not isinstance(file_id, bytes):
1370
raise AssertionError(
1371
"must be a utf8 file_id not %s" % (type(file_id), ))
1372
if (file_id in insertions) or (file_id in removals):
1373
self._raise_invalid(old_path or new_path, file_id,
1375
if old_path is not None:
1376
old_path = old_path.encode('utf-8')
1377
removals[file_id] = old_path
1379
new_ids.add(file_id)
1380
if new_path is not None:
1381
if inv_entry is None:
1382
self._raise_invalid(new_path, file_id,
1383
"new_path with no entry")
1384
new_path = new_path.encode('utf-8')
1385
dirname_utf8, basename = osutils.split(new_path)
1387
parents.add((dirname_utf8, inv_entry.parent_id))
1388
key = (dirname_utf8, basename, file_id)
1389
minikind = DirState._kind_to_minikind[inv_entry.kind]
1390
if minikind == b't':
1391
fingerprint = inv_entry.reference_revision or b''
1394
insertions[file_id] = (key, minikind, inv_entry.executable,
1395
fingerprint, new_path)
1396
# Transform moves into delete+add pairs
1397
if None not in (old_path, new_path):
1398
for child in self._iter_child_entries(0, old_path):
1399
if child[0][2] in insertions or child[0][2] in removals:
1401
child_dirname = child[0][0]
1402
child_basename = child[0][1]
1403
minikind = child[1][0][0]
1404
fingerprint = child[1][0][4]
1405
executable = child[1][0][3]
1406
old_child_path = osutils.pathjoin(child_dirname,
1408
removals[child[0][2]] = old_child_path
1409
child_suffix = child_dirname[len(old_path):]
1410
new_child_dirname = (new_path + child_suffix)
1411
key = (new_child_dirname, child_basename, child[0][2])
1412
new_child_path = osutils.pathjoin(new_child_dirname,
1414
insertions[child[0][2]] = (key, minikind, executable,
1415
fingerprint, new_child_path)
1416
self._check_delta_ids_absent(new_ids, delta, 0)
1418
self._apply_removals(removals.items())
1419
self._apply_insertions(insertions.values())
1421
self._after_delta_check_parents(parents, 0)
1422
except errors.BzrError as e:
1423
self._changes_aborted = True
1424
if 'integrity error' not in str(e):
1426
# _get_entry raises BzrError when a request is inconsistent; we
1427
# want such errors to be shown as InconsistentDelta - and that
1428
# fits the behaviour we trigger.
1429
raise errors.InconsistentDeltaDelta(delta,
1430
"error from _get_entry. %s" % (e,))
1432
def _apply_removals(self, removals):
1433
for file_id, path in sorted(removals, reverse=True,
1434
key=operator.itemgetter(1)):
1435
dirname, basename = osutils.split(path)
1436
block_i, entry_i, d_present, f_present = \
1437
self._get_block_entry_index(dirname, basename, 0)
1072
# This code assumes that the entry passed in is directly held in one of
1073
# the internal _dirblocks. So the dirblock state must have already been
1075
assert self._dirblock_state != DirState.NOT_IN_MEMORY
1076
if stat_value is None:
1439
entry = self._dirblocks[block_i][1][entry_i]
1441
self._raise_invalid(path, file_id,
1442
"Wrong path for old path.")
1443
if not f_present or entry[1][0][0] in (b'a', b'r'):
1444
self._raise_invalid(path, file_id,
1445
"Wrong path for old path.")
1446
if file_id != entry[0][2]:
1447
self._raise_invalid(path, file_id,
1448
"Attempt to remove path has wrong id - found %r."
1450
self._make_absent(entry)
1451
# See if we have a malformed delta: deleting a directory must not
1452
# leave crud behind. This increases the number of bisects needed
1453
# substantially, but deletion or renames of large numbers of paths
1454
# is rare enough it shouldn't be an issue (famous last words?) RBC
1456
block_i, entry_i, d_present, f_present = \
1457
self._get_block_entry_index(path, b'', 0)
1459
# The dir block is still present in the dirstate; this could
1460
# be due to it being in a parent tree, or a corrupt delta.
1461
for child_entry in self._dirblocks[block_i][1]:
1462
if child_entry[1][0][0] not in (b'r', b'a'):
1463
self._raise_invalid(path, entry[0][2],
1464
"The file id was deleted but its children were "
1467
def _apply_insertions(self, adds):
1469
for key, minikind, executable, fingerprint, path_utf8 in sorted(adds):
1470
self.update_minimal(key, minikind, executable, fingerprint,
1471
path_utf8=path_utf8)
1472
except errors.NotVersionedError:
1473
self._raise_invalid(path_utf8.decode('utf8'), key[2],
1476
def update_basis_by_delta(self, delta, new_revid):
1477
"""Update the parents of this tree after a commit.
1479
This gives the tree one parent, with revision id new_revid. The
1480
inventory delta is applied to the current basis tree to generate the
1481
inventory for the parent new_revid, and all other parent trees are
1484
Note that an exception during the operation of this method will leave
1485
the dirstate in a corrupt state where it should not be saved.
1487
:param new_revid: The new revision id for the trees parent.
1488
:param delta: An inventory delta (see apply_inventory_delta) describing
1489
the changes from the current left most parent revision to new_revid.
1491
self._read_dirblocks_if_needed()
1492
self._discard_merge_parents()
1493
if self._ghosts != []:
1494
raise NotImplementedError(self.update_basis_by_delta)
1495
if len(self._parents) == 0:
1496
# setup a blank tree, the most simple way.
1497
empty_parent = DirState.NULL_PARENT_DETAILS
1498
for entry in self._iter_entries():
1499
entry[1].append(empty_parent)
1500
self._parents.append(new_revid)
1502
self._parents[0] = new_revid
1504
delta = self._check_delta_is_valid(delta)
1508
# The paths this function accepts are unicode and must be encoded as we
1510
encode = cache_utf8.encode
1511
inv_to_entry = self._inv_entry_to_details
1512
# delta is now (deletes, changes), (adds) in reverse lexographical
1514
# deletes in reverse lexographic order are safe to process in situ.
1515
# renames are not, as a rename from any path could go to a path
1516
# lexographically lower, so we transform renames into delete, add pairs,
1517
# expanding them recursively as needed.
1518
# At the same time, to reduce interface friction we convert the input
1519
# inventory entries to dirstate.
1520
root_only = ('', '')
1521
# Accumulate parent references (path_utf8, id), to check for parentless
1522
# items or items placed under files/links/tree-references. We get
1523
# references from every item in the delta that is not a deletion and
1524
# is not itself the root.
1526
# Added ids must not be in the dirstate already. This set holds those
1529
for old_path, new_path, file_id, inv_entry in delta:
1530
if file_id.__class__ is not bytes:
1531
raise AssertionError(
1532
"must be a utf8 file_id not %s" % (type(file_id), ))
1533
if inv_entry is not None and file_id != inv_entry.file_id:
1534
self._raise_invalid(new_path, file_id,
1535
"mismatched entry file_id %r" % inv_entry)
1536
if new_path is None:
1537
new_path_utf8 = None
1539
if inv_entry is None:
1540
self._raise_invalid(new_path, file_id,
1541
"new_path with no entry")
1542
new_path_utf8 = encode(new_path)
1543
# note the parent for validation
1544
dirname_utf8, basename_utf8 = osutils.split(new_path_utf8)
1546
parents.add((dirname_utf8, inv_entry.parent_id))
1547
if old_path is None:
1548
old_path_utf8 = None
1550
old_path_utf8 = encode(old_path)
1551
if old_path is None:
1552
adds.append((None, new_path_utf8, file_id,
1553
inv_to_entry(inv_entry), True))
1554
new_ids.add(file_id)
1555
elif new_path is None:
1556
deletes.append((old_path_utf8, None, file_id, None, True))
1557
elif (old_path, new_path) == root_only:
1558
# change things in-place
1559
# Note: the case of a parent directory changing its file_id
1560
# tends to break optimizations here, because officially
1561
# the file has actually been moved, it just happens to
1562
# end up at the same path. If we can figure out how to
1563
# handle that case, we can avoid a lot of add+delete
1564
# pairs for objects that stay put.
1565
# elif old_path == new_path:
1566
changes.append((old_path_utf8, new_path_utf8, file_id,
1567
inv_to_entry(inv_entry)))
1570
# Because renames must preserve their children we must have
1571
# processed all relocations and removes before hand. The sort
1572
# order ensures we've examined the child paths, but we also
1573
# have to execute the removals, or the split to an add/delete
1574
# pair will result in the deleted item being reinserted, or
1575
# renamed items being reinserted twice - and possibly at the
1576
# wrong place. Splitting into a delete/add pair also simplifies
1577
# the handling of entries with (b'f', ...), (b'r' ...) because
1578
# the target of the b'r' is old_path here, and we add that to
1579
# deletes, meaning that the add handler does not need to check
1580
# for b'r' items on every pass.
1581
self._update_basis_apply_deletes(deletes)
1583
# Split into an add/delete pair recursively.
1584
adds.append((old_path_utf8, new_path_utf8, file_id,
1585
inv_to_entry(inv_entry), False))
1586
# Expunge deletes that we've seen so that deleted/renamed
1587
# children of a rename directory are handled correctly.
1588
new_deletes = reversed(list(
1589
self._iter_child_entries(1, old_path_utf8)))
1590
# Remove the current contents of the tree at orig_path, and
1591
# reinsert at the correct new path.
1592
for entry in new_deletes:
1593
child_dirname, child_basename, child_file_id = entry[0]
1595
source_path = child_dirname + b'/' + child_basename
1597
source_path = child_basename
1600
new_path_utf8 + source_path[len(old_path_utf8):]
1602
if old_path_utf8 == b'':
1603
raise AssertionError("cannot rename directory to"
1605
target_path = source_path[len(old_path_utf8) + 1:]
1607
(None, target_path, entry[0][2], entry[1][1], False))
1609
(source_path, target_path, entry[0][2], None, False))
1611
(old_path_utf8, new_path_utf8, file_id, None, False))
1613
self._check_delta_ids_absent(new_ids, delta, 1)
1615
# Finish expunging deletes/first half of renames.
1616
self._update_basis_apply_deletes(deletes)
1617
# Reinstate second half of renames and new paths.
1618
self._update_basis_apply_adds(adds)
1619
# Apply in-situ changes.
1620
self._update_basis_apply_changes(changes)
1622
self._after_delta_check_parents(parents, 1)
1623
except errors.BzrError as e:
1624
self._changes_aborted = True
1625
if 'integrity error' not in str(e):
1078
# We could inline os.lstat but the common case is that
1079
# stat_value will be passed in, not read here.
1080
stat_value = self._lstat(abspath, entry)
1081
except (OSError, IOError), e:
1082
if e.errno in (errno.ENOENT, errno.EACCES,
1084
# The entry is missing, consider it gone
1627
# _get_entry raises BzrError when a request is inconsistent; we
1628
# want such errors to be shown as InconsistentDelta - and that
1629
# fits the behaviour we trigger.
1630
raise errors.InconsistentDeltaDelta(delta,
1631
"error from _get_entry. %s" % (e,))
1633
self._mark_modified(header_modified=True)
1634
self._id_index = None
1637
def _check_delta_ids_absent(self, new_ids, delta, tree_index):
1638
"""Check that none of the file_ids in new_ids are present in a tree."""
1641
id_index = self._get_id_index()
1642
for file_id in new_ids:
1643
for key in id_index.get(file_id, ()):
1644
block_i, entry_i, d_present, f_present = \
1645
self._get_block_entry_index(key[0], key[1], tree_index)
1647
# In a different tree
1649
entry = self._dirblocks[block_i][1][entry_i]
1650
if entry[0][2] != file_id:
1651
# Different file_id, so not what we want.
1653
self._raise_invalid((b"%s/%s" % key[0:2]).decode('utf8'), file_id,
1654
"This file_id is new in the delta but already present in "
1657
def _raise_invalid(self, path, file_id, reason):
1658
self._changes_aborted = True
1659
raise errors.InconsistentDelta(path, file_id, reason)
1661
def _update_basis_apply_adds(self, adds):
1662
"""Apply a sequence of adds to tree 1 during update_basis_by_delta.
1664
They may be adds, or renames that have been split into add/delete
1667
:param adds: A sequence of adds. Each add is a tuple:
1668
(None, new_path_utf8, file_id, (entry_details), real_add). real_add
1669
is False when the add is the second half of a remove-and-reinsert
1670
pair created to handle renames and deletes.
1672
# Adds are accumulated partly from renames, so can be in any input
1674
# TODO: we may want to sort in dirblocks order. That way each entry
1675
# will end up in the same directory, allowing the _get_entry
1676
# fast-path for looking up 2 items in the same dir work.
1677
adds.sort(key=lambda x: x[1])
1678
# adds is now in lexographic order, which places all parents before
1679
# their children, so we can process it linearly.
1680
st = static_tuple.StaticTuple
1681
for old_path, new_path, file_id, new_details, real_add in adds:
1682
dirname, basename = osutils.split(new_path)
1683
entry_key = st(dirname, basename, file_id)
1684
block_index, present = self._find_block_index_from_key(entry_key)
1686
# The block where we want to put the file is not present.
1687
# However, it might have just been an empty directory. Look for
1688
# the parent in the basis-so-far before throwing an error.
1689
parent_dir, parent_base = osutils.split(dirname)
1690
parent_block_idx, parent_entry_idx, _, parent_present = \
1691
self._get_block_entry_index(parent_dir, parent_base, 1)
1692
if not parent_present:
1693
self._raise_invalid(new_path, file_id,
1694
"Unable to find block for this record."
1695
" Was the parent added?")
1696
self._ensure_block(parent_block_idx, parent_entry_idx, dirname)
1698
block = self._dirblocks[block_index][1]
1699
entry_index, present = self._find_entry_index(entry_key, block)
1701
if old_path is not None:
1702
self._raise_invalid(new_path, file_id,
1703
'considered a real add but still had old_path at %s'
1706
entry = block[entry_index]
1707
basis_kind = entry[1][1][0]
1708
if basis_kind == b'a':
1709
entry[1][1] = new_details
1710
elif basis_kind == b'r':
1711
raise NotImplementedError()
1713
self._raise_invalid(new_path, file_id,
1714
"An entry was marked as a new add"
1715
" but the basis target already existed")
1717
# The exact key was not found in the block. However, we need to
1718
# check if there is a key next to us that would have matched.
1719
# We only need to check 2 locations, because there are only 2
1721
for maybe_index in range(entry_index - 1, entry_index + 1):
1722
if maybe_index < 0 or maybe_index >= len(block):
1724
maybe_entry = block[maybe_index]
1725
if maybe_entry[0][:2] != (dirname, basename):
1726
# Just a random neighbor
1728
if maybe_entry[0][2] == file_id:
1729
raise AssertionError(
1730
'_find_entry_index didnt find a key match'
1731
' but walking the data did, for %s'
1733
basis_kind = maybe_entry[1][1][0]
1734
if basis_kind not in (b'a', b'r'):
1735
self._raise_invalid(new_path, file_id,
1736
"we have an add record for path, but the path"
1737
" is already present with another file_id %s"
1738
% (maybe_entry[0][2],))
1740
entry = (entry_key, [DirState.NULL_PARENT_DETAILS,
1742
block.insert(entry_index, entry)
1744
active_kind = entry[1][0][0]
1745
if active_kind == b'a':
1746
# The active record shows up as absent, this could be genuine,
1747
# or it could be present at some other location. We need to
1749
id_index = self._get_id_index()
1750
# The id_index may not be perfectly accurate for tree1, because
1751
# we haven't been keeping it updated. However, it should be
1752
# fine for tree0, and that gives us enough info for what we
1754
keys = id_index.get(file_id, ())
1756
block_i, entry_i, d_present, f_present = \
1757
self._get_block_entry_index(key[0], key[1], 0)
1760
active_entry = self._dirblocks[block_i][1][entry_i]
1761
if (active_entry[0][2] != file_id):
1762
# Some other file is at this path, we don't need to
1765
real_active_kind = active_entry[1][0][0]
1766
if real_active_kind in (b'a', b'r'):
1767
# We found a record, which was not *this* record,
1768
# which matches the file_id, but is not actually
1769
# present. Something seems *really* wrong.
1770
self._raise_invalid(new_path, file_id,
1771
"We found a tree0 entry that doesnt make sense")
1772
# Now, we've found a tree0 entry which matches the file_id
1773
# but is at a different location. So update them to be
1775
active_dir, active_name = active_entry[0][:2]
1777
active_path = active_dir + b'/' + active_name
1779
active_path = active_name
1780
active_entry[1][1] = st(b'r', new_path, 0, False, b'')
1781
entry[1][0] = st(b'r', active_path, 0, False, b'')
1782
elif active_kind == b'r':
1783
raise NotImplementedError()
1785
new_kind = new_details[0]
1786
if new_kind == b'd':
1787
self._ensure_block(block_index, entry_index, new_path)
1789
def _update_basis_apply_changes(self, changes):
1790
"""Apply a sequence of changes to tree 1 during update_basis_by_delta.
1792
:param adds: A sequence of changes. Each change is a tuple:
1793
(path_utf8, path_utf8, file_id, (entry_details))
1795
for old_path, new_path, file_id, new_details in changes:
1796
# the entry for this file_id must be in tree 0.
1797
entry = self._get_entry(1, file_id, new_path)
1798
if entry[0] is None or entry[1][1][0] in (b'a', b'r'):
1799
self._raise_invalid(new_path, file_id,
1800
'changed entry considered not present')
1801
entry[1][1] = new_details
1803
def _update_basis_apply_deletes(self, deletes):
1804
"""Apply a sequence of deletes to tree 1 during update_basis_by_delta.
1806
They may be deletes, or renames that have been split into add/delete
1809
:param deletes: A sequence of deletes. Each delete is a tuple:
1810
(old_path_utf8, new_path_utf8, file_id, None, real_delete).
1811
real_delete is True when the desired outcome is an actual deletion
1812
rather than the rename handling logic temporarily deleting a path
1813
during the replacement of a parent.
1815
null = DirState.NULL_PARENT_DETAILS
1816
for old_path, new_path, file_id, _, real_delete in deletes:
1817
if real_delete != (new_path is None):
1818
self._raise_invalid(old_path, file_id, "bad delete delta")
1819
# the entry for this file_id must be in tree 1.
1820
dirname, basename = osutils.split(old_path)
1821
block_index, entry_index, dir_present, file_present = \
1822
self._get_block_entry_index(dirname, basename, 1)
1823
if not file_present:
1824
self._raise_invalid(old_path, file_id,
1825
'basis tree does not contain removed entry')
1826
entry = self._dirblocks[block_index][1][entry_index]
1827
# The state of the entry in the 'active' WT
1828
active_kind = entry[1][0][0]
1829
if entry[0][2] != file_id:
1830
self._raise_invalid(old_path, file_id,
1831
'mismatched file_id in tree 1')
1833
old_kind = entry[1][1][0]
1834
if active_kind in b'ar':
1835
# The active tree doesn't have this file_id.
1836
# The basis tree is changing this record. If this is a
1837
# rename, then we don't want the record here at all
1838
# anymore. If it is just an in-place change, we want the
1839
# record here, but we'll add it if we need to. So we just
1841
if active_kind == b'r':
1842
active_path = entry[1][0][1]
1843
active_entry = self._get_entry(0, file_id, active_path)
1844
if active_entry[1][1][0] != b'r':
1845
self._raise_invalid(old_path, file_id,
1846
"Dirstate did not have matching rename entries")
1847
elif active_entry[1][0][0] in b'ar':
1848
self._raise_invalid(old_path, file_id,
1849
"Dirstate had a rename pointing at an inactive"
1851
active_entry[1][1] = null
1852
del self._dirblocks[block_index][1][entry_index]
1853
if old_kind == b'd':
1854
# This was a directory, and the active tree says it
1855
# doesn't exist, and now the basis tree says it doesn't
1856
# exist. Remove its dirblock if present
1858
present) = self._find_block_index_from_key(
1859
(old_path, b'', b''))
1861
dir_block = self._dirblocks[dir_block_index][1]
1863
# This entry is empty, go ahead and just remove it
1864
del self._dirblocks[dir_block_index]
1866
# There is still an active record, so just mark this
1869
block_i, entry_i, d_present, f_present = \
1870
self._get_block_entry_index(old_path, b'', 1)
1872
dir_block = self._dirblocks[block_i][1]
1873
for child_entry in dir_block:
1874
child_basis_kind = child_entry[1][1][0]
1875
if child_basis_kind not in b'ar':
1876
self._raise_invalid(old_path, file_id,
1877
"The file id was deleted but its children were "
1880
def _after_delta_check_parents(self, parents, index):
1881
"""Check that parents required by the delta are all intact.
1883
:param parents: An iterable of (path_utf8, file_id) tuples which are
1884
required to be present in tree 'index' at path_utf8 with id file_id
1886
:param index: The column in the dirstate to check for parents in.
1888
for dirname_utf8, file_id in parents:
1889
# Get the entry - the ensures that file_id, dirname_utf8 exists and
1890
# has the right file id.
1891
entry = self._get_entry(index, file_id, dirname_utf8)
1892
if entry[1] is None:
1893
self._raise_invalid(dirname_utf8.decode('utf8'),
1894
file_id, "This parent is not present.")
1895
# Parents of things must be directories
1896
if entry[1][index][0] != b'd':
1897
self._raise_invalid(dirname_utf8.decode('utf8'),
1898
file_id, "This parent is not a directory.")
1900
def _observed_sha1(self, entry, sha1, stat_value,
1901
_stat_to_minikind=_stat_to_minikind):
1902
"""Note the sha1 of a file.
1904
:param entry: The entry the sha1 is for.
1905
:param sha1: The observed sha1.
1906
:param stat_value: The os.lstat for the file.
1088
kind = osutils.file_kind_from_stat_mode(stat_value.st_mode)
1909
minikind = _stat_to_minikind[stat_value.st_mode & 0o170000]
1090
minikind = DirState._kind_to_minikind[kind]
1091
except KeyError: # Unknown kind
1913
if minikind == b'f':
1093
packed_stat = pack_stat(stat_value)
1094
(saved_minikind, saved_link_or_sha1, saved_file_size,
1095
saved_executable, saved_packed_stat) = entry[1][0]
1097
if (minikind == saved_minikind
1098
and packed_stat == saved_packed_stat
1099
# size should also be in packed_stat
1100
and saved_file_size == stat_value.st_size):
1101
# The stat hasn't changed since we saved, so we can potentially
1102
# re-use the saved sha hash.
1914
1106
if self._cutoff_time is None:
1915
1107
self._sha_cutoff_time()
1916
1109
if (stat_value.st_mtime < self._cutoff_time
1917
and stat_value.st_ctime < self._cutoff_time):
1918
entry[1][0] = (b'f', sha1, stat_value.st_size, entry[1][0][3],
1919
pack_stat(stat_value))
1920
self._mark_modified([entry])
1110
and stat_value.st_ctime < self._cutoff_time):
1111
# Return the existing fingerprint
1112
return saved_link_or_sha1
1114
# If we have gotten this far, that means that we need to actually
1115
# process this entry.
1118
link_or_sha1 = self._sha1_file(abspath, entry)
1119
executable = self._is_executable(stat_value.st_mode,
1121
entry[1][0] = ('f', link_or_sha1, stat_value.st_size,
1122
executable, packed_stat)
1123
elif minikind == 'd':
1125
entry[1][0] = ('d', '', 0, False, packed_stat)
1126
if saved_minikind != 'd':
1127
# This changed from something into a directory. Make sure we
1128
# have a directory block for it. This doesn't happen very
1129
# often, so this doesn't have to be super fast.
1130
block_index, entry_index, dir_present, file_present = \
1131
self._get_block_entry_index(entry[0][0], entry[0][1], 0)
1132
self._ensure_block(block_index, entry_index,
1133
osutils.pathjoin(entry[0][0], entry[0][1]))
1134
elif minikind == 'l':
1135
link_or_sha1 = self._read_link(abspath, saved_link_or_sha1)
1136
entry[1][0] = ('l', link_or_sha1, stat_value.st_size,
1138
self._dirblock_state = DirState.IN_MEMORY_MODIFIED
1922
1141
def _sha_cutoff_time(self):
1923
1142
"""Return cutoff time.
2329
1482
def _get_id_index(self):
2330
"""Get an id index of self._dirblocks.
2332
This maps from file_id => [(directory, name, file_id)] entries where
2333
that file_id appears in one of the trees.
1483
"""Get an id index of self._dirblocks."""
2335
1484
if self._id_index is None:
2337
1486
for key, tree_details in self._iter_entries():
2338
self._add_to_id_index(id_index, key)
1487
id_index.setdefault(key[2], set()).add(key)
2339
1488
self._id_index = id_index
2340
1489
return self._id_index
2342
def _add_to_id_index(self, id_index, entry_key):
2343
"""Add this entry to the _id_index mapping."""
2344
# This code used to use a set for every entry in the id_index. However,
2345
# it is *rare* to have more than one entry. So a set is a large
2346
# overkill. And even when we do, we won't ever have more than the
2347
# number of parent trees. Which is still a small number (rarely >2). As
2348
# such, we use a simple tuple, and do our own uniqueness checks. While
2349
# the 'in' check is O(N) since N is nicely bounded it shouldn't ever
2350
# cause quadratic failure.
2351
file_id = entry_key[2]
2352
entry_key = static_tuple.StaticTuple.from_sequence(entry_key)
2353
if file_id not in id_index:
2354
id_index[file_id] = static_tuple.StaticTuple(entry_key,)
2356
entry_keys = id_index[file_id]
2357
if entry_key not in entry_keys:
2358
id_index[file_id] = entry_keys + (entry_key,)
2360
def _remove_from_id_index(self, id_index, entry_key):
2361
"""Remove this entry from the _id_index mapping.
2363
It is an programming error to call this when the entry_key is not
2366
file_id = entry_key[2]
2367
entry_keys = list(id_index[file_id])
2368
entry_keys.remove(entry_key)
2369
id_index[file_id] = static_tuple.StaticTuple.from_sequence(entry_keys)
2371
1491
def _get_output_lines(self, lines):
2372
"""Format lines for final output.
1492
"""format lines for final output.
2374
:param lines: A sequence of lines containing the parents list and the
1494
:param lines: A sequece of lines containing the parents list and the
2377
1497
output_lines = [DirState.HEADER_FORMAT_3]
2378
lines.append(b'') # a final newline
2379
inventory_text = b'\0\n\0'.join(lines)
2380
output_lines.append(b'crc32: %d\n' % (zlib.crc32(inventory_text),))
1498
lines.append('') # a final newline
1499
inventory_text = '\0\n\0'.join(lines)
1500
output_lines.append('crc32: %s\n' % (zlib.crc32(inventory_text),))
2381
1501
# -3, 1 for num parents, 1 for ghosts, 1 for final newline
2382
num_entries = len(lines) - 3
2383
output_lines.append(b'num_entries: %d\n' % (num_entries,))
1502
num_entries = len(lines)-3
1503
output_lines.append('num_entries: %s\n' % (num_entries,))
2384
1504
output_lines.append(inventory_text)
2385
1505
return output_lines
2387
1507
def _make_deleted_row(self, fileid_utf8, parents):
2388
"""Return a deleted row for fileid_utf8."""
2389
return (b'/', b'RECYCLED.BIN', b'file', fileid_utf8, 0, DirState.NULLSTAT,
1508
"""Return a deleted for for fileid_utf8."""
1509
return ('/', 'RECYCLED.BIN', 'file', fileid_utf8, 0, DirState.NULLSTAT,
2392
1512
def _num_present_parents(self):
2393
1513
"""The number of parent entries in each record row."""
2394
1514
return len(self._parents) - len(self._ghosts)
2397
def on_file(cls, path, sha1_provider=None, worth_saving_limit=0,
2398
use_filesystem_for_exec=True):
2399
"""Construct a DirState on the file at path "path".
1518
"""Construct a DirState on the file at path path.
2401
:param path: The path at which the dirstate file on disk should live.
2402
:param sha1_provider: an object meeting the SHA1Provider interface.
2403
If None, a DefaultSHA1Provider is used.
2404
:param worth_saving_limit: when the exact number of hash changed
2405
entries is known, only bother saving the dirstate if more than
2406
this count of entries have changed. -1 means never save.
2407
:param use_filesystem_for_exec: Whether to trust the filesystem
2408
for executable bit information
2409
1520
:return: An unlocked DirState object, associated with the given path.
2411
if sha1_provider is None:
2412
sha1_provider = DefaultSHA1Provider()
2413
result = cls(path, sha1_provider,
2414
worth_saving_limit=worth_saving_limit,
2415
use_filesystem_for_exec=use_filesystem_for_exec)
1522
result = DirState(path)
2418
1525
def _read_dirblocks_if_needed(self):
2419
1526
"""Read in all the dirblocks from the file if they are not in memory.
2421
1528
This populates self._dirblocks, and sets self._dirblock_state to
2422
1529
IN_MEMORY_UNMODIFIED. It is not currently ready for incremental block
2425
1532
self._read_header_if_needed()
2426
1533
if self._dirblock_state == DirState.NOT_IN_MEMORY:
2427
_read_dirblocks(self)
1534
# move the _state_file pointer to after the header (in case bisect
1535
# has been called in the mean time)
1536
self._state_file.seek(self._end_of_header)
1537
text = self._state_file.read()
1538
# TODO: check the crc checksums. crc_measured = zlib.crc32(text)
1540
fields = text.split('\0')
1541
# Remove the last blank entry
1542
trailing = fields.pop()
1543
assert trailing == ''
1544
# consider turning fields into a tuple.
1546
# skip the first field which is the trailing null from the header.
1548
# Each line now has an extra '\n' field which is not used
1549
# so we just skip over it
1551
# 3 fields for the key
1552
# + number of fields per tree_data (5) * tree count
1554
num_present_parents = self._num_present_parents()
1555
tree_count = 1 + num_present_parents
1556
entry_size = self._fields_per_entry()
1557
expected_field_count = entry_size * self._num_entries
1558
field_count = len(fields)
1559
# this checks our adjustment, and also catches file too short.
1560
assert field_count - cur == expected_field_count, \
1561
'field count incorrect %s != %s, entry_size=%s, '\
1562
'num_entries=%s fields=%r' % (
1563
field_count - cur, expected_field_count, entry_size,
1564
self._num_entries, fields)
1566
if num_present_parents == 1:
1567
# Bind external functions to local names
1569
# We access all fields in order, so we can just iterate over
1570
# them. Grab an straight iterator over the fields. (We use an
1571
# iterator because we don't want to do a lot of additions, nor
1572
# do we want to do a lot of slicing)
1573
next = iter(fields).next
1574
# Move the iterator to the current position
1575
for x in xrange(cur):
1577
# The two blocks here are deliberate: the root block and the
1578
# contents-of-root block.
1579
self._dirblocks = [('', []), ('', [])]
1580
current_block = self._dirblocks[0][1]
1581
current_dirname = ''
1582
append_entry = current_block.append
1583
for count in xrange(self._num_entries):
1587
if dirname != current_dirname:
1588
# new block - different dirname
1590
current_dirname = dirname
1591
self._dirblocks.append((current_dirname, current_block))
1592
append_entry = current_block.append
1593
# we know current_dirname == dirname, so re-use it to avoid
1594
# creating new strings
1595
entry = ((current_dirname, name, file_id),
1598
next(), # fingerprint
1599
_int(next()), # size
1600
next() == 'y', # executable
1601
next(), # packed_stat or revision_id
1605
next(), # fingerprint
1606
_int(next()), # size
1607
next() == 'y', # executable
1608
next(), # packed_stat or revision_id
1612
assert trailing == '\n'
1613
# append the entry to the current block
1615
self._split_root_dirblock_into_contents()
1617
fields_to_entry = self._get_fields_to_entry()
1618
entries = [fields_to_entry(fields[pos:pos+entry_size])
1619
for pos in xrange(cur, field_count, entry_size)]
1620
self._entries_to_current_state(entries)
1621
# To convert from format 2 => format 3
1622
# self._dirblocks = sorted(self._dirblocks,
1623
# key=lambda blk:blk[0].split('/'))
1624
# To convert from format 3 => format 2
1625
# self._dirblocks = sorted(self._dirblocks)
1626
self._dirblock_state = DirState.IN_MEMORY_UNMODIFIED
2429
1628
def _read_header(self):
2430
1629
"""This reads in the metadata header, and the parent ids.
3422
2325
self._split_path_cache = {}
3424
2327
def _requires_lock(self):
3425
"""Check that a lock is currently held by someone on the dirstate."""
2328
"""Checks that a lock is currently held by someone on the dirstate"""
3426
2329
if not self._lock_token:
3427
2330
raise errors.ObjectNotLocked(self)
3430
def py_update_entry(state, entry, abspath, stat_value,
3431
_stat_to_minikind=DirState._stat_to_minikind):
3432
"""Update the entry based on what is actually on disk.
3434
This function only calculates the sha if it needs to - if the entry is
3435
uncachable, or clearly different to the first parent's entry, no sha
3436
is calculated, and None is returned.
3438
:param state: The dirstate this entry is in.
3439
:param entry: This is the dirblock entry for the file in question.
3440
:param abspath: The path on disk for this file.
3441
:param stat_value: The stat value done on the path.
3442
:return: None, or The sha1 hexdigest of the file (40 bytes) or link
3443
target of a symlink.
2333
def bisect_dirblock(dirblocks, dirname, lo=0, hi=None, cache={}):
2334
"""Return the index where to insert dirname into the dirblocks.
2336
The return value idx is such that all directories blocks in dirblock[:idx]
2337
have names < dirname, and all blocks in dirblock[idx:] have names >=
2340
Optional args lo (default 0) and hi (default len(dirblocks)) bound the
2341
slice of a to be searched.
3446
minikind = _stat_to_minikind[stat_value.st_mode & 0o170000]
2346
dirname_split = cache[dirname]
3447
2347
except KeyError:
3450
packed_stat = pack_stat(stat_value)
3451
(saved_minikind, saved_link_or_sha1, saved_file_size,
3452
saved_executable, saved_packed_stat) = entry[1][0]
3453
if not isinstance(saved_minikind, bytes):
3454
raise TypeError(saved_minikind)
3456
if minikind == b'd' and saved_minikind == b't':
3458
if (minikind == saved_minikind
3459
and packed_stat == saved_packed_stat):
3460
# The stat hasn't changed since we saved, so we can re-use the
3462
if minikind == b'd':
3465
# size should also be in packed_stat
3466
if saved_file_size == stat_value.st_size:
3467
return saved_link_or_sha1
3469
# If we have gotten this far, that means that we need to actually
3470
# process this entry.
3473
if minikind == b'f':
3474
executable = state._is_executable(stat_value.st_mode,
3476
if state._cutoff_time is None:
3477
state._sha_cutoff_time()
3478
if (stat_value.st_mtime < state._cutoff_time
3479
and stat_value.st_ctime < state._cutoff_time
3480
and len(entry[1]) > 1
3481
and entry[1][1][0] != b'a'):
3482
# Could check for size changes for further optimised
3483
# avoidance of sha1's. However the most prominent case of
3484
# over-shaing is during initial add, which this catches.
3485
# Besides, if content filtering happens, size and sha
3486
# are calculated at the same time, so checking just the size
3487
# gains nothing w.r.t. performance.
3488
link_or_sha1 = state._sha1_file(abspath)
3489
entry[1][0] = (b'f', link_or_sha1, stat_value.st_size,
3490
executable, packed_stat)
3492
entry[1][0] = (b'f', b'', stat_value.st_size,
3493
executable, DirState.NULLSTAT)
3494
worth_saving = False
3495
elif minikind == b'd':
3497
entry[1][0] = (b'd', b'', 0, False, packed_stat)
3498
if saved_minikind != b'd':
3499
# This changed from something into a directory. Make sure we
3500
# have a directory block for it. This doesn't happen very
3501
# often, so this doesn't have to be super fast.
3502
block_index, entry_index, dir_present, file_present = \
3503
state._get_block_entry_index(entry[0][0], entry[0][1], 0)
3504
state._ensure_block(block_index, entry_index,
3505
osutils.pathjoin(entry[0][0], entry[0][1]))
3507
worth_saving = False
3508
elif minikind == b'l':
3509
if saved_minikind == b'l':
3510
worth_saving = False
3511
link_or_sha1 = state._read_link(abspath, saved_link_or_sha1)
3512
if state._cutoff_time is None:
3513
state._sha_cutoff_time()
3514
if (stat_value.st_mtime < state._cutoff_time
3515
and stat_value.st_ctime < state._cutoff_time):
3516
entry[1][0] = (b'l', link_or_sha1, stat_value.st_size,
3519
entry[1][0] = (b'l', b'', stat_value.st_size,
3520
False, DirState.NULLSTAT)
3522
state._mark_modified([entry])
3526
class ProcessEntryPython(object):
3528
__slots__ = ["old_dirname_to_file_id", "new_dirname_to_file_id",
3529
"last_source_parent", "last_target_parent", "include_unchanged",
3530
"partial", "use_filesystem_for_exec", "utf8_decode",
3531
"searched_specific_files", "search_specific_files",
3532
"searched_exact_paths", "search_specific_file_parents", "seen_ids",
3533
"state", "source_index", "target_index", "want_unversioned", "tree"]
3535
def __init__(self, include_unchanged, use_filesystem_for_exec,
3536
search_specific_files, state, source_index, target_index,
3537
want_unversioned, tree):
3538
self.old_dirname_to_file_id = {}
3539
self.new_dirname_to_file_id = {}
3540
# Are we doing a partial iter_changes?
3541
self.partial = search_specific_files != {''}
3542
# Using a list so that we can access the values and change them in
3543
# nested scope. Each one is [path, file_id, entry]
3544
self.last_source_parent = [None, None]
3545
self.last_target_parent = [None, None]
3546
self.include_unchanged = include_unchanged
3547
self.use_filesystem_for_exec = use_filesystem_for_exec
3548
self.utf8_decode = cache_utf8._utf8_decode
3549
# for all search_indexs in each path at or under each element of
3550
# search_specific_files, if the detail is relocated: add the id, and
3551
# add the relocated path as one to search if its not searched already.
3552
# If the detail is not relocated, add the id.
3553
self.searched_specific_files = set()
3554
# When we search exact paths without expanding downwards, we record
3556
self.searched_exact_paths = set()
3557
self.search_specific_files = search_specific_files
3558
# The parents up to the root of the paths we are searching.
3559
# After all normal paths are returned, these specific items are returned.
3560
self.search_specific_file_parents = set()
3561
# The ids we've sent out in the delta.
3562
self.seen_ids = set()
3564
self.source_index = source_index
3565
self.target_index = target_index
3566
if target_index != 0:
3567
# A lot of code in here depends on target_index == 0
3568
raise errors.BzrError('unsupported target index')
3569
self.want_unversioned = want_unversioned
3572
def _process_entry(self, entry, path_info, pathjoin=osutils.pathjoin):
3573
"""Compare an entry and real disk to generate delta information.
3575
:param path_info: top_relpath, basename, kind, lstat, abspath for
3576
the path of entry. If None, then the path is considered absent in
3577
the target (Perhaps we should pass in a concrete entry for this ?)
3578
Basename is returned as a utf8 string because we expect this
3579
tuple will be ignored, and don't want to take the time to
3581
:return: (iter_changes_result, changed). If the entry has not been
3582
handled then changed is None. Otherwise it is False if no content
3583
or metadata changes have occurred, and True if any content or
3584
metadata change has occurred. If self.include_unchanged is True then
3585
if changed is not None, iter_changes_result will always be a result
3586
tuple. Otherwise, iter_changes_result is None unless changed is
3589
if self.source_index is None:
3590
source_details = DirState.NULL_PARENT_DETAILS
3592
source_details = entry[1][self.source_index]
3593
# GZ 2017-06-09: Eck, more sets.
3594
_fdltr = {b'f', b'd', b'l', b't', b'r'}
3595
_fdlt = {b'f', b'd', b'l', b't'}
3597
target_details = entry[1][self.target_index]
3598
target_minikind = target_details[0]
3599
if path_info is not None and target_minikind in _fdlt:
3600
if not (self.target_index == 0):
3601
raise AssertionError()
3602
link_or_sha1 = update_entry(self.state, entry,
3603
abspath=path_info[4], stat_value=path_info[3])
3604
# The entry may have been modified by update_entry
3605
target_details = entry[1][self.target_index]
3606
target_minikind = target_details[0]
3609
file_id = entry[0][2]
3610
source_minikind = source_details[0]
3611
if source_minikind in _fdltr and target_minikind in _fdlt:
3612
# claimed content in both: diff
3613
# r | fdlt | | add source to search, add id path move and perform
3614
# | | | diff check on source-target
3615
# r | fdlt | a | dangling file that was present in the basis.
3617
if source_minikind == b'r':
3618
# add the source to the search path to find any children it
3619
# has. TODO ? : only add if it is a container ?
3620
if not osutils.is_inside_any(self.searched_specific_files,
3622
self.search_specific_files.add(source_details[1])
3623
# generate the old path; this is needed for stating later
3625
old_path = source_details[1]
3626
old_dirname, old_basename = os.path.split(old_path)
3627
path = pathjoin(entry[0][0], entry[0][1])
3628
old_entry = self.state._get_entry(self.source_index,
3630
# update the source details variable to be the real
3632
if old_entry == (None, None):
3633
raise DirstateCorrupt(self.state._filename,
3634
"entry '%s/%s' is considered renamed from %r"
3635
" but source does not exist\n"
3636
"entry: %s" % (entry[0][0], entry[0][1], old_path, entry))
3637
source_details = old_entry[1][self.source_index]
3638
source_minikind = source_details[0]
3640
old_dirname = entry[0][0]
3641
old_basename = entry[0][1]
3642
old_path = path = None
3643
if path_info is None:
3644
# the file is missing on disk, show as removed.
3645
content_change = True
3649
# source and target are both versioned and disk file is present.
3650
target_kind = path_info[2]
3651
if target_kind == 'directory':
3653
old_path = path = pathjoin(old_dirname, old_basename)
3654
self.new_dirname_to_file_id[path] = file_id
3655
if source_minikind != b'd':
3656
content_change = True
3658
# directories have no fingerprint
3659
content_change = False
3661
elif target_kind == 'file':
3662
if source_minikind != b'f':
3663
content_change = True
3665
# Check the sha. We can't just rely on the size as
3666
# content filtering may mean differ sizes actually
3667
# map to the same content
3668
if link_or_sha1 is None:
3670
statvalue, link_or_sha1 = \
3671
self.state._sha1_provider.stat_and_sha1(
3673
self.state._observed_sha1(entry, link_or_sha1,
3675
content_change = (link_or_sha1 != source_details[1])
3676
# Target details is updated at update_entry time
3677
if self.use_filesystem_for_exec:
3678
# We don't need S_ISREG here, because we are sure
3679
# we are dealing with a file.
3680
target_exec = bool(stat.S_IEXEC & path_info[3].st_mode)
3682
target_exec = target_details[3]
3683
elif target_kind == 'symlink':
3684
if source_minikind != b'l':
3685
content_change = True
3687
content_change = (link_or_sha1 != source_details[1])
3689
elif target_kind == 'tree-reference':
3690
if source_minikind != b't':
3691
content_change = True
3693
content_change = False
3697
path = pathjoin(old_dirname, old_basename)
3698
raise errors.BadFileKindError(path, path_info[2])
3699
if source_minikind == b'd':
3701
old_path = path = pathjoin(old_dirname, old_basename)
3702
self.old_dirname_to_file_id[old_path] = file_id
3703
# parent id is the entry for the path in the target tree
3704
if old_basename and old_dirname == self.last_source_parent[0]:
3705
source_parent_id = self.last_source_parent[1]
3708
source_parent_id = self.old_dirname_to_file_id[old_dirname]
3710
source_parent_entry = self.state._get_entry(self.source_index,
3711
path_utf8=old_dirname)
3712
source_parent_id = source_parent_entry[0][2]
3713
if source_parent_id == entry[0][2]:
3714
# This is the root, so the parent is None
3715
source_parent_id = None
3717
self.last_source_parent[0] = old_dirname
3718
self.last_source_parent[1] = source_parent_id
3719
new_dirname = entry[0][0]
3720
if entry[0][1] and new_dirname == self.last_target_parent[0]:
3721
target_parent_id = self.last_target_parent[1]
3724
target_parent_id = self.new_dirname_to_file_id[new_dirname]
3726
# TODO: We don't always need to do the lookup, because the
3727
# parent entry will be the same as the source entry.
3728
target_parent_entry = self.state._get_entry(self.target_index,
3729
path_utf8=new_dirname)
3730
if target_parent_entry == (None, None):
3731
raise AssertionError(
3732
"Could not find target parent in wt: %s\nparent of: %s"
3733
% (new_dirname, entry))
3734
target_parent_id = target_parent_entry[0][2]
3735
if target_parent_id == entry[0][2]:
3736
# This is the root, so the parent is None
3737
target_parent_id = None
3739
self.last_target_parent[0] = new_dirname
3740
self.last_target_parent[1] = target_parent_id
3742
source_exec = source_details[3]
3743
changed = (content_change
3744
or source_parent_id != target_parent_id
3745
or old_basename != entry[0][1]
3746
or source_exec != target_exec
3748
if not changed and not self.include_unchanged:
3751
if old_path is None:
3752
old_path = path = pathjoin(old_dirname, old_basename)
3753
old_path_u = self.utf8_decode(old_path)[0]
3756
old_path_u = self.utf8_decode(old_path)[0]
3757
if old_path == path:
3760
path_u = self.utf8_decode(path)[0]
3761
source_kind = DirState._minikind_to_kind[source_minikind]
3764
(old_path_u, path_u),
3767
(source_parent_id, target_parent_id),
3768
(self.utf8_decode(old_basename)[
3769
0], self.utf8_decode(entry[0][1])[0]),
3770
(source_kind, target_kind),
3771
(source_exec, target_exec)), changed
3772
elif source_minikind in b'a' and target_minikind in _fdlt:
3773
# looks like a new file
3774
path = pathjoin(entry[0][0], entry[0][1])
3775
# parent id is the entry for the path in the target tree
3776
# TODO: these are the same for an entire directory: cache em.
3777
parent_id = self.state._get_entry(self.target_index,
3778
path_utf8=entry[0][0])[0][2]
3779
if parent_id == entry[0][2]:
3781
if path_info is not None:
3783
if self.use_filesystem_for_exec:
3784
# We need S_ISREG here, because we aren't sure if this
3787
stat.S_ISREG(path_info[3].st_mode)
3788
and stat.S_IEXEC & path_info[3].st_mode)
3790
target_exec = target_details[3]
3793
(None, self.utf8_decode(path)[0]),
3797
(None, self.utf8_decode(entry[0][1])[0]),
3798
(None, path_info[2]),
3799
(None, target_exec)), True
3801
# Its a missing file, report it as such.
3804
(None, self.utf8_decode(path)[0]),
3808
(None, self.utf8_decode(entry[0][1])[0]),
3810
(None, False)), True
3811
elif source_minikind in _fdlt and target_minikind in b'a':
3812
# unversioned, possibly, or possibly not deleted: we dont care.
3813
# if its still on disk, *and* theres no other entry at this
3814
# path [we dont know this in this routine at the moment -
3815
# perhaps we should change this - then it would be an unknown.
3816
old_path = pathjoin(entry[0][0], entry[0][1])
3817
# parent id is the entry for the path in the target tree
3818
parent_id = self.state._get_entry(
3819
self.source_index, path_utf8=entry[0][0])[0][2]
3820
if parent_id == entry[0][2]:
3824
(self.utf8_decode(old_path)[0], None),
3828
(self.utf8_decode(entry[0][1])[0], None),
3829
(DirState._minikind_to_kind[source_minikind], None),
3830
(source_details[3], None)), True
3831
elif source_minikind in _fdlt and target_minikind in b'r':
3832
# a rename; could be a true rename, or a rename inherited from
3833
# a renamed parent. TODO: handle this efficiently. Its not
3834
# common case to rename dirs though, so a correct but slow
3835
# implementation will do.
3836
if not osutils.is_inside_any(self.searched_specific_files,
3838
self.search_specific_files.add(target_details[1])
3839
elif source_minikind in _ra and target_minikind in _ra:
3840
# neither of the selected trees contain this file,
3841
# so skip over it. This is not currently directly tested, but
3842
# is indirectly via test_too_much.TestCommands.test_conflicts.
3845
raise AssertionError("don't know how to compare "
3846
"source_minikind=%r, target_minikind=%r"
3847
% (source_minikind, target_minikind))
3853
def _gather_result_for_consistency(self, result):
3854
"""Check a result we will yield to make sure we are consistent later.
3856
This gathers result's parents into a set to output later.
3858
:param result: A result tuple.
3860
if not self.partial or not result.file_id:
3862
self.seen_ids.add(result.file_id)
3863
new_path = result.path[1]
3865
# Not the root and not a delete: queue up the parents of the path.
3866
self.search_specific_file_parents.update(
3867
p.encode('utf8') for p in osutils.parent_directories(new_path))
3868
# Add the root directory which parent_directories does not
3870
self.search_specific_file_parents.add(b'')
3872
def iter_changes(self):
3873
"""Iterate over the changes."""
3874
utf8_decode = cache_utf8._utf8_decode
3875
_lt_by_dirs = lt_by_dirs
3876
_process_entry = self._process_entry
3877
search_specific_files = self.search_specific_files
3878
searched_specific_files = self.searched_specific_files
3879
splitpath = osutils.splitpath
3881
# compare source_index and target_index at or under each element of search_specific_files.
3882
# follow the following comparison table. Note that we only want to do diff operations when
3883
# the target is fdl because thats when the walkdirs logic will have exposed the pathinfo
3887
# Source | Target | disk | action
3888
# r | fdlt | | add source to search, add id path move and perform
3889
# | | | diff check on source-target
3890
# r | fdlt | a | dangling file that was present in the basis.
3892
# r | a | | add source to search
3894
# r | r | | this path is present in a non-examined tree, skip.
3895
# r | r | a | this path is present in a non-examined tree, skip.
3896
# a | fdlt | | add new id
3897
# a | fdlt | a | dangling locally added file, skip
3898
# a | a | | not present in either tree, skip
3899
# a | a | a | not present in any tree, skip
3900
# a | r | | not present in either tree at this path, skip as it
3901
# | | | may not be selected by the users list of paths.
3902
# a | r | a | not present in either tree at this path, skip as it
3903
# | | | may not be selected by the users list of paths.
3904
# fdlt | fdlt | | content in both: diff them
3905
# fdlt | fdlt | a | deleted locally, but not unversioned - show as deleted ?
3906
# fdlt | a | | unversioned: output deleted id for now
3907
# fdlt | a | a | unversioned and deleted: output deleted id
3908
# fdlt | r | | relocated in this tree, so add target to search.
3909
# | | | Dont diff, we will see an r,fd; pair when we reach
3910
# | | | this id at the other path.
3911
# fdlt | r | a | relocated in this tree, so add target to search.
3912
# | | | Dont diff, we will see an r,fd; pair when we reach
3913
# | | | this id at the other path.
3915
# TODO: jam 20070516 - Avoid the _get_entry lookup overhead by
3916
# keeping a cache of directories that we have seen.
3918
while search_specific_files:
3919
# TODO: the pending list should be lexically sorted? the
3920
# interface doesn't require it.
3921
current_root = search_specific_files.pop()
3922
current_root_unicode = current_root.decode('utf8')
3923
searched_specific_files.add(current_root)
3924
# process the entries for this containing directory: the rest will be
3925
# found by their parents recursively.
3926
root_entries = self.state._entries_for_path(current_root)
3927
root_abspath = self.tree.abspath(current_root_unicode)
3929
root_stat = os.lstat(root_abspath)
3930
except OSError as e:
3931
if e.errno == errno.ENOENT:
3932
# the path does not exist: let _process_entry know that.
3933
root_dir_info = None
3935
# some other random error: hand it up.
3938
root_dir_info = (b'', current_root,
3939
osutils.file_kind_from_stat_mode(
3940
root_stat.st_mode), root_stat,
3942
if root_dir_info[2] == 'directory':
3943
if self.tree._directory_is_tree_reference(
3944
current_root.decode('utf8')):
3945
root_dir_info = root_dir_info[:2] + \
3946
('tree-reference',) + root_dir_info[3:]
3948
if not root_entries and not root_dir_info:
3949
# this specified path is not present at all, skip it.
3951
path_handled = False
3952
for entry in root_entries:
3953
result, changed = _process_entry(entry, root_dir_info)
3954
if changed is not None:
3957
self._gather_result_for_consistency(result)
3958
if changed or self.include_unchanged:
3960
if self.want_unversioned and not path_handled and root_dir_info:
3961
new_executable = bool(
3962
stat.S_ISREG(root_dir_info[3].st_mode)
3963
and stat.S_IEXEC & root_dir_info[3].st_mode)
3966
(None, current_root_unicode),
3970
(None, splitpath(current_root_unicode)[-1]),
3971
(None, root_dir_info[2]),
3972
(None, new_executable)
3974
initial_key = (current_root, b'', b'')
3975
block_index, _ = self.state._find_block_index_from_key(initial_key)
3976
if block_index == 0:
3977
# we have processed the total root already, but because the
3978
# initial key matched it we should skip it here.
3980
if root_dir_info and root_dir_info[2] == 'tree-reference':
3981
current_dir_info = None
3983
dir_iterator = osutils._walkdirs_utf8(
3984
root_abspath, prefix=current_root)
3986
current_dir_info = next(dir_iterator)
3987
except OSError as e:
3988
# on win32, python2.4 has e.errno == ERROR_DIRECTORY, but
3989
# python 2.5 has e.errno == EINVAL,
3990
# and e.winerror == ERROR_DIRECTORY
3991
e_winerror = getattr(e, 'winerror', None)
3992
win_errors = (ERROR_DIRECTORY, ERROR_PATH_NOT_FOUND)
3993
# there may be directories in the inventory even though
3994
# this path is not a file on disk: so mark it as end of
3996
if e.errno in (errno.ENOENT, errno.ENOTDIR, errno.EINVAL):
3997
current_dir_info = None
3998
elif (sys.platform == 'win32'
3999
and (e.errno in win_errors or
4000
e_winerror in win_errors)):
4001
current_dir_info = None
4005
if current_dir_info[0][0] == b'':
4006
# remove .bzr from iteration
4007
bzr_index = bisect.bisect_left(
4008
current_dir_info[1], (b'.bzr',))
4009
if current_dir_info[1][bzr_index][0] != b'.bzr':
4010
raise AssertionError()
4011
del current_dir_info[1][bzr_index]
4012
# walk until both the directory listing and the versioned metadata
4014
if (block_index < len(self.state._dirblocks) and
4015
osutils.is_inside(current_root,
4016
self.state._dirblocks[block_index][0])):
4017
current_block = self.state._dirblocks[block_index]
4019
current_block = None
4020
while (current_dir_info is not None or
4021
current_block is not None):
4022
if (current_dir_info and current_block
4023
and current_dir_info[0][0] != current_block[0]):
4024
if _lt_by_dirs(current_dir_info[0][0], current_block[0]):
4025
# filesystem data refers to paths not covered by the dirblock.
4026
# this has two possibilities:
4027
# A) it is versioned but empty, so there is no block for it
4028
# B) it is not versioned.
4030
# if (A) then we need to recurse into it to check for
4031
# new unknown files or directories.
4032
# if (B) then we should ignore it, because we don't
4033
# recurse into unknown directories.
4035
while path_index < len(current_dir_info[1]):
4036
current_path_info = current_dir_info[1][path_index]
4037
if self.want_unversioned:
4038
if current_path_info[2] == 'directory':
4039
if self.tree._directory_is_tree_reference(
4040
current_path_info[0].decode('utf8')):
4041
current_path_info = current_path_info[:2] + \
4042
('tree-reference',) + \
4043
current_path_info[3:]
4044
new_executable = bool(
4045
stat.S_ISREG(current_path_info[3].st_mode)
4046
and stat.S_IEXEC & current_path_info[3].st_mode)
4049
(None, utf8_decode(current_path_info[0])[0]),
4053
(None, utf8_decode(current_path_info[1])[0]),
4054
(None, current_path_info[2]),
4055
(None, new_executable))
4056
# dont descend into this unversioned path if it is
4058
if current_path_info[2] in ('directory',
4060
del current_dir_info[1][path_index]
4064
# This dir info has been handled, go to the next
4066
current_dir_info = next(dir_iterator)
4067
except StopIteration:
4068
current_dir_info = None
4070
# We have a dirblock entry for this location, but there
4071
# is no filesystem path for this. This is most likely
4072
# because a directory was removed from the disk.
4073
# We don't have to report the missing directory,
4074
# because that should have already been handled, but we
4075
# need to handle all of the files that are contained
4077
for current_entry in current_block[1]:
4078
# entry referring to file not present on disk.
4079
# advance the entry only, after processing.
4080
result, changed = _process_entry(
4081
current_entry, None)
4082
if changed is not None:
4084
self._gather_result_for_consistency(result)
4085
if changed or self.include_unchanged:
4088
if (block_index < len(self.state._dirblocks) and
4089
osutils.is_inside(current_root,
4090
self.state._dirblocks[block_index][0])):
4091
current_block = self.state._dirblocks[block_index]
4093
current_block = None
4096
if current_block and entry_index < len(current_block[1]):
4097
current_entry = current_block[1][entry_index]
4099
current_entry = None
4100
advance_entry = True
4102
if current_dir_info and path_index < len(current_dir_info[1]):
4103
current_path_info = current_dir_info[1][path_index]
4104
if current_path_info[2] == 'directory':
4105
if self.tree._directory_is_tree_reference(
4106
current_path_info[0].decode('utf8')):
4107
current_path_info = current_path_info[:2] + \
4108
('tree-reference',) + current_path_info[3:]
4110
current_path_info = None
4112
path_handled = False
4113
while (current_entry is not None or
4114
current_path_info is not None):
4115
if current_entry is None:
4116
# the check for path_handled when the path is advanced
4117
# will yield this path if needed.
4119
elif current_path_info is None:
4120
# no path is fine: the per entry code will handle it.
4121
result, changed = _process_entry(
4122
current_entry, current_path_info)
4123
if changed is not None:
4125
self._gather_result_for_consistency(result)
4126
if changed or self.include_unchanged:
4128
elif (current_entry[0][1] != current_path_info[1]
4129
or current_entry[1][self.target_index][0] in (b'a', b'r')):
4130
# The current path on disk doesn't match the dirblock
4131
# record. Either the dirblock is marked as absent, or
4132
# the file on disk is not present at all in the
4133
# dirblock. Either way, report about the dirblock
4134
# entry, and let other code handle the filesystem one.
4136
# Compare the basename for these files to determine
4138
if current_path_info[1] < current_entry[0][1]:
4139
# extra file on disk: pass for now, but only
4140
# increment the path, not the entry
4141
advance_entry = False
4143
# entry referring to file not present on disk.
4144
# advance the entry only, after processing.
4145
result, changed = _process_entry(
4146
current_entry, None)
4147
if changed is not None:
4149
self._gather_result_for_consistency(result)
4150
if changed or self.include_unchanged:
4152
advance_path = False
4154
result, changed = _process_entry(
4155
current_entry, current_path_info)
4156
if changed is not None:
4159
self._gather_result_for_consistency(result)
4160
if changed or self.include_unchanged:
4162
if advance_entry and current_entry is not None:
4164
if entry_index < len(current_block[1]):
4165
current_entry = current_block[1][entry_index]
4167
current_entry = None
4169
advance_entry = True # reset the advance flaga
4170
if advance_path and current_path_info is not None:
4171
if not path_handled:
4172
# unversioned in all regards
4173
if self.want_unversioned:
4174
new_executable = bool(
4175
stat.S_ISREG(current_path_info[3].st_mode)
4176
and stat.S_IEXEC & current_path_info[3].st_mode)
4178
relpath_unicode = utf8_decode(
4179
current_path_info[0])[0]
4180
except UnicodeDecodeError:
4181
raise errors.BadFilenameEncoding(
4182
current_path_info[0], osutils._fs_enc)
4185
(None, relpath_unicode),
4189
(None, utf8_decode(current_path_info[1])[0]),
4190
(None, current_path_info[2]),
4191
(None, new_executable))
4192
# dont descend into this unversioned path if it is
4194
if current_path_info[2] in ('directory'):
4195
del current_dir_info[1][path_index]
4197
# dont descend the disk iterator into any tree
4199
if current_path_info[2] == 'tree-reference':
4200
del current_dir_info[1][path_index]
4203
if path_index < len(current_dir_info[1]):
4204
current_path_info = current_dir_info[1][path_index]
4205
if current_path_info[2] == 'directory':
4206
if self.tree._directory_is_tree_reference(
4207
current_path_info[0].decode('utf8')):
4208
current_path_info = current_path_info[:2] + \
4209
('tree-reference',) + \
4210
current_path_info[3:]
4212
current_path_info = None
4213
path_handled = False
4215
advance_path = True # reset the advance flagg.
4216
if current_block is not None:
4218
if (block_index < len(self.state._dirblocks) and
4219
osutils.is_inside(current_root,
4220
self.state._dirblocks[block_index][0])):
4221
current_block = self.state._dirblocks[block_index]
4223
current_block = None
4224
if current_dir_info is not None:
4226
current_dir_info = next(dir_iterator)
4227
except StopIteration:
4228
current_dir_info = None
4229
for result in self._iter_specific_file_parents():
4232
def _iter_specific_file_parents(self):
4233
"""Iter over the specific file parents."""
4234
while self.search_specific_file_parents:
4235
# Process the parent directories for the paths we were iterating.
4236
# Even in extremely large trees this should be modest, so currently
4237
# no attempt is made to optimise.
4238
path_utf8 = self.search_specific_file_parents.pop()
4239
if osutils.is_inside_any(self.searched_specific_files, path_utf8):
4240
# We've examined this path.
4242
if path_utf8 in self.searched_exact_paths:
4243
# We've examined this path.
4245
path_entries = self.state._entries_for_path(path_utf8)
4246
# We need either one or two entries. If the path in
4247
# self.target_index has moved (so the entry in source_index is in
4248
# 'ar') then we need to also look for the entry for this path in
4249
# self.source_index, to output the appropriate delete-or-rename.
4250
selected_entries = []
4252
for candidate_entry in path_entries:
4253
# Find entries present in target at this path:
4254
if candidate_entry[1][self.target_index][0] not in (b'a', b'r'):
4256
selected_entries.append(candidate_entry)
4257
# Find entries present in source at this path:
4258
elif (self.source_index is not None and
4259
candidate_entry[1][self.source_index][0] not in (b'a', b'r')):
4261
if candidate_entry[1][self.target_index][0] == b'a':
4262
# Deleted, emit it here.
4263
selected_entries.append(candidate_entry)
4265
# renamed, emit it when we process the directory it
4267
self.search_specific_file_parents.add(
4268
candidate_entry[1][self.target_index][1])
4270
raise AssertionError(
4271
"Missing entry for specific path parent %r, %r" % (
4272
path_utf8, path_entries))
4273
path_info = self._path_info(path_utf8, path_utf8.decode('utf8'))
4274
for entry in selected_entries:
4275
if entry[0][2] in self.seen_ids:
4277
result, changed = self._process_entry(entry, path_info)
4279
raise AssertionError(
4280
"Got entry<->path mismatch for specific path "
4281
"%r entry %r path_info %r " % (
4282
path_utf8, entry, path_info))
4283
# Only include changes - we're outside the users requested
4286
self._gather_result_for_consistency(result)
4287
if (result.kind[0] == 'directory' and
4288
result.kind[1] != 'directory'):
4289
# This stopped being a directory, the old children have
4291
if entry[1][self.source_index][0] == b'r':
4292
# renamed, take the source path
4293
entry_path_utf8 = entry[1][self.source_index][1]
4295
entry_path_utf8 = path_utf8
4296
initial_key = (entry_path_utf8, b'', b'')
4297
block_index, _ = self.state._find_block_index_from_key(
4299
if block_index == 0:
4300
# The children of the root are in block index 1.
4302
current_block = None
4303
if block_index < len(self.state._dirblocks):
4304
current_block = self.state._dirblocks[block_index]
4305
if not osutils.is_inside(
4306
entry_path_utf8, current_block[0]):
4307
# No entries for this directory at all.
4308
current_block = None
4309
if current_block is not None:
4310
for entry in current_block[1]:
4311
if entry[1][self.source_index][0] in (b'a', b'r'):
4312
# Not in the source tree, so doesn't have to be
4315
# Path of the entry itself.
4317
self.search_specific_file_parents.add(
4318
osutils.pathjoin(*entry[0][:2]))
4319
if changed or self.include_unchanged:
4321
self.searched_exact_paths.add(path_utf8)
4323
def _path_info(self, utf8_path, unicode_path):
4324
"""Generate path_info for unicode_path.
4326
:return: None if unicode_path does not exist, or a path_info tuple.
4328
abspath = self.tree.abspath(unicode_path)
2348
dirname_split = dirname.split('/')
2349
cache[dirname] = dirname_split
2352
# Grab the dirname for the current dirblock
2353
cur = dirblocks[mid][0]
4330
stat = os.lstat(abspath)
4331
except OSError as e:
4332
if e.errno == errno.ENOENT:
4333
# the path does not exist.
4337
utf8_basename = utf8_path.rsplit(b'/', 1)[-1]
4338
dir_info = (utf8_path, utf8_basename,
4339
osutils.file_kind_from_stat_mode(stat.st_mode), stat,
4341
if dir_info[2] == 'directory':
4342
if self.tree._directory_is_tree_reference(
4344
self.root_dir_info = self.root_dir_info[:2] + \
4345
('tree-reference',) + self.root_dir_info[3:]
4349
# Try to load the compiled form if possible
4351
from ._dirstate_helpers_pyx import (
4358
ProcessEntryC as _process_entry,
4359
update_entry as update_entry,
4361
except ImportError as e:
4362
osutils.failed_to_load_extension(e)
4363
from ._dirstate_helpers_py import (
4371
# FIXME: It would be nice to be able to track moved lines so that the
4372
# corresponding python code can be moved to the _dirstate_helpers_py
4373
# module. I don't want to break the history for this important piece of
4374
# code so I left the code here -- vila 20090622
4375
update_entry = py_update_entry
4376
_process_entry = ProcessEntryPython
2355
cur_split = cache[cur]
2357
cur_split = cur.split('/')
2358
cache[cur] = cur_split
2359
if cur_split < dirname_split: lo = mid+1
2365
def pack_stat(st, _encode=base64.encodestring, _pack=struct.pack):
2366
"""Convert stat values into a packed representation."""
2367
# jam 20060614 it isn't really worth removing more entries if we
2368
# are going to leave it in packed form.
2369
# With only st_mtime and st_mode filesize is 5.5M and read time is 275ms
2370
# With all entries filesize is 5.9M and read time is mabye 280ms
2371
# well within the noise margin
2373
# base64.encode always adds a final newline, so strip it off
2374
return _encode(_pack('>LLLLLL'
2375
, st.st_size, int(st.st_mtime), int(st.st_ctime)
2376
, st.st_dev, st.st_ino & 0xFFFFFFFF, st.st_mode))[:-1]