/brz/remove-bazaar

To get this branch, use:
bzr branch http://gegoxaren.bato24.eu/bzr/brz/remove-bazaar

« back to all changes in this revision

Viewing changes to bzrlib/xml_serializer.py

  • Committer: John Arbash Meinel
  • Date: 2009-09-09 18:52:56 UTC
  • mto: (4634.52.16 2.0)
  • mto: This revision was merged to the branch mainline in revision 4738.
  • Revision ID: john@arbash-meinel.com-20090909185256-rdaxy872xauoem46
Work around bug #402623 by allowing BTreeGraphIndex(...,unlimited_cache=True).

The basic issue is that the access pattern for chk pages is fully random,
because the keys are 'sha1' handles. As such, we have no locality of
reference, and downloading a large project over HTTP can cause us to
redownload all of the .cix pages multiple times. The bug report
noticed the pages getting downloaded 4-5 times.
This was causing a significant increase in the total bytes downloaded.
(For Launchpad, downloading the 10MB cix file 5 times was 50MB, out of
around 160MB total download.)

Show diffs side-by-side

added added

removed removed

Lines of Context:
1
 
# Copyright (C) 2005-2010 Canonical Ltd
 
1
# Copyright (C) 2005, 2006 Canonical Ltd
2
2
#
3
3
# This program is free software; you can redistribute it and/or modify
4
4
# it under the terms of the GNU General Public License as published by
23
23
# ElementTree bits
24
24
 
25
25
from bzrlib.serializer import Serializer
26
 
from bzrlib.trace import mutter
 
26
from bzrlib.trace import mutter, warning
27
27
 
28
28
try:
29
29
    try:
55
55
    squashes_xml_invalid_characters = True
56
56
 
57
57
    def read_inventory_from_string(self, xml_string, revision_id=None,
58
 
                                   entry_cache=None, return_from_cache=False):
 
58
                                   entry_cache=None):
59
59
        """Read xml_string into an inventory object.
60
60
 
61
61
        :param xml_string: The xml to read.
69
69
        :param entry_cache: An optional cache of InventoryEntry objects. If
70
70
            supplied we will look up entries via (file_id, revision_id) which
71
71
            should map to a valid InventoryEntry (File/Directory/etc) object.
72
 
        :param return_from_cache: Return entries directly from the cache,
73
 
            rather than copying them first. This is only safe if the caller
74
 
            promises not to mutate the returned inventory entries, but it can
75
 
            make some operations significantly faster.
76
72
        """
77
73
        try:
78
74
            return self._unpack_inventory(fromstring(xml_string), revision_id,
79
 
                                          entry_cache=entry_cache,
80
 
                                          return_from_cache=return_from_cache)
 
75
                                          entry_cache=entry_cache)
81
76
        except ParseError, e:
82
77
            raise errors.UnexpectedInventoryFormat(e)
83
78