/brz/remove-bazaar : contents of breezy/cache

: (revision 7532)

To get this branch, use:

bzr branch
http://gegoxaren.bato24.eu/bzr/brz/remove-bazaar

2052.3.1 by John Arbash Meinel Add tests to cleanup the copyright of all source files	1	# Copyright (C) 2006 Canonical Ltd
1911.2.3 by John Arbash Meinel Moving everything into a new location so that we can cache more than just revision ids	2	#
	3	# This program is free software; you can redistribute it and/or modify
	4	# it under the terms of the GNU General Public License as published by
	5	# the Free Software Foundation; either version 2 of the License, or
	6	# (at your option) any later version.
	7	#
	8	# This program is distributed in the hope that it will be useful,
	9	# but WITHOUT ANY WARRANTY; without even the implied warranty of
	10	# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
	11	# GNU General Public License for more details.
	12	#
	13	# You should have received a copy of the GNU General Public License
	14	# along with this program; if not, write to the Free Software
4183.7.1 by Sabin Iacob update FSF mailing address	15	# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
1911.2.3 by John Arbash Meinel Moving everything into a new location so that we can cache more than just revision ids	16
3943.8.1 by Marius Kruger remove all trailing whitespace from bzr source	17	# TODO: Some kind of command-line display of revision properties:
1911.2.3 by John Arbash Meinel Moving everything into a new location so that we can cache more than just revision ids	18	# perhaps show them in log -v and allow them as options to the commit command.
	19
6379.6.7 by Jelmer Vernooij Move importing from future until after doc string, otherwise the doc string will disappear.	20	"""Some functions to enable caching the conversion between unicode to utf8"""
	21
2155.1.1 by John Arbash Meinel (Dmitry Vasiliev) pre-lookup encoders to improve performance	22	import codecs
	23
4398.5.9 by John Arbash Meinel it seems that codecs.utf_8_decode is quite a bit faster than codecs.get_decoder('utf-8')	24	_utf8_encode = codecs.utf_8_encode
	25	_utf8_decode = codecs.utf_8_decode
7143.15.2 by Jelmer Vernooĳ Run autopep8.	26
7143.15.2 by Jelmer Vernooĳ Run autopep8.	27
2255.7.95 by Robert Collins Add convenience utf8 decode routine for handling strings that might be None	28	def _utf8_decode_with_None(bytestring, _utf8_decode=_utf8_decode):
2360.1.6 by John Arbash Meinel Change utf8_decode_with_None to return what we care about.	29	"""wrap _utf8_decode to support None->None for optional strings.
	30
	31	Also, only return the Unicode portion, since we don't care about the second
	32	return value.
	33	"""
2255.7.95 by Robert Collins Add convenience utf8 decode routine for handling strings that might be None	34	if bytestring is None:
2360.1.6 by John Arbash Meinel Change utf8_decode_with_None to return what we care about.	35	return None
2255.7.95 by Robert Collins Add convenience utf8 decode routine for handling strings that might be None	36	else:
2360.1.6 by John Arbash Meinel Change utf8_decode_with_None to return what we care about.	37	return _utf8_decode(bytestring)[0]
1911.2.3 by John Arbash Meinel Moving everything into a new location so that we can cache more than just revision ids	38
7143.15.2 by Jelmer Vernooĳ Run autopep8.	39
1911.2.3 by John Arbash Meinel Moving everything into a new location so that we can cache more than just revision ids	40	# Map revisions from and to utf8 encoding
	41	# Whenever we do an encode/decode operation, we save the result, so that
	42	# we don't have to do it again.
	43	_unicode_to_utf8_map = {}
	44	_utf8_to_unicode_map = {}
	45
	46
	47	def encode(unicode_str,
	48	_uni_to_utf8=_unicode_to_utf8_map,
2155.1.1 by John Arbash Meinel (Dmitry Vasiliev) pre-lookup encoders to improve performance	49	_utf8_to_uni=_utf8_to_unicode_map,
	50	_utf8_encode=_utf8_encode):
1911.2.3 by John Arbash Meinel Moving everything into a new location so that we can cache more than just revision ids	51	"""Take this unicode revision id, and get a unicode version"""
1934.1.11 by John Arbash Meinel Document why we use try/except rather than if None	52	# If the key is in the cache try/KeyError is 50% faster than
	53	# val = dict.get(key), if val is None:
3943.8.1 by Marius Kruger remove all trailing whitespace from bzr source	54	# On jam's machine the difference is
	55	# try/KeyError: 900ms
	56	# if None: 1250ms
1934.1.11 by John Arbash Meinel Document why we use try/except rather than if None	57	# Since these are primarily used when iterating over a knit entry
	58	# most of the time the key will already be in the cache, so use the
	59	# fast path
1911.2.3 by John Arbash Meinel Moving everything into a new location so that we can cache more than just revision ids	60	try:
	61	return _uni_to_utf8[unicode_str]
	62	except KeyError:
2155.1.1 by John Arbash Meinel (Dmitry Vasiliev) pre-lookup encoders to improve performance	63	_uni_to_utf8[unicode_str] = utf8_str = _utf8_encode(unicode_str)[0]
1911.2.3 by John Arbash Meinel Moving everything into a new location so that we can cache more than just revision ids	64	_utf8_to_uni[utf8_str] = unicode_str
	65	return utf8_str
	66
	67
	68	def decode(utf8_str,
	69	_uni_to_utf8=_unicode_to_utf8_map,
2155.1.1 by John Arbash Meinel (Dmitry Vasiliev) pre-lookup encoders to improve performance	70	_utf8_to_uni=_utf8_to_unicode_map,
	71	_utf8_decode=_utf8_decode):
1911.2.3 by John Arbash Meinel Moving everything into a new location so that we can cache more than just revision ids	72	"""Take a utf8 revision id, and decode it, but cache the result"""
	73	try:
	74	return _utf8_to_uni[utf8_str]
	75	except KeyError:
2249.5.12 by John Arbash Meinel Change the APIs for VersionedFile, Store, and some of Repository into utf-8	76	unicode_str = _utf8_decode(utf8_str)[0]
	77	_utf8_to_uni[utf8_str] = unicode_str
1911.2.3 by John Arbash Meinel Moving everything into a new location so that we can cache more than just revision ids	78	_uni_to_utf8[unicode_str] = utf8_str
	79	return unicode_str
	80
	81
1911.2.5 by John Arbash Meinel Update cache tests, add a function to do something like intern() only for unicode objects	82	def get_cached_unicode(unicode_str):
	83	"""Return a cached version of the unicode string.
	84
	85	This has a similar idea to that of intern() in that it tries
	86	to return a singleton string. Only it works for unicode strings.
	87	"""
	88	# This might return the same object, or it might return the cached one
	89	# the decode() should just be a hash lookup, because the encode() side
	90	# should add the entry to the maps
	91	return decode(encode(unicode_str))
	92
	93
2249.5.2 by John Arbash Meinel Add a get_cached_utf8, which will ensure it is really utf8, and cache the strings	94	def get_cached_utf8(utf8_str):
	95	"""Return a cached version of the utf-8 string.
	96
	97	Get a cached version of this string (similar to intern()).
	98	At present, this will be decoded to ensure it is a utf-8 string. In the
	99	future this might change to simply caching the string.
	100	"""
	101	return encode(decode(utf8_str))
	102
	103
2249.5.3 by John Arbash Meinel Add get_cached_ascii for dealing with how cElementTree handles ascii strings	104	def get_cached_ascii(ascii_str,
	105	_uni_to_utf8=_unicode_to_utf8_map,
	106	_utf8_to_uni=_utf8_to_unicode_map):
	107	"""This is a string which is identical in utf-8 and unicode."""
	108	# We don't need to do any encoding, but we want _utf8_to_uni to return a
	109	# real Unicode string. Unicode and plain strings of this type will have the
	110	# same hash, so we can just use it as the key in _uni_to_utf8, but we need
	111	# the return value to be different in _utf8_to_uni
7078.15.1 by Jelmer Vernooĳ Fix some more tests.	112	uni_str = ascii_str.decode('ascii')
	113	ascii_str = _uni_to_utf8.setdefault(uni_str, ascii_str)
	114	_utf8_to_uni.setdefault(ascii_str, uni_str)
2249.5.3 by John Arbash Meinel Add get_cached_ascii for dealing with how cElementTree handles ascii strings	115	return ascii_str
	116
	117
1911.2.3 by John Arbash Meinel Moving everything into a new location so that we can cache more than just revision ids	118	def clear_encoding_cache():
	119	"""Clear the encoding and decoding caches"""
	120	_unicode_to_utf8_map.clear()
	121	_utf8_to_unicode_map.clear()