/brz/remove-bazaar

To get this branch, use:
bzr branch http://gegoxaren.bato24.eu/bzr/brz/remove-bazaar
1540.3.3 by Martin Pool
Review updates of pycurl transport
1
# Copyright (C) 2005, 2006 Canonical Ltd
1540.3.18 by Martin Pool
Style review fixes (thanks robertc)
2
#
1185.11.19 by John Arbash Meinel
Testing put and append, also testing agaist file-like objects as well as strings.
3
# This program is free software; you can redistribute it and/or modify
4
# it under the terms of the GNU General Public License as published by
5
# the Free Software Foundation; either version 2 of the License, or
6
# (at your option) any later version.
1540.3.18 by Martin Pool
Style review fixes (thanks robertc)
7
#
1185.11.19 by John Arbash Meinel
Testing put and append, also testing agaist file-like objects as well as strings.
8
# This program is distributed in the hope that it will be useful,
9
# but WITHOUT ANY WARRANTY; without even the implied warranty of
10
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
11
# GNU General Public License for more details.
1540.3.18 by Martin Pool
Style review fixes (thanks robertc)
12
#
1185.11.19 by John Arbash Meinel
Testing put and append, also testing agaist file-like objects as well as strings.
13
# You should have received a copy of the GNU General Public License
14
# along with this program; if not, write to the Free Software
15
# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
1540.3.3 by Martin Pool
Review updates of pycurl transport
16
17
"""Base implementation of Transport over http.
18
19
There are separate implementation modules for each http client implementation.
907.1.21 by John Arbash Meinel
Adding http transport as a valid transport protocol.
20
"""
21
1711.4.14 by John Arbash Meinel
Custom HttpRequestHandler which treats all paths as utf8 encoded
22
from cStringIO import StringIO
1786.1.25 by John Arbash Meinel
Test that we can extract headers properly.
23
import mimetools
1540.3.23 by Martin Pool
Allow urls like http+pycurl://host/ to use a particular impl
24
import re
1540.3.3 by Martin Pool
Review updates of pycurl transport
25
import urlparse
26
import urllib
2172.3.2 by v.ladeuil+lp at free
Fix the missing import and typos in comments.
27
import sys
1786.1.6 by John Arbash Meinel
Missed a couple of imports
28
2094.3.5 by John Arbash Meinel
Fix imports to ensure modules are loaded before they are used
29
from bzrlib import errors, ui
2400.1.3 by Andrew Bennetts
Split smart transport code into several separate modules.
30
from bzrlib.smart import medium
1185.11.1 by John Arbash Meinel
(broken) Transport work is merged in. Tests do not pass yet.
31
from bzrlib.trace import mutter
2018.2.2 by Andrew Bennetts
Implement HTTP smart server.
32
from bzrlib.transport import (
33
    Transport,
34
    )
1540.3.6 by Martin Pool
[merge] update from bzr.dev
35
907.1.57 by John Arbash Meinel
Trying to get pipelined http library working + tests.
36
2004.1.9 by vila
Takes jam's remarks into account when possible, add TODOs for the rest.
37
# TODO: This is not used anymore by HttpTransport_urllib
38
# (extracting the auth info and prompting the user for a password
39
# have been split), only the tests still use it. It should be
40
# deleted and the tests rewritten ASAP to stay in sync.
1185.40.20 by Robey Pointer
allow user:pass@ info in http urls to be used for auth; this should be easily expandable later to use auth config files
41
def extract_auth(url, password_manager):
1540.3.26 by Martin Pool
[merge] bzr.dev; pycurl not updated for readv yet
42
    """Extract auth parameters from am HTTP/HTTPS url and add them to the given
1185.40.20 by Robey Pointer
allow user:pass@ info in http urls to be used for auth; this should be easily expandable later to use auth config files
43
    password manager.  Return the url, minus those auth parameters (which
44
    confuse urllib2).
45
    """
1540.3.26 by Martin Pool
[merge] bzr.dev; pycurl not updated for readv yet
46
    assert re.match(r'^(https?)(\+\w+)?://', url), \
47
            'invalid absolute url %r' % url
1540.2.1 by Röbey Pointer
change http url parsing to use urlparse, and use the ui_factory to ask for a password if necessary
48
    scheme, netloc, path, query, fragment = urlparse.urlsplit(url)
2004.3.1 by vila
Test ConnectionError exceptions.
49
1540.2.1 by Röbey Pointer
change http url parsing to use urlparse, and use the ui_factory to ask for a password if necessary
50
    if '@' in netloc:
51
        auth, netloc = netloc.split('@', 1)
1185.40.20 by Robey Pointer
allow user:pass@ info in http urls to be used for auth; this should be easily expandable later to use auth config files
52
        if ':' in auth:
53
            username, password = auth.split(':', 1)
54
        else:
55
            username, password = auth, None
1540.2.1 by Röbey Pointer
change http url parsing to use urlparse, and use the ui_factory to ask for a password if necessary
56
        if ':' in netloc:
57
            host = netloc.split(':', 1)[0]
58
        else:
59
            host = netloc
60
        username = urllib.unquote(username)
1185.40.20 by Robey Pointer
allow user:pass@ info in http urls to be used for auth; this should be easily expandable later to use auth config files
61
        if password is not None:
62
            password = urllib.unquote(password)
1540.2.1 by Röbey Pointer
change http url parsing to use urlparse, and use the ui_factory to ask for a password if necessary
63
        else:
2094.3.6 by John Arbash Meinel
[merge] bzr.dev 2158
64
            password = ui.ui_factory.get_password(
2004.2.1 by John Arbash Meinel
Cleanup of urllib functions
65
                prompt='HTTP %(user)s@%(host)s password',
66
                user=username, host=host)
1540.2.1 by Röbey Pointer
change http url parsing to use urlparse, and use the ui_factory to ask for a password if necessary
67
        password_manager.add_password(None, host, username, password)
68
    url = urlparse.urlunsplit((scheme, netloc, path, query, fragment))
1185.40.20 by Robey Pointer
allow user:pass@ info in http urls to be used for auth; this should be easily expandable later to use auth config files
69
    return url
1553.1.5 by James Henstridge
Make HTTP transport has() method do HEAD requests, and update test to
70
1185.50.83 by John Arbash Meinel
[merge] James Henstridge: Set Agent string in http headers, add tests for it.
71
1786.1.42 by John Arbash Meinel
Update _extract_headers, make it less generic, and non recursive.
72
def _extract_headers(header_text, url):
73
    """Extract the mapping for an rfc2822 header
1786.1.25 by John Arbash Meinel
Test that we can extract headers properly.
74
1786.1.42 by John Arbash Meinel
Update _extract_headers, make it less generic, and non recursive.
75
    This is a helper function for the test suite and for _pycurl.
1786.1.32 by John Arbash Meinel
cleanup pass, allow pycurl connections to be shared between transports.
76
    (urllib already parses the headers for us)
77
1786.1.42 by John Arbash Meinel
Update _extract_headers, make it less generic, and non recursive.
78
    In the case that there are multiple headers inside the file,
79
    the last one is returned.
80
81
    :param header_text: A string of header information.
82
        This expects that the first line of a header will always be HTTP ...
83
    :param url: The url we are parsing, so we can raise nice errors
84
    :return: mimetools.Message object, which basically acts like a case 
85
        insensitive dictionary.
1786.1.25 by John Arbash Meinel
Test that we can extract headers properly.
86
    """
1786.1.42 by John Arbash Meinel
Update _extract_headers, make it less generic, and non recursive.
87
    first_header = True
88
    remaining = header_text
89
90
    if not remaining:
91
        raise errors.InvalidHttpResponse(url, 'Empty headers')
92
93
    while remaining:
94
        header_file = StringIO(remaining)
95
        first_line = header_file.readline()
96
        if not first_line.startswith('HTTP'):
97
            if first_header: # The first header *must* start with HTTP
98
                raise errors.InvalidHttpResponse(url,
2004.3.1 by vila
Test ConnectionError exceptions.
99
                    'Opening header line did not start with HTTP: %s'
1786.1.42 by John Arbash Meinel
Update _extract_headers, make it less generic, and non recursive.
100
                    % (first_line,))
101
                assert False, 'Opening header line was not HTTP'
102
            else:
103
                break # We are done parsing
104
        first_header = False
105
        m = mimetools.Message(header_file)
106
107
        # mimetools.Message parses the first header up to a blank line
108
        # So while there is remaining data, it probably means there is
109
        # another header to be parsed.
110
        # Get rid of any preceeding whitespace, which if it is all whitespace
111
        # will get rid of everything.
112
        remaining = header_file.read().lstrip()
1786.1.25 by John Arbash Meinel
Test that we can extract headers properly.
113
    return m
114
115
2400.1.3 by Andrew Bennetts
Split smart transport code into several separate modules.
116
class HttpTransportBase(Transport, medium.SmartClientMedium):
1540.3.1 by Martin Pool
First-cut implementation of pycurl. Substantially faster than using urllib.
117
    """Base class for http implementations.
118
1540.3.23 by Martin Pool
Allow urls like http+pycurl://host/ to use a particular impl
119
    Does URL parsing, etc, but not any network IO.
120
121
    The protocol can be given as e.g. http+urllib://host/ to use a particular
122
    implementation.
123
    """
124
1540.3.24 by Martin Pool
Add new protocol 'http+pycurl' that always uses PyCurl.
125
    # _proto: "http" or "https"
126
    # _qualified_proto: may have "+pycurl", etc
127
2004.1.30 by v.ladeuil+lp at free
Fix #62276 and #62029 by providing a more robust http range handling.
128
    def __init__(self, base, from_transport=None):
907.1.21 by John Arbash Meinel
Adding http transport as a valid transport protocol.
129
        """Set the base path where files will be stored."""
1540.3.23 by Martin Pool
Allow urls like http+pycurl://host/ to use a particular impl
130
        proto_match = re.match(r'^(https?)(\+\w+)?://', base)
131
        if not proto_match:
132
            raise AssertionError("not a http url: %r" % base)
1540.3.24 by Martin Pool
Add new protocol 'http+pycurl' that always uses PyCurl.
133
        self._proto = proto_match.group(1)
134
        impl_name = proto_match.group(2)
1540.3.23 by Martin Pool
Allow urls like http+pycurl://host/ to use a particular impl
135
        if impl_name:
136
            impl_name = impl_name[1:]
1540.3.24 by Martin Pool
Add new protocol 'http+pycurl' that always uses PyCurl.
137
        self._impl_name = impl_name
1530.1.3 by Robert Collins
transport implementations now tested consistently.
138
        if base[-1] != '/':
139
            base = base + '/'
1540.3.1 by Martin Pool
First-cut implementation of pycurl. Substantially faster than using urllib.
140
        super(HttpTransportBase, self).__init__(base)
1540.3.24 by Martin Pool
Add new protocol 'http+pycurl' that always uses PyCurl.
141
        (apparent_proto, self._host,
1185.11.6 by John Arbash Meinel
Made HttpTransport handle a request for a parent directory differently.
142
            self._path, self._parameters,
143
            self._query, self._fragment) = urlparse.urlparse(self.base)
1540.3.24 by Martin Pool
Add new protocol 'http+pycurl' that always uses PyCurl.
144
        self._qualified_proto = apparent_proto
2004.1.30 by v.ladeuil+lp at free
Fix #62276 and #62029 by providing a more robust http range handling.
145
        # range hint is handled dynamically throughout the life
2164.2.13 by v.ladeuil+lp at free
Add tests for redirection. Preserve transport decorations.
146
        # of the object. We start by trying multi-range requests
2004.1.30 by v.ladeuil+lp at free
Fix #62276 and #62029 by providing a more robust http range handling.
147
        # and if the server returns bougs results, we retry with
148
        # single range requests and, finally, we forget about
149
        # range if the server really can't understand. Once
150
        # aquired, this piece of info is propogated to clones.
151
        if from_transport is not None:
152
            self._range_hint = from_transport._range_hint
153
        else:
154
            self._range_hint = 'multi'
907.1.21 by John Arbash Meinel
Adding http transport as a valid transport protocol.
155
156
    def abspath(self, relpath):
157
        """Return the full url to the given relative path.
1540.3.24 by Martin Pool
Add new protocol 'http+pycurl' that always uses PyCurl.
158
159
        This can be supplied with a string or a list.
160
1540.3.25 by Martin Pool
New 'http+urllib' scheme
161
        The URL returned always has the protocol scheme originally used to 
162
        construct the transport, even if that includes an explicit
163
        implementation qualifier.
907.1.21 by John Arbash Meinel
Adding http transport as a valid transport protocol.
164
        """
1469 by Robert Collins
Change Transport.* to work with URL's.
165
        assert isinstance(relpath, basestring)
1185.85.76 by John Arbash Meinel
Adding an InvalidURL so transports can report they expect utf-8 quoted paths. Updated tests
166
        if isinstance(relpath, unicode):
2004.1.25 by v.ladeuil+lp at free
Shuffle http related test code. Hopefully it ends up at the right place :)
167
            raise errors.InvalidURL(relpath, 'paths must not be unicode.')
907.1.21 by John Arbash Meinel
Adding http transport as a valid transport protocol.
168
        if isinstance(relpath, basestring):
1185.16.68 by Martin Pool
- http url fixes suggested by Robey Pointer, and tests
169
            relpath_parts = relpath.split('/')
170
        else:
171
            # TODO: Don't call this with an array - no magic interfaces
172
            relpath_parts = relpath[:]
1910.15.1 by Andrew Bennetts
More tests for abspath and clone behaviour
173
        if relpath.startswith('/'):
174
            basepath = []
175
        else:
176
            # Except for the root, no trailing slashes are allowed
177
            if len(relpath_parts) > 1 and relpath_parts[-1] == '':
2004.1.28 by v.ladeuil+lp at free
Merge bzr.dev. Including http modifications by "smart" related code
178
                raise ValueError(
179
                    "path %r within branch %r seems to be a directory"
180
                    % (relpath, self._path))
1910.15.1 by Andrew Bennetts
More tests for abspath and clone behaviour
181
            basepath = self._path.split('/')
182
            if len(basepath) > 0 and basepath[-1] == '':
183
                basepath = basepath[:-1]
184
1185.16.68 by Martin Pool
- http url fixes suggested by Robey Pointer, and tests
185
        for p in relpath_parts:
1185.11.6 by John Arbash Meinel
Made HttpTransport handle a request for a parent directory differently.
186
            if p == '..':
1185.16.68 by Martin Pool
- http url fixes suggested by Robey Pointer, and tests
187
                if len(basepath) == 0:
1185.11.7 by John Arbash Meinel
HttpTransport just returns root when parent is requested.
188
                    # In most filesystems, a request for the parent
189
                    # of root, just returns root.
190
                    continue
1185.16.68 by Martin Pool
- http url fixes suggested by Robey Pointer, and tests
191
                basepath.pop()
192
            elif p == '.' or p == '':
1185.11.6 by John Arbash Meinel
Made HttpTransport handle a request for a parent directory differently.
193
                continue # No-op
194
            else:
195
                basepath.append(p)
196
        # Possibly, we could use urlparse.urljoin() here, but
197
        # I'm concerned about when it chooses to strip the last
198
        # portion of the path, and when it doesn't.
199
        path = '/'.join(basepath)
1636.1.1 by Robert Collins
Fix calling relpath() and abspath() on transports at their root.
200
        if path == '':
201
            path = '/'
202
        result = urlparse.urlunparse((self._qualified_proto,
1540.3.24 by Martin Pool
Add new protocol 'http+pycurl' that always uses PyCurl.
203
                                    self._host, path, '', '', ''))
1636.1.1 by Robert Collins
Fix calling relpath() and abspath() on transports at their root.
204
        return result
907.1.24 by John Arbash Meinel
Remote functionality work.
205
1540.3.25 by Martin Pool
New 'http+urllib' scheme
206
    def _real_abspath(self, relpath):
207
        """Produce absolute path, adjusting protocol if needed"""
208
        abspath = self.abspath(relpath)
209
        qp = self._qualified_proto
210
        rp = self._proto
211
        if self._qualified_proto != self._proto:
212
            abspath = rp + abspath[len(qp):]
213
        if not isinstance(abspath, str):
214
            # escaping must be done at a higher level
215
            abspath = abspath.encode('ascii')
216
        return abspath
217
907.1.21 by John Arbash Meinel
Adding http transport as a valid transport protocol.
218
    def has(self, relpath):
1540.3.15 by Martin Pool
[merge] large merge to sync with bzr.dev
219
        raise NotImplementedError("has() is abstract on %r" % self)
220
2164.2.15 by Vincent Ladeuil
Http redirections are not followed by default. Do not use hints
221
    def get(self, relpath):
1594.2.5 by Robert Collins
Readv patch from Johan Rydberg giving knits partial download support.
222
        """Get the file at the given relative path.
223
224
        :param relpath: The relative path to the file
225
        """
2164.2.15 by Vincent Ladeuil
Http redirections are not followed by default. Do not use hints
226
        code, response_file = self._get(relpath, None)
1540.3.27 by Martin Pool
Integrate http range support for pycurl
227
        return response_file
1540.3.26 by Martin Pool
[merge] bzr.dev; pycurl not updated for readv yet
228
2164.2.15 by Vincent Ladeuil
Http redirections are not followed by default. Do not use hints
229
    def _get(self, relpath, ranges, tail_amount=0):
1540.3.27 by Martin Pool
Integrate http range support for pycurl
230
        """Get a file, or part of a file.
231
232
        :param relpath: Path relative to transport base URL
2164.2.1 by v.ladeuil+lp at free
First rough http branch redirection implementation.
233
        :param ranges: None to get the whole file;
2164.2.16 by Vincent Ladeuil
Add tests.
234
            or [(start,end)+], a list of tuples to fetch parts of a file.
2164.2.26 by Vincent Ladeuil
Delete obsolete note in doc string.
235
        :param tail_amount: The amount to get from the end of the file.
1540.3.27 by Martin Pool
Integrate http range support for pycurl
236
237
        :returns: (http_code, result_file)
238
        """
1540.3.26 by Martin Pool
[merge] bzr.dev; pycurl not updated for readv yet
239
        raise NotImplementedError(self._get)
1594.2.5 by Robert Collins
Readv patch from Johan Rydberg giving knits partial download support.
240
2018.2.6 by Andrew Bennetts
HTTP client starting to work (pycurl for the moment).
241
    def get_request(self):
2018.2.8 by Andrew Bennetts
Make HttpTransportBase.get_smart_client return self again.
242
        return SmartClientHTTPMediumRequest(self)
2018.2.6 by Andrew Bennetts
HTTP client starting to work (pycurl for the moment).
243
2018.2.3 by Andrew Bennetts
Starting factoring out the smart server client "medium" from the protocol.
244
    def get_smart_medium(self):
245
        """See Transport.get_smart_medium.
246
247
        HttpTransportBase directly implements the minimal interface of
248
        SmartMediumClient, so this returns self.
249
        """
2018.2.8 by Andrew Bennetts
Make HttpTransportBase.get_smart_client return self again.
250
        return self
2018.2.3 by Andrew Bennetts
Starting factoring out the smart server client "medium" from the protocol.
251
2172.3.1 by v.ladeuil+lp at free
Merge a recent bzr.dev (2172) and takes John's remarks into account.
252
    def _retry_get(self, relpath, ranges, exc_info):
2000.3.9 by v.ladeuil+lp at free
The tests that would have help avoid bug #73948 and all that mess :)
253
        """A GET request have failed, let's retry with a simpler request."""
254
255
        try_again = False
256
        # The server does not gives us enough data or
257
        # bogus-looking result, let's try again with
258
        # a simpler request if possible.
259
        if self._range_hint == 'multi':
260
            self._range_hint = 'single'
261
            mutter('Retry %s with single range request' % relpath)
262
            try_again = True
263
        elif self._range_hint == 'single':
264
            self._range_hint = None
265
            mutter('Retry %s without ranges' % relpath)
266
            try_again = True
267
        if try_again:
268
            # Note that since the offsets and the ranges may not
2180.1.2 by Aaron Bentley
Grammar fixes
269
            # be in the same order, we don't try to calculate a
2000.3.9 by v.ladeuil+lp at free
The tests that would have help avoid bug #73948 and all that mess :)
270
            # restricted single range encompassing unprocessed
271
            # offsets.
272
            code, f = self._get(relpath, ranges)
273
            return try_again, code, f
274
        else:
2180.1.2 by Aaron Bentley
Grammar fixes
275
            # We tried all the tricks, but nothing worked. We
276
            # re-raise original exception; the 'mutter' calls
2172.3.2 by v.ladeuil+lp at free
Fix the missing import and typos in comments.
277
            # above will indicate that further tries were
278
            # unsuccessful
2172.3.1 by v.ladeuil+lp at free
Merge a recent bzr.dev (2172) and takes John's remarks into account.
279
            raise exc_info[0], exc_info[1], exc_info[2]
2000.3.9 by v.ladeuil+lp at free
The tests that would have help avoid bug #73948 and all that mess :)
280
1594.2.5 by Robert Collins
Readv patch from Johan Rydberg giving knits partial download support.
281
    def readv(self, relpath, offsets):
282
        """Get parts of the file at the given relative path.
283
1540.3.26 by Martin Pool
[merge] bzr.dev; pycurl not updated for readv yet
284
        :param offsets: A list of (offset, size) tuples.
1540.3.27 by Martin Pool
Integrate http range support for pycurl
285
        :param return: A list or generator of (offset, data) tuples
1594.2.5 by Robert Collins
Readv patch from Johan Rydberg giving knits partial download support.
286
        """
1786.1.39 by John Arbash Meinel
Remove the ability to read negative offsets from readv()
287
        ranges = self.offsets_to_ranges(offsets)
288
        mutter('http readv of %s collapsed %s offsets => %s',
1786.1.34 by John Arbash Meinel
shorten the readv message to cause a smaller debug log.
289
                relpath, len(offsets), ranges)
2000.3.9 by v.ladeuil+lp at free
The tests that would have help avoid bug #73948 and all that mess :)
290
291
        try_again = True
292
        while try_again:
293
            try_again = False
294
            try:
295
                code, f = self._get(relpath, ranges)
2172.3.1 by v.ladeuil+lp at free
Merge a recent bzr.dev (2172) and takes John's remarks into account.
296
            except (errors.InvalidRange, errors.ShortReadvError), e:
297
                try_again, code, f = self._retry_get(relpath, ranges,
298
                                                     sys.exc_info())
2000.3.9 by v.ladeuil+lp at free
The tests that would have help avoid bug #73948 and all that mess :)
299
1786.1.5 by John Arbash Meinel
Move the common Multipart stuff into plain http, and wrap pycurl response so that it matches the urllib response object.
300
        for start, size in offsets:
2004.1.30 by v.ladeuil+lp at free
Fix #62276 and #62029 by providing a more robust http range handling.
301
            try_again = True
302
            while try_again:
303
                try_again = False
304
                f.seek(start, (start < 0) and 2 or 0)
305
                start = f.tell()
306
                try:
307
                    data = f.read(size)
308
                    if len(data) != size:
309
                        raise errors.ShortReadvError(relpath, start, size,
310
                                                     actual=len(data))
2172.3.1 by v.ladeuil+lp at free
Merge a recent bzr.dev (2172) and takes John's remarks into account.
311
                except (errors.InvalidRange, errors.ShortReadvError), e:
2000.3.9 by v.ladeuil+lp at free
The tests that would have help avoid bug #73948 and all that mess :)
312
                    # Note that we replace 'f' here and that it
313
                    # may need cleaning one day before being
314
                    # thrown that way.
315
                    try_again, code, f = self._retry_get(relpath, ranges,
2172.3.1 by v.ladeuil+lp at free
Merge a recent bzr.dev (2172) and takes John's remarks into account.
316
                                                         sys.exc_info())
2000.3.9 by v.ladeuil+lp at free
The tests that would have help avoid bug #73948 and all that mess :)
317
            # After one or more tries, we get the data.
1786.1.5 by John Arbash Meinel
Move the common Multipart stuff into plain http, and wrap pycurl response so that it matches the urllib response object.
318
            yield start, data
319
1786.1.23 by John Arbash Meinel
Move offset_to_http_ranges back onto HttpTransportBase, clarify tests.
320
    @staticmethod
1786.1.39 by John Arbash Meinel
Remove the ability to read negative offsets from readv()
321
    def offsets_to_ranges(offsets):
1786.1.23 by John Arbash Meinel
Move offset_to_http_ranges back onto HttpTransportBase, clarify tests.
322
        """Turn a list of offsets and sizes into a list of byte ranges.
323
324
        :param offsets: A list of tuples of (start, size).  An empty list
1786.1.32 by John Arbash Meinel
cleanup pass, allow pycurl connections to be shared between transports.
325
            is not accepted.
1786.1.39 by John Arbash Meinel
Remove the ability to read negative offsets from readv()
326
        :return: a list of inclusive byte ranges (start, end) 
1786.1.32 by John Arbash Meinel
cleanup pass, allow pycurl connections to be shared between transports.
327
            Adjacent ranges will be combined.
1786.1.23 by John Arbash Meinel
Move offset_to_http_ranges back onto HttpTransportBase, clarify tests.
328
        """
1786.1.33 by John Arbash Meinel
Cleanup pass #2
329
        # Make sure we process sorted offsets
1786.1.23 by John Arbash Meinel
Move offset_to_http_ranges back onto HttpTransportBase, clarify tests.
330
        offsets = sorted(offsets)
331
332
        prev_end = None
333
        combined = []
334
335
        for start, size in offsets:
1786.1.39 by John Arbash Meinel
Remove the ability to read negative offsets from readv()
336
            end = start + size - 1
337
            if prev_end is None:
338
                combined.append([start, end])
339
            elif start <= prev_end + 1:
340
                combined[-1][1] = end
1786.1.23 by John Arbash Meinel
Move offset_to_http_ranges back onto HttpTransportBase, clarify tests.
341
            else:
1786.1.39 by John Arbash Meinel
Remove the ability to read negative offsets from readv()
342
                combined.append([start, end])
343
            prev_end = end
1786.1.23 by John Arbash Meinel
Move offset_to_http_ranges back onto HttpTransportBase, clarify tests.
344
1786.1.39 by John Arbash Meinel
Remove the ability to read negative offsets from readv()
345
        return combined
1786.1.24 by John Arbash Meinel
Move the functions/regexes to be static members
346
2018.2.10 by Andrew Bennetts
Tidy up TODOs, further testing and fixes for SmartServerRequestProtocolOne, and remove a read_bytes(1) call.
347
    def _post(self, body_bytes):
348
        """POST body_bytes to .bzr/smart on this transport.
349
        
350
        :returns: (response code, response body file-like object).
351
        """
352
        # TODO: Requiring all the body_bytes to be available at the beginning of
353
        # the POST may require large client buffers.  It would be nice to have
354
        # an interface that allows streaming via POST when possible (and
355
        # degrades to a local buffer when not).
356
        raise NotImplementedError(self._post)
357
1955.3.6 by John Arbash Meinel
Lots of deprecation warnings, but no errors
358
    def put_file(self, relpath, f, mode=None):
359
        """Copy the file-like object into the location.
907.1.21 by John Arbash Meinel
Adding http transport as a valid transport protocol.
360
361
        :param relpath: Location to put the contents, relative to base.
1955.3.6 by John Arbash Meinel
Lots of deprecation warnings, but no errors
362
        :param f:       File-like object.
907.1.21 by John Arbash Meinel
Adding http transport as a valid transport protocol.
363
        """
2004.1.25 by v.ladeuil+lp at free
Shuffle http related test code. Hopefully it ends up at the right place :)
364
        raise errors.TransportNotPossible('http PUT not supported')
907.1.21 by John Arbash Meinel
Adding http transport as a valid transport protocol.
365
1185.58.2 by John Arbash Meinel
Added mode to the appropriate transport functions, and tests to make sure they work.
366
    def mkdir(self, relpath, mode=None):
907.1.21 by John Arbash Meinel
Adding http transport as a valid transport protocol.
367
        """Create a directory at the given path."""
2004.1.25 by v.ladeuil+lp at free
Shuffle http related test code. Hopefully it ends up at the right place :)
368
        raise errors.TransportNotPossible('http does not support mkdir()')
907.1.21 by John Arbash Meinel
Adding http transport as a valid transport protocol.
369
1534.4.15 by Robert Collins
Remove shutil dependency in upgrade - create a delete_tree method for transports.
370
    def rmdir(self, relpath):
371
        """See Transport.rmdir."""
2004.1.25 by v.ladeuil+lp at free
Shuffle http related test code. Hopefully it ends up at the right place :)
372
        raise errors.TransportNotPossible('http does not support rmdir()')
1534.4.15 by Robert Collins
Remove shutil dependency in upgrade - create a delete_tree method for transports.
373
1955.3.15 by John Arbash Meinel
Deprecate 'Transport.append' in favor of Transport.append_file or Transport.append_bytes
374
    def append_file(self, relpath, f, mode=None):
907.1.21 by John Arbash Meinel
Adding http transport as a valid transport protocol.
375
        """Append the text in the file-like object into the final
376
        location.
377
        """
2004.1.25 by v.ladeuil+lp at free
Shuffle http related test code. Hopefully it ends up at the right place :)
378
        raise errors.TransportNotPossible('http does not support append()')
907.1.21 by John Arbash Meinel
Adding http transport as a valid transport protocol.
379
380
    def copy(self, rel_from, rel_to):
381
        """Copy the item at rel_from to the location at rel_to"""
2004.1.25 by v.ladeuil+lp at free
Shuffle http related test code. Hopefully it ends up at the right place :)
382
        raise errors.TransportNotPossible('http does not support copy()')
907.1.21 by John Arbash Meinel
Adding http transport as a valid transport protocol.
383
1185.58.2 by John Arbash Meinel
Added mode to the appropriate transport functions, and tests to make sure they work.
384
    def copy_to(self, relpaths, other, mode=None, pb=None):
907.1.28 by John Arbash Meinel
Added pb to function that were missing, implemented a basic double-dispatch copy_to function.
385
        """Copy a set of entries from self into another Transport.
386
387
        :param relpaths: A list/generator of entries to be copied.
907.1.50 by John Arbash Meinel
Removed encode/decode from Transport.put/get, added more exceptions that can be thrown.
388
389
        TODO: if other is LocalTransport, is it possible to
390
              do better than put(get())?
907.1.28 by John Arbash Meinel
Added pb to function that were missing, implemented a basic double-dispatch copy_to function.
391
        """
907.1.29 by John Arbash Meinel
Fixing small bug in HttpTransport.copy_to
392
        # At this point HttpTransport might be able to check and see if
393
        # the remote location is the same, and rather than download, and
394
        # then upload, it could just issue a remote copy_this command.
1540.3.6 by Martin Pool
[merge] update from bzr.dev
395
        if isinstance(other, HttpTransportBase):
2004.1.25 by v.ladeuil+lp at free
Shuffle http related test code. Hopefully it ends up at the right place :)
396
            raise errors.TransportNotPossible(
397
                'http cannot be the target of copy_to()')
907.1.28 by John Arbash Meinel
Added pb to function that were missing, implemented a basic double-dispatch copy_to function.
398
        else:
1540.3.26 by Martin Pool
[merge] bzr.dev; pycurl not updated for readv yet
399
            return super(HttpTransportBase, self).\
400
                    copy_to(relpaths, other, mode=mode, pb=pb)
907.1.28 by John Arbash Meinel
Added pb to function that were missing, implemented a basic double-dispatch copy_to function.
401
907.1.21 by John Arbash Meinel
Adding http transport as a valid transport protocol.
402
    def move(self, rel_from, rel_to):
403
        """Move the item at rel_from to the location at rel_to"""
2004.1.25 by v.ladeuil+lp at free
Shuffle http related test code. Hopefully it ends up at the right place :)
404
        raise errors.TransportNotPossible('http does not support move()')
907.1.21 by John Arbash Meinel
Adding http transport as a valid transport protocol.
405
406
    def delete(self, relpath):
407
        """Delete the item at relpath"""
2004.1.25 by v.ladeuil+lp at free
Shuffle http related test code. Hopefully it ends up at the right place :)
408
        raise errors.TransportNotPossible('http does not support delete()')
907.1.21 by John Arbash Meinel
Adding http transport as a valid transport protocol.
409
1530.1.3 by Robert Collins
transport implementations now tested consistently.
410
    def is_readonly(self):
411
        """See Transport.is_readonly."""
412
        return True
413
1400.1.1 by Robert Collins
implement a basic test for the ui branch command from http servers
414
    def listable(self):
415
        """See Transport.listable."""
416
        return False
907.1.21 by John Arbash Meinel
Adding http transport as a valid transport protocol.
417
418
    def stat(self, relpath):
419
        """Return the stat information for a file.
420
        """
2004.1.25 by v.ladeuil+lp at free
Shuffle http related test code. Hopefully it ends up at the right place :)
421
        raise errors.TransportNotPossible('http does not support stat()')
907.1.21 by John Arbash Meinel
Adding http transport as a valid transport protocol.
422
907.1.24 by John Arbash Meinel
Remote functionality work.
423
    def lock_read(self, relpath):
424
        """Lock the given file for shared (read) access.
425
        :return: A lock object, which should be passed to Transport.unlock()
426
        """
427
        # The old RemoteBranch ignore lock for reading, so we will
428
        # continue that tradition and return a bogus lock object.
429
        class BogusLock(object):
430
            def __init__(self, path):
431
                self.path = path
432
            def unlock(self):
433
                pass
434
        return BogusLock(relpath)
435
436
    def lock_write(self, relpath):
437
        """Lock the given file for exclusive (write) access.
438
        WARNING: many transports do not support this, so trying avoid using it
439
440
        :return: A lock object, which should be passed to Transport.unlock()
441
        """
2004.1.25 by v.ladeuil+lp at free
Shuffle http related test code. Hopefully it ends up at the right place :)
442
        raise errors.TransportNotPossible('http does not support lock_write()')
1530.1.1 by Robert Collins
Minimal infrastructure to test TransportTestProviderAdapter.
443
1540.3.26 by Martin Pool
[merge] bzr.dev; pycurl not updated for readv yet
444
    def clone(self, offset=None):
445
        """Return a new HttpTransportBase with root at self.base + offset
2025.2.1 by v.ladeuil+lp at free
Fix bug #61606 by providing cloning hint do daughter classes.
446
2004.1.6 by vila
Connection sharing between cloned transports.
447
        We leave the daughter classes take advantage of the hint
448
        that it's a cloning not a raw creation.
1540.3.26 by Martin Pool
[merge] bzr.dev; pycurl not updated for readv yet
449
        """
450
        if offset is None:
2004.1.6 by vila
Connection sharing between cloned transports.
451
            return self.__class__(self.base, self)
1540.3.26 by Martin Pool
[merge] bzr.dev; pycurl not updated for readv yet
452
        else:
2004.1.6 by vila
Connection sharing between cloned transports.
453
            return self.__class__(self.abspath(offset), self)
1530.1.1 by Robert Collins
Minimal infrastructure to test TransportTestProviderAdapter.
454
2004.1.30 by v.ladeuil+lp at free
Fix #62276 and #62029 by providing a more robust http range handling.
455
    def attempted_range_header(self, ranges, tail_amount):
456
        """Prepare a HTTP Range header at a level the server should accept"""
457
458
        if self._range_hint == 'multi':
459
            # Nothing to do here
460
            return self.range_header(ranges, tail_amount)
461
        elif self._range_hint == 'single':
462
            # Combine all the requested ranges into a single
463
            # encompassing one
464
            if len(ranges) > 0:
465
                start, ignored = ranges[0]
466
                ignored, end = ranges[-1]
467
                if tail_amount not in (0, None):
468
                    # Nothing we can do here to combine ranges
469
                    # with tail_amount, just returns None. The
470
                    # whole file should be downloaded.
471
                    return None
472
                else:
473
                    return self.range_header([(start, end)], 0)
474
            else:
475
                # Only tail_amount, requested, leave range_header
476
                # do its work
477
                return self.range_header(ranges, tail_amount)
478
        else:
479
            return None
480
1786.1.27 by John Arbash Meinel
Fix up the http transports so that tests pass with the new configuration.
481
    @staticmethod
482
    def range_header(ranges, tail_amount):
1750.1.2 by Michael Ellerman
Add support for HTTP multipart ranges and hook it into http+urllib.
483
        """Turn a list of bytes ranges into a HTTP Range header value.
484
2004.1.30 by v.ladeuil+lp at free
Fix #62276 and #62029 by providing a more robust http range handling.
485
        :param ranges: A list of byte ranges, (start, end).
486
        :param tail_amount: The amount to get from the end of the file.
1750.1.2 by Michael Ellerman
Add support for HTTP multipart ranges and hook it into http+urllib.
487
488
        :return: HTTP range header string.
2004.1.30 by v.ladeuil+lp at free
Fix #62276 and #62029 by providing a more robust http range handling.
489
490
        At least a non-empty ranges *or* a tail_amount must be
491
        provided.
1750.1.2 by Michael Ellerman
Add support for HTTP multipart ranges and hook it into http+urllib.
492
        """
493
        strings = []
494
        for start, end in ranges:
495
            strings.append('%d-%d' % (start, end))
496
1786.1.8 by John Arbash Meinel
[merge] Johan Rydberg test updates
497
        if tail_amount:
498
            strings.append('-%d' % tail_amount)
499
1786.1.36 by John Arbash Meinel
pycurl expects us to just set the range of bytes, not including bytes=
500
        return ','.join(strings)
1750.1.2 by Michael Ellerman
Add support for HTTP multipart ranges and hook it into http+urllib.
501
2018.2.8 by Andrew Bennetts
Make HttpTransportBase.get_smart_client return self again.
502
    def send_http_smart_request(self, bytes):
503
        code, body_filelike = self._post(bytes)
504
        assert code == 200, 'unexpected HTTP response code %r' % (code,)
505
        return body_filelike
506
507
2400.1.3 by Andrew Bennetts
Split smart transport code into several separate modules.
508
class SmartClientHTTPMediumRequest(medium.SmartClientMediumRequest):
2018.2.8 by Andrew Bennetts
Make HttpTransportBase.get_smart_client return self again.
509
    """A SmartClientMediumRequest that works with an HTTP medium."""
510
2414.1.1 by Andrew Bennetts
Add some unicode-related tests from the hpss branch, and a few other nits (also from the hpss branch).
511
    def __init__(self, client_medium):
512
        medium.SmartClientMediumRequest.__init__(self, client_medium)
2018.2.8 by Andrew Bennetts
Make HttpTransportBase.get_smart_client return self again.
513
        self._buffer = ''
514
515
    def _accept_bytes(self, bytes):
516
        self._buffer += bytes
517
518
    def _finished_writing(self):
519
        data = self._medium.send_http_smart_request(self._buffer)
520
        self._response_body = data
521
522
    def _read_bytes(self, count):
523
        return self._response_body.read(count)
2004.1.28 by v.ladeuil+lp at free
Merge bzr.dev. Including http modifications by "smart" related code
524
2018.2.8 by Andrew Bennetts
Make HttpTransportBase.get_smart_client return self again.
525
    def _finished_reading(self):
526
        """See SmartClientMediumRequest._finished_reading."""
527
        pass