bzr branch
http://gegoxaren.bato24.eu/bzr/brz/remove-bazaar
|
2400.1.2
by Andrew Bennetts
Move SmartTCPServer classes into bzrlib/smart/server.py |
1 |
# Copyright (C) 2006 Canonical Ltd
|
2 |
#
|
|
3 |
# This program is free software; you can redistribute it and/or modify
|
|
4 |
# it under the terms of the GNU General Public License as published by
|
|
5 |
# the Free Software Foundation; either version 2 of the License, or
|
|
6 |
# (at your option) any later version.
|
|
7 |
#
|
|
8 |
# This program is distributed in the hope that it will be useful,
|
|
9 |
# but WITHOUT ANY WARRANTY; without even the implied warranty of
|
|
10 |
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
|
11 |
# GNU General Public License for more details.
|
|
12 |
#
|
|
13 |
# You should have received a copy of the GNU General Public License
|
|
14 |
# along with this program; if not, write to the Free Software
|
|
15 |
# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
|
|
16 |
||
17 |
"""Smart-server protocol, client and server.
|
|
18 |
||
|
2400.1.6
by Andrew Bennetts
Cosmetic changes to minimise the difference between this branch and the hpss branch. |
19 |
This code is fairly complex, so it has been split up into a package of modules,
|
20 |
rather than being a single large module. Refer to the individual module
|
|
21 |
docstrings for details.
|
|
22 |
||
23 |
Overview
|
|
24 |
========
|
|
25 |
||
|
2400.1.2
by Andrew Bennetts
Move SmartTCPServer classes into bzrlib/smart/server.py |
26 |
Requests are sent as a command and list of arguments, followed by optional
|
27 |
bulk body data. Responses are similarly a response and list of arguments,
|
|
28 |
followed by bulk body data. ::
|
|
29 |
||
30 |
SEP := '\001'
|
|
31 |
Fields are separated by Ctrl-A.
|
|
32 |
BULK_DATA := CHUNK TRAILER
|
|
33 |
Chunks can be repeated as many times as necessary.
|
|
34 |
CHUNK := CHUNK_LEN CHUNK_BODY
|
|
35 |
CHUNK_LEN := DIGIT+ NEWLINE
|
|
36 |
Gives the number of bytes in the following chunk.
|
|
37 |
CHUNK_BODY := BYTE[chunk_len]
|
|
38 |
TRAILER := SUCCESS_TRAILER | ERROR_TRAILER
|
|
39 |
SUCCESS_TRAILER := 'done' NEWLINE
|
|
40 |
ERROR_TRAILER :=
|
|
41 |
||
42 |
Paths are passed across the network. The client needs to see a namespace that
|
|
43 |
includes any repository that might need to be referenced, and the client needs
|
|
44 |
to know about a root directory beyond which it cannot ascend.
|
|
45 |
||
46 |
Servers run over ssh will typically want to be able to access any path the user
|
|
47 |
can access. Public servers on the other hand (which might be over http, ssh
|
|
48 |
or tcp) will typically want to restrict access to only a particular directory
|
|
49 |
and its children, so will want to do a software virtual root at that level.
|
|
50 |
In other words they'll want to rewrite incoming paths to be under that level
|
|
51 |
(and prevent escaping using ../ tricks.)
|
|
52 |
||
53 |
URLs that include ~ should probably be passed across to the server verbatim
|
|
54 |
and the server can expand them. This will proably not be meaningful when
|
|
55 |
limited to a directory?
|
|
56 |
||
57 |
At the bottom level socket, pipes, HTTP server. For sockets, we have the idea
|
|
58 |
that you have multiple requests and get a read error because the other side did
|
|
59 |
shutdown. For pipes we have read pipe which will have a zero read which marks
|
|
60 |
end-of-file. For HTTP server environment there is not end-of-stream because
|
|
61 |
each request coming into the server is independent.
|
|
62 |
||
63 |
So we need a wrapper around pipes and sockets to seperate out requests from
|
|
64 |
substrate and this will give us a single model which is consist for HTTP,
|
|
65 |
sockets and pipes.
|
|
66 |
||
67 |
Server-side
|
|
68 |
-----------
|
|
69 |
||
70 |
MEDIUM (factory for protocol, reads bytes & pushes to protocol,
|
|
71 |
uses protocol to detect end-of-request, sends written
|
|
72 |
bytes to client) e.g. socket, pipe, HTTP request handler.
|
|
73 |
^
|
|
74 |
| bytes.
|
|
75 |
v
|
|
76 |
||
77 |
PROTOCOL (serialization, deserialization) accepts bytes for one
|
|
78 |
request, decodes according to internal state, pushes
|
|
79 |
structured data to handler. accepts structured data from
|
|
80 |
handler and encodes and writes to the medium. factory for
|
|
81 |
handler.
|
|
82 |
^
|
|
83 |
| structured data
|
|
84 |
v
|
|
85 |
||
86 |
HANDLER (domain logic) accepts structured data, operates state
|
|
87 |
machine until the request can be satisfied,
|
|
88 |
sends structured data to the protocol.
|
|
89 |
||
90 |
||
91 |
Client-side
|
|
92 |
-----------
|
|
93 |
||
94 |
CLIENT domain logic, accepts domain requests, generated structured
|
|
95 |
data, reads structured data from responses and turns into
|
|
96 |
domain data. Sends structured data to the protocol.
|
|
97 |
Operates state machines until the request can be delivered
|
|
98 |
(e.g. reading from a bundle generated in bzrlib to deliver a
|
|
99 |
complete request).
|
|
100 |
||
101 |
Possibly this should just be RemoteBzrDir, RemoteTransport,
|
|
102 |
...
|
|
103 |
^
|
|
104 |
| structured data
|
|
105 |
v
|
|
106 |
||
107 |
PROTOCOL (serialization, deserialization) accepts structured data for one
|
|
108 |
request, encodes and writes to the medium. Reads bytes from the
|
|
109 |
medium, decodes and allows the client to read structured data.
|
|
110 |
^
|
|
111 |
| bytes.
|
|
112 |
v
|
|
113 |
||
114 |
MEDIUM (accepts bytes from the protocol & delivers to the remote server.
|
|
115 |
Allows the potocol to read bytes e.g. socket, pipe, HTTP request.
|
|
116 |
"""
|
|
117 |
||
118 |
# TODO: _translate_error should be on the client, not the transport because
|
|
119 |
# error coding is wire protocol specific.
|
|
120 |
||
121 |
# TODO: A plain integer from query_version is too simple; should give some
|
|
122 |
# capabilities too?
|
|
123 |
||
124 |
# TODO: Server should probably catch exceptions within itself and send them
|
|
125 |
# back across the network. (But shouldn't catch KeyboardInterrupt etc)
|
|
126 |
# Also needs to somehow report protocol errors like bad requests. Need to
|
|
127 |
# consider how we'll handle error reporting, e.g. if we get halfway through a
|
|
128 |
# bulk transfer and then something goes wrong.
|
|
129 |
||
130 |
# TODO: Standard marker at start of request/response lines?
|
|
131 |
||
132 |
# TODO: Make each request and response self-validatable, e.g. with checksums.
|
|
133 |
#
|
|
134 |
# TODO: get/put objects could be changed to gradually read back the data as it
|
|
135 |
# comes across the network
|
|
136 |
#
|
|
137 |
# TODO: What should the server do if it hits an error and has to terminate?
|
|
138 |
#
|
|
139 |
# TODO: is it useful to allow multiple chunks in the bulk data?
|
|
140 |
#
|
|
141 |
# TODO: If we get an exception during transmission of bulk data we can't just
|
|
142 |
# emit the exception because it won't be seen.
|
|
143 |
# John proposes: I think it would be worthwhile to have a header on each
|
|
144 |
# chunk, that indicates it is another chunk. Then you can send an 'error'
|
|
145 |
# chunk as long as you finish the previous chunk.
|
|
146 |
#
|
|
147 |
# TODO: Clone method on Transport; should work up towards parent directory;
|
|
148 |
# unclear how this should be stored or communicated to the server... maybe
|
|
149 |
# just pass it on all relevant requests?
|
|
150 |
#
|
|
151 |
# TODO: Better name than clone() for changing between directories. How about
|
|
152 |
# open_dir or change_dir or chdir?
|
|
153 |
#
|
|
154 |
# TODO: Is it really good to have the notion of current directory within the
|
|
155 |
# connection? Perhaps all Transports should factor out a common connection
|
|
156 |
# from the thing that has the directory context?
|
|
157 |
#
|
|
158 |
# TODO: Pull more things common to sftp and ssh to a higher level.
|
|
159 |
#
|
|
160 |
# TODO: The server that manages a connection should be quite small and retain
|
|
161 |
# minimum state because each of the requests are supposed to be stateless.
|
|
162 |
# Then we can write another implementation that maps to http.
|
|
163 |
#
|
|
164 |
# TODO: What to do when a client connection is garbage collected? Maybe just
|
|
165 |
# abruptly drop the connection?
|
|
166 |
#
|
|
167 |
# TODO: Server in some cases will need to restrict access to files outside of
|
|
168 |
# a particular root directory. LocalTransport doesn't do anything to stop you
|
|
169 |
# ascending above the base directory, so we need to prevent paths
|
|
170 |
# containing '..' in either the server or transport layers. (Also need to
|
|
171 |
# consider what happens if someone creates a symlink pointing outside the
|
|
172 |
# directory tree...)
|
|
173 |
#
|
|
174 |
# TODO: Server should rebase absolute paths coming across the network to put
|
|
175 |
# them under the virtual root, if one is in use. LocalTransport currently
|
|
176 |
# doesn't do that; if you give it an absolute path it just uses it.
|
|
177 |
#
|
|
178 |
# XXX: Arguments can't contain newlines or ascii; possibly we should e.g.
|
|
179 |
# urlescape them instead. Indeed possibly this should just literally be
|
|
180 |
# http-over-ssh.
|
|
181 |
#
|
|
182 |
# FIXME: This transport, with several others, has imperfect handling of paths
|
|
183 |
# within urls. It'd probably be better for ".." from a root to raise an error
|
|
184 |
# rather than return the same directory as we do at present.
|
|
185 |
#
|
|
186 |
# TODO: Rather than working at the Transport layer we want a Branch,
|
|
187 |
# Repository or BzrDir objects that talk to a server.
|
|
188 |
#
|
|
189 |
# TODO: Probably want some way for server commands to gradually produce body
|
|
190 |
# data rather than passing it as a string; they could perhaps pass an
|
|
191 |
# iterator-like callback that will gradually yield data; it probably needs a
|
|
192 |
# close() method that will always be closed to do any necessary cleanup.
|
|
193 |
#
|
|
194 |
# TODO: Split the actual smart server from the ssh encoding of it.
|
|
195 |
#
|
|
196 |
# TODO: Perhaps support file-level readwrite operations over the transport
|
|
197 |
# too.
|
|
198 |
#
|
|
199 |
# TODO: SmartBzrDir class, proxying all Branch etc methods across to another
|
|
200 |
# branch doing file-level operations.
|
|
201 |
#
|