NCBI C++ ToolKit
ncbi_http_connector.h
Go to the documentation of this file.

Go to the SVN repository for this file.

1 #ifndef CONNECT___HTTP_CONNECTOR__H
2 #define CONNECT___HTTP_CONNECTOR__H
3 
4 /* $Id: ncbi_http_connector.h 102054 2024-03-23 01:53:55Z lavr $
5  * ===========================================================================
6  *
7  * PUBLIC DOMAIN NOTICE
8  * National Center for Biotechnology Information
9  *
10  * This software/database is a "United States Government Work" under the
11  * terms of the United States Copyright Act. It was written as part of
12  * the author's official duties as a United States Government employee and
13  * thus cannot be copyrighted. This software/database is freely available
14  * to the public for use. The National Library of Medicine and the U.S.
15  * Government have not placed any restriction on its use or reproduction.
16  *
17  * Although all reasonable efforts have been taken to ensure the accuracy
18  * and reliability of the software and data, the NLM and the U.S.
19  * Government do not and cannot warrant the performance or results that
20  * may be obtained by using this software or data. The NLM and the U.S.
21  * Government disclaim all warranties, express or implied, including
22  * warranties of performance, merchantability or fitness for any particular
23  * purpose.
24  *
25  * Please cite the author in any work or product based on this material.
26  *
27  * ===========================================================================
28  *
29  * Author: Denis Vakatov
30  *
31  * File Description:
32  * Implement CONNECTOR for the HTTP-based network connection
33  *
34  * See in "ncbi_connector.h" for the detailed specification of the underlying
35  * connector ("CONNECTOR", "SConnectorTag") methods and structures.
36  *
37  */
38 
39 #include <connect/ncbi_connutil.h>
40 
41 #ifndef NCBI_DEPRECATED
42 # define NCBI_HTTP_CONNECTOR_DEPRECATED
43 #else
44 # define NCBI_HTTP_CONNECTOR_DEPRECATED NCBI_DEPRECATED
45 #endif
46 
47 
48 /** @addtogroup Connectors
49  *
50  * @{
51  */
52 
53 
54 #ifdef __cplusplus
55 extern "C" {
56 #endif
57 
58 
59 /** HTTP connector flags.
60  *
61  * @var fHTTP_Flushable
62  *
63  * HTTP/1.0 or when fHTTP_WriteThru is not set:
64  * by default all data written to the connection are kept until read
65  * begins (even though CONN_Flush() might have been called in between the
66  * writes); with this flag set, CONN_Flush() will result the data to be
67  * actually sent to the server side, so the following write will form a
68  * new request, and not get added to the previous one; also this flag
69  * assures that the connector sends at least an HTTP header on "CLOSE"
70  * and re-"CONNECT", even if no data for HTTP body have been written.
71  *
72  * HTTP/1.1 and when fHTTP_WriteThru is set:
73  * CONN_Flush() attempts to send all pending data down to server.
74  *
75  * @var fHTTP_KeepHeader
76  * Do not strip HTTP header (i.e. everything up to the first "\r\n\r\n",
77  * including the "\r\n\r\n") from the incomning HTTP response (including
78  * any server error, which then is made available for reading as well).
79  * *NOTE* this flag disables automatic authorization and redirection.
80  *
81  * @var fHCC_UrlDecodeInput
82  * Assume the response body as single-part, URL-encoded; perform the
83  * URL-decoding on read, and deliver decoded data to the user. Obsolete!
84  *
85  * @var fHTTP_PushAuth
86  * Present credentials to the server if they are set in the connection
87  * parameters when sending 1st request. Normally, the credentials are
88  * only presented on a retry when the server rejects the initial request
89  * with 401 / 407. This saves a hit, but is only honored with HTTP/1.1.
90  *
91  * @var fHTTP_WriteThru
92  * Valid only with HTTP/1.1: Connection to the server is made upon a
93  * first CONN_Write(), or CONN_Flush() if fHTTP_Flushable is set, or
94  * CONN_Wait(eIO_Write), and each CONN_Write() forms a chunk of HTTP
95  * data to be sent to the server. Reading / waiting for read from the
96  * connector finalizes the body and, if reading, fetches the response.
97  *
98  * @var fHTTP_NoUpread
99  * Do *not* do internal reading into temporary buffer while sending data
100  * to HTTP server; by default any send operation tries to fetch data as
101  * they are coming back from the server in order to prevent stalling due
102  * to data clogging the connection.
103  *
104  * @var fHTTP_DropUnread
105  * Do not collect incoming data in "Read" mode before switching into
106  * "Write" mode for preparing next request; by default all data sent by
107  * the server get stored even if not all of it have been requested prior
108  * to a "Write" that followed data reading (stream emulation).
109  *
110  * @var fHTTP_NoAutoRetry
111  * Do not attempt any auto-retries in case of failing connections
112  * (this flag effectively overrides SConnNetInfo::max_try with 1).
113  *
114  * @var fHTTP_UnsafeRedirects
115  * For security reasons the following redirects comprise security risk,
116  * and thus, are prohibited: switching from https to http, and/or
117  * re-POSTing data (regardless of the transport, either http or https);
118  * this flag allows such redirects (when encountered) to be honored.
119  *
120  * @note
121  * URL encoding/decoding (in the "fHCC_Url*" cases and "net_info->args")
122  * is performed by URL_Encode() and URL_Decode() -- see "ncbi_connutil.[ch]".
123  *
124  * @sa
125  * SConnNetInfo, ConnNetInfo_OverrideUserHeader, URL_Encode, URL_Decode
126  */
128  fHTTP_AutoReconnect = 0x1, /**< See HTTP_CreateConnectorEx() */
129  fHTTP_Flushable = 0x2, /**< Connector will really flush on Flush()*/
130  fHTTP_KeepHeader = 0x4, /**< Keep HTTP header (see limitations) */
131  /*fHCC_UrlEncodeArgs = 0x8, URL-encode "info->args" (w/o fragment)*/
132  /*fHCC_UrlDecodeInput = 0x10, URL-decode response body */
133  /*fHCC_UrlEncodeOutput = 0x20, URL-encode all output data */
134  /*fHCC_UrlCodec = 0x30, fHTTP_UrlDecodeInput | ...EncodeOutput*/
135  fHTTP_PushAuth = 0x10, /**< HTTP/1.1 pushes out auth if present */
136  fHTTP_WriteThru = 0x20, /**< HTTP/1.1 writes through (chunked) */
137  fHTTP_NoUpread = 0x40, /**< Do not use SOCK_SetReadOnWrite() */
138  fHTTP_DropUnread = 0x80, /**< Each microsession drops unread data */
139  fHTTP_NoAutoRetry = 0x100,/**< No auto-retries allowed */
140  fHTTP_NoAutomagicSID = 0x200,/**< Do not add NCBI SID automagically */
141  fHTTP_UnsafeRedirects = 0x400,/**< Any redirect will be honored */
142  fHTTP_AdjustOnRedirect= 0x800,/**< Call adjust routine for redirects, too*/
143  fHTTP_SuppressMessages= 0x1000/**< Most annoying ones reduced to traces */
144 };
145 typedef unsigned int THTTP_Flags; /**< Bitwise OR of EHTTP_Flag */
147 /** DEPRECATED, do not use! */
148 typedef enum {
149  /*fHCC_AutoReconnect = fHTTP_AutoReconnect, */
150  /*fHCC_Flushable = fHTTP_Flushable, */
151  /*fHCC_SureFlush = fHTTP_Flushable, */
152  /*fHCC_KeepHeader = fHTTP_KeepHeader, */
153  fHCC_UrlEncodeArgs = 0x8, /**< NB: Error-prone semantics, do not use!*/
154  fHCC_UrlDecodeInput = 0x10, /**< Obsolete, may not work, do not use! */
155  fHCC_UrlEncodeOutput = 0x20, /**< Obsolete, may not work, do not use! */
156  fHCC_UrlCodec = 0x30 /**< fHCC_UrlDecodeInput | ...EncodeOutput */
157  /*fHCC_NoUpread = fHTTP_NoUpread, */
158  /*fHCC_DropUnread = fHTTP_DropUnread, */
159  /*fHCC_NoAutoRetry = fHTTP_NoAutoRetry */
162 typedef unsigned int THCC_Flags; /**< bitwise OR of EHCC_Flag, deprecated */
163 
164 
165 /** Same as HTTP_CreateConnector(net_info, flags, 0, 0, 0, 0)
166  * with the passed "user_header" overriding the value provided in
167  * "net_info->http_user_header".
168  * @sa
169  * HTTP_CreateConnectorEx, ConnNetInfo_OverrideUserHeader
170  */
172 (const SConnNetInfo* net_info,
173  const char* user_header,
175  );
176 
177 
178 /** The extended version HTTP_CreateConnectorEx() is able to track the HTTP
179  * response chain and also change the URL of the server "on-the-fly":
180  *
181  * - FHTTP_ParseHeader() gets called every time a new HTTP response header is
182  * received from the server, and only if fHTTP_KeepHeader is NOT set.
183  * Return code from the parser adjusts the existing server error condition
184  * (if any) as the following:
185  *
186  * + eHTTP_HeaderError: unconditionally flag a server error;
187  * + eHTTP_HeaderSuccess: header parse successful, retain existing condition
188  * (note that in case of an already existing server
189  * error condition the response body can be logged
190  * but will not be made available for the user code
191  * to read, and eIO_Unknown will result on read);
192  * + eHTTP_HeaderContinue: if there was already a server error condition,
193  * the response body will be made available for the
194  * user code to read (but only if HTTP connector
195  * cannot post-process the request such as for
196  * redirects, authorization etc); otherwise, this
197  * code has the same effect as eHTTP_HeaderSuccess;
198  * + eHTTP_HeaderComplete: flag this request as processed completely, and do
199  * not do any post-processing (such as redirects,
200  * authorization etc), yet make the response body (if
201  * any, and regardless of whether there was a server
202  * error or not) available for reading.
203  *
204  * - FHTTP_Adjust() gets invoked every time before starting a new "HTTP
205  * micro-session" (e.g. to make a hit when a previous hit has failed, to
206  * follow a redirect, or to change a URL to hit next). It is passed the live
207  * "net_info" exactly as stored within the connector, and:
208  * 1. a positive number of previously unsuccessful consecutive attempts
209  * (in the least significant word of "failure_count") since the connector
210  * was successfully opened; or
211  * 2. 0 in "failure_count" if being called for a redirect (when
212  * fHTTP_AdjustOnRedirect was set); or
213  * 3. -1 in "failure_count" if the callback is invoked when a new request
214  * is about to be made (as a solicitaiton of new URL for the hit, if any
215  * available).
216  * A zero (false) return value ends the request/retries and issues an error.
217  * A non-zero continues with the request: an advisory value of greater than
218  * 0 means an adjustment was made, and a negative value indicates no changes
219  * (and in case "3" above would not initiate any subsequent new request).
220  * If a new HTTP request is started as a result of the callback, the updated
221  * SConnNetInfo parameters ("net_info") are going to be used.
222  *
223  * - FHTTP_Cleanup() gets called when the connector is about to be destroyed;
224  * "user_data" is guaranteed not to be referenced anymore (so this is a good
225  * place to clean up "user_data" if necessary).
226  *
227  * @sa
228  * SConnNetInfo::max_try
229  */
230 typedef enum {
231  eHTTP_HeaderError = 0, /**< Parse failed, treat as a server error */
232  eHTTP_HeaderSuccess = 1, /**< Parse succeeded, retain server status */
233  eHTTP_HeaderContinue = 2, /**< Parse succeeded, continue with body */
234  eHTTP_HeaderComplete = 3 /**< Parse succeeded, no more processing */
237 (const char* http_header, /**< HTTP header to parse */
238  void* user_data, /**< supplemental user data */
239  int server_error /**< != 0 if HTTP error (NOT 2xx code) */
240  );
241 
242 
243 /* Called with failure_count == 0 for redirects; and with failure_count == -1
244  * for a new URL before starting new successive request(s). Return value 0
245  * means an error, and stops processing; return value 1 means changes were
246  * made, and request should proceed; and return value -1 means no changes.
247  */
248 typedef int/*bool*/ (*FHTTP_Adjust)
249 (SConnNetInfo* net_info, /**< net_info to adjust (in place) */
250  void* user_data, /**< supplemental user data */
251  unsigned int failure_count /**< low word: # of failures since open */
252  );
253 
254 typedef void (*FHTTP_Cleanup)
255 (void* user_data /**< supplemental user data */
256  );
257 
258 
259 /** Create new CONNECTOR structure to hit the specified URL using HTTP with
260  * either POST / GET (or ANY) method. Use the configuration values stored in
261  * "net_info". If "net_info" is NULL, then use the default info as created by
262  * ConnNetInfo_Create(0).
263  *
264  * If "net_info" does not explicitly specify an HTTP request method (i.e. it
265  * has it as "eReqMethod_Any"), then the actual method sent to the HTTP server
266  * depends on whether any data has been written to the connection with
267  * CONN_Write(): the presence of pending data will cause a POST request (with
268  * a "Content-Length:" tag supplied automatically and reflecting the total
269  * pending data size), and GET request method will result in the absence of any
270  * data. An explicit value for the request method will cause the specified
271  * request to be used regardless of pending data, and will flag an error if any
272  * data will have to be sent with a GET (per the standard).
273  *
274  * When not using HTTP/1.1's fHTTP_WriteThru mode, in order to work around
275  * some HTTP communication features, this code does:
276  *
277  * 1. Accumulate all output data in an internal memory buffer until the
278  * first CONN_Read() (including peek) or CONN_Wait(on read) is attempted
279  * (also see fHTTP_Flushable flag below).
280  * 2. On the first CONN_Read() or CONN_Wait(on read), compose and send the
281  * whole HTTP request as:
282  * @verbatim
283  * METHOD <net_info->path>?<net_info->args> HTTP/1.0\r\n
284  * <user_header\r\n>
285  * Content-Length: <accumulated_data_length>\r\n
286  * \r\n
287  * <accumulated_data>
288  * @endverbatim
289  * @note
290  * If <user_header> is neither a NULL pointer nor an empty string, then:
291  * - it must NOT contain any "empty lines": "\r\n\r\n";
292  * - multiple tags must be separated by "\r\n" (*not* just "\n");
293  * - it should be terminated by a single "\r\n" (will be added, if not);
294  * - it gets inserted to the HTTP header "as is", without any automatic
295  * checking and / or encoding (except for the trailing "\r\n");
296  * - the "user_header" specified in the arguments overrides any user
297  * header that can be provided via the "net_info" argument, see
298  * ConnNetInfo_OverrideUserHeader() from <connect/ncbi_connutil.h>.
299  * @note
300  * Data may depart to the server side earlier if CONN_Flush()'ed in a
301  * fHTTP_Flushable connector, see "flags".
302  * 3. Once the request has been sent, then the response data from the peer
303  * (usually, a CGI program) can be actually read out.
304  * 4. On a CONN_Write() operation, which follows data reading, the connection
305  * to the peer is read out until EOF (all the data saved internally) then
306  * forcedly closed (the peer CGI process will presumably die if it has not
307  * done so yet on its own), and data to be written again get stored in the
308  * buffer until next "Read" etc, see item 1). The subsequent read will
309  * first see the leftovers (if any) of data saved previously, then the new
310  * data generated in response to the latest request. The behavior can be
311  * changed by the fHTTP_DropUnread flag (not to save the unread data).
312  *
313  * When fHTTP_WriteThru is set with HTTP/1.1, writing to the connector begins
314  * upon any write operations, and reading from the connector causes the
315  * request body to finalize and response to be fetched from the server.
316  * Request method must be explicitly specified with fHTTP_WriteThru, "ANY"
317  * does not get accepted (eIO_NotSupported returned).
318  *
319  * @note
320  * If "fHTTP_AutoReconnect" is set in "flags", then the connector makes an
321  * automatic reconnect to the same URL with just the same parameters for
322  * each micro-session steps (1,2,3) repeated.
323  * @note
324  * If "fHTTP_AutoReconnect" is not set then only a single
325  * "Write ... Write Read ... Read" micro-session is allowed, and any
326  * following write attempt fails with "eIO_Closed".
327  *
328  * @sa
329  * EHTTP_Flag
330  */
332 (const SConnNetInfo* net_info,
334  FHTTP_ParseHeader parse_header, /**< may be NULL, then no addtl. parsing */
335  void* user_data, /**< user data for HTTP CBs (callbacks) */
336  FHTTP_Adjust adjust, /**< may be NULL */
337  FHTTP_Cleanup cleanup /**< may be NULL */
338  );
339 
340 
341 /** Create a tunnel to "net_info->host:net_info->port" via an HTTP proxy server
342  * located at "net_info->http_proxy_host:net_info->http_proxy_port". Return
343  * the tunnel as a socket via the last parameter. For compatibility with
344  * future API extensions, please make sure *sock is NULL when making the call.
345  * "net_info->scheme" is only used to infer the proper default form of the
346  * ":port" part in the "Host:" tag for the proxy request in case of HTTP[S]
347  * (thus, eURL_Unspec forces the ":port" part to be always present in the tag).
348  * @note
349  * "net_info" can be passed as NULL to be constructed from the environment.
350  * @note
351  * "sock" parameter must be non-NULL but must point to a NULL SOCK (checked!).
352  * @note
353  * Some HTTP proxies do not process "data" correctly (e.g. Squid 3) when sent
354  * along with the tunnel creation request (despite the standard specifically
355  * allows such use), so they may require separate SOCK I/O calls to write the
356  * data to the tunnel.
357  * @return
358  * eIO_Success if the tunnel (in *sock) has been successfully created;
359  * otherwise, return an error code and if "*sock" was passed non-NULL and has
360  * not been used at all in the call (consider: memory allocation errors or
361  * invalid arguments to the call), do not modify it; else, "*sock" gets
362  * closed internally and "*sock" returned cleared as 0.
363  * @sa
364  * THTTP_Flags, SOCK_CreateEx, SOCK_Close
365  */
367 (const SConnNetInfo* net_info,
369  const void* init_data, /**< initial data block to send via tunnel */
370  size_t init_size, /**< size of the initial data block */
371  void* user_data, /**< user data for the adjust callback */
372  FHTTP_Adjust adjust, /**< adjust callback, may be NULL */
373  SOCK* sock /**< return socket; must be non-NULL */
374  );
375 
376 
377 /** Same as HTTP_CreateTunnelEx(net_info, flags, 0, 0, 0, 0, sock) */
379 (const SConnNetInfo* net_info,
381  SOCK* sock
382  );
383 
384 
385 typedef void (*FHTTP_NcbiMessageHook)(const char* message);
386 
387 /** Set a message hook procedure for messages originating from NCBI via HTTP.
388  * Any hook will be called no more than once. Until no hook is installed,
389  * and exactly one message is caught, a critical error will be generated in
390  * the standard log file upon acceptance of every message. *Not MT-safe*.
391  */
393 (FHTTP_NcbiMessageHook /**< New hook to be installed, NULL to reset */
394  );
395 
396 
397 #ifdef __cplusplus
398 } /* extern "C" */
399 #endif
400 
401 
402 /* @} */
403 
404 #endif /* CONNECT___HTTP_CONNECTOR__H */
static uch flags
static void cleanup(void)
Definition: ct_dynamic.c:30
void(* FHTTP_NcbiMessageHook)(const char *message)
EIO_Status HTTP_CreateTunnel(const SConnNetInfo *net_info, THTTP_Flags flags, SOCK *sock)
Same as HTTP_CreateTunnelEx(net_info, flags, 0, 0, 0, 0, sock)
CONNECTOR HTTP_CreateConnectorEx(const SConnNetInfo *net_info, THTTP_Flags flags, FHTTP_ParseHeader parse_header, void *user_data, FHTTP_Adjust adjust, FHTTP_Cleanup cleanup)
Create new CONNECTOR structure to hit the specified URL using HTTP with either POST / GET (or ANY) me...
unsigned int THTTP_Flags
Bitwise OR of EHTTP_Flag.
int(* FHTTP_Adjust)(SConnNetInfo *net_info, void *user_data, unsigned int failure_count)
void(* FHTTP_Cleanup)(void *user_data)
CONNECTOR HTTP_CreateConnector(const SConnNetInfo *net_info, const char *user_header, THTTP_Flags flags)
Same as HTTP_CreateConnector(net_info, flags, 0, 0, 0, 0) with the passed "user_header" overriding th...
void HTTP_SetNcbiMessageHook(FHTTP_NcbiMessageHook)
Set a message hook procedure for messages originating from NCBI via HTTP.
EHTTP_HeaderParse
The extended version HTTP_CreateConnectorEx() is able to track the HTTP response chain and also chang...
EHTTP_HeaderParse(* FHTTP_ParseHeader)(const char *http_header, void *user_data, int server_error)
EIO_Status HTTP_CreateTunnelEx(const SConnNetInfo *net_info, THTTP_Flags flags, const void *init_data, size_t init_size, void *user_data, FHTTP_Adjust adjust, SOCK *sock)
Create a tunnel to "net_info->host:net_info->port" via an HTTP proxy server located at "net_info->htt...
EHCC_Flag
DEPRECATED, do not use!
unsigned int THCC_Flags
bitwise OR of EHCC_Flag, deprecated
@ fHTTP_DropUnread
Each microsession drops unread data.
@ fHTTP_AdjustOnRedirect
Call adjust routine for redirects, too.
@ fHTTP_NoUpread
Do not use SOCK_SetReadOnWrite()
@ fHTTP_NoAutoRetry
No auto-retries allowed.
@ fHTTP_AutoReconnect
See HTTP_CreateConnectorEx()
@ fHTTP_KeepHeader
Keep HTTP header (see limitations)
@ fHTTP_NoAutomagicSID
Do not add NCBI SID automagically.
@ fHTTP_SuppressMessages
Most annoying ones reduced to traces.
@ fHTTP_UnsafeRedirects
Any redirect will be honored.
@ fHTTP_Flushable
Connector will really flush on Flush()
@ fHTTP_PushAuth
HTTP/1.1 pushes out auth if present.
@ fHTTP_WriteThru
HTTP/1.1 writes through (chunked)
@ eHTTP_HeaderSuccess
Parse succeeded, retain server status.
@ eHTTP_HeaderError
Parse failed, treat as a server error.
@ eHTTP_HeaderContinue
Parse succeeded, continue with body.
@ eHTTP_HeaderComplete
Parse succeeded, no more processing.
@ fHCC_UrlEncodeOutput
Obsolete, may not work, do not use!
@ fHCC_UrlCodec
fHCC_UrlDecodeInput | ...EncodeOutput
@ fHCC_UrlDecodeInput
Obsolete, may not work, do not use!
@ fHCC_UrlEncodeArgs
NB: Error-prone semantics, do not use!
EIO_Status
I/O status.
Definition: ncbi_core.h:132
#define NCBI_XCONNECT_EXPORT
#define NCBI_HTTP_CONNECTOR_DEPRECATED
Connector specification.
Modified on Wed Jul 17 13:23:02 2024 by modify_doxy.py rev. 669887