Python URLLib / URLLib2 POST

u = urllib2.urlopen(‘http://myserver/inout-tracker’, data) h.request(‘POST’, ‘/inout-tracker/index.php’, data, headers) Using the path /inout-tracker without a trailing / doesn’t fetch index.php. Instead the server will issue a 302 redirect to the version with the trailing /. Doing a 302 will typically cause clients to convert a POST to a GET request.

Replace special characters in a string in Python

One way is to use re.sub, that’s my preferred way. import re my_str = “hey th~!ere” my_new_string = re.sub(‘[^a-zA-Z0-9 \n\.]’, ”, my_str) print my_new_string Output: hey there Another way is to use re.escape: import string import re my_str = “hey th~!ere” chars = re.escape(string.punctuation) print re.sub(‘[‘+chars+’]’, ”,my_str) Output: hey there Just a small tip about …

Read more

Which should I be using: urlparse or urlsplit?

Directly from the docs you linked yourself: urllib.parse.urlsplit(urlstring, scheme=””, allow_fragments=True) This is similar to urlparse(), but does not split the params from the URL. This should generally be used instead of urlparse() if the more recent URL syntax allowing parameters to be applied to each segment of the path portion of the URL (see RFC …

Read more

Is there a unicode-ready substitute I can use for urllib.quote and urllib.unquote in Python 2.6.5?

Python’s urllib.quote and urllib.unquote do not handle Unicode correctly urllib does not handle Unicode at all. URLs don’t contain non-ASCII characters, by definition. When you’re dealing with urllib you should use only byte strings. If you want those to represent Unicode characters you will have to encode and decode them manually. IRIs can contain non-ASCII …

Read more

urllib.quote() throws KeyError

You are trying to quote Unicode data, so you need to decide how to turn that into URL-safe bytes. Encode the string to bytes first. UTF-8 is often used: >>> import urllib >>> urllib.quote(u’sch\xe9nefeld’) /opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib.py:1268: UnicodeWarning: Unicode equal comparison failed to convert both arguments to Unicode – interpreting them as being unequal return ”.join(map(quoter, s)) …

Read more

What is the global default timeout

I suspect this is implementation-dependent. That said, for CPython: From socket.create_connection, If no timeout is supplied, the global default timeout setting returned by :func:getdefaulttimeout is used. From socketmodule.c, static PyObject * socket_getdefaulttimeout(PyObject *self) { if (defaulttimeout < 0.0) { Py_INCREF(Py_None); return Py_None; } else return PyFloat_FromDouble(defaulttimeout); } Earlier in the same file, static double defaulttimeout …

Read more

Python: Get HTTP headers from urllib2.urlopen call?

Use the response.info() method to get the headers. From the urllib2 docs: urllib2.urlopen(url[, data][, timeout]) … This function returns a file-like object with two additional methods: geturl() — return the URL of the resource retrieved, commonly used to determine if a redirect was followed info() — return the meta-information of the page, such as headers, …

Read more

How to unquote a urlencoded unicode string in python?

%uXXXX is a non-standard encoding scheme that has been rejected by the w3c, despite the fact that an implementation continues to live on in JavaScript land. The more common technique seems to be to UTF-8 encode the string and then % escape the resulting bytes using %XX. This scheme is supported by urllib.unquote: >>> urllib2.unquote(“%0a”) …

Read more