Convert URL to normal windows filename Java

The current recommendation (with JDK 1.7+) is to convert URL → URI → Path. So to convert a URL to File, you would say Paths.get(url.toURI()).toFile(). If you can’t use JDK 1.7 yet, I would recommend new File(URI.getSchemeSpecificPart()).

Converting file → URI: First I’ll show you some examples of what URIs you are likely to get in Java.

                          -classpath URLClassLoader File.toURI()                Path.toUri()
C:\Program Files          file:/C:/Program%20Files/ file:/C:/Program%20Files/   file:///C:/Program%20Files/
C:\main.c++               file:/C:/main.c++         file:/C:/main.c++           file:///C:/main.c++
\\VBOXSVR\Downloads       file://VBOXSVR/Downloads/ file:////VBOXSVR/Downloads/ file://VBOXSVR/Downloads/
C:\Résume.txt             file:/C:/R%c3%a9sume.txt  file:/C:/Résume.txt         file:///C:/Résume.txt
\\?\C:\Windows (non-path) file://%3f/C:/Windows/    file:////%3F/C:/Windows/    InvalidPathException

Some observations about these URIs:

  • The URI specifications are RFC 1738: URL, superseded by RFC 2396: URI, superseded by RFC 3986: URI. (The WHATWG also has a URI spec, but it does not specify how file URIs should be interpreted.) Any reserved characters within the path are percent-quoted, and non-ascii characters in a URI are percent-quoted when you call URI.toASCIIString().
  • File.toURI() is worse than Path.toUri() because File.toURI() returns an unusual non-RFC 1738 URI (gives file:/ instead of file:///) and does not format URIs for UNC paths according to Microsoft’s preferred format. None of these UNC URIs work in Firefox though (Firefox requires file://///).
  • Path is more strict than File; you cannot construct an invalid Path from “\.\” prefix. “These prefixes are not used as part of the path itself,” but they can be passed to Win32 APIs.

Converting URI → file: Let’s try converting the preceding examples to files:

                            new File(URI)            Paths.get(URI)           new File(URI.getSchemeSpecificPart())
file:///C:/Program%20Files  C:\Program Files         C:\Program Files         C:\Program Files
file:/C:/Program%20Files    C:\Program Files         C:\Program Files         C:\Program Files
file:///C:/main.c++         C:\main.c++              C:\main.c++              C:\main.c++
file://VBOXSVR/Downloads/   IllegalArgumentException \\VBOXSVR\Downloads\     \\VBOXSVR\Downloads
file:////VBOXSVR/Downloads/ \\VBOXSVR\Downloads      \\VBOXSVR\Downloads\     \\VBOXSVR\Downloads
file://///VBOXSVR/Downloads \\VBOXSVR\Downloads      \\VBOXSVR\Downloads\     \\VBOXSVR\Downloads
file://%3f/C:/Windows/      IllegalArgumentException IllegalArgumentException \\?\C:\Windows
file:////%3F/C:/Windows/    \\?\C:\Windows           InvalidPathException     \\?\C:\Windows

Again, using Paths.get(URI) is preferred over new File(URI), because Path is able to handle the UNC URI and reject invalid paths with the \?\ prefix. But if you can’t use Java 1.7, say new File(URI.getSchemeSpecificPart()) instead.

By the way, do not use URLDecoder to decode a file URL. For files containing “+” such as “file:///C:/main.c++”, URLDecoder will turn it into “C:\main.c  ”! URLDecoder is only for parsing application/x-www-form-urlencoded HTML form submissions within a URI’s query (param=value&param=value), not for unquoting a URI’s path.

2014-09: edited to add examples.

Leave a Comment