Friday, November 21, 2008

Validating File Type

A few weeks back, a colleague of mine - who had built a .Net web page to upload documents, asked me for a good technique to validate the uploaded documents.

I asked him how much he has already covered and he explained that he was using the ASP.Net Fileupload control and is validating for file extensions to be only .doc and the file size to not exceed 2MB.

Though this would work fine, there is still one loop hole. Someone could change the extension of a malicious file to .doc and upload it and bring it into the system. The users have a different module to access the uploaded documents and the uploader can then rename it to whatever file extension it originally was. We just did not want that to happen. At least the system newly built should not provide that loop hole.

The options were to check for Content-Type - which failed miserably because the Content-Type is again driven by the mere extension of the file uploaded.

Since security was the important issue and also since it did not matter whether the validation is made on the server side or client side - I suggested to check for the MIME Type and gave the following code snippet to help:


Public Function GetMimeDataOfFile(ByVal strFileName As String) As String

' Declare and Initialize variables
Dim mimeOut As IntPtr
Dim returnStr As String = ""
Dim fstream As FileStream = Nothing
Dim buffer() As Byte = Nothing
Dim urlmonResult As Integer = -1
Dim maxContentSize As Integer = 0
Dim fileInfo As FileInfo = Nothing


' Short circuit out if file not found

If Not File.Exists(strFileName) Then

Return returnStr

Else

fileInfo = New FileInfo(strFileName)


' Read the file in buffer - Initial 1024 bytes is enough to determine the mime type
maxContentSize = fileInfo.Length

If maxContentSize > 1024 Then
maxContentSize = 1024
End If


' Initialize buffer
ReDim buffer(maxContentSize)


' Open file in stream read mode - read it into buffer
fstream = fileInfo.OpenRead()
With fstream
.Read(buffer, 0, maxContentSize)
.Close()
End With

' Call our pinvoke method
urlmonResult = DllImports.FindMimeFromData(IntPtr.Zero, strFileName, buffer, maxContentSize _

, Nothing, 0, mimeOut, 0)


If urlmonResult = 0 Then

' Convert the pointer data to Unicode text
returnStr = System.Runtime.InteropServices.Marshal.PtrToStringUni(mimeOut)

End If


' Free up the pointer references
System.Runtime.InteropServices.Marshal.FreeCoTaskMem(mimeOut)


End If

Return returnStr
End Function


You can easily get the following pinvoke signatures code from http://pinvoke.net


_

Public Shared Function FindMimeFromData( _
ByVal pBC As IntPtr, _
_
ByVal pwzUrl As String, _
ByVal _
pBuffer As Byte(), _
ByVal cbSize As Integer, _
_
ByVal pwzMimeProposed As String, _
ByVal dwMimeFlags As Integer, _
_
ByRef ppwzMimeOut As Integer, _
ByVal dwReserved As Integer) As Integer
End Function

End Class


Now even if a .jpg file is renamed as .doc and uploaded, this function would see through it and will help to notify that the content does not match the file extension.

No comments: