Friday, 15 February 2008

Input Validation... Getting It Right the First Time?

Most of the attacks on web applications happen because of poor input validation. So can we say if input validation is done properly, these applications would be able to avoid these attacks? Maybe, but this is easier said than done.

With the complex nature of web applications having multi-tier architecures these days, it is really difficult to decide where and how the input validation should be done. While conducting penetration tests, I normally come across web applications where developers rely too much on client side validation controls and only a handful of them get it right in the first place.

Any user input that is processed by the application should be treated as malicious. This includes all the HTTP headers and hidden form fields that are processed by the application and not just the data entered by the user in the form fields. User input that is passed to the server should always be checked for type, length, format and range.

The application should only rely on server side validation controls and not on client side validation controls as it is very trivial to bypass client side controls by using a web proxy. Even on the server side, it may be a bit tricky to have proper validation controls in place. So a defence in depth mechanism should be applied and the data should be validated every time it crosses a trust boundary. This approach should be followed to prevent servers at different stages assuming a malicious data string as a legitimate data string. For example, it may be perfectly legitimate for a web server to treat a single quote (') as a valid character for a name field (as someone may have a name O'Neil), but the validation controls on the application server may be put in place to make sure that the single quote is treated as is, and not as part of an SQL query to the database server. Again we may employ some parameterised queries on the database so that all the input passed to the database is considered as a parameter and not the command to the database.

Also, there are two approaches for handling input validation on the server side. One is to create a white list of acceptable characters and restrict everything else, and the other is to create a black list of known malicious characters and restrict these characters from the input while accepting everything else. The best practise is to create a white list of all the acceptable characters for each input field and reject everything else. This approach would enable the web application to prevent any unknown and future attacks as well, while on the other hand, creating a black list may only prevent against currently recognised malicious chracters. Sometimes, it may also not be so evident to identify all the malicious characters in advance and thus may be easy to miss mitigation against some of the attack vectors by following the black listing approach.

So, if done correctly, input validation would prevent most of the injection attacks on your web application.