Guessing URL Parameters
Posted by Steve Green on 10 October2012.
When testing websites, exploratory testers often spend some time looking at URL structures and editing them to learn how the system responds and to find vulnerabilities and other types of adverse behaviours.
The parameters on the end of a URL are a common target and a rich source of bugs. Many of us have seen e-commerce websites where a URL such as http://www.abc.com/basket?action=add&stockid=1234&price=19.99 can be edited such that you pay any price you wish for the product. Or nothing at all.
The parameters you can't see
However, editing the existing parameters is only part of the story. Most systems contain library code or code that has been re-used from some other application, and these sometimes contain support for parameters that the current application is not using.
If you can guess what those parameters are, you can get the system to do things the developers never intended. And sometimes that can be very interesting.
Site search engines often only present a simple search form, but under the hood lies a fully-featured search engine with advanced features. If those features have not been disabled, they will respond if you send the right parameters.
Looking at the URLs for other search engines can give clues, and I have sometimes had success with parameters such as &n=100 (to display 100 search results per page) or &sort=asc (to sort results in ascending order).
The latter may not be particularly interesting, but combined with the ability to display longer-than-normal results pages it can make analysis of the results easier. And the ability to display all the search results on a single page can be invaluable, even if it does make the server and database work harder than intended (I once crashed a system by making it create a single results page containing 250,000 records).
This technique can also allow you to bypass client-side restrictions such as greyed-out options in the search form or missing options such as a Country combobox that only lists certain countries.
Is that the best you can come up with?
the search URL allowed you to request pages by what appeared to be an index number. For instance http://www.abc.com/?q=1 brought up the home page, while http://www.abc.com/?q=205 brought up an Unsubscribe page that I could not access via normal navigation.
I quickly built a 'sitemap' page that contained URLs with values of q from 0 to 500. By running a link checker against this page I was able to determine that every page existed for values of q between 1 and 205 and no pages existed for values higher than 205.
By comparing the list of 205 pages with a list of pages identified by spidering the website normally, I found a number of pages in the former that were not in the latter. In this case none of the pages contained sensitive content but they could have done, for instance if pages containing financial results had been constructed in advance of publication.
The CMS was MODX, an open source Linux Apache variant of Joomla. It uses an ID number for each page - and it's that variable we were seeing when changing the URL. Like some other CMS solutions it uses a plug-in to make 'friendly URL's' for good SEO. So, every page will have an ID number and a friendly URL, and both work from a user point of view.
The developers claimed that no one will be able to see hidden, unpublished or protected pages, but they never explained why we could access pages that it was not possible to navigate to.