A Guide To Web Security Testing: Part 1 - Mapping Contents

Written by kalilinux | Published 2022/01/11
Tech Story Tags: cybersecurity | hacking | ethical-hacking | bug-bounty | cyber-security-awareness | security | data-privacy | hackernoon-top-stories

TLDRIn this detailed series of articles, we are going to discuss how to test a web application step by step. This article will cover all the categories of vulnerability and attacks an ethical hacker does while testing a website or web application. We are not giving guarantees that these methods will lead to unearthing vulnerabilities on a website, but it will give us a very good level of knowledge of how web penetration testing works. In this first part, we will look into the target website and try to know it.via the TL;DR App

In this detailed series of articles, we are going to discuss how to test a web application step by step. This is would be a long article but we divided it into parts, so this will be a mega-series for web penetration testers and bug bounty hunters.
This article will cover all the categories of vulnerability and attacks an ethical hacker does while testing a website or web application. We are not giving guarantees that these methods will lead to unearthing vulnerabilities on a website, but it will give us a very good level of knowledge of how web penetration testing works.
In this first part, we are going to look into the target website and try to know it. We are not just seeing it, we are going to watch our target very carefully, just like that we need to understand this article also very carefully.

1. Mapping Contents of a Website

Scout Visible Contents

  1. First, we configure our web browser to use our favorite proxy intercept or spidering tool. For this we recommend BurpSuite but WebScarab is also a good alternative. to passively spider the site and monitor and analyze the website's content through the proxy.
  2. Sometimes it is very useful if we configure our web browser and use an extension like ZAP (OWASP Zed Attack Proxy) to check and parse the HTTP and HTML content processes.
  3. We can browse the website manually in a normal way. Visit each page and every link. We can try submitting forms and check it's processing. Browse the website or web application with JavaScript enabled and disabled, and with cookies enabled and disabled.
  4. If the website has user registration then we have or can create a login account to test its protected functions.
  5. When we browse the website we need to be aware of the requests and responses passing through our proxy intercept tool to understand how the data is being submitted and how the client is controlling the server-side application.
  6. We need to review the sitemap generated by our proxy intercept tool by the passive spidering. We look for that functionality or content that we have not checked through our browser.
  7. When we finished manually browsing and passive spidering, we can use our spider to continue actively crawling the website. This might sometimes uncover additional content that we overlooked when working manually.

Information from Public Sources

  1. We can use search engines and archives (like Wayback Machine) to identify to identify what content the website indexed and stored.
  2. We should use search engines' advanced options to get more improved results. Like on Google, we can use the site: to get all the content for our target, link: to get other sites linked to our target website. Sometimes, we can get old, removed content and get additional information.
  3. If we get any names, e-mail addresses, or phone numbers then we can search them on search engines. We specially focus on the forums if the target asked about any technical details there. That way we may get some information on infrastructure.
  4. We should review any published Web Services Description Language (WSDL) files to generate a list of function names and parameter values potentially employed by the website.

Playing with Hidden Content

  1. First, we run some tolls like DIRB and Gobuster to find hidden files and directories by using a brute-force attack on our target website. Bigger word lists will increase the chance of success.
  2. We learn how the website/web application handles requests for non-existent items. We need to create some manual requests for known valid and invalid resources and compare the website's response to establish an easier way to know when an item doesn't exist
  3. We should create a list of all known files, links, and common file extensions. This will help us to guess the hidden files and directories. For example, if there are pages called 
    AddDocument.jsp & ViewDocument.jsp
     then there is a high chance to have 
    EditDocument.jsp & RemoveDocument.jsp
    .
  4. We should review all client-side codes to find any clues about hidden server-side codes or contents.
  5. We should check in HTML comments and disable from elements.
  6. We should understand the patterns from the client-side to the server-side processes.

Find Default Contents

  1. To know about the webserver, we can run Nikto against it to detect what kind of server running. This may find any default or well-known in the server. We should use Nikto's options to increase its effectiveness. Like we can use 
    --root
     flag to specify a directory to check default contents, 
    -404
     flag used to specify a string that identifies a custom 'Not Found' page.
  2. We should verify any juicy findings manually, it might be a false positive.
  3. We request the server's root directory, specifying the IP address in the Host header, and determine if the application responds with any different content. If it happens then we need to again run Nikto against the IP address as well as the server name.
  4. We should request to the server's root directory, specifying a range of User-Agent headers.

Identify Identifier-Specified Functions

  1. We need to identify any instances where specific web application functions are accessed by passing an identifier of the function in a request parameter.
  2. We have discovered that the mechanism is being used to access individual functions. Like if the website uses a parameter that contains a function name, first we need to determine its behavior. When an individual function is specified, and we should try to establish an easier way to identify when a valid function has been requested. We should compile a list of common function names or cycle through the syntactic range of identifiers in use. Also, we can automate this task to identify valid functions as quickly & easily as possible.
  3. If needed & applicable we can compile a map of web application content based on the functional paths unlike the URL's, showing all the founded functions and the logical paths and dependencies between them.

Testing the Debug Parameters

  1. We need to find some website's pages or functions where debug parameters are hidden, like 
    debug=true
    . Maybe we find them in functions like login, search and file upload or download.
  2. We should make a list of common debug parameters like debug, source, test hide, and their common values like true, yes, no, and 1. On POST requests, we should check by submitting permutations of each name or value pair to each target. Try them on both in URL string and request body. We can automate this by using BurpSuite's Intruder (cluster bomb attack) to make a combination and permutations of two payload lists.
  3. We also check the application's response for any anomalies that may include that the added parameter had an effect on the website's processing.

End Notes

The above methods will help us to know much more about a target website. In the next part, we are going to go about analyzing our target website.

Written by kalilinux | Hello, We write tutorials on Cybersecurity and Bug Bounty on our website and HackerNoon. We
Published by HackerNoon on 2022/01/11