Warning : this page is computer-translated from a french page of this site. Automatically translated pages are not always perfect and may contain errors in vocabulary, syntax or grammar (probably similar to the errors that would make a foreigner speaking in your language!). The original page, in french, could be find here...
MailWalker is software that is designed to browse web sites to extract email addresses present. This activity is decomposed into a three distinct tasks performed in parallel:


Walking


When you add new websites to a meeting, you tell by MailWalker which sites you want to start your collection.
Unless you have clearly stated in the addition MailWalker is not confined to the analysis of such web sites. Indeed, when analyzing these sites, MailWalker will try to find links to other websites and add to the list of sites to be analyzed.

This phase of the course, which is to find and track a number of external links to a website, is highly customizable:
  • a structural point of view because you can specify at what level of hierarchy you want to do; 
  • from a quantitative indication of the limits in terms of number of sites to go; 
  • and a qualitative point of view since you can specify under what conditions you want (or not) include a new site in the browse list. 
MailWalker is able to browse sites with external links are "classics" (hard links), but also to detect more subtle relationships, such as indirect links (including scripting), some links enabled javascript or those contained in the flow XML. MailWalker is also capable, if you ask him, to follow scripted or HTTP redirects.


Exploring


As MailWalker that built its browse list, based on the sites you've added and the options you have, it also tries to explore the sites. This time to build a list of internal pages for each web site of the browse list.

Just as the engine of course, the engine of exploration is also highly configurable:
  • a structural point of view because you can specify how you want the exploration to be done and what type of media; 
  • from a quantitative indication of the limits in terms of number of pages to explore and document sizes not to exceed; 
  • and a qualitative point of view since you can specify under what conditions you want (or not) include a page of a site in the list of analysis. 
MailWalker has the ability to explore a number of types of sites with options more or less "open", but you can also ask him to try to explore the sites of established media that he does not know explore natively, the results are often very interesting.


Analyzing


While the exploration engine finds pages "for the good service", the analyzer is responsible to verify the content of these pages and extract emails depending on the options you have asked to meet.

Here you can refine the behavior of the analyzer:
  • a structural point of view by choosing the method or methods to be used for analysis; 
  • and a qualitative point of view, specifying under what conditions you want (or not) store and found an email how you want to make it usable. 


MailWalker was designed to be very "maléable", the operating characteristics of engines journey and exploration as well as the analyzer can be easily adapted to certain scenarios to fully escape its competitors.

It is becoming quite easy to add new features or heuristics to these engines and that is why, in order to stick to your business and your applications, you will appear regularly updates MailWalker ... Also, remember to activate the search for automatic updates!

Go back