Saturday, August 18, 2012

PhantomJS & finding pizza using Yelp and DEiXTo!

Recently I stumbled upon PhantomJS, a headless WebKit browser which can serve a wide variety of purposes such as web browser automation, site scraping, website testing, SVG rendering and network monitoring. It's a very interesting tool and I am sure that it could successfully be used in combination with DEiXToBot which is our beloved powerful Mechanize scraper. For example, it could fetch a not-easy-to-reach (probably JavaScript-rich) target page (that WWW::Mechanize could not get due to its lack of JavaScript support) after completing some steps like clicking, selecting, checking, etc and then pass it to DEiXToBot to do the scraping job. This is particularly useful for complex scraping cases where in my humble opinion PhantomJS DOM manipulation support would just not be enough and DEiXTo extraction capabilities could come into play.
So, I was taking a look at the PhantomJS examples and I liked (among others) the one about finding pizza in Mountain View using Yelp (I really like pizza!). So, I thought it would be nice to port the example to DEiXToBot in order to demonstrate the latter's use and efficiency. Hence, I visually created a pretty simple and easy to build XML pattern with GUI DEiXTo for extracting the address field of each pizzeria returned (essentially equivalent to what PhantomJS does by getting the inner text of span.address items) and wrote a few lines of Perl code to execute the pattern on the target page and print the addresses extracted on the screen (either on a GNU/Linux terminal or a command prompt window on Windows).
The resulting script was simple like that:
use DEiXToBot;
my $agent = DEiXToBot->new();
die 'Unable to access network' unless $agent->success;
my @addresses;
for my $record (@{$agent->records}) {
    push @addresses, $$record[0];
print join("\n",@addresses);

Just note that it scrapes only the first results page (just like in the PhantomJS example). We could easily parse through all the pages by following the "Next" page link but this is out of scope.

I would like to further look into PhantomJS and check the potential of using it (along with DEiXTo) as a pre-scraping step for hard JavaScript-enabled pages. In any case, PhantomJS is a handy tool that can be quite useful for a wide range of use cases. Generally speaking, web scraping can have countless applications and uses and there are many remarkable tools out there. One of the best we believe is DEiXTo, so check it out! DEiXTo has helped quite a few people get their web data extraction tasks done easily and free!

Friday, March 9, 2012

DEiXTo powers ΟPEN-SME

We are happy to announce that DEiXTo is going to power ΟPEN-SME, an exciting EU-funded project that promotes software reuse among small and medium-sized software enterprises (SMEs). ΟPEN-SME is coordinated by the Greek Association of Computer Engineers and it is aiming to develop a set of methodologies, tools and business models centered on SME Associations, which will enable software SMEs to effectively introduce open source software reuse practices in their production processes.

   DEiXTo-based wrappers have been successfully deployed in order to enable the project's federated search engine, called OCEAN (developed by the Department of Informatics of the Aristotle University of Thessaloniki), to simultaneously search in real time existing open source software search engines that do NOT offer API access (i.e. Koders and Krugle). To achieve this, custom Perl code was written so as to submit the user-specified queries to the native websites and scrape the (N first) results returned into a suitable form.

    We are really glad that we are participating in this challenging and innovative project and we hope that DEiXTo will help ΟPEN-SME towards implementing its goals. So, if you are looking for a web scraping framework to power your aggregator or search engine, please do not hesitate to contact us!

Monday, March 5, 2012

Uses and applications of web scraping

Some people wonder what the uses of web scraping might be. Well, your imagination is the only limit (along with the copyright notices perhaps). There is a huge wealth of data out there and many believe that the open Web is a real goldmine. So, web data extraction tools and DEiXTo in particular could help you unlock this treasure and give birth to innovations, applications and new ideas.
    Public institutions, companies and organizations, entrepreneurs, professionals as well as mere citizens and users generate an enormous amount of information every single day. The question is: how effectively is it being used? Towards this direction, web content extraction can prove a valuable ally. Along with data mining, they have much to offer in every field you can imagine. The following are only some of the uses of web scraping:
  • collect properties from real estate listings
  • scrape retailer sites on a daily basis
  • extract offers and discounts from deal-of-the-day websites
  • gather data for hotels and vacation rentals
  • scrape jobs postings and internships
  • crawl forums and social sites so as to enable analysis and post-processing of their rich data
  • power aggregators and product search engines
  • monitor your online reputation and check what is being said for you or your brand
  • quickly populate product catalogues with full specifications
  • monitor prices of the competition
  • scrape the content of digital libraries in order to transform it into suitable, structured forms
  • collect and aggregate government and public data
  • search (in real time) bibliographic databases and online sources that don't offer an API, thus powering federated search engines
  • look for educational material and information from across traditional formal higher education subjects and real-life context environments in order to help the contemporary learner
  • power mobile applications
  • help building geolocation apps (e.g. extracting addresses available on web pages and using their coordinates to build meaningful maps with points of interest)
  • prepare large, focused datasets for scientific tasks (i.e. data mining)
  • extract and summarize large volumes of text (e.g. summarizing product reviews)
  • <your scraping task goes here!>
    This list can grow very long. There are countless use cases and potential scenarios, either business-oriented or non-profit. As far as the access and copyright restrictions are concerned, it is a really significant issue that has raised a lot of discussion and controversy. However, the opinion that seems to be gaining ground is that (well-intentioned) web scraping is legal since the data is publicly and freely available on the Web. So, let your creativity and imagination loose; DEiXTo can probably help you to achieve your scraping-based project goals. We would be more than happy to hear from you.

Sunday, February 19, 2012

Linked Data & DEiXTo

As explained in a previous postDEiXTo can scrape the content of digital libraries, archives and multimedia collections lacking an API and enable their metadata transformation (through post-processing and custom Perl code) to Dublin Core and subsequently in OAI-PMH or another suitable form, e.g. Europeana Semantic Elements (ESE).
    Meanwhile, the Web has become a dynamic collaboration platform that allows everyone to meet, read and more importantly write. Thus, it steadily approaches the vision of Tim Berners-Lee (the inventor of the World Wide Web): the Linked Data Web, a place where related data are linked and information is represented in a more structured and easily machine-processable way.
    Linked Data refers to a set of best practices for publishing and connecting structured data on the Web. Its key technologies are URIs (a generic method to identify resources on the Internet), the Hypertext Transfer Protocol (HTTP) and RDF (a data model and a general method for conceptual description of things in the real world). It is an exciting topic of interest and it's expected to make great progress in the next few years. A video that does a nice job of explaining what Linked Open Data is all about can be found here:
    Over the last decade, the Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) has become the de facto standard for metadata exchange in digital libraries and it's playing an increasingly important role. However, it has two major drawbacks: it does not make its resources accessible via dereferencable URIs and it provides only restricted means of selective access to metadata. Therefore, there is a strong need for efficient tools that would allow metadata repositories to expose their content according to the Linked Data guidelines. This would make digitized items and media objects accessible via HTTP URIs and query able via the SPARQL protocol.
    Dr Haslhofer has performed significant research and work towards this direction. He has developed (among others) the OAI2LOD Server based on the D2R Server implementation and wrote the ESE2EDM converter, a collection of ruby scripts that can convert given XML-based ESE source files into the RDF-based Europeana Data Model (EDM). These remarkable tools could turn out very useful for making large volumes of information Linked-Data ready, with all the advantages this brings.
    Linked Open Data can change the computer world as we know it. So, there is a lot of potential in combining DEiXTo with Linked Data technologies. Their blend could eventually produce an innovative and useful outcome. Many already believe that Linked Data is the next big thing. Time will tell. Meanwhile, DEiXTo could definitely help you generate structured data in a variety of formats from unstructured HTML pages, either your ultimate goal is Linked Data or not.

Saturday, February 11, 2012

DEiXTo components clarified

From the emails and feedback received, it seems that many people get a bit confused about the utility and functionality of the DEiXTo GUI tool compared to the Perl command line executor (CLE). DEiXToBot is even more confusing for quite a few users. So, let's clarify things.
    The GUI tool is freeware (available at no cost but without any source code, at least yet) and it allows you to visually build and execute extraction rules for web pages of interest with point and click convenience. It offers you an embedded web browser and a friendly graphical interface so that you can highlight an element/ record instance as the mouse moves over it. The GUI tool is a Windows-only application that harnesses Internet Explorer's HTML parser and render engine.  It is worth noting that it can support simple cooperative extraction scenarios as well as periodic, scheduled execution through batch files and the Windows Task Scheduler. Perhaps its main drawback is that it can execute just one pattern on a page although for several cases (maybe for the majority) one and only extraction rule is enough to get the job done.
    On the other hand, the command line executor, or CLE for short, is implemented in Perl and it is freely distributed under the GNU General Public License v3, thus its source code is included. Its purpose is to execute wrapper project files (.wpf) that have previously been created with the GUI tool. It runs on a DOS prompt window or on a Linux/ Mac terminal.  Besides the code though, we have built two standalone executables so that you can run CLE either on a Windows or a GNU/Linux machine without having Perl or any prerequisite modules installed. CLE is faster, offers more output formats and has some additional features such as an efficient post-processing mechanism and database support. However, it shares the same shortcoming as the GUI tool: it supports just one pattern on a page. Finally, it relies on DEiXToBot, a "homemade" package that facilitates the execution of GUI DEiXTo generated wrappers.
    DEiXToBot is the third and probably the most powerful and well-crafted software component of the DEiXTo scraping suite and it is available under the GPL v3 license. It is a Perl module based on WWW::Mechanize::Sleepy, a handy web browser Perl object, and several other CPAN modules. It allows extensive customization and tailor-made solutions since it facilitates the combination of multiple extraction rules/ patterns as well as the post-processing of their results through custom code. Therefore, it can deal with complex cases and cover more advanced web scraping needs. But it requires programming skills in order to use it. 
    The bottom line is that DEiXToBot is the essence of our long experience. The GUI tool might be more suitable for most every-day users (due to its visual convenience) but when things get difficult or the situation requires a more advanced solution (e.g. scheduled or on-demand execution and coordination of multiple wrappers on a GNU/Linux server), a customized DEiXToBot-based script is your choice. You can use the GUI tool first to create the necessary patterns and then deploy a Perl script that uses them to extract structured data from the pages of the target website. So, if you are familiar with Perl, you should not find it very hard to write your first deixto-based spider/ crawler!

Saturday, January 28, 2012

Federated searching & dbWiz

Nowadays, most university and college students, professors as well as researchers are increasingly seeking information and finding answers on the open Web. Google has become the dominant search tool for almost everyone. Its popularity is enormous, no need to wonder or analyze why. It has a simple and effective interface and it returns fast, accurate results.

    However, libraries, in their effort to win some patrons back, have tried to offer a decent searching alternative by developing a new model: federated search engines. Federated searching (also known as metasearch or cross searching) allows users to search simultaneously multiple web resources and subscription-based bibliographic databases from a single interface. To achieve that, parallel processes are executed in real time and retrieve results from each separate source. Τhen, the results returned get grouped together and presented to the user in a unified way.
    The mechanisms used for pulling the data from the target sources are broadly two: either through an Application Programming Interface (API) or via scraping the native web interface/ site of each database. The first method is undoubtedly better but very often a search API is not available. In such cases, web robots (or agents) come into play and capture information of interest, typically by simulating a human browsing through the target webpages. Especially in the academia, there are numerous online bibliographic databases. Some of them offer Z39.50 or API access. However, a large number still does not provide protocol-based search functionality. Thus, scraping techniques should be deployed for those (unless the vendor disallows bots).
   When starting my programming adventure with Perl back in 2006, in the context of my former full-time job at the Library of University of Macedonia (Thessaloniki, Greece), I had the chance (and luck) to run across dbWiz, a remarkable open source, federated search tool developed by the Simon Fraser University (SFU) Library in Canada. I was fascinated with Perl as well as dbWiz's internal design and implementation. So, this is how I met and fell in love with Perl.
    dbWiz offered a friendly and usable admin interface that allowed you to create search categories and select from a global list of resources which databases would be active and searchable. If you had to add a new resource though, you would have to write your own plugin (Perl knowledge and programming skills were required). Some of the dbWiz search plugins were based upon Z39.50 whereas others (the majority) relied on regular expressions and WWW::Mechanize (a handy web browser Perl object).
    The federated search engine developed while working at the University of Macedonia (2006-2008) was named "Pantou" and became a valuable everyday tool for students and professors of the University. The results of this work were presented at the 16th Panhellenic Academic Libraries Conference (Piraeus, 1-3 October 2007). Unfortunately, its maintenance stopped at the end of 2010 due to the economic crisis and severe cuts in funding. Consequently, a few months later some of its plugins started falling apart.
    Generally, delving into dbWiz taught me a lot of lessons such as web development, Perl programming and GNU/Linux administration. I loved it! Meanwhile, in my effort to improve the relatively hard and tedious procedure of creating new dbWiz plugins, I put into practice an early version of GUI DEiXTo (which was my MSc thesis being fulfilled in the same period at the Aristotle University of Thessaloniki). The result was a new Perl module that allowed the execution of W3C DOM-based, XML patterns (built with the GUI DEiXTo) inside dbWiz and eliminated, at least to a large extent, the need for heavy use of regular expressions. That module, which was the first predecessor of today's DEiXToBot package, got included in the official dbWiz distribution after contacting the dbWiz development team in 2007. Unfortunately, SFU Library ended the support and development of dbWiz in 2010.
    Looking back, I can now say with quite a bit of certainty, that DEiXTo (more than ever before) can power federated search tools and help them extend their reach to previously inaccessible resources. As far as the search engines war is concerned, Google seems to triumph but nobody can say for sure what is going to happen in the next few years to come. Time will tell..

Friday, January 20, 2012

Open Archives & Digital Libraries

The Open Archives Initiative (OAI) develops and promotes interoperability standards that aim to facilitate the efficient dissemination of content. OAI has its roots in the open access and institutional repository movements and its cornerstone is the Protocol for Metadata Harvesting (OAI-PMH) which allows data providers/ repositories to expose their content in a structured format. A client then can make OAI-PMH service requests to harvest that metadata through HTTP. is a great federated search engine harvesting 57 Greek digital libraries and institutional repositories (as of January 2012). It currently provides access to almost half a million(!) documents (mainly undergraduate theses and Master/ PhD dissertations) and its index gets updated on a daily basis. It began its operation back in 2006 after being designed and implemented by Vangelis Banos but since May 2011 it is being hosted, managed and co-developed by the National Documentation Centre (EKT). What makes this amazing searching tool even more remarkable is the fact that it is entirely built on open source/ free software.
    A tricky point that needs some clarification is that when a user searches, the search is not submitted in real time to the target sources. Instead, it is performed locally on the server where full copies of the repositories/ libraries are stored (and updated at regular time intervals).
    The majority of the sources searched by are OAI-PMH compliant repositories (such as DSpace or EPrints). Therefore, their data are periodically retrieved via their OAI-PMH endpoint. However, it is worth mentioning that non OAI-PMH digital libraries have also been included in its database. This was made possible through scraping their websites with DEiXTo and transforming their metadata into Dublin Core. So, more than 16.000 records from 6 significant online digital libraries (such as the Lyceum Club of Greek Women and the Music Library of Greece “Lilian Voudouri”) were inserted in with the use of DEiXTo wrappers and custom Perl code.
    Finally, it is known that digital collections have flourished over the last few years and enjoy growing popularity. However, most of them do NOT provide their contents in OAI-PMH or another appropriate metadata format. Actually, many of them (especially legacy systems) do NOT even offer an API or an SRW/U interface. Consequently, we believe that there is much room for DEiXTo to help cultural and educational organizations (e.g., museums, archives, libraries and multimedia collections) to export, present and distribute their digitized items and rich content to the outside world, in an efficient and structured way, through scraping and repurposing their data.

Tuesday, January 17, 2012

Netnography & Scraping

Netnography or digital ethnography, is (or should be) the correct translation of ethnographic methods to online environments such as bulletin boards and social sites. It is more or less doing the same that ethnographers do in actual places like squares, pubs, clubs, etc: observe what people say and do, and try to participate as much as possible in order to better understand what's involved in action and discourses. Using ethnography may answer a lot of what, when, who and how questions defining several everyday problems. However, netnography differs in many ways compared to ethnography; especially in the fashion it is conducted.
    Forums, Wikis as well as the blogosphere are good online equivalents of public squares and pubs. There are not physical identities, but online ones; there are not faces, but avatars; there is no gender, age or any reliable info about physical identities, but there are voices discussing and arguing about common topics of interests.
    The more popular a forum is, the more difficult it gets to follow it nethnographically. A nethnographer has to use a Computer Assisted Qualitative Data Analysis (CAQDA) tool (such as RDQA) on certain parts of the texts collected during his research. In a forum use case, these texts would be posts and threads. If the researcher has to browse the forum and manually copy and paste its content, a huge amount of effort would be required. However, this obstacle could be surpassed through scraping the forum with a web data extraction tool such as DEiXTo.
    A scraped forum is a jewel: perfectly ordered textual data corresponding to each thread, ready for further analysis. So, this is where DEiXTo comes into play and may boost the research process significantly. To our knowledge, Dr Juan Luis Chulilla Cano, CEO of Online and Offline Ltd., has been successfully utilizing scraping techniques so as to capture the threads of popular Spanish forums (and their metadata) and transform them into a structured format, suitable for post-processing. Typically, such sites have a common presentation style for their threads and offer rich metadata. Thus, they are potential goldmines upon which various methodologies can be tested and applied so as to discover knowledge and trends and draw useful conclusions.
    Finally, netnography and anthropology seem to be gaining momentum over the last few years. They are really interesting as well as challenging fields and scraping could evolve to an important ally. It is worth mentioning that quite a few IT vendors and firms employ ethnographers for R&D and testing of new products. Therefore, there is a lot of potential in using computer aided techniques in the context of netnography. So, if you are coming from social sciences and creating wrappers/ extraction rules is not your second nature, why don't you drop us an email? Perhaps we could help you gather quite a few tons of usable data with DEiXTo! Unless terms of use or copyright restrictions forbid it..

Friday, January 13, 2012

Geo-location data, Yahoo! PlaceFinder & Google Maps API

Location-aware applications have known huge success over the last few years and geographic data have been used extensively in a wide variety of ways. Meanwhile, there are numerous places of interest out there, such as shopping malls, airports, restaurants, museums, transit stations and for most of them their addresses are publicly available on the Web. Therefore, you could use DEiXTo (or a web data extraction tool of your choice) in order to scrape the desired location information for any points of interest and then postprocess it so as to produce geographic data for further use.
    Yahoo! PlaceFinder is a great web service that supports world-wide geocoding of street addresses and place names. It allows developers to convert addresses and places into geographic coordinates (and vice versa). Thus, you can send an HTTP request with a street address to it and get the latitude and longitude back! It's amazing how well it works. Of course, the more complete and detailed the address, the more precise the coordinates returned.
    In the context of this post, we thought it would be nice, mostly for demonstration purposes, to build a map of Thessaloniki museums using the Google Maps API and geo-location data generated with Yahoo! PlaceFinder. The source of data for our demo was Odysseus, the WWW server of the Hellenic Ministry of Culture that provides a full list of Greek museums, monuments and archaeological sites.
    So, we searched for museums located in the city of Thessaloniki (the second-largest city in Greece and the capital of the region of Central Macedonia) and extracted through DEiXTo the street addresses of the ten results returned. At the picture below you can see a sample screenshot from the "INFORMATION" section of the Folk Art and Ethnological Museum of Macedonia and Thrace Odysseus detailed webpage (from which the address of this specific museum was scraped):
    After capturing the name and location of each museum and exporting them to a simple tab delimited text file, we wrote a Perl script harnessing the Geo::Coder::PlaceFinder CPAN module in order to automatically find their geo-location coordinates and create an XML output file containing all the necessary information (through XML::Writer). Part of this XML document is displayed right below:
    After having all the metadata we needed in this XML file, we utilized the Google Maps JavaScript API v3 and created a map (centered on Thessaloniki) displaying all city museums! To accomplish that goal, we followed the helpful guidelines given in this very informative post about Google Maps markers and wrote a short script that parsed the XML contents (via XML::LibXML) and produced a web page with the desired Google Map object embedded (including markers for each museum). Finally, the end result was pretty satisfying (after some extra manual effort to be absolutely honest):
    This is kind of cool, isn't it? Of course, the same procedure could be applied in a larger scale (e.g. for creating a map of Greece with ALL museums or/and monuments available) or expanded to other points of interest (whatever you can imagine, from schools and educational institutions to cinemas, supermarkets, shops or bank ATMs). In conclusion, we think that the combination of DEiXTo with other powerful tools and technologies can sometimes yield an innovative and hopefully useful outcome. Since you have the raw web data at your disposal (captured with DEiXTo), your imagination (and perhaps copyright restrictions) is the only limit!

Wednesday, January 4, 2012

DEiXTo powers Michelin Maps and Guides!

One of the biggest success stories of DEiXTo is that it was used a few months ago by the Maps and Guides UK division of Michelin in order to build a France gazetteer web application. If you are going on holiday to France, probably you will need hotel and restaurant guides, maps, atlases and tourist guides relevant to where you are staying or the places you will visit. So, the free online Michelin database can help you find out which ones are for you.
    The contribution of DEiXTo in the context of the implementation of this useful service was that it scraped from Wikipedia geo-location data as well as other metadata fields for 36.000+ French communes. In France the smallest administrative region is the commune and Wikipedia happened to have all of this relevant information freely available!
    The starting target page contained a list of 95 (or so) departments, each of which containing a large number of communes. Thus, every department detailed page would in turn list all its communes and their corresponding hyperlinks/ URLs. A sample department page looks like this. And last, at a level below, we have the actual pages of interest with all the details needed about each commune. You can see a sample commune Wikipedia page by clicking here and a screenshot from it at the picture below. Meanwhile, this "scenario" also serves as a good example of collaborative wrappers where the output of a wrapper (a txt file with URLs) gets passed as input to a second one.
    It should be noted though that there were slight variations in the layout and structure of the target pages. However, the algorithm DEiXTo uses is quite efficient and robust and usually can deal with such cases. To be more specific, the scraper that was deployed, extracted from each commune page the following metadata: region, department, arrondissement, canton and importantly the latitude and longitude.
    The precision and recall that DEiXTo achieved with these commune pages was amazing (very close to 100%) and as a result the database was finally enriched with the large volumes of information captured. We are really happy that Michelin was able to successfully utilize DEiXTo and create a free and useful online service. So, if you plan a trip to France, you know where to find an informative online map/ guide! :)

Monday, January 2, 2012

Cooperating DEiXTo agents

Basically there are two major, broad categories of cooperating DEiXTo wrappers. In the first one, the wrappers are executed and applied on the same, single page so as to capture bits of interest that are scattered all over this particular target page. On the other hand, the second category comprises cases where the output of a wrapper serves as input for a second one. For the latter, typically the output of the first wrapper is a txt file containing the target URLs leading to pages with detailed information.
    The first category is not supported directly by the GUI tool. However, DEiXToBot  (a Mechanize agent object capable of executing extraction rules previously built with the GUI tool) allows the combination of multiple extraction rules/ patterns on the same page and their results through Perl code. So, if you have come across a complex, data-rich page and you are fluent with Perl and DEiXToBot's interface, you can build the necessary tree patterns separately with the GUI tool and then write a highly efficient set of cooperating Perl robots aiming at capturing all the desired data. It is not easy though since it requires programming skills and custom code.
    As far as the second type of collaboration is concerned, we have stumbled upon numerous cases where a first wrapper collects the detailed target URLs from listing pages and passes them to a second wrapper which in turn takes over and gathers all data of interest from the pages containing the full text/ description. A typical case would be a blog or a news site or an e-shop, where a first agent could scrape the URLs of the detailed pages and a second one would visit each one of them extracting every single piece of desired information. If you wonder how you can set a DEiXTo wrapper to visit multiple target pages, this can be done either through a text file containing their addresses or via a list. Both ways can be specified in the Project Info tab of the DEiXTo GUI tool.
    Moreover, for the first wrapper which is intended to scrape the URLs, you only have to create a pattern that locates the links towards the detailed pages. Usually this is easy and straightforward. You should just point at a representative link, use it as a record instance and set the A rule node as "checked" (right click on the A node and select "Match and Extract Content"). The resulting pattern will be something like this:
    Then, via executing the rule you can extract the "href" attribute (essentially the URI) of each matching link and export the results to a txt file, say target_urls.txt, which subsequently will be fed to the next wrapper. Please note that if you provide just the A rule node as a pattern, you will capture ALL the hyperlinks found on the page but we guess you don't want that (we want only those leading to the detailed pages).
   In conclusion, DEiXTo can power schemes of cooperative robots and achieve very high precision. Especially for more advanced cases, synergies of multiple wrappers are always needed. Their coordination though usually needs some careful thought and effort. Should you have any questions, please do not hesitate to contact us!