March 6, 2021

The INE massively ‘scratches’ the websites of Airbnb, Booking and Vrbo to know how many tourist homes there are in Spain

During the last year, the National Institute of Statistics (INE) has downloaded data from the main tourist housing websites to estimate how many there are in Spain. The work is part of the so-called ‘experimental statistics’, innovative projects in which the body uses new sources or methods. The results will come out on December 17 and will constitute an important advance, since not all communities have updated directories. Until now, the data extracted directly from the webs was in private hands (companies like AirDNA charge for them) or in open projects of third parties, which do not always include all cities. In We have done analysis of tourist homes with DataHippo data.

The technique used by the INE is ‘web scraping’, which consists of ‘scratching’ the content of a website and extracting all possible data in bulk. The Technical project, which is already public, explains that information on tourist homes is necessary to analyze their impact and apply the regulations with more knowledge. Furthermore, Eurostat demands it and is aware that all countries have problems accessing it. Eurostat has a pilot project with Airbnb, Booking, Expedia and Tripadvisor to share this information.

The first idea of ​​the INE was to contact the tourism managers of each community but the result was not “entirely satisfactory”, so they developed ‘scraping’ programs. After the data, which includes licensed and unlicensed apartments, is extracted, it is run through an algorithm to detect duplicates. The three platforms analyzed are Airbnb, Booking and Vrbo.

Airbnb doesn’t usually like to share data. In the brochure of your next IPO, described it as a risk. “If a new regulation forces us to share host data with a city, revenue will drop because there will be hosts who don’t want to and leave the platform.” This newspaper has contacted the company to find out its opinion on the INE’s ‘scraping’, but at the end of this article it had not received a response. Neither of Booking. We will update it when we have more information.

This is not the only experimental statistics on tourism that the INE has in place. Juan Manuel Rodríguez Poo, president of the institute, has announced in a ceremony organized by the Ministry of Industry that they have two more new products. On the one hand, that of card spending by foreign visitors and measured through POS terminals (and not surveys, as is done monthly). This statistic was published this Wednesday. During the pandemic, the Government has also used this type of information to analyze the evolution of the economy. And several banks publish data on card consumption on their network POS.

On the other, the measurement of national and foreign tourism with position data from mobile phones. There were previous projects with this type of information, such as the one carried out by the INE itself during the confinement or the one prepared by the Ministry of Transport to know the evolution of mobility. The measurement of visitors will complement the information from the survey of tourist movements at borders (FRONTUR) and will be published during the first half of 2021.

In both cases, Rodríguez pointed out, it is about accessing “more granular” information (in more detail) and “more frequent” (the photo will not be fixed, but will be updated). “The use of these massive data sources is going to be a great advantage for institutional and academic environments”, he concluded.


Source link