Building an RSS feed for the Greek e-procurement platform

In early 2013, the Greek government launched the Central Electronic Registry for Public Contracts (CERPC or “Κεντρικό Ηλεκτρονικό Μητρώο Δημοσίων Συμβάσεων” in Greek). It records all tendering procedures of public procurements with a budget over 1000 euros. It has been set up in order to control state expenditure on public contracts, to facilitate and encourage the participation and competition of companies in accordance with the principles of transparency and equal opportunities and to comply with the rules of European and national legislation. Therefore, all sorts of public sector organisations are obliged to publish their tender notices as well as the resulting contracts and follow-up payments on CERPC.

promitheus.gov.gr - Ε.ΣΗ.ΔΗ.Σ.

Its data is open and freely available and as of today (21/12/2013) almost 13000 tenders have been published. However, the system does not offer an API (in contrast with the Cl@rity Program) nor an RSS feed. So, we thought that it would be interesting and potentially useful if we could create an RSS feed for CERPC through scraping the latest tender notices. The goal of this effort is twofold:

  • allow people subscribe to the feed and get the latest tenders automatically through the use of an RSS reader and
  • help developers build innovative applications around public e-procurement data by pulling an updated feed on a daily basis.

rdf_open_dataRSS is a popular and mature technology and there are numerous RSS tools out there. Thus, one could easily utilise the tenders feed.

Meanwhile, the e-procurement portal makes heavy use of frames and AJAX/ JavaScript which typically make scraping harder. However, Selenium did the trick and allowed us to simulate the way a user interacts with the CERPC website. The agent developed is able to programmatically visit the search page, insert values into the “From – To” (Από – Έως) date fields, submit the form and crawl through the results returned. In a second stage we used an efficient DEiXTo pattern and managed to capture all pieces of information on the listings/result pages. It was then quite straightforward to build the desired RSS feed and we are happy to announce that it is available here:

Tenders RSS feed for the Greek e-procurement platform

128px-Feed-icon.svg

It should be noted that the e-procurement platform does not offer persistent links, so you cannot bookmark or send the URL of a detail tender page to another person. Consequently, we could not link each RSS item to its native detail page. As a workaround, we decided to download the full text (available in a PDF format) and associate the RSS items to the local PDF files instead.

Finally, we will be running the scraper every night in order to gather fresh data and we have already made the necessary preparations on our server. The feed is going to provide the notices published over the last three days but due to disk space limitations we will store the PDF documents only for a month or so. Therefore, on our server you will find only the most recent tenders along with their full text.

In conclusion, as open data advocates, we would like to applaud the adoption of e-procurement in Greece and generally we encourage people and organisations to take advantage of the wealth of open, public data and use it creatively for the benefit of the general public.

 

This entry was posted in Extended Article, News. Bookmark the permalink.

Comments are closed.