Skip to content

PPDS Co-creation workshop 22 February 2024

Agenda

  • 10:00 - 10:10: Introduction
  • 10:10 - 10:55: PPDS Data Quality Business Rules
  • 10:55 - 11:30: Analysis of the PPDS Data Quality: exercise
  • 11:30 - 11:40: Coffee break
  • 11:40 - 12:15: Analysis of the PPDS Data Quality: continuation
  • 12:15 - 12:30: Wrap-up & QAs

Session Rules

  1. During the presentation keep your mic off.
  2. Use the chat to ask your questions.
  3. Do not be shy to participate actively throughout the session.
  4. Please raise your hand if you wish to take the floor.

1. Introduction

Welcome to the 6th PPDS Co-creation session! We will be focusing on the latest advancements regarding Data Quality Business Rules. During the session we have foreseen co-creation exercises (using MIRO and Mentimeter). MIRO is an online collaborative tool for brainstorming and capturing ideas digitally. Should you wish to get familiar with it, we invite you to consult to a short tutorial video here: Miro tutorial.

2. PPDS Data Quality Business Rules

The objective of Data Quality Business Rules is to ensure the good quality of the data regarding completeness, validity, consistency, accuracy within the PPDS. The methodology of this rules will be explained in detail using the PPDS Data Quality GitLab.

3. Analysis of the PPDS Data Quality

The objective of this activity is to gather feedback on the existing business rules and possible improvements to complement them (e.g. descriptions, examples). We also want reflect on new business rules to include in the PPDS.

The PPDS Data Quality Miro Board has been created to facilitate the discussion and for you to add/suggest your comments (e.g. “comp-006 [National Registration Number of the Buyer is provided]: This data is not available in my country”).

We invite you to consult the Miro board – that includes all the Quality rules – and to add your comments in the post-its already available in each section.

This will help us spot topics of discussion important for you!

Access the PPDS Data Quality Miro Board: PPDS Miro Board

4. Upcoming

  • Next PPDS Co-creation session (beginning of May)

5. Outcomes of the session

Meeting Coordinators: AROSA Daniel/Marc christopher SCHMIDT/Antonio Soeiro

After an initial poll, here is some information about the session participants:

We had a total of 73 participants, from 21 different countries:

  • Finland
  • Italy
  • Latvia
  • Slovakia
  • Estonia
  • France
  • Hungary
  • Luxembourg
  • Romania
  • Spain
  • Sweden
  • United Kingdom
  • Belgium
  • Cyprus
  • Germany
  • Iceland
  • Ireland
  • Malta
  • Portugal
  • Slovenia
  • Netherlands

6 different types of users:

  • Policymaker
  • Buyer
  • Company
  • Auditor/Supervisor
  • NGO/Academia/Journalist
  • Other (i.e., eSenders, Data Scientists, System Developers, Policy Analysts)
QUESTIONS ANSWERS
On What type of PPDS user you are, please for the ones who selected Other, who are you?
- Data Scientist from JRC
- System "developer"
- Process-owner of the purchase to payment on the governmental level
- System architect
- JRC, scientific team
- IT on governmental level
- Other: Procurement Monitoring Bureau. We provide public procurement data, also we are eSender
- eSender
- Economic/policy analyst from JRC
- Data analyst
How to differentiate in between categories (DPS) and LOTs? Is it an issue? The logic of categories indeed, does not exist in the forms. When we worked on the DPS guidelines, it varies on how these fits into a DPS. There are three possibilities:
1. One category, one lot
2. One category, multiple lots (hardware into server, desktop, and notebooks)
3. Multiple categories, one lot (gaming notebook, normal notebook, small notebook)
4. We need to analyse how this is practically used
A given buyer can be register in different countries with different national id?
- BRIS system should provide the links between organizations.
- Actually, Buyers could be companies as well, so there could be indeed a same buyer (name) with different buyer IDs.
Would the EU market share of winners not be an interesting analysis? Yes, this is one of the data analytics aspects under development as well.
Regarding main cpv code. How you extract the main cpv code from TED when you have several CPV entries? The ePO has an attribute already for the Main CPV (and another for additional CPV codes).
Why is the procedure identifier not added as business rules, because you can have several contracts in a procedure. And it gives a more insight in the procedure? The procedure identifier is always present, at least in TED. We will include it as a rule though to ensure that this is checked even for below threshold procedures.
If the PPDS contains below threshold data from different countries. How to make sure that there are no duplicates of procedures / notices (for TED notices there are unique identifiers). There may be unique identifiers in each MS but these systems are not related to each other. For example: buyers in one country may publish notices in several countries etc.
- With the introduction of eForms, each procedure will (should) have a unique procedure ID. If this would be used to publish one notices in multiple countries, we could easily identify them and avoid duplication.
- Indeed, identifiers may be a way of identifying duplicates but not the only one. As part of a data curation application that will be implemented in the PPDS, there will be a set of checks to identify potential duplicates between national data and TED data - or even duplicates within the same datasets (either at national or TED level).
What is the main identifier that allows for an automated link of all the events in a procurement process (potentially from PIN to modification notices)? In TED, the notice identifiers establish such link. However, when mixing data from TED and national registries, the code given by the contracting authority to the contract should be used. Nevertheless, this code is not always present, or can be duplicated in contracts that are not related.
What if the data not within the reasonable range? Not using the whole LOT data, or just one particular data which is not in a reasonable range? If the given data is not within a reasonable range the purpose is not using that specific data point (e.g., if the estimated value of the lot is 0.1€, this data point will not be used when the computation includes the estimated value of the lot as a variable but can be used in other calculations that are not affected by this issue).
Is there Single Market Scoreboard methodology available somewhere? As part of the PPDS, the methodology for the calculation of the Single Market Scoreboard Indicators using the PPDS data is also being developed. The methodology used for the calculation of those indicators is provided also through the PPDS Gitlab: https://eproc.pages.code.europa.eu/ppds/pages/indicators/
How did you like this co-creation workshop?
- Really enjoy it/a lot.
- VERY interesting.
- A good format with online participation and Miro board, thank you.
FEEDBACK
Just a note on comp-001: if the source of this ID is CN, then it could be cleared to call the business rule as "Contract Notice ID" (Not contract ID as it may be understood as an ID for a contract that appears in CAN).
The comp-001 refers to the contract ID, not the CN or CAN ID. Each procedure can be divided into lots that at the same time are divided into contracts.
We would like to ask that this site be updated with the eForms-TED code list. It seems to us that it is not aligned.
val-018 is missing requirement for the integer to be positive.
I am wondering if the EC experience in managing its own contractors could be a source of inspiration. CORDIS / CORDA managed by RTD collects and manages the data on the recipients of the EC R&D grants from Framework Programme projects like H2020, etc. My impression is that they assign their own participant identification code (PIC). Similarly, the CORDIS data filters between various types of organizations and large firms / SMEs. Unless not done yet, maybe it would be worthwhile to talk to the colleagues managing the CORDIS/CORDA database to learn about their experience, suggestions, etc. Just an idea.
GENERAL COMMENTS
A CN may not lead to a contract. The procedure may be cancelled or receive 0 offers....
We have a system for identifying organizations such as public organizations and private companies. But a buyer may legally be a single part of an organization, and they share the same identifier. There is no system in place for identifying the buyer (from a legal standpoint).
ESPD is not used under EU threshold competitions. And the data on ESPDs seems not to be very accurate.
We also publish statistics om SME (size of the winning party) based on national register information. But we use a simplified definition based only on number of employees not economic information such as turnover / balance sheet. Also, hard to consider company group relations. So, it’s not the EU-definition that is being applied.
We ask the bidder to enter the information in their profile in the portal.

6. Evaluation of the session

We administered a survey to the participants to gather feedback on the organization of the session.

The first part of the survey was made of 2 questions: one open ended and one with multiple choices.

1. Which topics would you like to discuss in the next co-creation sessions? (16 voters out of 73 participants)
- eForms (56,25 %)
- SMS (18,75%)
- Indicators (12,50%)
- TED (6,25%)
2. Which type of format would you prefer for future sessions? (24 voters out of 73 participants)
- Fully Online (37,50%)
- Fully on-site in Brussels (33,30%)
- Hybrid (29,50%)

The second part of the survey was made of 4 questions with the possibility to answer on a scale from “strongly disagree’’ to ‘’strongly agree’’. We received 15 answers out of 73 participants. (20% of participants)

3. The materials of the session were relevant and contributed to my understanding.
- 78,60% of voters responded ‘’Rather’’ and ‘’Strongly agree”.
4. The time assigned for the content delivery was adequate
- 80% of voters responded ‘’Rather’’ and ‘’Strongly agree”.
5. The PPDS team responded to questions effectively and motivated the attendees to participate.
- 93,30% of voters responded ‘’Rather’’ and ‘’Strongly agree”.
6. The organization of the session seemed appropriate
- 100% of voters responded ‘’Rather’’ and ‘’Strongly agree”.