Question: Please read the following case and answer the question followed by it. CASE STUDY Optimisation of addresses for European football The organisation of the European
Please read the following case and answer the question followed by it.
CASE STUDY
Optimisation of addresses for European football
The organisation of the European Football Championships applied strict guidelines during the sale of tickets to the international public in 2008. Each individual was allowed to buy only one ticket for themselves and one for a guest, and the personal data of both were to be registered. To avoid tickets falling into the hands of hooligans and unofficial traders, Human Inference was asked for assistance.
We knew beforehand people would try to get hold of more than one ticket, said Jos de Kruif, ticketing sales manager. The simple de-duplication tool the organisation used to identify those people in the database produced a list of suspicious cases. Customer service, however, kept on facing weird requests.
What EURO 2008 wanted then was first validation and standardisation of the names and addresses in the database. The tickets would be sent to customers shortly before the game. Therefore it was of utmost importance for the personalised tickets to be distributed to the right addresses and the right persons.
Names were set in a standard format with a program named HIquality Name and it was the function of HIquality Name to apply capitals in the proper way. Many names were corrected. Hundreds of names were rejected because the family name was missing or a company name was mentioned. Controlling and standardising the addresses was done with the help of HIquality Address. HIquality Address compared the filled-in address with the most similar address in the postcode databases of a specific country. An error margin was incorporated and all prevented addresses with minor differences were printed on a list. For countries without available postcode databases other solutions were invented.
Many of the addresses were standardised in the format the organisation desired. There were numerous reasons for adjusting the addresses see Table 6.1.

To de-duplicate the database, the Football Championship organisation applied HIquality Identify. Experience shows that most fraudulent people only make minor adjustments in their personal data; the system HIquality Identify assesses this similarity between records. The database for sold tickets contained, before analysis, approximately 400,000 records.
The de-duplication exercise produced 9 per cent potential duplicate names and addresses. From a safety control perspective this was a significant filtering outcome. The same names with different first letters were found many times. First and last names were inverted or the names were represented with and without the maiden name, a neighbours house number was given, rather than their own and their own slightly adjusted birth date was given.
To prevent organised fraud, not only individual records were compared but also groups of records. In case person 1 resembles person 4, person 2 matches person 4 and person 5 shows similarities with person 1, the group relations were presented. In this way a street in Reykjavik was identified where several persons together had requested 60 tickets.
There were three sources of personal data: the internet, a data-entry agency and the organisations own office. Since all three sources had their own database characters, diacritics were presented in varying ways. After importing the data in an SQL file the became a or and became a ;. Although this complicated data comparison, it turned out that several people ordered tickets through multiple channels.
Hooligans formed a special risk category during the analysis. Using specialised technology, people who most likely were preventing correct identification by using variationson their names were traced. The German, English, Belgium and Dutch football associations provided databases with names of hooligans who were no longer allowed to enter a stadium. Fifty hooligans were identified this way. De Kruif: to avoid criticism afterwards we had to do everything to get these people out of our databases! Each of these 50 hooligans had ordered one or more tickets and all of them were cancelled.
The resulting data set was used by the Football Championship organisation to accept and fulfil orders. The list with minor errors was processed manually. This also holds for people who potentially ordered more than two tickets for a match. Many of them probably would not have known they were only allowed to book two tickets or made an unintended error. They were not necessarily cheats. The organisation in total refused 1,100 orders. In the end, only those people in Reykjavik can tell you whether they acted in bad faith. This is also the case for the 48-year-old man from a little village in the country who lives with his 47-year-old brother at the same address . . .
Question:
3 How can you estimate the financial value of (a lack of) data quality in this case? (5 marks) (answer must related to the case study above and the length of the answer must equal with the marks given)
Table 6.1 Errors during standardisation and validation of names and addresses Name errors Address errors Name is not the name of a private person Name and sex are contradictory Two persons with the same name Part of the name is left out Unknown double-barrelled surname Unclear interpretation which results in different ways of writing a name Family name missing Maiden name was left out; sex unknown Unknown element Address can be standardised, but was abbreviated Alpha section of postcode was incorrect or missing Foreign address Incorrect address Incorrect street name Incorrect or missing street number Industry park Unique address missing No unique standardisation possible Address does not make sense City and postcode do not matchStep by Step Solution
There are 3 Steps involved in it
Estimating the Financial Value of Data Quality in the Case In this case study the financial impact of data quality can be estimated through both direct and indirect factors linked to the standardisati... View full answer
Get step-by-step solutions from verified subject matter experts
