Jóna Jónsdóttir meets John Q. Public!

Jóna Jónsdóttir meets John Q. Public!

Data protection leads to "charming" extraction results

Who does not know them: John Doe and his female counterpart Jane or John Smith and John Q. Public. This special kind of placeholder names are of course also used in other countries.

John Q. Public CV

Who does not know them: John Doe and his female counterpart Jane or John Smith and John Q. Public. This special kind of placeholder names are of course also used in other countries. While Geman-speaking countries have their Max Mustermann and Lieschen Müller, the Italians appreciate very much their Mario Rossi, the French love their Jean Dupont, the Finns adore Matti Meikäläinen, and Ivan Ivanovich, on his part, is dear to the hearts of the Russians. The Japanese, equally fond of their Nanno Nanigashi, share their passion with the Indians and Poles, who respectively have their Ashok Kumar and Jan Kowalski. In short, placeholders for personal data are used throughout the world.

But this is not yet the end of the "placeholder" story: a person called "John Q Public" or "John Sample" often lives in "Sample City" located in "Sample Country" and has got a sample email address as well as a sample telephone number (e.g. 999-999999). May the use of placeholders be very well justified to protect sensitive personal data or to replace them in standard forms, it can, however, turn into a major problem when it comes to anonymized data in documents that the JoinVision software development department has to analyse.

CVlizer, our semantic CV-Parser CVlizer, is much more than an ordinary text parser whose functions are limited to processing field names and standardized CV templates. Our parsing engine also analyses the content of the extracted data and assesses their plausibility with the aim to optimize the quality of the extraction results and to possibly eliminate errors appearing in the document or deriving from previously executed extraction steps.

Although the occurrence of a placeholder like "John Q. Public" does not yet affect the process of document analysis, it may become problematic if the term "Sample City" appears, simply because this place does not exist. In these cases, the parsing engine consequently cannot fully exploit its knowledge base and has to resort to heuristic methods which use the address format or similar references. This works in most cases, but sometimes it cannot be avoided that things go wrong.

The same applies for illogical date ranges: whenever persons finish their studies or already work as a neurosurgeon at the age of two, it can be assumed that they are either one of Doogie Howser’s highly gifted brothers and sisters or that a plausibility error has occurred. And it is equally implausible that somebody who was born in the 19th century is still holding a position on the current labour market. Finally, six-digit telephone numbers with the dialling code 9999 are likewise improbable.

Even though this list could be largely extended, the actual result would often be the same: our development team receives a CV with the information that during the data processing extraction errors have occurred. Very often, though, a short glimpse on the personal data of a candidate rapidly provides clarity: good old “John Q. Public” has struck once again...

List of all placeholder names by language: https://en.wikipedia.org/wiki/List_of_placeholder_names_by_language

Contact us

Our Office

Wehrgasse 28 / Top 3+4

1050 Vienna


+43 (0)1 505 80 70

+43 (0)1 505 80 70 60

Drop us a line

JoinVision is a leading provider of multilingual semantic recruiting technology. With the two parsers CVlizer and JOBolizer, applicant documents and job advertisements are automatically recorded, analyzed and coded. Modules, such as HRclassifier, HRcapture and HRmerger, expand the possibilities to have all information immediately available as a standardized, structured candidate or job profile in XML format. At the end of 2019 JoinVision took over the commercialization of Joveo, a US-based technology provider for programmatic job advertising, in the German-speaking countries.

Connect with us

Latest Tweets

  • An dem Thema kommt keine #Recruiting Messe aktuell vorbei: Mensch vs Maschine. Auch die Personalmesse München nicht… https://t.co/tjB7mbyfaa
    Tue Nov 05 10:54:03 +0000 2019

  • Zwei innovative HR-Technologie-Anbieter haben sich gefunden: https://t.co/fXzDVxisBN @joveoinc @JobCloudAG #Job… https://t.co/C5OIBshDXR
    Wed Oct 23 14:51:25 +0000 2019