Data Security and Compliance

Data Security and Compliance

Data Security and Compliance

Is ChatGPT Generative AI GDPR-compliant or not?

Is ChatGPT Generative AI GDPR-compliant or not?

Is ChatGPT Generative AI GDPR-compliant or not?

Introduction: Understanding the GDPR

The General Data Protection Regulation (GDPR) is a groundbreaking European Union (EU) legislation that came into force in 2018. It was designed to harmonize data privacy laws across Europe and provide greater protection and rights to individuals. GDPR affects companies worldwide, especially those operating within the EU or dealing with the personal data of EU citizens.

Among the numerous regulations in the GDPR, some key principles include:

  • The right of access by the data subject;

  • The right to rectification of data;

  • The right to erasure (or "right to be forgotten");

  • The right to restrict processing;

  • The obligation for data protection by design and by default.

The legislation has had far-reaching impacts on global businesses and their operations, including on how they handle, store, and process personal data.

ChatGPT: A brief overview

ChatGPT, developed by OpenAI, is a language model designed to generate human-like text based on the input it receives. Since its development and deployment, it has been met with much praise for its capabilities. However, recent concerns regarding its compliance with GDPR have also emerged a few months after the end of 2022 release.

The pioneering challenge: The case of the Italian Data Protection Authority ban to ChatGPT

In late March 2023, the Italian Data Protection Authority, known as the "Garante della Privacy", delivered an interim emergency decision mandating OpenAI to cease using ChatGPT for processing personal data of individuals in Italy, marking a pause until further examination could be conducted​. This action was spurred by a data breach that occurred on 20 March, although the impact was noted to be "extremely low." The breach, originating from a technical glitch, exposed titles of active user chat histories and some payment-related data to a small fraction of users. The situation spotlighted the larger issue of GDPR non-compliance, a concern vocalized by Mr. Guido Scorza, a notable member of the Garante board​.

The primary reasons for GDPR non-compliance were three-fold:

  • Firstly, OpenAI had amassed personal data from billions of individuals for algorithm training without prior notification or a proper legal foundation;

  • Secondly, users of ChatGPT were not adequately informed about the processing and utilization of their personal data;

  • Lastly, the possibility of processing inaccurate personal data through AI-generated responses posed another challenge​.

Figure 1: The opening of the groundbreaking ban of Italian Authority

The ban was a wake-up call not only for OpenAI but also hinted at the escalating regulatory scrutiny on generative AI technologies. The Garante's decision wasn't a final verdict but an interim measure to mitigate the risk of GDPR violations, highlighting critical GDPR articles that ChatGPT allegedly breached. The Garante provided OpenAI an opportunity to address these concerns, emphasizing the necessity for clarity on data processing procedures, and age verification mechanisms to safeguard underage individuals from potential exposure to inappropriate content​.

By the end of April 2023, a path towards resolution was in sight as the Garante was poised to lift the ban provided OpenAI introduced new data protection controls to address the identified concerns​. OpenAI responded positively by amending its online notices and privacy policy to accentuate transparency, optionality, and security, which led to the lifting of the temporary ban, reinstating ChatGPT's services in Italy​.

Through this ordeal, the crucial lesson underscored is the imperative for AI service providers to ensure stringent data protection measures aligning with regional regulatory frameworks, to foster trust and ensure the sustainable operation of AI technologies. However, if some of the measure were easy to take (e.g., the informed consent to processing and utilization of personal data and the access to people below 18 years old), some other concerns are intrinsically difficult to address provided the available technology, and are still largely open points (among others, the training without prior notification or a proper legal foundation and the processing of inaccurate personal data).

The latest (and broadest) case of Generative AI scrutiny: The Polish Data Protection Authority filing

The concerns about ChatGPT and its potential GDPR violations came into the spotlight once again when a user named Lukasz Olejnik lodged a formal complaint. Olejnik, a privacy and digital rights researcher, expressed concerns about OpenAI's data handling practices concerning the GDPR.

Olejnik's interactions with ChatGPT, coupled with his inability to exercise his GDPR rights effectively, became the catalyst for the subsequent inquiry. The specific timeline after the submission of the complaint is uncertain; however, the Polish data protection authority, UODO, started its investigation based on the filed grievance.

The core of the accusation lies in OpenAI's handling of users' personal data. The GDPR ensures several rights to users, including access to personal data, rectification of inaccuracies, the right to erasure, among others. Olejnik's complaint emphasized the potential inability of ChatGPT users to exercise these rights.

In particular, the concerns highlighted:

  • ChatGPT's alleged inability to provide Olejnik with the data it might have stored about him;

  • Questions about OpenAI's data protection measures and whether ChatGPT's operations align with the GDPR's data protection-by-design and default principle;

  • OpenAI's purported failure to engage in prior consultations with the UODO or any other European DPA before launching ChatGPT, as necessitated by Article 36 of the GDPR;

  • A call for OpenAI to submit a Data Protection Impact Assessment (DPIA) detailing its data processing activities for ChatGPT. A DPIA is a fundamental component of GDPR compliance in Europe, providing a structured assessment of the potential risks and mitigations when handling personal data.

These requirements are standard requirements for any digital product distributed in European Union, and as such were to be expected from ChatGPT as well.

But how does ChatGPT actually score against GDPR requirements?

GDPR Requirements and ChatGPT

#1: Right of Access by the Data Subject

What GDPR says: Individuals have the right to access their personal data and supplementary information. This means that they have the right to understand whether their data is being processed and to access it if so.

ChatGPT status: The recent complaint by Lukasz Olejnik against ChatGPT highlights potential challenges in accessing personal data stored by ChatGPT. The claim suggests that users, including Olejnik, may not be able to access their stored data in the system, potentially infringing upon this GDPR requirement.

#2: Right to Rectification

What GDPR says: Data subjects have the right to have inaccurate personal data rectified or completed if it's incomplete.

ChatGPT status: It remains unclear whether users can modify or rectify data once it has been processed by ChatGPT. If users can't amend potentially inaccurate data entries, ChatGPT might be in violation of this principle.

#3: Right to Erasure (Right to be Forgotten)

What GDPR says: Individuals can request the deletion or removal of personal data where there's no compelling reason for its continued processing.

ChatGPT status: Olejnik's complaint raises concerns about ChatGPT's potential inability to erase user data upon request. If these concerns are substantiated, it would be a significant area of non-compliance.

#4: Right to Restrict Processing

What GDPR says: Individuals have the right to block or suppress the processing of their personal data.

ChatGPT status: As with the aforementioned rights, it's still a topic of discussion whether ChatGPT allows users to restrict how their data is processed. The ongoing investigation will undoubtedly shed more light on this.

#5: Obligation for Data Protection by Design and Default

What GDPR says: Companies should design their systems with data protection in mind. This means that only necessary data should be processed, stored, and accessible.

ChatGPT status: Given the concerns raised, it's essential to evaluate how ChatGPT was designed concerning GDPR principles. If ChatGPT processes more data than necessary or doesn't prioritize user data protection from the outset, it could be at odds with this GDPR principle.

Additional Concerns: Article 36 and Data Protection Impact Assessment (DPIA)

Article 36 of the GDPR requires entities to engage in prior consultation with data protection authorities before launching services that might pose high risks to data subjects' rights and freedoms.

ChatGPT status: The concern raised in Olejnik's complaint suggests that OpenAI did not engage in such prior consultations with the UODO or other European DPAs before launching ChatGPT. This may be a significant oversight in GDPR compliance.

Furthermore, the DPIA, a crucial feature of GDPR compliance in Europe, is an assessment tool used to identify and minimize data protection risks. Olejnik's complaint urges the Polish regulator to require OpenAI to submit a DPIA, which would provide clarity on how ChatGPT processes personal data.

The Price of Privacy: GDPR sanctions and their real-world impact

In the digital realm, GDPR acts as a "watchdog", ensuring that organizations handle personal data with the utmost integrity. However, when entities falter in adhering to this regulation, the consequences can be financially and operationally severe. The GDPR sanctions are tiered based on the gravity of the infringement.

  • Most severe violations can trigger fines soaring up to €20 million or 4% of the preceding fiscal year's global turnover (whichever pinches more);

  • Less severe violations, though not to be taken lightly, can incur fines up to €10 million or 2% of the global turnover from the previous financial year (whichever is greater).

But the GDPR is not all about financial retribution; it also empowers Data Protection Authorities (DPAs) to enforce alternative measures. These could range from issuing stern warnings, laying down temporary or permanent bans on data processing, demanding data erasure, to suspending data transfers to third countries. The imposition of these sanctions isn't arbitrary but is meticulously weighed against several factors including the nature and severity of the violation, the preventive measures adopted, the level of cooperation with the authorities, and prior infringements, to name a few.

The landscape of GDPR sanctions isn't just a theoretical construct but has manifested in real-world scenarios, leaving indelible marks on some corporations' reputations and finances, for a cumulated € 4.5 billion up to October 2023.


Figure 2: Cumulated GDPR fines over time. Source: www.enforcementtracker.com

A notable instance is the hefty fine of €204.6 million meted out to British Airways following a data breach. The GDPR's fangs didn't spare tech giants either; Google found itself on the receiving end of a €50 million fine due to lack of transparency in advertising data processing. Other behemoths like Marriott International, Austrian Post, and Deutsche Wohnen SE also had their coffers significantly lightened due to GDPR infringements.

These instances are not mere scare tactics but a clarion call underscoring the imperativeness of GDPR compliance. The tales of these sanctions serve as a stark reminder and a learning curve for organizations to bolster their data protection frameworks, thereby navigating safely through the tempestuous seas of GDPR mandates.

The Path Forward: Generative AI and Privacy Regulation

Generative artificial intelligence, such as OpenAI’s ChatGPT, represents a massive leap forward in AI technology. These models can produce intricate content, have real-time conversations with users, and engage in tasks that previously seemed out of reach for machines. However, with great power comes great responsibility. As we've seen with the case of OpenAI's potential GDPR concerns, there's a pressing need to ensure these Generative AI marvels also prioritize user privacy.

The Intricacy of Generative AI and Privacy Compliance

Traditional data handling and privacy measures often rely on clear definitions and tangible data points, making it relatively straightforward to address issues like data access, correction, and erasure. Generative AI, given its vast and intricate training data, brings about challenges that aren’t always addressed by traditional data protection regulations. This is mainly because generative models don't just store data; they assimilate, generalize, and then generate outputs based on patterns. The line between stored personal information and the knowledge ingrained within the model is blurry, making compliance with regulations like GDPR particularly challenging.

De-identification and Anonymization: A New Approach

To overcome these challenges and respect the essence of data protection regulations, companies leveraging generative AI need to adapt. One promising approach lies in de-identification or anonymization techniques.

De-identification involves stripping data of personal identifiers, ensuring that the individual source of the data cannot be easily traced. Anonymization, on the other hand, is a more rigorous process wherein data is processed in a way that it can never be linked back to an individual, even when combined with other data sources.

By incorporating these techniques, AI companies can achieve two main objectives:

  1. Minimizing Risks of PII Leakage: By ensuring that the data fed into these models is already de-identified or anonymized, companies can drastically reduce the risk of inadvertently generating outputs that might contain or infer personal identifiable information (PII).

  2. Building Trust with Users: A user’s trust is paramount. By openly implementing and communicating about these techniques, companies can assure users that their interactions with the AI are private and secure, even if the AI model is expansive and intricate.

Beyond Traditional Approaches

However, it's important to note that while de-identification and anonymization are part of the solution, they might not be the silver bullet. Given the capabilities of generative AI models to produce detailed and unforeseen outputs, companies might also need to consider:

  • Continuous Auditing: Regularly monitoring and auditing AI outputs can help in identifying potential PII leaks or biased behavior.

  • Feedback Mechanisms: Allowing users to flag concerning outputs and incorporating that feedback can help in refining the model over time.

  • Transparency and Education: Educating users about the capabilities and limitations of generative AI can set proper expectations and ensure informed interactions.


Sources used in this article:

  • https://techcrunch.com/2023/08/30/chatgpt-maker-openai-accused-of-string-of-data-protection-breaches-in-gdpr-complaint-filed-by-privacy-researcher/

  • https://www.cliffordchance.com/insights/resources/blogs/talking-tech/en/articles/2023/04/the-italian-data-protection-authority-halts-chatgpt-s-data-proce.html

  • https://www.computerweekly.com/news/365535143/Italy-to-lift-ChatGPT-ban-subject-to-new-data-protection-controls#:~:text=Published%3A%2013%20Apr%202023%2013%3A59,the%20service%20implements%20a

  • https://www.dwt.com/blogs/artificial-intelligence-law-advisor/2023/05/ai-chatgpt-italy-ban-lifted

  • https://www.dw.com/en/ai-italy-lifts-ban-on-chatgpt-after-data-privacy-improvements/a-65469742#:~:text=The%20artificial%20intelligence%20,Italy%20blocked

  • https://www.enforcementtracker.com/?insights

Introduction: Understanding the GDPR

The General Data Protection Regulation (GDPR) is a groundbreaking European Union (EU) legislation that came into force in 2018. It was designed to harmonize data privacy laws across Europe and provide greater protection and rights to individuals. GDPR affects companies worldwide, especially those operating within the EU or dealing with the personal data of EU citizens.

Among the numerous regulations in the GDPR, some key principles include:

  • The right of access by the data subject;

  • The right to rectification of data;

  • The right to erasure (or "right to be forgotten");

  • The right to restrict processing;

  • The obligation for data protection by design and by default.

The legislation has had far-reaching impacts on global businesses and their operations, including on how they handle, store, and process personal data.

ChatGPT: A brief overview

ChatGPT, developed by OpenAI, is a language model designed to generate human-like text based on the input it receives. Since its development and deployment, it has been met with much praise for its capabilities. However, recent concerns regarding its compliance with GDPR have also emerged a few months after the end of 2022 release.

The pioneering challenge: The case of the Italian Data Protection Authority ban to ChatGPT

In late March 2023, the Italian Data Protection Authority, known as the "Garante della Privacy", delivered an interim emergency decision mandating OpenAI to cease using ChatGPT for processing personal data of individuals in Italy, marking a pause until further examination could be conducted​. This action was spurred by a data breach that occurred on 20 March, although the impact was noted to be "extremely low." The breach, originating from a technical glitch, exposed titles of active user chat histories and some payment-related data to a small fraction of users. The situation spotlighted the larger issue of GDPR non-compliance, a concern vocalized by Mr. Guido Scorza, a notable member of the Garante board​.

The primary reasons for GDPR non-compliance were three-fold:

  • Firstly, OpenAI had amassed personal data from billions of individuals for algorithm training without prior notification or a proper legal foundation;

  • Secondly, users of ChatGPT were not adequately informed about the processing and utilization of their personal data;

  • Lastly, the possibility of processing inaccurate personal data through AI-generated responses posed another challenge​.

Figure 1: The opening of the groundbreaking ban of Italian Authority

The ban was a wake-up call not only for OpenAI but also hinted at the escalating regulatory scrutiny on generative AI technologies. The Garante's decision wasn't a final verdict but an interim measure to mitigate the risk of GDPR violations, highlighting critical GDPR articles that ChatGPT allegedly breached. The Garante provided OpenAI an opportunity to address these concerns, emphasizing the necessity for clarity on data processing procedures, and age verification mechanisms to safeguard underage individuals from potential exposure to inappropriate content​.

By the end of April 2023, a path towards resolution was in sight as the Garante was poised to lift the ban provided OpenAI introduced new data protection controls to address the identified concerns​. OpenAI responded positively by amending its online notices and privacy policy to accentuate transparency, optionality, and security, which led to the lifting of the temporary ban, reinstating ChatGPT's services in Italy​.

Through this ordeal, the crucial lesson underscored is the imperative for AI service providers to ensure stringent data protection measures aligning with regional regulatory frameworks, to foster trust and ensure the sustainable operation of AI technologies. However, if some of the measure were easy to take (e.g., the informed consent to processing and utilization of personal data and the access to people below 18 years old), some other concerns are intrinsically difficult to address provided the available technology, and are still largely open points (among others, the training without prior notification or a proper legal foundation and the processing of inaccurate personal data).

The latest (and broadest) case of Generative AI scrutiny: The Polish Data Protection Authority filing

The concerns about ChatGPT and its potential GDPR violations came into the spotlight once again when a user named Lukasz Olejnik lodged a formal complaint. Olejnik, a privacy and digital rights researcher, expressed concerns about OpenAI's data handling practices concerning the GDPR.

Olejnik's interactions with ChatGPT, coupled with his inability to exercise his GDPR rights effectively, became the catalyst for the subsequent inquiry. The specific timeline after the submission of the complaint is uncertain; however, the Polish data protection authority, UODO, started its investigation based on the filed grievance.

The core of the accusation lies in OpenAI's handling of users' personal data. The GDPR ensures several rights to users, including access to personal data, rectification of inaccuracies, the right to erasure, among others. Olejnik's complaint emphasized the potential inability of ChatGPT users to exercise these rights.

In particular, the concerns highlighted:

  • ChatGPT's alleged inability to provide Olejnik with the data it might have stored about him;

  • Questions about OpenAI's data protection measures and whether ChatGPT's operations align with the GDPR's data protection-by-design and default principle;

  • OpenAI's purported failure to engage in prior consultations with the UODO or any other European DPA before launching ChatGPT, as necessitated by Article 36 of the GDPR;

  • A call for OpenAI to submit a Data Protection Impact Assessment (DPIA) detailing its data processing activities for ChatGPT. A DPIA is a fundamental component of GDPR compliance in Europe, providing a structured assessment of the potential risks and mitigations when handling personal data.

These requirements are standard requirements for any digital product distributed in European Union, and as such were to be expected from ChatGPT as well.

But how does ChatGPT actually score against GDPR requirements?

GDPR Requirements and ChatGPT

#1: Right of Access by the Data Subject

What GDPR says: Individuals have the right to access their personal data and supplementary information. This means that they have the right to understand whether their data is being processed and to access it if so.

ChatGPT status: The recent complaint by Lukasz Olejnik against ChatGPT highlights potential challenges in accessing personal data stored by ChatGPT. The claim suggests that users, including Olejnik, may not be able to access their stored data in the system, potentially infringing upon this GDPR requirement.

#2: Right to Rectification

What GDPR says: Data subjects have the right to have inaccurate personal data rectified or completed if it's incomplete.

ChatGPT status: It remains unclear whether users can modify or rectify data once it has been processed by ChatGPT. If users can't amend potentially inaccurate data entries, ChatGPT might be in violation of this principle.

#3: Right to Erasure (Right to be Forgotten)

What GDPR says: Individuals can request the deletion or removal of personal data where there's no compelling reason for its continued processing.

ChatGPT status: Olejnik's complaint raises concerns about ChatGPT's potential inability to erase user data upon request. If these concerns are substantiated, it would be a significant area of non-compliance.

#4: Right to Restrict Processing

What GDPR says: Individuals have the right to block or suppress the processing of their personal data.

ChatGPT status: As with the aforementioned rights, it's still a topic of discussion whether ChatGPT allows users to restrict how their data is processed. The ongoing investigation will undoubtedly shed more light on this.

#5: Obligation for Data Protection by Design and Default

What GDPR says: Companies should design their systems with data protection in mind. This means that only necessary data should be processed, stored, and accessible.

ChatGPT status: Given the concerns raised, it's essential to evaluate how ChatGPT was designed concerning GDPR principles. If ChatGPT processes more data than necessary or doesn't prioritize user data protection from the outset, it could be at odds with this GDPR principle.

Additional Concerns: Article 36 and Data Protection Impact Assessment (DPIA)

Article 36 of the GDPR requires entities to engage in prior consultation with data protection authorities before launching services that might pose high risks to data subjects' rights and freedoms.

ChatGPT status: The concern raised in Olejnik's complaint suggests that OpenAI did not engage in such prior consultations with the UODO or other European DPAs before launching ChatGPT. This may be a significant oversight in GDPR compliance.

Furthermore, the DPIA, a crucial feature of GDPR compliance in Europe, is an assessment tool used to identify and minimize data protection risks. Olejnik's complaint urges the Polish regulator to require OpenAI to submit a DPIA, which would provide clarity on how ChatGPT processes personal data.

The Price of Privacy: GDPR sanctions and their real-world impact

In the digital realm, GDPR acts as a "watchdog", ensuring that organizations handle personal data with the utmost integrity. However, when entities falter in adhering to this regulation, the consequences can be financially and operationally severe. The GDPR sanctions are tiered based on the gravity of the infringement.

  • Most severe violations can trigger fines soaring up to €20 million or 4% of the preceding fiscal year's global turnover (whichever pinches more);

  • Less severe violations, though not to be taken lightly, can incur fines up to €10 million or 2% of the global turnover from the previous financial year (whichever is greater).

But the GDPR is not all about financial retribution; it also empowers Data Protection Authorities (DPAs) to enforce alternative measures. These could range from issuing stern warnings, laying down temporary or permanent bans on data processing, demanding data erasure, to suspending data transfers to third countries. The imposition of these sanctions isn't arbitrary but is meticulously weighed against several factors including the nature and severity of the violation, the preventive measures adopted, the level of cooperation with the authorities, and prior infringements, to name a few.

The landscape of GDPR sanctions isn't just a theoretical construct but has manifested in real-world scenarios, leaving indelible marks on some corporations' reputations and finances, for a cumulated € 4.5 billion up to October 2023.


Figure 2: Cumulated GDPR fines over time. Source: www.enforcementtracker.com

A notable instance is the hefty fine of €204.6 million meted out to British Airways following a data breach. The GDPR's fangs didn't spare tech giants either; Google found itself on the receiving end of a €50 million fine due to lack of transparency in advertising data processing. Other behemoths like Marriott International, Austrian Post, and Deutsche Wohnen SE also had their coffers significantly lightened due to GDPR infringements.

These instances are not mere scare tactics but a clarion call underscoring the imperativeness of GDPR compliance. The tales of these sanctions serve as a stark reminder and a learning curve for organizations to bolster their data protection frameworks, thereby navigating safely through the tempestuous seas of GDPR mandates.

The Path Forward: Generative AI and Privacy Regulation

Generative artificial intelligence, such as OpenAI’s ChatGPT, represents a massive leap forward in AI technology. These models can produce intricate content, have real-time conversations with users, and engage in tasks that previously seemed out of reach for machines. However, with great power comes great responsibility. As we've seen with the case of OpenAI's potential GDPR concerns, there's a pressing need to ensure these Generative AI marvels also prioritize user privacy.

The Intricacy of Generative AI and Privacy Compliance

Traditional data handling and privacy measures often rely on clear definitions and tangible data points, making it relatively straightforward to address issues like data access, correction, and erasure. Generative AI, given its vast and intricate training data, brings about challenges that aren’t always addressed by traditional data protection regulations. This is mainly because generative models don't just store data; they assimilate, generalize, and then generate outputs based on patterns. The line between stored personal information and the knowledge ingrained within the model is blurry, making compliance with regulations like GDPR particularly challenging.

De-identification and Anonymization: A New Approach

To overcome these challenges and respect the essence of data protection regulations, companies leveraging generative AI need to adapt. One promising approach lies in de-identification or anonymization techniques.

De-identification involves stripping data of personal identifiers, ensuring that the individual source of the data cannot be easily traced. Anonymization, on the other hand, is a more rigorous process wherein data is processed in a way that it can never be linked back to an individual, even when combined with other data sources.

By incorporating these techniques, AI companies can achieve two main objectives:

  1. Minimizing Risks of PII Leakage: By ensuring that the data fed into these models is already de-identified or anonymized, companies can drastically reduce the risk of inadvertently generating outputs that might contain or infer personal identifiable information (PII).

  2. Building Trust with Users: A user’s trust is paramount. By openly implementing and communicating about these techniques, companies can assure users that their interactions with the AI are private and secure, even if the AI model is expansive and intricate.

Beyond Traditional Approaches

However, it's important to note that while de-identification and anonymization are part of the solution, they might not be the silver bullet. Given the capabilities of generative AI models to produce detailed and unforeseen outputs, companies might also need to consider:

  • Continuous Auditing: Regularly monitoring and auditing AI outputs can help in identifying potential PII leaks or biased behavior.

  • Feedback Mechanisms: Allowing users to flag concerning outputs and incorporating that feedback can help in refining the model over time.

  • Transparency and Education: Educating users about the capabilities and limitations of generative AI can set proper expectations and ensure informed interactions.


Sources used in this article:

  • https://techcrunch.com/2023/08/30/chatgpt-maker-openai-accused-of-string-of-data-protection-breaches-in-gdpr-complaint-filed-by-privacy-researcher/

  • https://www.cliffordchance.com/insights/resources/blogs/talking-tech/en/articles/2023/04/the-italian-data-protection-authority-halts-chatgpt-s-data-proce.html

  • https://www.computerweekly.com/news/365535143/Italy-to-lift-ChatGPT-ban-subject-to-new-data-protection-controls#:~:text=Published%3A%2013%20Apr%202023%2013%3A59,the%20service%20implements%20a

  • https://www.dwt.com/blogs/artificial-intelligence-law-advisor/2023/05/ai-chatgpt-italy-ban-lifted

  • https://www.dw.com/en/ai-italy-lifts-ban-on-chatgpt-after-data-privacy-improvements/a-65469742#:~:text=The%20artificial%20intelligence%20,Italy%20blocked

  • https://www.enforcementtracker.com/?insights

Introduction: Understanding the GDPR

The General Data Protection Regulation (GDPR) is a groundbreaking European Union (EU) legislation that came into force in 2018. It was designed to harmonize data privacy laws across Europe and provide greater protection and rights to individuals. GDPR affects companies worldwide, especially those operating within the EU or dealing with the personal data of EU citizens.

Among the numerous regulations in the GDPR, some key principles include:

  • The right of access by the data subject;

  • The right to rectification of data;

  • The right to erasure (or "right to be forgotten");

  • The right to restrict processing;

  • The obligation for data protection by design and by default.

The legislation has had far-reaching impacts on global businesses and their operations, including on how they handle, store, and process personal data.

ChatGPT: A brief overview

ChatGPT, developed by OpenAI, is a language model designed to generate human-like text based on the input it receives. Since its development and deployment, it has been met with much praise for its capabilities. However, recent concerns regarding its compliance with GDPR have also emerged a few months after the end of 2022 release.

The pioneering challenge: The case of the Italian Data Protection Authority ban to ChatGPT

In late March 2023, the Italian Data Protection Authority, known as the "Garante della Privacy", delivered an interim emergency decision mandating OpenAI to cease using ChatGPT for processing personal data of individuals in Italy, marking a pause until further examination could be conducted​. This action was spurred by a data breach that occurred on 20 March, although the impact was noted to be "extremely low." The breach, originating from a technical glitch, exposed titles of active user chat histories and some payment-related data to a small fraction of users. The situation spotlighted the larger issue of GDPR non-compliance, a concern vocalized by Mr. Guido Scorza, a notable member of the Garante board​.

The primary reasons for GDPR non-compliance were three-fold:

  • Firstly, OpenAI had amassed personal data from billions of individuals for algorithm training without prior notification or a proper legal foundation;

  • Secondly, users of ChatGPT were not adequately informed about the processing and utilization of their personal data;

  • Lastly, the possibility of processing inaccurate personal data through AI-generated responses posed another challenge​.

Figure 1: The opening of the groundbreaking ban of Italian Authority

The ban was a wake-up call not only for OpenAI but also hinted at the escalating regulatory scrutiny on generative AI technologies. The Garante's decision wasn't a final verdict but an interim measure to mitigate the risk of GDPR violations, highlighting critical GDPR articles that ChatGPT allegedly breached. The Garante provided OpenAI an opportunity to address these concerns, emphasizing the necessity for clarity on data processing procedures, and age verification mechanisms to safeguard underage individuals from potential exposure to inappropriate content​.

By the end of April 2023, a path towards resolution was in sight as the Garante was poised to lift the ban provided OpenAI introduced new data protection controls to address the identified concerns​. OpenAI responded positively by amending its online notices and privacy policy to accentuate transparency, optionality, and security, which led to the lifting of the temporary ban, reinstating ChatGPT's services in Italy​.

Through this ordeal, the crucial lesson underscored is the imperative for AI service providers to ensure stringent data protection measures aligning with regional regulatory frameworks, to foster trust and ensure the sustainable operation of AI technologies. However, if some of the measure were easy to take (e.g., the informed consent to processing and utilization of personal data and the access to people below 18 years old), some other concerns are intrinsically difficult to address provided the available technology, and are still largely open points (among others, the training without prior notification or a proper legal foundation and the processing of inaccurate personal data).

The latest (and broadest) case of Generative AI scrutiny: The Polish Data Protection Authority filing

The concerns about ChatGPT and its potential GDPR violations came into the spotlight once again when a user named Lukasz Olejnik lodged a formal complaint. Olejnik, a privacy and digital rights researcher, expressed concerns about OpenAI's data handling practices concerning the GDPR.

Olejnik's interactions with ChatGPT, coupled with his inability to exercise his GDPR rights effectively, became the catalyst for the subsequent inquiry. The specific timeline after the submission of the complaint is uncertain; however, the Polish data protection authority, UODO, started its investigation based on the filed grievance.

The core of the accusation lies in OpenAI's handling of users' personal data. The GDPR ensures several rights to users, including access to personal data, rectification of inaccuracies, the right to erasure, among others. Olejnik's complaint emphasized the potential inability of ChatGPT users to exercise these rights.

In particular, the concerns highlighted:

  • ChatGPT's alleged inability to provide Olejnik with the data it might have stored about him;

  • Questions about OpenAI's data protection measures and whether ChatGPT's operations align with the GDPR's data protection-by-design and default principle;

  • OpenAI's purported failure to engage in prior consultations with the UODO or any other European DPA before launching ChatGPT, as necessitated by Article 36 of the GDPR;

  • A call for OpenAI to submit a Data Protection Impact Assessment (DPIA) detailing its data processing activities for ChatGPT. A DPIA is a fundamental component of GDPR compliance in Europe, providing a structured assessment of the potential risks and mitigations when handling personal data.

These requirements are standard requirements for any digital product distributed in European Union, and as such were to be expected from ChatGPT as well.

But how does ChatGPT actually score against GDPR requirements?

GDPR Requirements and ChatGPT

#1: Right of Access by the Data Subject

What GDPR says: Individuals have the right to access their personal data and supplementary information. This means that they have the right to understand whether their data is being processed and to access it if so.

ChatGPT status: The recent complaint by Lukasz Olejnik against ChatGPT highlights potential challenges in accessing personal data stored by ChatGPT. The claim suggests that users, including Olejnik, may not be able to access their stored data in the system, potentially infringing upon this GDPR requirement.

#2: Right to Rectification

What GDPR says: Data subjects have the right to have inaccurate personal data rectified or completed if it's incomplete.

ChatGPT status: It remains unclear whether users can modify or rectify data once it has been processed by ChatGPT. If users can't amend potentially inaccurate data entries, ChatGPT might be in violation of this principle.

#3: Right to Erasure (Right to be Forgotten)

What GDPR says: Individuals can request the deletion or removal of personal data where there's no compelling reason for its continued processing.

ChatGPT status: Olejnik's complaint raises concerns about ChatGPT's potential inability to erase user data upon request. If these concerns are substantiated, it would be a significant area of non-compliance.

#4: Right to Restrict Processing

What GDPR says: Individuals have the right to block or suppress the processing of their personal data.

ChatGPT status: As with the aforementioned rights, it's still a topic of discussion whether ChatGPT allows users to restrict how their data is processed. The ongoing investigation will undoubtedly shed more light on this.

#5: Obligation for Data Protection by Design and Default

What GDPR says: Companies should design their systems with data protection in mind. This means that only necessary data should be processed, stored, and accessible.

ChatGPT status: Given the concerns raised, it's essential to evaluate how ChatGPT was designed concerning GDPR principles. If ChatGPT processes more data than necessary or doesn't prioritize user data protection from the outset, it could be at odds with this GDPR principle.

Additional Concerns: Article 36 and Data Protection Impact Assessment (DPIA)

Article 36 of the GDPR requires entities to engage in prior consultation with data protection authorities before launching services that might pose high risks to data subjects' rights and freedoms.

ChatGPT status: The concern raised in Olejnik's complaint suggests that OpenAI did not engage in such prior consultations with the UODO or other European DPAs before launching ChatGPT. This may be a significant oversight in GDPR compliance.

Furthermore, the DPIA, a crucial feature of GDPR compliance in Europe, is an assessment tool used to identify and minimize data protection risks. Olejnik's complaint urges the Polish regulator to require OpenAI to submit a DPIA, which would provide clarity on how ChatGPT processes personal data.

The Price of Privacy: GDPR sanctions and their real-world impact

In the digital realm, GDPR acts as a "watchdog", ensuring that organizations handle personal data with the utmost integrity. However, when entities falter in adhering to this regulation, the consequences can be financially and operationally severe. The GDPR sanctions are tiered based on the gravity of the infringement.

  • Most severe violations can trigger fines soaring up to €20 million or 4% of the preceding fiscal year's global turnover (whichever pinches more);

  • Less severe violations, though not to be taken lightly, can incur fines up to €10 million or 2% of the global turnover from the previous financial year (whichever is greater).

But the GDPR is not all about financial retribution; it also empowers Data Protection Authorities (DPAs) to enforce alternative measures. These could range from issuing stern warnings, laying down temporary or permanent bans on data processing, demanding data erasure, to suspending data transfers to third countries. The imposition of these sanctions isn't arbitrary but is meticulously weighed against several factors including the nature and severity of the violation, the preventive measures adopted, the level of cooperation with the authorities, and prior infringements, to name a few.

The landscape of GDPR sanctions isn't just a theoretical construct but has manifested in real-world scenarios, leaving indelible marks on some corporations' reputations and finances, for a cumulated € 4.5 billion up to October 2023.


Figure 2: Cumulated GDPR fines over time. Source: www.enforcementtracker.com

A notable instance is the hefty fine of €204.6 million meted out to British Airways following a data breach. The GDPR's fangs didn't spare tech giants either; Google found itself on the receiving end of a €50 million fine due to lack of transparency in advertising data processing. Other behemoths like Marriott International, Austrian Post, and Deutsche Wohnen SE also had their coffers significantly lightened due to GDPR infringements.

These instances are not mere scare tactics but a clarion call underscoring the imperativeness of GDPR compliance. The tales of these sanctions serve as a stark reminder and a learning curve for organizations to bolster their data protection frameworks, thereby navigating safely through the tempestuous seas of GDPR mandates.

The Path Forward: Generative AI and Privacy Regulation

Generative artificial intelligence, such as OpenAI’s ChatGPT, represents a massive leap forward in AI technology. These models can produce intricate content, have real-time conversations with users, and engage in tasks that previously seemed out of reach for machines. However, with great power comes great responsibility. As we've seen with the case of OpenAI's potential GDPR concerns, there's a pressing need to ensure these Generative AI marvels also prioritize user privacy.

The Intricacy of Generative AI and Privacy Compliance

Traditional data handling and privacy measures often rely on clear definitions and tangible data points, making it relatively straightforward to address issues like data access, correction, and erasure. Generative AI, given its vast and intricate training data, brings about challenges that aren’t always addressed by traditional data protection regulations. This is mainly because generative models don't just store data; they assimilate, generalize, and then generate outputs based on patterns. The line between stored personal information and the knowledge ingrained within the model is blurry, making compliance with regulations like GDPR particularly challenging.

De-identification and Anonymization: A New Approach

To overcome these challenges and respect the essence of data protection regulations, companies leveraging generative AI need to adapt. One promising approach lies in de-identification or anonymization techniques.

De-identification involves stripping data of personal identifiers, ensuring that the individual source of the data cannot be easily traced. Anonymization, on the other hand, is a more rigorous process wherein data is processed in a way that it can never be linked back to an individual, even when combined with other data sources.

By incorporating these techniques, AI companies can achieve two main objectives:

  1. Minimizing Risks of PII Leakage: By ensuring that the data fed into these models is already de-identified or anonymized, companies can drastically reduce the risk of inadvertently generating outputs that might contain or infer personal identifiable information (PII).

  2. Building Trust with Users: A user’s trust is paramount. By openly implementing and communicating about these techniques, companies can assure users that their interactions with the AI are private and secure, even if the AI model is expansive and intricate.

Beyond Traditional Approaches

However, it's important to note that while de-identification and anonymization are part of the solution, they might not be the silver bullet. Given the capabilities of generative AI models to produce detailed and unforeseen outputs, companies might also need to consider:

  • Continuous Auditing: Regularly monitoring and auditing AI outputs can help in identifying potential PII leaks or biased behavior.

  • Feedback Mechanisms: Allowing users to flag concerning outputs and incorporating that feedback can help in refining the model over time.

  • Transparency and Education: Educating users about the capabilities and limitations of generative AI can set proper expectations and ensure informed interactions.


Sources used in this article:

  • https://techcrunch.com/2023/08/30/chatgpt-maker-openai-accused-of-string-of-data-protection-breaches-in-gdpr-complaint-filed-by-privacy-researcher/

  • https://www.cliffordchance.com/insights/resources/blogs/talking-tech/en/articles/2023/04/the-italian-data-protection-authority-halts-chatgpt-s-data-proce.html

  • https://www.computerweekly.com/news/365535143/Italy-to-lift-ChatGPT-ban-subject-to-new-data-protection-controls#:~:text=Published%3A%2013%20Apr%202023%2013%3A59,the%20service%20implements%20a

  • https://www.dwt.com/blogs/artificial-intelligence-law-advisor/2023/05/ai-chatgpt-italy-ban-lifted

  • https://www.dw.com/en/ai-italy-lifts-ban-on-chatgpt-after-data-privacy-improvements/a-65469742#:~:text=The%20artificial%20intelligence%20,Italy%20blocked

  • https://www.enforcementtracker.com/?insights

Subscribe

Get fresh web design stories, tips, and resources delivered straight to your inbox every week.

Get fresh web design stories, tips, and resources delivered straight to your inbox every week.

Continue Reading

Continue Reading

Continue Reading