News

News

News

Why Italian Privacy Authority has issued a second infringement note to OpenAI ChatGPT?

Why Italian Privacy Authority has issued a second infringement note to OpenAI ChatGPT?

Why Italian Privacy Authority has issued a second infringement note to OpenAI ChatGPT?

What's new

On January 29th, the Italian DPA issued a second unexpected press release on possible GDPR infringements of OpenAI's Large Language Models. This follows a thorough investigation by Italy's data protection authority: while the specifics of the draft findings by the Italian authority haven't been revealed, OpenAI has been formally notified of these allegations.

OpenAI is now required to respond to these allegations within 30 days, presenting a defense against the claims made by the Italian data protection authority, known as the "Garante della Privacy".

The press release highlights the significant consequences that could arise if OpenAI is found to be in breach of the EU privacy regulations. These consequences include substantial potential fines that can reach up to €20 million or 4% of OpenAI's global annual turnover, whichever is higher. Despite being a well-know "ceil" commonly not hit by actual GDPR sanctions, this presents a significant financial risk to the company, and the more considering that it is not the first time a warning has been issued to OpenAI.

More critically, the release underscores the operational impacts that OpenAI might face. Data Protection Authorities (DPAs) across the EU have the power to issue orders mandating changes in how data is processed. This means that if OpenAI is found to have violated privacy regulations, it might be required to alter its data processing methods. Such changes could fundamentally affect how OpenAI operates its services, including ChatGPT.

Furthermore, if OpenAI disagrees with the changes demanded by the privacy authorities of the EU Member States, it might face a difficult decision. The company could either comply with these changes or choose to withdraw its services from those specific EU Member States. This would be a significant move, potentially limiting the availability of its services in those regions.

The press release concludes by noting that OpenAI was contacted for a statement regarding the notification of violation by the Garante. There is an indication that the report will be updated if OpenAI responds to these allegations. This suggests that further information may be forthcoming, which could provide more insight into both the nature of the allegations and OpenAI's stance on the matter.

Which the reasons behind the Italian Data Protection Authority release?

Similarly to the case of the scrutiny of the Polish Authority, several areas remain arguably an open concern regarding the compliance with GDPR. Therefore, while the official reasons are not available yet, we can identify some "suspected" reasons.

Here our top 3 suspects.

Suspect #1: Legal basis underpinning the massive collection and processing of personal data in order to ‘train’ the algorithms on which the platform relies

ChatGPT, like any other of the well-known large language models, was developed using extensive data scraping from the internet without the consent of the individuals involved. While this practice is permissible in the United States, it's more challenging in Europe, leading to intervention by the Italian Data Protection Authority (Garante).

Back in March 2023, the Garante ordered OpenAI to eliminate any reference to contract execution in its data processing and, based on the principle of accountability, to instead indicate consent or legitimate interest as the basis for using such data. This is pending the Garante's verification and assessment of this approach. According to the newly published privacy policy, OpenAI has chosen legitimate interest as its legal basis for operation, pending further review by the Garante. The alternative, seeking user consent for the use of personal data in algorithm training, was deemed impractical.

Recently, German privacy authorities have also taken action, asking OpenAI for information about its compliance with privacy regulations. This has led to the formation of a European task force of privacy regulators. The situation unfolding in Italy set a precedent in Europe and possibly beyond and remains a slippery slope for Generative AI models.

Suspect #2: Right to object to the processing of own personal data

The "Right to Object" under the General Data Protection Regulation (GDPR) is a critical component of data privacy rights in the European Union. This right allows individuals to object to the processing of their personal data in certain circumstances.

Individuals can object to data processing based on their particular situation, especially if the processing is based on legitimate interest or public interest. Upon receiving an objection, the data controller must cease processing unless they can demonstrate compelling legitimate grounds for the processing which override the interests, rights, and freedoms of the individual, or for the establishment, exercise, or defense of legal claims.

Given the vast amount of data used in training, it might be challenging for ChatGPT to identify and isolate specific personal data upon which an individual might wish to exercise their right to object: enabling a mechanism to allow individuals to effectively object to the use of their data in an AI model as complex as ChatGPT is technically challenging. It would require not just identifying and removing the data, but potentially retraining the model without that data.

To date, there is no easy way for OpenAI to allow the exercise of this right, and any such request is not even easy to submit from the user perspective.

Suspect #3: Right to rectification of incorrect or incomplete data

LLMs are trained on vast datasets that often include personal data scraped from the internet. If this data is incorrect or incomplete, the right to rectification implies that individuals could request corrections. However, the challenge lies in the identification of specific personal data within these large datasets and the feasibility of rectifying data within a trained model. Unlike conventional databases, it's not straightforward to locate and alter individual data points in a trained neural network.

Like for right to objection, implementing right to rectification them in the context of LLMs is technically complex. It's not just about deleting a row in a database; it involves understanding how a piece of data has influenced the learning of a model, potentially requiring significant alterations to the training process or the model itself.

While OpenAI released a tool for request rectification of data shortly after the ban of April, this web form seems not easily available anymore, reopening the issue with granting such right to the users.

Conclusions

The recent actions by the Italian Data Protection Authority against OpenAI's ChatGPT have brought (again) to light potential GDPR infringements in the realm of Large Language Models (LLMs). While the specific reasons for this regulatory scrutiny are not yet disclosed, it highlights intrinsic challenges in GDPR compliance due to the nature of technology used in LLMs.

Key areas of concern include the legal basis for using data, the right to object, and the right to rectification. LLMs like ChatGPT are trained on extensive datasets collected from the internet, often without explicit consent from individuals, raising questions about the legal basis of data processing. Furthermore, the right to object and the right to rectification are difficult to implement in LLMs due to the complexity in identifying and modifying personal data within massive datasets.

In this complex context, anonymization emerges as a significant ally in increasing GDPR compliance from a company perspective. By transforming personal data in such a way that individuals cannot be identified, companies can mitigate risks associated with personal data processing. However, effective anonymization must balance technical feasibility with the maintenance of data utility for the LLMs.

Although anonymization presents its own set of challenges, it offers a pathway for LLM providers to align more closely with GDPR requirements, emphasizing the need for ongoing innovation and adaptation in data privacy practices in the AI field.

What's new

On January 29th, the Italian DPA issued a second unexpected press release on possible GDPR infringements of OpenAI's Large Language Models. This follows a thorough investigation by Italy's data protection authority: while the specifics of the draft findings by the Italian authority haven't been revealed, OpenAI has been formally notified of these allegations.

OpenAI is now required to respond to these allegations within 30 days, presenting a defense against the claims made by the Italian data protection authority, known as the "Garante della Privacy".

The press release highlights the significant consequences that could arise if OpenAI is found to be in breach of the EU privacy regulations. These consequences include substantial potential fines that can reach up to €20 million or 4% of OpenAI's global annual turnover, whichever is higher. Despite being a well-know "ceil" commonly not hit by actual GDPR sanctions, this presents a significant financial risk to the company, and the more considering that it is not the first time a warning has been issued to OpenAI.

More critically, the release underscores the operational impacts that OpenAI might face. Data Protection Authorities (DPAs) across the EU have the power to issue orders mandating changes in how data is processed. This means that if OpenAI is found to have violated privacy regulations, it might be required to alter its data processing methods. Such changes could fundamentally affect how OpenAI operates its services, including ChatGPT.

Furthermore, if OpenAI disagrees with the changes demanded by the privacy authorities of the EU Member States, it might face a difficult decision. The company could either comply with these changes or choose to withdraw its services from those specific EU Member States. This would be a significant move, potentially limiting the availability of its services in those regions.

The press release concludes by noting that OpenAI was contacted for a statement regarding the notification of violation by the Garante. There is an indication that the report will be updated if OpenAI responds to these allegations. This suggests that further information may be forthcoming, which could provide more insight into both the nature of the allegations and OpenAI's stance on the matter.

Which the reasons behind the Italian Data Protection Authority release?

Similarly to the case of the scrutiny of the Polish Authority, several areas remain arguably an open concern regarding the compliance with GDPR. Therefore, while the official reasons are not available yet, we can identify some "suspected" reasons.

Here our top 3 suspects.

Suspect #1: Legal basis underpinning the massive collection and processing of personal data in order to ‘train’ the algorithms on which the platform relies

ChatGPT, like any other of the well-known large language models, was developed using extensive data scraping from the internet without the consent of the individuals involved. While this practice is permissible in the United States, it's more challenging in Europe, leading to intervention by the Italian Data Protection Authority (Garante).

Back in March 2023, the Garante ordered OpenAI to eliminate any reference to contract execution in its data processing and, based on the principle of accountability, to instead indicate consent or legitimate interest as the basis for using such data. This is pending the Garante's verification and assessment of this approach. According to the newly published privacy policy, OpenAI has chosen legitimate interest as its legal basis for operation, pending further review by the Garante. The alternative, seeking user consent for the use of personal data in algorithm training, was deemed impractical.

Recently, German privacy authorities have also taken action, asking OpenAI for information about its compliance with privacy regulations. This has led to the formation of a European task force of privacy regulators. The situation unfolding in Italy set a precedent in Europe and possibly beyond and remains a slippery slope for Generative AI models.

Suspect #2: Right to object to the processing of own personal data

The "Right to Object" under the General Data Protection Regulation (GDPR) is a critical component of data privacy rights in the European Union. This right allows individuals to object to the processing of their personal data in certain circumstances.

Individuals can object to data processing based on their particular situation, especially if the processing is based on legitimate interest or public interest. Upon receiving an objection, the data controller must cease processing unless they can demonstrate compelling legitimate grounds for the processing which override the interests, rights, and freedoms of the individual, or for the establishment, exercise, or defense of legal claims.

Given the vast amount of data used in training, it might be challenging for ChatGPT to identify and isolate specific personal data upon which an individual might wish to exercise their right to object: enabling a mechanism to allow individuals to effectively object to the use of their data in an AI model as complex as ChatGPT is technically challenging. It would require not just identifying and removing the data, but potentially retraining the model without that data.

To date, there is no easy way for OpenAI to allow the exercise of this right, and any such request is not even easy to submit from the user perspective.

Suspect #3: Right to rectification of incorrect or incomplete data

LLMs are trained on vast datasets that often include personal data scraped from the internet. If this data is incorrect or incomplete, the right to rectification implies that individuals could request corrections. However, the challenge lies in the identification of specific personal data within these large datasets and the feasibility of rectifying data within a trained model. Unlike conventional databases, it's not straightforward to locate and alter individual data points in a trained neural network.

Like for right to objection, implementing right to rectification them in the context of LLMs is technically complex. It's not just about deleting a row in a database; it involves understanding how a piece of data has influenced the learning of a model, potentially requiring significant alterations to the training process or the model itself.

While OpenAI released a tool for request rectification of data shortly after the ban of April, this web form seems not easily available anymore, reopening the issue with granting such right to the users.

Conclusions

The recent actions by the Italian Data Protection Authority against OpenAI's ChatGPT have brought (again) to light potential GDPR infringements in the realm of Large Language Models (LLMs). While the specific reasons for this regulatory scrutiny are not yet disclosed, it highlights intrinsic challenges in GDPR compliance due to the nature of technology used in LLMs.

Key areas of concern include the legal basis for using data, the right to object, and the right to rectification. LLMs like ChatGPT are trained on extensive datasets collected from the internet, often without explicit consent from individuals, raising questions about the legal basis of data processing. Furthermore, the right to object and the right to rectification are difficult to implement in LLMs due to the complexity in identifying and modifying personal data within massive datasets.

In this complex context, anonymization emerges as a significant ally in increasing GDPR compliance from a company perspective. By transforming personal data in such a way that individuals cannot be identified, companies can mitigate risks associated with personal data processing. However, effective anonymization must balance technical feasibility with the maintenance of data utility for the LLMs.

Although anonymization presents its own set of challenges, it offers a pathway for LLM providers to align more closely with GDPR requirements, emphasizing the need for ongoing innovation and adaptation in data privacy practices in the AI field.

What's new

On January 29th, the Italian DPA issued a second unexpected press release on possible GDPR infringements of OpenAI's Large Language Models. This follows a thorough investigation by Italy's data protection authority: while the specifics of the draft findings by the Italian authority haven't been revealed, OpenAI has been formally notified of these allegations.

OpenAI is now required to respond to these allegations within 30 days, presenting a defense against the claims made by the Italian data protection authority, known as the "Garante della Privacy".

The press release highlights the significant consequences that could arise if OpenAI is found to be in breach of the EU privacy regulations. These consequences include substantial potential fines that can reach up to €20 million or 4% of OpenAI's global annual turnover, whichever is higher. Despite being a well-know "ceil" commonly not hit by actual GDPR sanctions, this presents a significant financial risk to the company, and the more considering that it is not the first time a warning has been issued to OpenAI.

More critically, the release underscores the operational impacts that OpenAI might face. Data Protection Authorities (DPAs) across the EU have the power to issue orders mandating changes in how data is processed. This means that if OpenAI is found to have violated privacy regulations, it might be required to alter its data processing methods. Such changes could fundamentally affect how OpenAI operates its services, including ChatGPT.

Furthermore, if OpenAI disagrees with the changes demanded by the privacy authorities of the EU Member States, it might face a difficult decision. The company could either comply with these changes or choose to withdraw its services from those specific EU Member States. This would be a significant move, potentially limiting the availability of its services in those regions.

The press release concludes by noting that OpenAI was contacted for a statement regarding the notification of violation by the Garante. There is an indication that the report will be updated if OpenAI responds to these allegations. This suggests that further information may be forthcoming, which could provide more insight into both the nature of the allegations and OpenAI's stance on the matter.

Which the reasons behind the Italian Data Protection Authority release?

Similarly to the case of the scrutiny of the Polish Authority, several areas remain arguably an open concern regarding the compliance with GDPR. Therefore, while the official reasons are not available yet, we can identify some "suspected" reasons.

Here our top 3 suspects.

Suspect #1: Legal basis underpinning the massive collection and processing of personal data in order to ‘train’ the algorithms on which the platform relies

ChatGPT, like any other of the well-known large language models, was developed using extensive data scraping from the internet without the consent of the individuals involved. While this practice is permissible in the United States, it's more challenging in Europe, leading to intervention by the Italian Data Protection Authority (Garante).

Back in March 2023, the Garante ordered OpenAI to eliminate any reference to contract execution in its data processing and, based on the principle of accountability, to instead indicate consent or legitimate interest as the basis for using such data. This is pending the Garante's verification and assessment of this approach. According to the newly published privacy policy, OpenAI has chosen legitimate interest as its legal basis for operation, pending further review by the Garante. The alternative, seeking user consent for the use of personal data in algorithm training, was deemed impractical.

Recently, German privacy authorities have also taken action, asking OpenAI for information about its compliance with privacy regulations. This has led to the formation of a European task force of privacy regulators. The situation unfolding in Italy set a precedent in Europe and possibly beyond and remains a slippery slope for Generative AI models.

Suspect #2: Right to object to the processing of own personal data

The "Right to Object" under the General Data Protection Regulation (GDPR) is a critical component of data privacy rights in the European Union. This right allows individuals to object to the processing of their personal data in certain circumstances.

Individuals can object to data processing based on their particular situation, especially if the processing is based on legitimate interest or public interest. Upon receiving an objection, the data controller must cease processing unless they can demonstrate compelling legitimate grounds for the processing which override the interests, rights, and freedoms of the individual, or for the establishment, exercise, or defense of legal claims.

Given the vast amount of data used in training, it might be challenging for ChatGPT to identify and isolate specific personal data upon which an individual might wish to exercise their right to object: enabling a mechanism to allow individuals to effectively object to the use of their data in an AI model as complex as ChatGPT is technically challenging. It would require not just identifying and removing the data, but potentially retraining the model without that data.

To date, there is no easy way for OpenAI to allow the exercise of this right, and any such request is not even easy to submit from the user perspective.

Suspect #3: Right to rectification of incorrect or incomplete data

LLMs are trained on vast datasets that often include personal data scraped from the internet. If this data is incorrect or incomplete, the right to rectification implies that individuals could request corrections. However, the challenge lies in the identification of specific personal data within these large datasets and the feasibility of rectifying data within a trained model. Unlike conventional databases, it's not straightforward to locate and alter individual data points in a trained neural network.

Like for right to objection, implementing right to rectification them in the context of LLMs is technically complex. It's not just about deleting a row in a database; it involves understanding how a piece of data has influenced the learning of a model, potentially requiring significant alterations to the training process or the model itself.

While OpenAI released a tool for request rectification of data shortly after the ban of April, this web form seems not easily available anymore, reopening the issue with granting such right to the users.

Conclusions

The recent actions by the Italian Data Protection Authority against OpenAI's ChatGPT have brought (again) to light potential GDPR infringements in the realm of Large Language Models (LLMs). While the specific reasons for this regulatory scrutiny are not yet disclosed, it highlights intrinsic challenges in GDPR compliance due to the nature of technology used in LLMs.

Key areas of concern include the legal basis for using data, the right to object, and the right to rectification. LLMs like ChatGPT are trained on extensive datasets collected from the internet, often without explicit consent from individuals, raising questions about the legal basis of data processing. Furthermore, the right to object and the right to rectification are difficult to implement in LLMs due to the complexity in identifying and modifying personal data within massive datasets.

In this complex context, anonymization emerges as a significant ally in increasing GDPR compliance from a company perspective. By transforming personal data in such a way that individuals cannot be identified, companies can mitigate risks associated with personal data processing. However, effective anonymization must balance technical feasibility with the maintenance of data utility for the LLMs.

Although anonymization presents its own set of challenges, it offers a pathway for LLM providers to align more closely with GDPR requirements, emphasizing the need for ongoing innovation and adaptation in data privacy practices in the AI field.

Subscribe

Get fresh web design stories, tips, and resources delivered straight to your inbox every week.

Get fresh web design stories, tips, and resources delivered straight to your inbox every week.

Continue Reading

Continue Reading

Continue Reading