OpenAI “Accidentally” Deletes Evidence in Copyright Infringement Cases

2 min read

In a shocking turn of events that has questioned legal processes and trust in technology companies, OpenAI, which made the famous ChatGPT, is at the center of a major controversy: The company “accidentally” deleted data used to train its AI model, data that could have served as crucial evidence in copyright infringement lawsuits.

Background of the Scandal
Earlier this fall, OpenAI agreed to provide two virtual machines so that lawyers from The Times and Daily News could search for their copyrighted content in AI training datasets. In a letter to the court, the attorneys for the two publications said they spent more than 150 hours examining the training data used by OpenAI.

The Controversial Incident
The company’s engineers wiped the search data off one of those virtual machines on November 14, according to a letter filed with the U.S. District Court for the Southern District of New York. While OpenAI could recover most of that data, it was returned in a format that rendered it practically unusable in litigation.
The Times and Daily News attorneys must now begin the process of gathering evidence anew, with no guarantee that they will be able to get enough data on a second attempt. The incident raises serious questions about the transparency and accountability of tech companies in managing training data.

Lessons Learned
The NY Times’ experience showed that AI companies are in full control of the training data, and any preliminary investigations might be subject to “unfortunate accidents.” The public image of OpenAI and its ilk may suffer, but at the end of the day, it is all a cost-benefit calculation.

Conclusion
One still waits to see if authorities will actually act on grounds of copyright and the integrity of due legal processes. Meanwhile, creators and journalists must continue with vigilance and test every available avenue to legally defend their works.

x

+ There are no comments

Add yours