option
Home News Microsoft Explores Crediting AI Data Contributors

Microsoft Explores Crediting AI Data Contributors

release date release date April 10, 2025
Author Author NicholasLewis
views views 93

Microsoft Explores Crediting AI Data Contributors

Microsoft is embarking on a new research project aimed at understanding how specific training examples influence the outputs of generative AI models, such as text, images, and other media. This initiative was highlighted in a job listing from December that recently resurfaced on LinkedIn, seeking a research intern to join the effort. The project's goal is to develop a method to train models so that the impact of particular data, like photos and books, on their outputs can be "efficiently and usefully estimated." The job listing points out that current neural network architectures lack transparency in tracing the origins of their outputs, and there are compelling reasons to address this issue. One reason mentioned is the potential for providing incentives, recognition, and even compensation to individuals who contribute valuable data to future AI models. The backdrop to this research is the ongoing legal battles involving AI companies, including Microsoft, over intellectual property rights. AI models are often trained on vast datasets scraped from public websites, which can include copyrighted material. While AI companies often claim protection under fair use doctrine, creators across various fields—artists, programmers, authors—dispute this stance. Microsoft is currently facing legal challenges, including a lawsuit from The New York Times, which alleges that Microsoft and OpenAI infringed on its copyright by using its articles to train their models. Additionally, several software developers have sued Microsoft over its GitHub Copilot AI coding assistant, claiming it was trained on their copyrighted code. The research project, referred to as "training-time provenance," involves Jaron Lanier, a notable technologist at Microsoft Research. Lanier has previously written about "data dignity," advocating for a system that connects digital content with its creators and potentially compensates them for their contributions to AI outputs. While Microsoft's project is still in its early stages, other companies like Bria, Adobe, and Shutterstock are already experimenting with compensating data owners based on their contributions to AI models. However, large AI labs have generally not established individual contributor payout programs, opting instead for licensing agreements or opt-out mechanisms for copyright holders, which can be cumbersome and limited in scope. Microsoft's initiative might remain a proof of concept, similar to OpenAI's yet-to-be-released tool for creators to control how their works are used in training data. There's also speculation that Microsoft might be attempting to "ethics wash" its AI practices or preempt regulatory and legal challenges. This move by Microsoft is particularly noteworthy given the recent calls from other AI labs, like Google and OpenAI, for the U.S. government to relax copyright protections for AI development. Microsoft has not yet responded to requests for comment on this project.
Related article
AI Photo Generation on WhatsApp: Full Guide for 2025 AI Photo Generation on WhatsApp: Full Guide for 2025 In 2025, the world of visual creation has transformed, thanks to the seamless integration of artificial intelligence into everyday applications like WhatsApp. With Meta AI now part of the platform, everyone can dive into the realm of AI-generated photos right from their favorite messaging app. This
Leonardo AI Character Consistency: An In-Depth Overview Leonardo AI Character Consistency: An In-Depth Overview Creating consistent characters in AI-generated art has never been easier, thanks to the innovative tools provided by Leonardo AI. This guide delves into the character reference feature, which empowers you to maintain character consistency across various settings, environments, and even different fac
Anthropic Launches API for AI-Driven Web Search Anthropic Launches API for AI-Driven Web Search Anthropic has just rolled out a new API that supercharges its Claude AI models with the power to scour the web for the latest info. This means developers can now craft apps powered
Comments (30)
0/200
KeithSmith
KeithSmith April 10, 2025 at 2:45:43 PM GMT

Microsoft's project on crediting AI data contributors is super interesting! It's about time someone looked into how our data shapes AI outputs. I'm curious to see how they'll implement this, but it's a bit of a mystery right now. Can't wait for more updates!

DanielLewis
DanielLewis April 10, 2025 at 2:45:43 PM GMT

マイクロソフトのAIデータ貢献者へのクレジットプロジェクトはとても興味深いです!私たちのデータがAIの出力にどのように影響を与えるかを誰かが調査する時が来ました。どのように実装するのか興味がありますが、今はまだ謎です。もっと更新を待ちきれません!

JackPerez
JackPerez April 10, 2025 at 2:45:43 PM GMT

마이크로소프트의 AI 데이터 기여자에게 크레딧을 주는 프로젝트는 정말 흥미로워요! 우리의 데이터가 AI 출력에 어떻게 영향을 미치는지 누군가가 조사할 때가 왔어요. 어떻게 구현할지 궁금하지만, 지금은 아직 미스터리예요. 더 많은 업데이트를 기다릴 수 없어요!

AlbertAllen
AlbertAllen April 10, 2025 at 2:45:43 PM GMT

O projeto da Microsoft sobre dar crédito aos contribuintes de dados de IA é super interessante! Já era hora de alguém investigar como nossos dados moldam as saídas de IA. Estou curioso para ver como eles vão implementar isso, mas é um pouco misterioso agora. Mal posso esperar por mais atualizações!

HaroldMiller
HaroldMiller April 10, 2025 at 2:45:43 PM GMT

El proyecto de Microsoft sobre dar crédito a los contribuyentes de datos de IA es súper interesante. ¡Ya era hora de que alguien investigara cómo nuestros datos moldean las salidas de IA! Estoy curioso por ver cómo lo implementarán, pero es un poco misterioso por ahora. ¡No puedo esperar por más actualizaciones!

BruceHernández
BruceHernández April 11, 2025 at 5:11:25 AM GMT

Microsoft's new project on understanding AI training data is super interesting! It's cool to see how they're trying to figure out the nitty-gritty of AI outputs. The only downside is it feels a bit too academic for my taste, but I'm excited to see where it goes. Keep up the good work, Microsoft!

Back to Top
OR