option
Home
News
Microsoft Explores Crediting AI Data Contributors

Microsoft Explores Crediting AI Data Contributors

April 10, 2025
238

Microsoft Explores Crediting AI Data Contributors

Microsoft is embarking on a new research project aimed at understanding how specific training examples influence the outputs of generative AI models, such as text, images, and other media. This initiative was highlighted in a job listing from December that recently resurfaced on LinkedIn, seeking a research intern to join the effort. The project's goal is to develop a method to train models so that the impact of particular data, like photos and books, on their outputs can be "efficiently and usefully estimated." The job listing points out that current neural network architectures lack transparency in tracing the origins of their outputs, and there are compelling reasons to address this issue. One reason mentioned is the potential for providing incentives, recognition, and even compensation to individuals who contribute valuable data to future AI models. The backdrop to this research is the ongoing legal battles involving AI companies, including Microsoft, over intellectual property rights. AI models are often trained on vast datasets scraped from public websites, which can include copyrighted material. While AI companies often claim protection under fair use doctrine, creators across various fields—artists, programmers, authors—dispute this stance. Microsoft is currently facing legal challenges, including a lawsuit from The New York Times, which alleges that Microsoft and OpenAI infringed on its copyright by using its articles to train their models. Additionally, several software developers have sued Microsoft over its GitHub Copilot AI coding assistant, claiming it was trained on their copyrighted code. The research project, referred to as "training-time provenance," involves Jaron Lanier, a notable technologist at Microsoft Research. Lanier has previously written about "data dignity," advocating for a system that connects digital content with its creators and potentially compensates them for their contributions to AI outputs. While Microsoft's project is still in its early stages, other companies like Bria, Adobe, and Shutterstock are already experimenting with compensating data owners based on their contributions to AI models. However, large AI labs have generally not established individual contributor payout programs, opting instead for licensing agreements or opt-out mechanisms for copyright holders, which can be cumbersome and limited in scope. Microsoft's initiative might remain a proof of concept, similar to OpenAI's yet-to-be-released tool for creators to control how their works are used in training data. There's also speculation that Microsoft might be attempting to "ethics wash" its AI practices or preempt regulatory and legal challenges. This move by Microsoft is particularly noteworthy given the recent calls from other AI labs, like Google and OpenAI, for the U.S. government to relax copyright protections for AI development. Microsoft has not yet responded to requests for comment on this project.
Related article
Google Unveils Gemini Notebooks, Merging NotebookLM with Personal Knowledge Base Google Unveils Gemini Notebooks, Merging NotebookLM with Personal Knowledge Base Google recently launched a "Notebooks" feature for Gemini, designed to help users manage complex projects by creating a personalized knowledge base. This update bridges the data gap between Gemini and the AI research assistant NotebookLM, marking a k
Luma AI unveils Uni-1 autoregressive model that generates text and pixels simultaneously Luma AI unveils Uni-1 autoregressive model that generates text and pixels simultaneously Luma Labs launched its image generation model Uni-1 on March 23, marking the company's first publicly available model built on the Unified Intelligence architecture. Free trial access is now open on the official website, with API pricing announced an
NVIDIA's Xinzhou Wu: autonomous driving's ChatGPT moment has arrived, L4 mass production no longer a dream NVIDIA's Xinzhou Wu: autonomous driving's ChatGPT moment has arrived, L4 mass production no longer a dream In the rapidly evolving field of physical AI, autonomous driving is often viewed as the first major challenge to overcome. Recently, Wu Xinzhou, Vice President of NVIDIA, outlined the company's ambitious vision for intelligent driving at a Beijing co
Related Special Topic Recommendations
chatbot Top-Rated AI Romantic Chatbots: Build Long-Term Relationships with Consistent Personalities
Top-Rated AI Romantic Chatbots: Build Long-Term Relationships with Consistent Personalities

Discover the 2026 latest top-rated AI romantic chatbots for building genuine, long-term connections. Our curated list features powerful, consistent personalities, free vs paid comparisons, and real-world tests. Find your perfect companion and start building today at XIX.AI.

10 tools
xix.ai
Education and Learning Best AI Data Science Mentors: Master SQL, Pandas & Machine Learning Workflows
Best AI Data Science Mentors: Master SQL, Pandas & Machine Learning Workflows

Discover the 2026 best AI data science mentors to master SQL, Pandas & ML workflows. Explore our top-rated, curated selection at XIX.AI for powerful, game-changing guidance. Compare free vs paid options with real-world insights. Unlock your data science mastery today.

10 tools
xix.ai
chatbot Best AI Flirting & Conversation Trainers: Improve Social Charisma and Confidence in Real-Time
Best AI Flirting & Conversation Trainers: Improve Social Charisma and Confidence in Real-Time

Discover the 2026 best AI flirting and conversation trainers on XIX.AI. Our curated, top-rated selection helps you build social charisma and confidence in real-time. Explore must-try, game-changing tools with free vs paid comparisons and weekly updated rankings. Unlock your social edge today.

10 tools
xix.ai
code Best AI Tools for Automated Unit Testing: Generate Jest, PyTest & JUnit Test Cases in One Click
Best AI Tools for Automated Unit Testing: Generate Jest, PyTest & JUnit Test Cases in One Click

Discover the 2026 latest top-rated AI tools for automated unit testing. Our curated selection features powerful, game-changing solutions to generate Jest, PyTest & JUnit test cases instantly. Compare free vs paid options with real-world tests and weekly updated rankings on XIX.AI. Unlock your AI edge and boost development productivity today.

10 tools
xix.ai
Data Analysis Best AI Data Visualization Tools: Auto-Generate Interactive BI Dashboards from Raw Files
Best AI Data Visualization Tools: Auto-Generate Interactive BI Dashboards from Raw Files

Discover the 2026 best AI data visualization tools at XIX.AI. Our curated, top-rated selection helps you auto-generate powerful, interactive BI dashboards from raw files instantly. Compare free vs paid options with real-world tests and weekly updated rankings. Unlock your data's potential today.

10 tools
xix.ai
Social Media AI Branding Kits for Social Media: Maintain Consistent Brand Visuals Across All Channels
AI Branding Kits for Social Media: Maintain Consistent Brand Visuals Across All Channels

Discover the 2026 best AI branding kits for social media. XIX.AI's curated list features top-rated, game-changing tools to maintain perfectly consistent brand visuals across all channels. Compare free vs paid options with real-world tests. Unlock your brand's visual edge today.

10 tools
xix.ai
Comments (37)
0/500
ChristopherTaylor
ChristopherTaylor March 10, 2026 at 12:01:55 PM EDT

Interesante proyecto, pero lo que realmente necesito es que alguien en la IA me explique por qué mi asistente virtual aún no puede organizarme el escritorio 😅. ¿Esto de la atribución podría cambiar cómo las empresas comparten datos? Me preocupa un poco la transparencia de todo el proceso. Ojalá no sea solo un gesto de relaciones públicas.

AvaHill
AvaHill December 16, 2025 at 7:30:41 PM EST

Lo de Microsoft parece interesante pero, ¿y la privacidad de los datos? 🤔 A veces siento que con toda esta exploración de IA, estamos perdiendo el control de lo que se usa para entrenar los modelos. ¿Habrá una compensación justa para quienes contribuyeron? No quiero que esto se convierta en otro caso de 'big tech' aprovechándose de 'datos gratis'...

RogerSanchez
RogerSanchez November 1, 2025 at 10:30:40 AM EDT

이런 연구는 AI 데이터 기여자들에게 공정한 보상을 제공하는 데 중요한 단계가 될 수 있겠네요. 😊 근데 MS가 과연 저작권 문제를 해결할 수 있을지 의문이 드네요. 데이터 소싱 방식이 좀 더 투명해져야 할 시점인 것 같아요!

JuanWhite
JuanWhite August 15, 2025 at 3:01:00 PM EDT

This is super intriguing! Microsoft's diving into how AI training data shapes outputs—mind-blowing stuff. Wonder how they'll credit contributors fairly? 🤔

BrianWilliams
BrianWilliams August 11, 2025 at 1:00:59 AM EDT

This Microsoft AI project sounds intriguing! Crediting data contributors could reshape how we value creative input in AI. Curious to see if it'll spark ethical debates or just be a tech flex. 🤔

ChristopherThomas
ChristopherThomas August 6, 2025 at 5:00:59 PM EDT

This is wild! Microsoft’s diving into how specific data shapes AI outputs. Makes me wonder if they’ll start paying people for their data contributions 🤔. Could be a game-changer for fairness in AI!

OR