Presently, if one ignores the hype around Generative AI systems, we can recognize that software tools are not sentient. Nor can they (yet) overcome the problem of coming up with creative solutions to novel problems. They are limited in what they can do by the training data that they are supplied. They do hold the prospect for making us more efficient and productive, particularly for wrote tasks. But given enough training data, one could consider how much farther this could be taken. In preparation for that future, when it comes to the digital twins, the landscape of the ownership of the intellectual property (IP) behind them is already taking shape.

A cartoon image of a man in a 1950s suit and tie standing in front of a large console of knobs and dials with a metalic robot behind him in the background. The words "The Future of Efficiency - Smarter machines for a smarter tomorrow" are present as a headline

Several chatbots have been set up to replicate long-dead historical figures so that you can engage with them in their “voice”.  Hellohistory is an AI-driven chatbot that provides people the opportunity to, “have in-depth conversations with history’s greatest.” A different tool, Historical Figures Chat, was widely panned not long after its release in 2023, and especially by historians who strongly objected. There are several variations on this theme of varying quality. Of course, with all things GenAI, they will improve over time and many of the obvious and problematic issues will be resolved either by this generation of companies or the next. Whether there is real value and insight to be gained, apart from the novelty, of engaging with “real historical figures” is the multi-billion dollar question. Much like the World Wide Web in the 1990s, very likely there is value, but it will be years before it can be clearly discerned what that value is and how to capitalize upon it. In anticipation of that day, many organizations are positioning themselves to capture that value.

While many universities have taken a very liberal view of ownership of the intellectual property of their students and faculty — far more liberal than many corporations might — others are quite more restrictive. Since the 1980s, there has been an increasing trend by institutions to assert ever greater claim to scholarly outputs. A 2013 report by the American Association of University Professors highlighted the challenges related to intellectual property rights created by the 2011 Supreme Court Stanford v. Roche decision. In the ensuing years, universities have moved to increasingly assert that intellectual property by faculty, staff, and students are “works for hire” and therefore owned by the institutions. In the early 2010s, these concerns bubbled up again as more and more professors sought to share their course materials in MOOCs and other online course platforms.

As there is perceived a value in extracting and profiting from the assertion of intellectual property rights over content for training AI tools, attention is again focused on the question of ownership of the intellectual outputs in a variety of domains. Just last month, the University of Wisconsin (UW) system has proposed a change in its copyright policy, asserting ownership of “scholarly works.” This ownership would extend to all lecture notes, course materials, recordings, journal articles, and syllabi, and that ownership would originate with the University of Wisconsin system, “but is then transferred to the author.”  Furthermore, the UW general counsel told faculty, “the UWs reserve a non-exclusive license to use syllabi in furtherance of its business needs and mission.” It’s worth noting that the leadership at the UW system have been considering these issues quite closely, as likely have many other institutions. The UW has a very robust set of policies regarding AI and its use on UW campuses. The change in policy likely had ties to these considerations. This has led to concerns being raised by the faculty about the potential that they might be replaced by AIs in the future. I don’t see the wholesale replacement of people in positions anytime soon, but there are certainly other concerns about the impacts of these moves.

When I discussed this concept during the Frankfurt Book Fair Scholarly Kitchen panel in October, fellow Chef Lisa Hinchliffe responded succinctly that, for her, this is an already settled question. She explained that, according the General Rules of the University of Illinois (Article 3: Intellectual Property), with the exception of copyrights in traditional academic works, such as course notes and scholarly publications, “the University owns all intellectual property developed (1) by any University employee in the performance of their work; or (2) by anyone, including students and visiting scholars, using any University resources.” Many other institutions, particularly corporations, have similarly broad claims on the IP of their employees. Fundamentally, individuals have a significant power imbalance in every one of these situations. Would one not accept a job or student position at the University of Illinois because of their IP ownership policy? Likely not, even if one were to read through the fine print.

While some fear that GenAI tools will replace people so that no one will have jobs any longer, I have a slightly more subtle concern. What makes someone valuable to a team is the skills they bring to the table. If this knowledge can be extracted, stored, and regenerated in novel ways in new situations, does one ever “leave” at the end of one’s employment? What rights does your former employer have to continue to “use” your agent after you quit or retire? The value you bring to your role as an editor, for example, is a popular style, a editorial approach, or a nuanced style for curating content. It is possible that with a significant pool of your work, that style could be replicated.

Throughout human history, when a new employee was brought into a position, there was a period of training that they went through. This process has swung from being more or less formal or rigorous over time, but the system of apprenticeship has existed for millennia. Over the past several decades, the costs of training for a particular career have shifted significantly onto the worker. While was no discernable trend in corporate training expenditures from 1970-2000, higher education costs spiked considerably — by more than 800% — over the same period, a cost born by the employee. Corporate investments in training have long been the target of cost cutting and one Harvard Business School paper went so far as to describe corporate training as “The Great Training Robbery”. And yet, as the cost of investments in human capital have shifted to the individual, there is an increasing movement to capture and retain that investment by employers.

Things get more complex if one considers moving roles. If over the past decade, all my work had been done using an AI agent, which came to work symbiotically with me and made me ever more efficient, if I were to leave would I then be 20-30-40% less efficient and therefore less valuable on the market without my AI support tool? Could I extract myself from my employer’s AI or will they even let me? If I am allowed to extract the data in some meaningful form, will I even be able to import that into the AI at my next employer? Data portability between models is certainly not ensured and there are strong incentives against allowing it. We have repeatedly seen technology companies attempt (only successfully in a few instances for any length of time) in creating walled gardens to prevent competition or product switching. Perhaps, I will only be able to “fit” into companies that are running on the same base technology. One could imagine an HR conversation going something like: “Our company only runs on the LLAMA AI companion. I’m sorry your OpenAI model isn’t the right fit for our firm.” Of course, one could start from scratch. However, that could diminish one’s value in the job market.

Alternatively, one could simply use your own tool to create and manage your agent. Every person could create their own, using their own preferred system and control the data and the surrounding technology stack. I’ll take my own situation as an example. At NISO, I write a monthly newsletter article, so this amounts to more than 225 articles. Over the years, I’ve contributed about 100 Scholarly Kitchen posts. I’ve authored or co-authored dozens more journal articles. My sent mail currently stands at 67,847. Over the years, in total, I’ve easily written more than 1,000,000 words on a wide range of topics. There’s certainly enough content to get a good capture of my style and perspectives. The corpus is probably sufficient to create a good predictive model of how I might respond to a prompt. When I get the time over the holiday break, I plan to train an LLM model on this corpus to see how well a GenAI “Todd” can replicate and generate content. Daniel Hook discussed a similar experiment he did at the Charleston Conference last month. Whether this is something that most non-technology oriented people want to learn how to do and then have the resources to execute is a very different question. As with most technology, few people want to get their hands dirty mucking about with the actual code, gears or chips.

Even if the tools I were to use were my own, such as my own computer and a personally licensed AI-tool subscription, NISO could claim copyright over most of the content used to train the model. Furthermore, it could assert that I couldn’t use it to train an AI agent without its consent. Who would then own the agent I’ve created? NISO’s HR policies are rather clear about the use of NISO technology, and its assertion of copyright ownership as works for hire. In essence, were this agent to be built resulting from my work, NISO could claim ownership of the model. It could claim the right to continue to use it once I’ve retired or moved on.

Each of us trades a lot when we take on a job or a career. We give a lot of ourselves over the time we contribute. Many of us hope to make an impact either on the organization itself, on its customers, or even on the broader community we participate in. When we think of how we will linger or what kind of legacy we might leave, we don’t necessarily think that impact will continue to flourish after we leave or even long after. Interestingly, for the first time in human history, companies are well positioned to continue to profit from the employee’s experience and intellectual contributions. In many ways, with training of AI agents, the economic model in which the profits of intellectual labor are time-bound or even constrained is rapidly being transformed.

It has been a running joke with colleagues that someone needed to clone me so that I could attend more meetings. Perhaps this break, I might get a chance to. Since I am the ‘original’ version of me, I hopefully would write the next Scholarly Kitchen piece better than an agent might.

In the new year, perhaps we will see how well the agent does…

Todd A Carpenter

Todd A Carpenter

Todd Carpenter is Executive Director of the National Information Standards Organization (NISO). He additionally serves in a number of leadership roles of a variety of organizations, including as Chair of the ISO Technical Subcommittee on Identification & Description (ISO TC46/SC9), founding partner of the Coalition for Seamless Access, Past President of FORCE11, Treasurer of the Book Industry Study Group (BISG), and a Director of the Foundation of the Baltimore County Public Library. He also previously served as Treasurer of SSP.

Discussion

3 Thoughts on "Once It Has Been Trained, Who Will Own My Digital Twin?"

@ “making us more efficient and productive, particularly for wrote tasks”
Clever neologism for written+rote

Comments are closed.