Earlier this month I attended the Professional and Scholarly Publishing (PSP) conference, produced by the Association of American Publishers (AAP). One session that piqued my interest was “Technology, Ideas, and Expression: How Machines Learn and What It Means for Legal and Business Paradigms.” In particular, the presentation of Attorney Jule Sigall from Microsoft gave me food for thought.
Sigall spoke about Seeing AI, a free Microsoft app for people with visual impairments. The technology provides users with audio descriptions of their surroundings, including documents, products, and, most interestingly, organic environments and people they may know. In an intriguing promotional video, the app tells its user that the person facing him at a party is a “28-year-old female, wearing glasses, looking happy.”
Apart from wondering whether the app could detect sarcasm—is the 28-year-old female really happy, or grinning to show her dismay?—like others in the audience I was prompted to consider the depth of machine learning needed to produce this observation. Often, machines extract baseline data from photos or similar artifacts. Sigall’s presentation hinted at the amount of data the program would need to interpret ad hoc scenes—landscapes, social gatherings, bus stops, etc.
- Only after ingesting data from many thousands of photos, or other data representations, can a machine develop pattern-matching capabilities to provide this level of support.
- How many photos of people did the program need to process to assess this subject’s age?
- Ultimately, the app aims to alert users to the presence of their friends: how many user-supplied images will the program need to draw such conclusions on the fly, and will it recognize, for example, a man who has shaved his beard since the last photo input?
It struck me that developers will need mass quantities of photo, video, or other consumable data to train artificial intelligence (AI) products, and they will need it on demand. If a toy company wants to debut an awe-inspiring (or awww-inspiring) AI puppy for the 2021 holiday season, surely the program behind it needs to process every available dog video, photo, etc. from now through next spring. For ‘Doggo Roboto’ to be truly impressive, it won’t be enough to monitor one dog in a lab, interacting with two or three subjects.
That was the light-bulb moment for me in terms of publishing industry impact.
As Sigall indicated, there is the potential for publishers to supply creators of AI tools with some of the data that machines need to learn, whether the learning is for practical, playful or artistic purpose. Publishers have supplied humans with intellectual and creative inspiration for hundreds of years, and now an entirely new audience has appeared in the marketplace. And, what’s better: machines consume content infinitely faster and more voraciously than humans. The sky’s the limit!
It may not be traditional publishers who fill the void. Crowd-sourcing will surely be needed to provide the volume required, and certain (not all) types of curation will be unnecessary. But, how interesting to think of machine learning creating a market—and associated jobs and commercial business models—rather than damaging one. The on-demand nature of the need, combined with “more is better” outcomes, make this a model for responsiveness and ingenuity. Who’s ready?