Siri 2.0 - Apple and generative AI

Good question. iCalendar has been around for more than two decades, so what gives?
The ML tools are less than a decade old at this point. The iPhone only got an NPU in 2017.
But it won't, as neural networks inherently have an error rate.
It doesn't have to be flawless; today you can click on an email to create a calendar entry, and sometimes you have to edit the final output anyway.
While it's bad to have the wrong thing happen consistently, arguably, it's even worse to have the right thing happen inconsistently.
And the magic of ML is that the act of correcting the wrong thing is a way to train the model to do the right thing eventually.

That's how autocorrect becomes more accurate.
 
  • Like
Reactions: Tagbert

Bonusround

Ars Scholae Palatinae
1,060
Subscriptor
You feel that podcasts are free of intellectual property laws as long as they are free? OpenAI needs to get you on their PR team ;)
Heh, sure. What I’m suggesting is that some tasks, like transcribing podcasts, fall under fair use. If you train a model to become better at performing something that‘s fair use, is that… fair? We know that courts are beginning to tackle this very question.

I think the nature of task matters, but it’s also a fine and messy line. To me, the podcast transcription case is very distinct from GPT regurgitating a NYTimes article as part of its paid service, or DALL-E copping a painter’s most popular portrait subjects. $.02
 
  • Like
Reactions: Tagbert

Scandinavian Film

Ars Scholae Palatinae
1,285
Subscriptor++
A personal wish for Safari: AI-powered dark mode.

Train it on millions of pages. Plug it into the display brightness and ambient light sensors. Learn my personal preferences. End this plague of scalded retinas.
Why limit it to Safari? They could add an OS-wide feature under the accessibility settings called “minimize sudden increases in brightness” to cap the average brightness level across the screen and how quickly it’s allowed to increase between frames. It wouldn’t be AI, but it might save your eyeballs.
 
  • Love
Reactions: Bonusround

wrylachlan

Ars Legatus Legionis
12,769
Subscriptor
Heh, sure. What I’m suggesting is that some tasks, like transcribing podcasts, fall under fair use. If you train a model to become better at performing something that‘s fair use, is that… fair? We know that courts are beginning to tackle this very question.

I think the nature of task matters, but it’s also a fine and messy line. To me, the podcast transcription case is very distinct from GPT regurgitating a NYTimes article as part of its paid service, or DALL-E copping a painter’s most popular portrait subjects. $.02
Absolutely not. The issue is whether you stole copyrighted work to create your product, not what your product does. A transcription engine that was trained on stolen data is still theft even if what it’s doing is fair use.
 

Bonusround

Ars Scholae Palatinae
1,060
Subscriptor
Absolutely not. The issue is whether you stole copyrighted work to create your product, not what your product does. A transcription engine that was trained on stolen data is still theft even if what it’s doing is fair use.
Good job taking a nuanced reflection on a fresh, challenging issue and collapsing it into a binary. “Absolutely,” eh? So very certain. But you’ve come right up to the nub of why this issue will befuddle courts for years to come.

These machines will adapt and learn when they do work – whether we like it or not. What happens to the product (result) of that learning? That’s something we are going to wrestle with.
 
Last edited:

Bonusround

Ars Scholae Palatinae
1,060
Subscriptor
Why limit it to Safari? They could add an OS-wide feature under the accessibility settings called “minimize sudden increases in brightness” to cap the average brightness level across the screen and how quickly it’s allowed to increase between frames. It wouldn’t be AI, but it might save your eyeballs.
That sounds very cool! Does it exclude certain surfacdes or work against the entire screen output? I may still want to watch video content in its full HDR goodness without the system flattening the contrast to mush.

There are other areas of the system (Mail) where a governed dark mode would be helpful, but mostly for the web. The dark mode plugins I use seem to be fading in effectiveness and growing more error-prone over the years. Apple could, and should, fix this.
 
Last edited:

Scandinavian Film

Ars Scholae Palatinae
1,285
Subscriptor++
That sounds very cool! Does it exclude certain surfacdes or work against the entire screen output? I may still want to watch video content in its full HDR goodness without the system flattening the contrast to mush.
It’s not an existing setting, I was just proposing it. But in theory it could optionally exclude video and/or images, like the “smart invert” setting currently does.
 
  • Like
Reactions: Bonusround

jaberg

Ars Praefectus
3,660
Subscriptor
Yeah, you also can’t train humans using copyrighted material against the wishes of the rights holder.
I would counter that most of my “training” as a photographer and ‘artist’ (such as I am) was through my study of copyrighted works — whether it was in accordance with the wishes of the rights holder or not.

I have certainly “copied” the look and feel of others along the path of developing my eye and finding my own voice. There are still days when I head out the door with the thought of going full Sam Abell. Or of channeling my inner-(William Albert) Allard

There is a lot of gray area here. I understand the concerns of (other) artists and working photographers, but I don’t fully support their arguments about fair use in this context. Images are meant to be seen, and by extension to be absorbed into the collective body of work. (I don’t mean that to be as trippy as it sounds.)

Interesting times.
 

Scud

Ars Legatus Legionis
12,314
Heh, sure. What I’m suggesting is that some tasks, like transcribing podcasts, fall under fair use. If you train a model to become better at performing something that‘s fair use, is that… fair? We know that courts are beginning to tackle this very question.

I think the nature of task matters, but it’s also a fine and messy line. To me, the podcast transcription case is very distinct from GPT regurgitating a NYTimes article as part of its paid service, or DALL-E copping a painter’s most popular portrait subjects. $.02
I agree with your point, but do you really see Apple being the one to to lead the charge in that legal battle, against the very creatives they market to? Speaking of OpenAI, Apple and OpenAI have agreed to terms. Looks like OpenAI will be in Apple devices, just like Windows.

 
Last edited:
  • Like
Reactions: Bonusround
Speaking of OpenAI, Apple and OpenAI have agreed to terms. Looks like OpenAI will be in Apple devices, just like Windows.

I just came here to comment/gripe about this. I don't want Siri to chat/flirt with me; I don't care if it even speaks in complete sentences. I want it to reliably perform basic tasks, and until I'm convinced it can do that, I will keep it turned off. John Giannandrea supposedly said in an email in 2023 that "the last thing people needed was another chatbot," and I wish they had listened to him.
 

Bonusround

Ars Scholae Palatinae
1,060
Subscriptor
I just came here to comment/gripe about this. I don't want Siri to chat/flirt with me; I don't care if it even speaks in complete sentences. I want it to reliably perform basic tasks, and until I'm convinced it can do that, I will keep it turned off. John Giannandrea supposedly said in an email in 2023 that "the last thing people needed was another chatbot," and I wish they had listened to him.
‘Using’ OpenAI doesn’t necessarily mean ChatGPT. Let’s see what they announce in 11 days.

Sam Altman is a crook and I can't wait for OpenAI to crash and burn. Thumbs down, Apple.
I appreciate your sentiment. Apple’s alternative was… Google? It’s interesting to see Microsoft and Apple backing the same horse.
 

Scud

Ars Legatus Legionis
12,314
Yeah, it's an odd pairing for sure but I suspect Sora was too tempting for Cook to pass up. This also confirms that Apple will be using a cloud based solution as well, meaning I expect Siri to remain stupid for the foreseeable future. My guess is Apple will do what Salesforce is doing with OpenAI and offer a secured and private OpenAI based LLM. The gotcha is time as it took Microsoft and Salesforce a year to bake OpenAI into their respective tech stacks.
 

wrylachlan

Ars Legatus Legionis
12,769
Subscriptor
These machines will adapt and learn when they do work – whether we like it or not. What happens to the product (result) of that learning? That’s something we are going to wrestle with.
This is weird anthropomorphism. These models aren’t training or learning despite the fact that that’s what we lazily call it. They’re machines that we’re feeding copyrighted work into so that they can build us a pattern recognition table. They don’t have agency and aren’t ’doing’ anything in the same way that that hoe you use in the garden isn’t autonomously gardening.

Nor is it a given that there’s nothing we can do to prevent humans from using these machines this way. There are many things that humans could physically do that we nevertheless outlaw.

That using copyrighted work for a human to learn is considered fair use has no relation to feeding that same work to a machine in order to calculate model weights. Calculating model weights for a commercial product that devalues the original work seems unambiguously not to be fair use. Calling it learning to equate it with scholarship that is fair use is entirely slight of hand.
 

Bonusround

Ars Scholae Palatinae
1,060
Subscriptor
I agree with your point, but do you really see Apple being the one to to lead the charge in that legal battle, against the very creatives they market to?
Great question. No, not when put like that. But that’s why I say the task matters – are podcasters are going to rail against Apple for offering free transcriptions of episodes? For all we know Apple offers an opt-out in the catalog interface.

This issue won’t be leading the legal charge – if for no other reason than Apple isn’t charging anything. The transcriptions are free, and they aren’t in the business of renting or selling models.

Calculating model weights for a commercial product that devalues the original work seems unambiguously not to be fair use.
Agreed. Are Apple’s podcast transcriptions a commercial product that devalues the original? I think not on either count.
 
Last edited:
  • Like
Reactions: Tagbert

Scud

Ars Legatus Legionis
12,314
Great question. No, not when put like that. But that’s why I say the task matters – are podcasters are going to rail against Apple for offering free transcriptions of episodes? For all we know Apple offers an opt-out in the catalog interface.

This issue won’t be leading the legal charge – if for no other reason than Apple isn’t charging anything. The transcriptions are free, and they aren’t in the business of renting or selling models.
Again, it's not the transcriptions thats the issue, but using copyrighted material to train an AI that Apple sells. It's like putting lyrcs from a song on a t-shirt and selling those shirts. Most artists may not care, but if you sell billions of those shirts worldwide to generate billions in revenue, I guarantee you that either the artists or the publishers will most certainly care.
 
‘Using’ OpenAI doesn’t necessarily mean ChatGPT. Let’s see what they announce in 11 days.
Very true. The one thing I would most appreciate would be more accurate on-device audio transcription, because typing with your thumbs is terrible. Second would be translation. Both seem to be good problems for transformer models to solve, so here's hoping...
 
  • Like
Reactions: Bonusround

Bonusround

Ars Scholae Palatinae
1,060
Subscriptor
This is weird anthropomorphism. These models aren’t training or learning despite the fact that that’s what we lazily call it.
You aren’t wrong, but… this language is already infused in the entire space. “Training” is how models are generated. We call them neural networks, fer chrissakes.

Nor is it a given that there’s nothing we can do to prevent humans from using these machines this way. There are many things that humans could physically do that we nevertheless outlaw.
100%
 

wrylachlan

Ars Legatus Legionis
12,769
Subscriptor
You aren’t wrong, but… this language is already infused in the entire space. “Training” is how models are generated. We call them neural networks, fer chrissakes.
It’s not the calling them training and learning that are the problem. Obviously those are the terms of art in the field. The issue is extrapolating from the fact that we call them learning and training that they should fall under fair use just because human learning and training are considered fair use. Same words, but very very different relative to the fair use doctrine.
 
This is weird anthropomorphism. These models aren’t training or learning despite the fact that that’s what we lazily call it. They’re machines that we’re feeding copyrighted work into so that they can build us a pattern recognition table. They don’t have agency and aren’t ’doing’ anything in the same way that that hoe you use in the garden isn’t autonomously gardening.
We have to call it something though.

To use your example, a hoe is a tool, but due to its use it’s also a verb. You hoe your garden.

So it makes sense to say a model fed data and had its weights modified is being trained, since that’s pretty much what we do when we train dogs. And a model that succeeds has learned, exactly how a dog that has mastered a command or trick has also learned.
 
It’s not the calling them training and learning that are the problem. Obviously those are the terms of art in the field. The issue is extrapolating from the fact that we call them learning and training that they should fall under fair use just because human learning and training are considered fair use. Same words, but very very different relative to the fair use doctrine.
Oh I see.

It’s not obvious to me how this plays out. If I go to a library and recite all the books for a model, that becomes fair use.

But scanning the books? That’s less clear.
 

wrylachlan

Ars Legatus Legionis
12,769
Subscriptor
If I go to a library and recite all the books for a model, that becomes fair use.
I don’t think that’s true at all. If that were true, you could go into a library, memorize a book, recite it, and someone else could write down what you said and publish it. I don’t think the fact that you passaged it through a human brain would be a viable defense against copyright infringement in that case.
 
I don’t think that’s true at all. If that were true, you could go into a library, memorize a book, recite it, and someone else could write down what you said and publish it. I don’t think the fact that you passaged it through a human brain would be a viable defense against copyright infringement in that case.
Yet you see adults reading out loud to dozens of kids in school libraries.
 

iljitsch

Ars Tribunus Angusticlavius
8,474
Subscriptor++
I wrote several books. I had to give the publishers permission to make copies and sell those. If I hadn’t, I could easily sue anyone who had gotten into my home, studied those books and then created something of their own based on them.
Is this sarcasm? Libraries exist.
Libraries don’t trump copyright. Now if you want to publish the books you wrote then they also end up in libraries. But if you don’t publish them, you can sue anyone who comes into your house and uses those books for whatever. Or even if you throw them in a dumpster and someone fishes them out.
 

wrylachlan

Ars Legatus Legionis
12,769
Subscriptor
I wrote several books. I had to give the publishers permission to make copies and sell those. If I hadn’t, I could easily sue anyone who had gotten into my home, studied those books and then created something of their own based on them.

Libraries don’t trump copyright. Now if you want to publish the books you wrote then they also end up in libraries. But if you don’t publish them, you can sue anyone who comes into your house and uses those books for whatever. Or even if you throw them in a dumpster and someone fishes them out.
And a key point here is that passaging your copying through fair use doesn’t help you. Imagine you had not published your book but used it as a teaching aid in a class. It would be totally fair use for your students to write down quotes out of the book in their notes in the course of their studies. If someone came along and dumpster dived all those notes to the point they could reconstruct your entire book and publish it themselves, that would still be copyright infringement. If doesn’t matter that the first copy was fair use.
 

Bonusround

Ars Scholae Palatinae
1,060
Subscriptor
It’s not the calling them training and learning that are the problem. Obviously those are the terms of art in the field. The issue is extrapolating from the fact that we call them learning and training that they should fall under fair use just because human learning and training are considered fair use. Same words, but very very different relative to the fair use doctrine.
Again, I’m suggesting that fair use is in the act – in this case generating free transcriptions of freely-available works.

Was Apple’s podcast transcription trained on the podcasts themselves?
We don’t know. Does it matter? We also don’t know, yet – that’s NYTimes vs. OpenAI. Should it matter? I think so, but the details are important and there are many. This is our current morass.

It’s not the reading that’s the issue, it’s the ‘turning it into a commercial product’.
Exactly. That’s what I don’t see Apple having done in the case of podcast transcriptions.


What I see as one of the stickiest wickets is this: models need not be static. Let’s say a company begins with a perfectly ‘clean’ model – trained only on open source and licensed works. It then begins using said model against copyrighted works. Through feedback the model becomes better-trained… what now? This is what I meant by ‘learning’ earlier.
 

wrylachlan

Ars Legatus Legionis
12,769
Subscriptor
Let’s say a company begins with a perfectly ‘clean’ model – trained only on open source and licensed works. It then begins using said model against copyrighted works. Through feedback the model becomes better-trained… what now? This is what I meant by ‘learning’ earlier.
So… don’t do that. Models work just fine without ongoing feedback learning based on works you don’t have permission to use. If you want to do continual model refinement then pay to use the works for that purpose.
 

Bonusround

Ars Scholae Palatinae
1,060
Subscriptor
So… don’t do that. Models work just fine without ongoing feedback learning based on works you don’t have permission to use. If you want to do continual model refinement then pay to use the works for that purpose.
Disable feedback... entirely? Or allow it for the extent of operating on these works only, then forget. Or maintain the feedback locally but don’t share it with the ‘master’ model. And again, what if your business is not to sell or rent models but to perform your legal, allowed work and nothing more?

Your definition of free must be very different to an mine or has Apple started giving away iPhones?
Last I checked Apple’s podcast catalog was freely available to everyone. If transcriptions aren’t included as part of it then I stand corrected.
 
Last edited:

wrylachlan

Ars Legatus Legionis
12,769
Subscriptor

Disable feedback... entirely? Or allow it for the extent of operating on these works only, then forget. Or maintain the feedback locally but don’t share it with the ‘master’ model. And again, what if your business is not to sell or rent models but to perform your legal, allowed work and nothing more?
Don’t use other people’s work to improve your product without paying them. You’re making it seem much more nuanced than it actually is. Don’t build a business model around illegal copying. Are the potential benefits of a continuous learning AI that gets better and better over time enticing? Absolutely! So pay for it.
 

Bonusround

Ars Scholae Palatinae
1,060
Subscriptor
Don’t use other people’s work to improve your product without paying them. You’re making it seem much more nuanced than it actually is.
It is nuanced – sorry.

Don’t build a business model around illegal copying.
Copying? I don’t agree this is copying. It’s something else.

Are the potential benefits of a continuous learning AI that gets better and better over time enticing? Absolutely! So pay for it.
People who create works should be compensated for their works, no question, and especially by those who profit off their works.

I care a lot about fair use. Copyright can protect creators, but it is weaponized all too often by those who couldn’t give a rat’s ass about the original creator. What will constitute fair use in an age of ubiquitous, always-learning AI?

I want my future AI assistant to be able to read everything I do, to learn from it, and to recall it to me upon request. I wonder if it will be able to do so, and under what constraints. Will it be able to see the video I’m watching or does DRM step in? Does it need to purge its memory because I returned an eBook to the library?

Someone said it above: interesting times.
 
It’s not the reading that’s the issue, it’s the ‘turning it into a commercial product’.
<tangent>Then there's Borges's Pierre Menard, Author of the Quixote, in which a man gets himself in the right frame of mind to write Cervantes's "El Quijote" word for word without copying it. Would that be legal?</tangent>