Pictures of Canadian victims are among the thousands of images depicting child sexual abuse that an internet watchdog group found in databases used to train popular artificial image generators “...
The database started out empty. They added all of the content. The filtering should have been part of the intake process, not after the fact. Image recognition has beem used to detect CP for many years now.
They could have and should gave stopped these images from getting into the dataset at all, but they didn’t. Consequently, people who were victimized as children are having the exploitive images of them being used to generate new (synthetic) child porn.
They did run filters. The group that found the new ones made a completely new stronger filter that is better at detecting it. You can’t blame them for not using technology that just wasn’t available at the time. They also pulled the whole dataset the moment the group alerted them to it and removed them.
This watch dog group was able to find this content. You don’t think the producer of the database should have any responsibility for the content within it? If it’s not feasible for you to guarantee that the contents of your product are legal/ethical, maybe that’s a problem?
I’m not sure about guarantee. That implies perfection which is never attainable in anything. But requiring transparent evidence of due diligence is certainly doable. As are penalties for failure to meet some kind of standard.
It’s past time to institute “grading standards” on large datasets. I have in mind the same kind of statistical standards that are applied in various kinds of defect and contamination analysis. For example, nobody ever guarantees that your food is free of animal feces, only that a fair and representative sample didn’t find any.
They are currently getting rid of them. It’s a database with 5 billion images, it’s not feasible for someone to go through it one by one.
I’m happy it got found by a new AI filter, hopefully it can also be used to get rid of the websites that hosted them in the first place.
But a slap to the face to who? The anti AI crowd needs to grow up.
The database started out empty. They added all of the content. The filtering should have been part of the intake process, not after the fact. Image recognition has beem used to detect CP for many years now.
They could have and should gave stopped these images from getting into the dataset at all, but they didn’t. Consequently, people who were victimized as children are having the exploitive images of them being used to generate new (synthetic) child porn.
They did run filters. The group that found the new ones made a completely new stronger filter that is better at detecting it. You can’t blame them for not using technology that just wasn’t available at the time. They also pulled the whole dataset the moment the group alerted them to it and removed them.
This watch dog group was able to find this content. You don’t think the producer of the database should have any responsibility for the content within it? If it’s not feasible for you to guarantee that the contents of your product are legal/ethical, maybe that’s a problem?
I’m not sure about guarantee. That implies perfection which is never attainable in anything. But requiring transparent evidence of due diligence is certainly doable. As are penalties for failure to meet some kind of standard.
It’s past time to institute “grading standards” on large datasets. I have in mind the same kind of statistical standards that are applied in various kinds of defect and contamination analysis. For example, nobody ever guarantees that your food is free of animal feces, only that a fair and representative sample didn’t find any.
The watch dog group made a completely new filter for it.
Yes, the producers should be running all available filters and they did. This one simply wasn’t available.