Thursday, September 2, 2021, 1:34 AM
Content Moderation via Artificial Intelligence
In-context artificial intelligence may assist communities in detecting early indications of criminal activity, such as grooming or recruiting, on a large scale.
This field of artificial intelligence examines data in the context of its environment, which means it considers both content (the whole raw text) and context (e.
, characteristics of users and situation, frequency of offender) in order to categorize a behavior.
When contextual AI is applied to a platform, it looks across all parts of the platform - postings, private chats, and messaging - and links numerous messages together in order to evaluate discussions spanning many encounters. When it comes down to it, contextual AI is concerned with how users' actions develop over time, as well as how they react to various messages, in order to differentiate between discussions that are consensual and those that are not.
In your society, if ten individuals are yelling at each other, is it because they are enjoying a competitive game or because they are a group of bullies conspiring against someone? The sentiment is identified by the context.
Following the identification of improper conduct by contextual AI, coupled with the precise reason why the behavior breaches a community standard, the platform's actions against the offender may be automated to a large extent. For example, you could choose to give a warning for a first violation, suspend an offender's account for three days for a second infraction, or ban a repeat offender from the site entirely.
Many of the well-documented difficulties associated with face recognition technologies may be overcome using artificial intelligence. Investigative teams working to locate missing children, for example, may spend hours upon hours comparing photographs supplied by the children's families with photographs obtained from internet escort services. Nonetheless, face recognition software is not as helpful in these situations, despite the fact that the models were developed using images only of white adults, not the young and varied individuals who are really the victims of human trafficking.
A next best neighbor method, which is basically ranking pictures in decreasing order of probable matches and then presenting them to the investigator so that they may make an informed decision, can assist in resolving this issue. This saves the investigators significant time and effort, allowing them to devote their time and energy to developing their investigative abilities, making choices, and putting facts into context, rather than scrolling through pictures.
The need for a next best neighborhood highlights a fundamental issue that faces any artificial intelligence: models must be trained using a large number of datasets that are correctly labeled.
There have been some remarkable instances of AI failures over the last ten years, with results that are extremely biased against particular races and groups of individuals being delivered. To ensure that AI models are free of bias, they must be trained on a broad collection of datasets that have been labeled by a diverse population of labelers.
The capacity of a person to correctly classify data will be influenced greatly by his or her upbringing, culture, and life experiences, among other factors. For example, individuals who have not grown up in a drug-related environment may be unfamiliar with numerous coded-drug phrases. Some people may be better than others in recognizing grooming habits, while others may be better than others at recognizing hate speech and other offensive language.
Aside from that, it's critical to hire labelers who are native speakers of the language of the material they'll be evaluating. A speaker who is not linguistically and culturally competent may be ignorant of the specific subtleties, euphemisms, and culturalisms of a given language, as well as the ways in which the context of a word or phrase may alter the meaning of the phrase or word used. A poor accuracy rate is likely to be experienced by individuals who are not native speakers of the language in which they are asked to identify it as a consequence of this.
When developing a pool of data labelers, it is critical to have a diverse group of people. A male may consider a particular word or behavior to be innocuous or perhaps simply unpleasant, while a woman may find it to be downright objectionable. Gender, age, religion, national origin, race, and ethnicity are all equal in this regard, as are all other characteristics. It is important to have a varied dataset and data labeling team to guarantee that your labels are correct and that you are not adding accidental individual human biases into your data labeling process. Diversity will also help to minimize the possibility of oversensitivity to certain issues.