Toxic Comment Identifier

Improving the Internet For All

Try our Product!

Start by writing your text or paste your text here!

Toxicity on the internet is a major problem for platforms 
and publications. Abuse and bullying on the internet suppress vital voices, while inappropriate comments, even if they're not directed to us, can deter users from seeking important information and sour everyone's online experience. 


Online hate language can cause severe long-term problems, and it isn’t as simple as just turning off a computer. Adults use computers for everyday communication, while kids today are online to socialize with their peers. Online harassment can lead to problems in school, issues with depression, slipping grades, drug use, and a high risk for suicide. These effects are often terrible tragedies that ripple through communities and affect everyone. Stopping such mavolence online is crucial for both the offender and the receiver; the future is bleak no matter which side of the equation you are on. When it is not stopped early on, it can become a habit that can end with serious legal implications.

Overcoming Difficulties

"In the middle of difficulty lies opportunity" - Albert Einstein

Debugging Errors

Detecting and eliminating mistakes is a long procedure. Debugging is used to discover and repair faults or defects in software or systems to avoid improper functioning. This was a challenging procedure since humans are involved. Since human modification is required, this process was very tedious and tiresome.

Brutal Comments and Lack of Datasets

Our original idea centered more around weeding out identity hate and flagging microaggressions for users. Unfortunately, this sort of dataset was difficult to find. Luckily, there were a number of toxic comment datasets that could be used to identify, more generally, toxic comments online. Looking through this dataset was sometimes emotionally taxing because of the vulgarity of the comments.

CoCalc G2/ Server Errors

We just had to be patient and redo and restart a task when the kernel stopped while we were developing or running our code. Training our model often required taking time outside of class to complete, but after enough tries, we managed to achieve a few successful runs. To solve our cell input/output problem, we were able to successfully get out cells working clearly using the validate button.

Web Dev Dilemma

In our time in developing our website, one main problem we had was the text not being saved properly. Whenever we put a paragraph worth of text and see it saved, as soon as we exit the page, the text all seems to be deleted. This was a huge milestone we needed to cross since we can’t have loads of text being deleted. A reason for this might be that the platform might not be strong at holding too much text t all at once. Later one, when we find out the text is deleted it is very redundant to rewrite everything over again.

Mechanism Behind Our Product

  During our work process/planning to create the toxic comment identifier, we considered three main  resources:

Jigsaw Dataset

A dataset composed of a large number of Wikipedia comments, labeled by humans to indicate whether each comment is toxic, an insult, identity hate, or a threat, among others.


An advanced pre-trained language model, able to extract word embeddings (the way computers understand the meaning of words)  from datasets based on sentence context and other factors. BERT's outputs are information-rich and perfect for feeding into classification models.


The basis for the structure that BERT utilizes; a set of encoders, which make sense of words using outside word embeddings and the relationships between words in a sentence (called self-attention), and/or decoders, which alter the form of input into our preferred output using the above techniques as well as deciding the importance of a word for sentence meaning (called attention). 

First, the dataset we used was referenced from Kaggle and was a dataset containing 6 different categories of hate speech. These labels  were created by  humans and not by the computer. 

Identity Hate
Severe Toxic

Dataset Analysis

Data Distribution

As shown above, the graph shows the amount of comments per category. Most of the comments were clean (positive) and the least were comments in the threat category. In the middle would be the comments that would be distinguished as toxic, showing that media hate is still a perverse topic. Keep in mind that this visualization shows how each comment would be classified as, NOT the type. 

Toxic Comments Comparison

This graph compares the prevalence of different forms of hate speech. Same with the previous graph, this graph has more clean comments, yet there is still a significant amount of both obscene and insulting comments. Remarks categorized as threats or identity hate are significantly more common in our dataset than threats and identity hate comments. This is not a classification, rather this visual shows what level of toxicity each comment is remarked as. 

Multi-Tag Comments

Many of the comments in the dataset fit into multiple categories. For example, a certain comment might be both obscene and an insult. This bar graph depicts how often comments have 1, 2, 3, 4, 5, or all 6 categories.

Clean vs Hate

This graph demonstrates that the majority of the comments in our dataset were clean, with just a small fraction containing some toxicity.

Overview of BERT

Encoder Stack

BERT is a Transformer model that solely utilizes encoders. These encoders can be combined with outside word embeddings or be used to extract word embeddings based on the data, as we chose to do.


The inputted text is passed through these encoders, which, as described above under Transformer, make sense of the sentence using word relationships and, in our case, extract word meanings.


BERT's output captures individual word meanings and overall sentence understanding in a computer-readable way. This can be passed on to a classifier, which, in our case, labels the input based on its level of toxicity.


We used a simple linear classifier followed by the sigmoid function in order to get our final output: whether the inputted comment is toxic, severely toxic, obscene, a threat, an insult, identity hate, some combination thereof, or none of the above.

An example of a multi-class linear classifier

The linear classifier uses past data to predict which category future data will fall into. The input (a collection of features extracted by BERT) is weighted based on which of those features are most important to figuring out the class, and then passed on to the sigmoid function, which provides a probability for each category, indicating how likely the input is toxic, a threat, etc.

Motive & Our Story

We were drawn to the idea of toxic comment identification because of its uniquely modern applications. Today's lifestyles have become considerably more digitally focused, with online interactions being nearly as common as in-person encounters. The coronavirus pandemic has only exacerbated this trend, causing daily tasks to be done more than ever before over the Internet. Although we all know that the online world can be a wonderful place for connecting and learning, we also know that it can be an outlet for obscenity, hate speech, and general toxicity. We hope our program can be used to mitigate this phenomenon and help steer online conversations towards creating a more positive and productive environment.

Meet the students!

Bella Goldwasser

Web Developer

Bella is a rising sophomore at Alameda High School. In her free time, she enjoys reading, cooking, and playing the flute. Prior to this camp she only had limited programming knowledge but greatly enjoyed participating on her middle school’s robotics team. In the future, she hopes to continue expanding her AI and programming knowledge, as well as exploring how both could intersect with all of her other interests to have a positive impact on the world.

Elisa D'sa

Web Developer

Elisa is a rising junior at Mission San Jose High School. She enjoys doing all different forms of art as she herself attends a weekly art academy. Along with art, she loves to bake for family & friends, listen to music, watch horror movies, and play tennis for recreational purposes. Her main purpose for joining AI Camp was to gain a some exposure to different branches of programming, since our future depends on it, but in actuality, she aspires to major in Psychology in hopes of going into a medical field related career.

Kimberly Chang

Web Developer

Kimberly is a rising Freshman at Westridge School. She is a co-founder of a non-profit organization called Madhatter Knits, in which during her free time she loves knitting beanies for premature babies in the NICU. In addition, Kimberly loves tennis, playing her zither, and making ceramic pieces! Her goal is to continue to acquire more knowledge on AI to make a global impact!

Rachel Yang

Data Scientist

Rachel is a rising sophomore. Outside of academics, she often swims, enjoys gaming and reading novels. Rachel joined AI Camp to gain exposure to and learn more about AI with a few years of prior coding experience. Looking forward, she also hopes to study CS or other computer-related engineering in the future.

Shreeya Garg

Data Scientist

Shreeya is a rising junior at American High School. In her free time, she enjoys reading, watching TV, and playing board games with her friends and family. She is dedicated to helping everyone receive a high quality education and is the founder of an organization called Empowering Kidzz, where kids can attend free classes led by high schoolers. She is also a part of other clubs and organizations like GenUp, FAYE, Interact, and L- Connection and aims to make the world a better place. Shreeya joined AI Camp with the hopes to gain more exposure in this field and she hopes to one day be able to use these skills to help her community.

Justin Yi

Product Manager

Justin is an undergraduate at UCLA and served as the group's product manager and instructor. Justin is passionate about education and holds a dedicated interest in fair machine learning.