This paper is available on arxiv under CC 4.0 license.

**Authors:**

(1) Muhammed Yusuf Kocyigit, Boston University;

(2) Anietie Andy, University of Pennsylvania;

(3) Derry Wijaya, Boston University.

## Table of Links

- Abstract and Intro
- Related Works
- Data
- Method
- Analysis and Results
- Conclusion
- Limitations
- Ethics Statement and References
- Appendix: Toxicity Measurement
- Appendix: Correlation Over Time
- Appendix: Wikidata
- Appendix: Hyperparameter Sensitivity

## Appendix: Correlation Over Time

We plot the correlation over-time to get a general picture of the embedding space in Figure 5. To support our observations we also conduct a Kolmogorov–Smirnov test. We use the two-sided test where the null-hypothesis is that the two empirical distributions are the same. We simply take two columns from our heatmap, ignore the rows where either of the entries are 1 and take the difference and then the absolute between the two lists. The resulting list consitutes our samples from the first distribution for our KS test. The samples from the second distribution is simply the same list for every other transition in our heatmap appended together, since the KS test is not dependent on the number of samples we can run the test for each transition.

Below in Table 3 and 4 the results for the KS test are given. The test simply tells if the two empirical distributions are likely to be from the same distribution. We observe that there are two cases where we can reject the null hypothesis relatively safely. One is for the White Americans heatmap between the years 1920-1930 and the second is for the African American heatmap between 1900-1910. For the first one we observe that the average similarity is well above the average similarity of samples from distribution 2 signaling that the null hypothesis was rejected not because the difference in this transition is big but the contrary. To our point, we observe that for the latter of the two cases the average similarity is much smaller.