This paper is available on arxiv under CC 4.0 license.
Authors:
(1) Muhammed Yusuf Kocyigit, Boston University;
(2) Anietie Andy, University of Pennsylvania;
(3) Derry Wijaya, Boston University.
Table of Links
- Abstract and Intro
- Related Works
- Data
- Method
- Analysis and Results
- Conclusion
- Limitations
- Ethics Statement and References
- Appendix: Toxicity Measurement
- Appendix: Correlation Over Time
- Appendix: Wikidata
- Appendix: Hyperparameter Sensitivity
Appendix: Wikidata
We use the the query in Figure 8 to select the set of individuals that were born in, residents of and citizens of the United States of America. The query takes the ethnic label manually. The ethnic label returns classes that are much more fine-grained then we aim for in this study so we manually create a dictionary to map each sub-group into our main categories presented in Figure 4 as African American, White American and Others. For African American the main rule was that the origin country for the ethnicity would be in the African Continent. We have also classified each European American(for example Italian American, Irish American etc.) ethnicity into White Americans.
Finally we manually label individuals that are are significant figures that don’t contain the ethnicity label in their Wikipedia page. This was more prevalent in White Americans.