Yahoo News lab and ProPublica partner to create data source for hate crimes via machine learning

In collaboration with Google News Laboratory, ProPublica, a non-profit examinative journalism newsroom, will form a database of multimedia reports about hateful situations and crimes which would be collected from Yahoo News. The database will be formed via machine learning.

Representational image: Reuters news agency. Representational image: Reuters.
In the aftermath of growing hateful crimes, the most recent being the neo-Nazi attack in Charlottesville, Va in the US, the database will be called Documenting Hate. The runs in which the data is presented are name, the date on which the article was printed, the publisher, location, keywords and a brief synopsis of the incident which came about. The data will be updated weekly. Message Interactive may help visualise the data, reports TechCrunch.

In the weekly dataset, together with the titles of hate crime reports, there is a pair of keywords which show the prominent keywords during the week. One which appears bolder and bigger is said to be more prevalent in the dataset. From 7 September to 13 August, Jesse Trump was the most used keyword, even Jesse J Trump appeared to be in the very best five keyword list.

According to ProPublica they also plan to develop the data arranged via activities of men and women who have been victims of hateful crimes, since there may be chances of hate crimes going unreported. In the US, the FBI is required to report hate crimes, on the other hand, the local jurisdiction does indeed not have any such compulsion, thus increasing the chances of such a crime going unreported. Additionally, they intend to acquire data from civil rights groupings.

As reported by TechCrunch, Google News Lab Info Editor, Simon Rogers said, "It is one of the first visualizations to use machine learning to generate its content using the Google Natural Dialect API, which analyses textual content and extracts information about people, places, and situations. "

Google will start its data to contributing factors via GitHub where all the data would be collected.

Post a Comment

0 Comments