Loading…
2017 #SMSociety Theme: Social Media for Social Good or Evil

Our online behaviour is far from virtual–it extends our offline lives. Much social media research has identified the positive opportunities of using social media; for example, how people use social media to form support groups online, participate in political uprising, raise money for charities, extend teaching and learning outside the classroom, etc. However, mirroring offline experiences, we have also seen social media being used to spread propaganda and misinformation, recruit terrorists, live stream criminal activities, reinforce echo chambers by politicians, and perpetuate hate and oppression (such as racist, sexist, homophobic, and anti-Semitic behaviour).

Friday, July 28 • 09:00 - 10:30
Workshop 1C: Text Analytics for Social Data Using DiscoverText & Sifter

Sign up or log in to save this to your schedule and see who's attending!

Feedback form is now closed.

Workshop Facilitator

Dr. Stu Shulman
, Texifter (USA)

Workshop Details

Participate in this workshop to learn how to build custom machine classifiers for sifting social media data. The topics covered include how to:

  • construct precise social data fetch queries,
  • use Boolean search on resulting archives,
  • filter on metadata or other project attributes,
  • count and set aside duplicates, cluster near-duplicates,
  • crowd source human coding,
  • measure inter-rater reliability,
  • adjudicate coder disagreements, and
  • build high quality word sense and topic disambiguation engines.

DiscoverText is designed specifically for collecting and cleaning up messy Twitter data streams. Use basic research measurement tools to improve human and machine performance classifying Twitter data over time. The workshop covers how to reach and substantiate inferences using a theoretical and applied model informed by a decade of interdisciplinary, National Science Foundation-funded research into the text classification problem.

Participants will learn how to apply “CoderRank” in machine-learning. Just as Google said not all web pages are created equal, links on some pages rank higher than others, Dr. Shulman argues not all human coders are created equal; the accuracy of observations by some coders on any task invariably rank higher than others. The major idea of the workshop is that when training machines for text analysis, greater reliance should be placed on the input of those humans most likely to create a valid observation. Texifter proposed a unique way to recursively validate, measure, and rank humans on trust and knowledge vectors, and called it CoderRank.

Instructor’s Bio

Dr. Stuart W. Shulman is founder & CEO of Texifter.  He was a Research Associate Professor of Political Science at the University of Massachusetts Amherst and the founding Director of the Qualitative Data Analysis Program (QDAP) at the University of Pittsburgh and at UMass Amherst. Dr. Shulman is Editor Emeritus of the Journal of Information Technology & Politics, the official journal of Information Technology & Politics section of the American Political Science Association.


Workshop Organizers
avatar for Stu Shulman

Stu Shulman

CEO, Texifter
Dr. Stuart W. Shulman is the founder and CEO of Texifter. Stu was formerly a UMass Amherst political science professor and the Vice President for Text Analytics at Vision Critical. Dr. Shulman is the sole inventor of the Coding Analysis Toolkit (CAT), an open source, Web-based text... Read More →


Friday July 28, 2017 09:00 - 10:30
TRS 1-075 - 7th Flr Ted Rogers School of Management, Ryerson University 55 Dundas Street West, Toronto, ON M5G 2C9

Attendees (31)